[v4] checkpatch.pl: Add SPDX license tag check

Message ID 20171220234625.16521-1-robh@kernel.org
State New
Headers show
Series
  • [v4] checkpatch.pl: Add SPDX license tag check
Related show

Commit Message

Rob Herring Dec. 20, 2017, 11:46 p.m.
Add SPDX license tag check based on the rules defined in
Documentation/process/license-rules.rst. To summarize, SPDX license tags
should be on the 1st line (or 2nd line in scripts) using the appropriate
comment style for the file type.

Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Rob Herring <robh@kernel.org>

---
Thomas, if you are inclined and Joe is happy with this, can you add this 
on top of your series adding license-rules.rst.

v4:
- Reference license-rules.rst
- Add comment style checks based on file types
- Check .rst files

v3:
- Since we specify that the tag goes on the 1st or 2nd line, the logic
  can be greatly simplified compared to v2 because we can just use the
  line number. And now the check is improved too.

 scripts/checkpatch.pl | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

-- 
2.14.1

Comments

Joe Perches Dec. 21, 2017, 6:28 a.m. | #1
On Wed, 2017-12-20 at 17:46 -0600, Rob Herring wrote:
> Add SPDX license tag check based on the rules defined in

> Documentation/process/license-rules.rst. To summarize, SPDX license tags

> should be on the 1st line (or 2nd line in scripts) using the appropriate

> comment style for the file type.

> 

> Cc: Andy Whitcroft <apw@canonical.com>

> Cc: Joe Perches <joe@perches.com>

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

> Cc: Thomas Gleixner <tglx@linutronix.de>

> Cc: Philippe Ombredanne <pombredanne@nexb.com>

> Signed-off-by: Rob Herring <robh@kernel.org>

> ---

> Thomas, if you are inclined and Joe is happy with this, can you add this 

> on top of your series adding license-rules.rst.

> 

> v4:

> - Reference license-rules.rst

> - Add comment style checks based on file types

> - Check .rst files

> 

> v3:

> - Since we specify that the tag goes on the 1st or 2nd line, the logic

>   can be greatly simplified compared to v2 because we can just use the

>   line number. And now the check is improved too.

> 

>  scripts/checkpatch.pl | 25 +++++++++++++++++++++++++

>  1 file changed, 25 insertions(+)

> 

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

[]
> @@ -2866,6 +2866,31 @@ sub process {

>  			}

>  		}

>  

> +# check for using SPDX license tag at beginning of files

> +		if ($rawline =~ /^\+/ && !($realline == 1 && $rawline =~ /^[\s\+]#!/)) {


This test will enter this block for every added line of the patch.

Needs to be /^[ \+]/ and not [\t\+] and probably should just be ^\+

I'd probably have something like
	my $checklicenseline = 1;
	
at the start of sub process

and use something

		if ($realline == $checklicenseline) {
			if ($realfile =~ /\.(?:sh|pl|py)/ && $rawline =~ /\[ \+]\s*\!\#/) {
				$checklicenseline = 2;
			} elsif (etc...) {
			}
		}

> +			} elsif ($realfile =~ /\.rst$/) {

> +				$comment = '..';


\.\.

What about .txt, .json, .cocci, and .awk ?
Philippe Ombredanne Dec. 21, 2017, 7:15 a.m. | #2
Rob,

On Thu, Dec 21, 2017 at 12:46 AM, Rob Herring <robh@kernel.org> wrote:
> Add SPDX license tag check based on the rules defined in

> Documentation/process/license-rules.rst. To summarize, SPDX license tags

> should be on the 1st line (or 2nd line in scripts) using the appropriate

> comment style for the file type.

>

> Cc: Andy Whitcroft <apw@canonical.com>

> Cc: Joe Perches <joe@perches.com>

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

> Cc: Thomas Gleixner <tglx@linutronix.de>

> Cc: Philippe Ombredanne <pombredanne@nexb.com>

> Signed-off-by: Rob Herring <robh@kernel.org>

> ---

> Thomas, if you are inclined and Joe is happy with this, can you add this

> on top of your series adding license-rules.rst.

>

> v4:

> - Reference license-rules.rst

> - Add comment style checks based on file types

> - Check .rst files

>

> v3:

> - Since we specify that the tag goes on the 1st or 2nd line, the logic

>   can be greatly simplified compared to v2 because we can just use the

>   line number. And now the check is improved too.

>

>  scripts/checkpatch.pl | 25 +++++++++++++++++++++++++

>  1 file changed, 25 insertions(+)

>

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

> index 31031f10fe56..0324f845011d 100755

> --- a/scripts/checkpatch.pl

> +++ b/scripts/checkpatch.pl

> @@ -2866,6 +2866,31 @@ sub process {

>                         }

>                 }

>

> +# check for using SPDX license tag at beginning of files

> +               if ($rawline =~ /^\+/ && !($realline == 1 && $rawline =~ /^[\s\+]#!/)) {

> +                       my $ln = 1;

> +                       my $comment = "";

> +

> +                       if ($realfile =~ /\.(h|s|S)$/) {

> +                               $comment = '/\*';

> +                       } elsif ($realfile =~ /\.(c|dts|dtsi)$/) {

> +                               $comment = '//';

> +                       } elsif ($realfile =~ /\.(sh|pl|py)$/) {

> +                               if ($prevrawline =~ /^[\s\+]#!/) {

> +                                       $ln = 2;

> +                               }

> +                               $comment = '#';

> +                       } elsif ($realfile =~ /\.rst$/) {

> +                               $comment = '..';

> +                       }

> +

> +                       if ($comment !~ /^$/ &&

> +                           ($realline == $ln xor $rawline =~ m@^\+$comment SPDX-License-Identifier: @)) {

> +                               WARN("SPDX_LICENSE_TAG",

> +                                    "Missing or malformed SPDX-License-Identifier tag in 1st (or 2nd for scripts) line\n" . $herecurr);

> +                       }

> +               }

> +

>  # check we are in a valid source file if not then ignore this hunk

>                 next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);

>

> --

> 2.14.1

>


My Perl is terribly rusty. But heck this is checkpatch.pl, not
checkpatch.py ;) This looks good to me though.

FWIW I maintain a comprehensive license expression parser and boolean
minimizer that could be a nice addition but is likely overkill even
for deeper checks.

Instead, in the future what we could add to checkpatch.pl could be
some simple table lookup to ensure that the actual expression is a
known one since we have a finite number of licenses in the kernel.

Reviewed-by:  Philippe Ombredanne <pombredanne@nexb.com>


[1] https://github.com/nexB/license-expression/
-- 
Cordially
Philippe Ombredanne
Rob Herring Dec. 21, 2017, 5:04 p.m. | #3
On Wed, Dec 20, 2017 at 10:28:48PM -0800, Joe Perches wrote:
> On Wed, 2017-12-20 at 17:46 -0600, Rob Herring wrote:

> > Add SPDX license tag check based on the rules defined in

> > Documentation/process/license-rules.rst. To summarize, SPDX license tags

> > should be on the 1st line (or 2nd line in scripts) using the appropriate

> > comment style for the file type.

> > 

> > Cc: Andy Whitcroft <apw@canonical.com>

> > Cc: Joe Perches <joe@perches.com>

> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

> > Cc: Thomas Gleixner <tglx@linutronix.de>

> > Cc: Philippe Ombredanne <pombredanne@nexb.com>

> > Signed-off-by: Rob Herring <robh@kernel.org>

> > ---

> > Thomas, if you are inclined and Joe is happy with this, can you add this 

> > on top of your series adding license-rules.rst.

> > 

> > v4:

> > - Reference license-rules.rst

> > - Add comment style checks based on file types

> > - Check .rst files

> > 

> > v3:

> > - Since we specify that the tag goes on the 1st or 2nd line, the logic

> >   can be greatly simplified compared to v2 because we can just use the

> >   line number. And now the check is improved too.

> > 

> >  scripts/checkpatch.pl | 25 +++++++++++++++++++++++++

> >  1 file changed, 25 insertions(+)

> > 

> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

> []

> > @@ -2866,6 +2866,31 @@ sub process {

> >  			}

> >  		}

> >  

> > +# check for using SPDX license tag at beginning of files

> > +		if ($rawline =~ /^\+/ && !($realline == 1 && $rawline =~ /^[\s\+]#!/)) {

> 

> This test will enter this block for every added line of the patch.

> 

> Needs to be /^[ \+]/ and not [\t\+] and probably should just be ^\+

> 

> I'd probably have something like

> 	my $checklicenseline = 1;

> 	

> at the start of sub process

> 

> and use something

> 

> 		if ($realline == $checklicenseline) {

> 			if ($realfile =~ /\.(?:sh|pl|py)/ && $rawline =~ /\[ \+]\s*\!\#/) {

> 				$checklicenseline = 2;

> 			} elsif (etc...) {

> 			}

> 		}


Okay, here's what I've ended up with:

		if ($realline == $checklicenseline) {
			if ($realfile =~ /\.(?:sh|pl|py)/ && $rawline =~ /\[ \+]\s*\!\#/) {
				$checklicenseline = 2;
			} elsif ($rawline =~ /^\+/) {
				my $comment = "";
				if ($realfile =~ /\.(h|s|S)$/) {
					$comment = '/\*';
				} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {
					$comment = '//';
				} elsif ($realfile =~ /\.(sh|pl|py)$/) {
					$comment = '#';
				} elsif ($realfile =~ /\.rst$/) {
					$comment = '\.\.';
				}

				if ($comment !~ /^$/ &&
				    $rawline !~ m@^\+$comment SPDX-License-Identifier: @) {
					WARN("SPDX_LICENSE_TAG",
					     "Missing or malformed SPDX-License-Identifier tag in 1st (or 2nd for scripts) line\n" . 
$herecurr);
				}
			}
		}


> 

> > +			} elsif ($realfile =~ /\.rst$/) {

> > +				$comment = '..';

> 

> \.\.

> 

> What about .txt, .json, .cocci, and .awk ?


What about them? They aren't documented in license-rules.rst and I'm 
just implementing what's documented as you said I should on v3.

Rob
Joe Perches Dec. 21, 2017, 5:22 p.m. | #4
On Thu, 2017-12-21 at 11:04 -0600, Rob Herring wrote:
> Okay, here's what I've ended up with:

> 

> 		if ($realline == $checklicenseline) {

> 			if ($realfile =~ /\.(?:sh|pl|py)/ && $rawline =~ /\[ \+]\s*\!\#/) {

> 				$checklicenseline = 2;

> 			} elsif ($rawline =~ /^\+/) {

> 				my $comment = "";

> 				if ($realfile =~ /\.(h|s|S)$/) {

> 					$comment = '/\*';

> 				} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {

> 					$comment = '//';

> 				} elsif ($realfile =~ /\.(sh|pl|py)$/) {

> 					$comment = '#';

> 				} elsif ($realfile =~ /\.rst$/) {

> 					$comment = '\.\.';

> 				}

> 

> 				if ($comment !~ /^$/ &&

> 				    $rawline !~ m@^\+$comment SPDX-License-Identifier: @) {

> 					WARN("SPDX_LICENSE_TAG",

> 					     "Missing or malformed SPDX-License-Identifier tag in 1st (or 2nd for scripts) line\n" . 

> $herecurr);

> 				}

> 			}

> 		}


Seems sensible enough.

Maybe it's better to use \Q$comment\E and a consistent
style on comment and rawline

Any checkpatch patch for license style requirements should
not be applied until after Documentation/license-rules.rst
is in -next.

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 31031f10fe56..0324f845011d 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2866,6 +2866,31 @@  sub process {
 			}
 		}
 
+# check for using SPDX license tag at beginning of files
+		if ($rawline =~ /^\+/ && !($realline == 1 && $rawline =~ /^[\s\+]#!/)) {
+			my $ln = 1;
+			my $comment = "";
+
+			if ($realfile =~ /\.(h|s|S)$/) {
+				$comment = '/\*';
+			} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {
+				$comment = '//';
+			} elsif ($realfile =~ /\.(sh|pl|py)$/) {
+				if ($prevrawline =~ /^[\s\+]#!/) {
+					$ln = 2;
+				}
+				$comment = '#';
+			} elsif ($realfile =~ /\.rst$/) {
+				$comment = '..';
+			}
+
+			if ($comment !~ /^$/ &&
+			    ($realline == $ln xor $rawline =~ m@^\+$comment SPDX-License-Identifier: @)) {
+				WARN("SPDX_LICENSE_TAG",
+				     "Missing or malformed SPDX-License-Identifier tag in 1st (or 2nd for scripts) line\n" . $herecurr);
+			}
+		}
+
 # check we are in a valid source file if not then ignore this hunk
 		next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);