Message ID | 20171109011047.20040-1-robh@kernel.org |
---|---|
State | New |
Headers | show |
Series | checkpatch.pl: Add SPDX license tag check | expand |
On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: > Add a check warning if SPDX-License-Identifier tags are not used in > newly added files. If this is to be done, and I think it's not a great idea, there are better ways of doing this that emit this warning on a per-file basis instead of a per-patch.
On Wed, Nov 08, 2017 at 06:10:19PM -0800, Joe Perches wrote: > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: > > Add a check warning if SPDX-License-Identifier tags are not used in > > newly added files. > > If this is to be done, and I think it's not a great idea, > there are better ways of doing this that emit this warning > on a per-file basis instead of a per-patch. Any hints as to how to do that? :) thanks, greg k-h
On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@perches.com> wrote: > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: >> Add a check warning if SPDX-License-Identifier tags are not used in >> newly added files. > > If this is to be done, and I think it's not a great idea, Which part? SPDX tags or checking new files or just using checkpatch for this? > there are better ways of doing this that emit this warning > on a per-file basis instead of a per-patch. You had mentioned using something like checkincludes.pl before. The problem I see with that is few people run those tools. Lots of people run checkpatch.pl. We want this to be correct when added, not after the fact by someone else who is not the author. It fits in the workflow, because if checkpatch doesn't catch it, then I have to in reviews. I do agree though that the implementation is a bit ugly given the line by line way checkpatch works. Rob
On Thu, 2017-11-09 at 10:12 +0100, Greg Kroah-Hartman wrote: > On Wed, Nov 08, 2017 at 06:10:19PM -0800, Joe Perches wrote: > > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: > > > Add a check warning if SPDX-License-Identifier tags are not used in > > > newly added files. > > > > If this is to be done, and I think it's not a great idea, > > there are better ways of doing this that emit this warning > > on a per-file basis instead of a per-patch. > > Any hints as to how to do that? :) Sure, but only after it's decided if adding spdx license tag to all compilable source files is appropriate. And if that's true, then that requirement also has to be added in the Documentation directory somewhere.
On Thu, 2017-11-09 at 07:47 -0600, Rob Herring wrote: > On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@perches.com> wrote: > > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: > > > Add a check warning if SPDX-License-Identifier tags are not used in > > > newly added files. > > > > If this is to be done, and I think it's not a great idea, > > Which part? SPDX tags or checking new files or just using checkpatch for this? SPDX tags in all files. There's no real way to check a patch for this. You have to check the entire file. checkpatch could, as you've done, scan for new files against /dev/null, but a single patch can add multiple files and each newly added file should have a missing SPDX indicator check. My concern is that there are ~50,000 files in the kernel source tree and, after that scripted patch adding the tags, only about a quarter of them have an SPDX tag. So which files actually _need_ a SPDX tag? files in -next with an SPDX tag: $ git grep --name-only -i -P "spdx-licen[cs]e-identifier" | \ while read file ; do basename $file ; done | \ sed -r -e 's/^.*(\..*)/\1/' | \ sort | uniq -c | sort -rn | head -10 7514 .h 3435 .c 1193 Makefile 486 .S 221 .dts 186 Kconfig 185 .dtsi 97 .sh 34 .tc 24 .debug vs all files in -next (not Documentation/) $ git ls-files | grep -v "^Documentation/" | \ while read file ; do basename $file ; done | \ sed -r -e 's/^.*(\..*)/\1/' | \ sort | uniq -c | sort -rn | head -10 25946 .c 20360 .h 2437 Makefile 1454 .S 1442 .dts 1380 Kconfig 1099 .dtsi 207 .json 204 .gitignore 194 .sh
On Thu, Nov 9, 2017 at 9:39 AM, Joe Perches <joe@perches.com> wrote: > On Thu, 2017-11-09 at 07:47 -0600, Rob Herring wrote: >> On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@perches.com> wrote: >> > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: >> > > Add a check warning if SPDX-License-Identifier tags are not used in >> > > newly added files. >> > >> > If this is to be done, and I think it's not a great idea, >> >> Which part? SPDX tags or checking new files or just using checkpatch for this? > > SPDX tags in all files. > > There's no real way to check a patch for this. > > You have to check the entire file. Changing existing files is a separate problem. There is a script for that (though the data file is not public). I'm only worried with new files here because that's what I review and have to tell folks to replace their 2 pages of license text with SPDX tags. (It will be much easier to just tell them to run checkpatch. ;) ). > checkpatch could, as you've done, scan for new files > against /dev/null, but a single patch can add > multiple files and each newly added file should have > a missing SPDX indicator check. I was going with the easy route of just giving one warning per patch. I'd hope that's enough info for folks to figure out what's needed from there. However, it should be possible to make it per file. The main complication is we need to look for either '^+++' or the end of the patch which I didn't see an easy/clean way to do. > My concern is that there are ~50,000 files in the > kernel source tree and, after that scripted patch > adding the tags, only about a quarter of them have > an SPDX tag. > > So which files actually _need_ a SPDX tag? > > files in -next with an SPDX tag: > > $ git grep --name-only -i -P "spdx-licen[cs]e-identifier" | \ > while read file ; do basename $file ; done | \ > sed -r -e 's/^.*(\..*)/\1/' | \ > sort | uniq -c | sort -rn | head -10 > 7514 .h > 3435 .c > 1193 Makefile > 486 .S > 221 .dts > 186 Kconfig > 185 .dtsi > 97 .sh > 34 .tc > 24 .debug > > vs all files in -next (not Documentation/) > > $ git ls-files | grep -v "^Documentation/" | \ > while read file ; do basename $file ; done | \ > sed -r -e 's/^.*(\..*)/\1/' | \ > sort | uniq -c | sort -rn | head -10 > 25946 .c > 20360 .h > 2437 Makefile > 1454 .S > 1442 .dts > 1380 Kconfig > 1099 .dtsi > 207 .json > 204 .gitignore > 194 .sh >
On Thu, 2017-11-09 at 12:12 -0600, Rob Herring wrote: > On Thu, Nov 9, 2017 at 9:39 AM, Joe Perches <joe@perches.com> wrote: > > On Thu, 2017-11-09 at 07:47 -0600, Rob Herring wrote: > > > On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@perches.com> wrote: > > > > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote: > > > > > Add a check warning if SPDX-License-Identifier tags are not used in > > > > > newly added files. > > > > > > > > If this is to be done, and I think it's not a great idea, > > > > > > Which part? SPDX tags or checking new files or just using checkpatch for this? > > > > SPDX tags in all files. Is having an SPDX tag in every file really desired? > > > > There's no real way to check a patch for this. > > > > You have to check the entire file. > > Changing existing files is a separate problem. There is a script for > that (though the data file is not public). I'm only worried with new > files here because that's what I review and have to tell folks to > replace their 2 pages of license text with SPDX tags. (It will be much > easier to just tell them to run checkpatch. ;) ). > > > checkpatch could, as you've done, scan for new files > > against /dev/null, but a single patch can add > > multiple files and each newly added file should have > > a missing SPDX indicator check. > > I was going with the easy route of just giving one warning per patch. > I'd hope that's enough info for folks to figure out what's needed from > there. However, it should be possible to make it per file. The main > complication is we need to look for either '^+++' or the end of the > patch which I didn't see an easy/clean way to do. EOF is easy. There already is a $realfile test for start of file. > > My concern is that there are ~50,000 files in the > > kernel source tree and, after that scripted patch > > adding the tags, only about a quarter of them have > > an SPDX tag. > > > > So which files actually _need_ a SPDX tag? > > > > files in -next with an SPDX tag: > > > > $ git grep --name-only -i -P "spdx-licen[cs]e-identifier" | \ > > while read file ; do basename $file ; done | \ > > sed -r -e 's/^.*(\..*)/\1/' | \ > > sort | uniq -c | sort -rn | head -10 > > 7514 .h > > 3435 .c > > 1193 Makefile > > 486 .S > > 221 .dts > > 186 Kconfig > > 185 .dtsi > > 97 .sh > > 34 .tc > > 24 .debug > > > > vs all files in -next (not Documentation/) > > > > $ git ls-files | grep -v "^Documentation/" | \ > > while read file ; do basename $file ; done | \ > > sed -r -e 's/^.*(\..*)/\1/' | \ > > sort | uniq -c | sort -rn | head -10 > > 25946 .c > > 20360 .h > > 2437 Makefile > > 1454 .S > > 1442 .dts > > 1380 Kconfig > > 1099 .dtsi > > 207 .json > > 204 .gitignore > > 194 .sh > >
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 8b80bac055e4..6665735123e5 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2185,6 +2185,8 @@ sub process { my $commit_log_has_diff = 0; my $reported_maintainer_file = 0; my $non_utf8_charset = 0; + my $added_file = 0; + my $missing_spdx_license = 0; my $last_blank_line = 0; my $last_coalesced_string_linenr = -1; @@ -2368,6 +2370,16 @@ sub process { $here = "#$linenr: " if (!$file); $here = "#$realline: " if ($file); + # determine if this is a new file + if ($line =~ m@^\-\-\-\s/@) { + if ($line =~ m@/dev/null@) { + $added_file = 1; + $missing_spdx_license++; + } else { + $added_file = 0; + } + } + my $found_file = 0; # extract the filename as it passes if ($line =~ /^diff --git.*?(\S+)$/) { @@ -2865,6 +2877,14 @@ sub process { } } +# check for using SPDX tag instead of free form license text + if ($added_file && + ($rawline =~ /\bSPDX-License-Identifier/ || + $realfile =~ /Documentation/)) { + $missing_spdx_license--; + $added_file = 0; + } + # check we are in a valid source file if not then ignore this hunk next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/); @@ -6399,6 +6419,11 @@ sub process { "Missing Signed-off-by: line(s)\n"); } + if ($missing_spdx_license) { + WARN("SPDX_LICENSE_TAG", + "Missing SPDX-License-Identifier tags in added files. Use tags instead of full license text.\n"); + } + print report_dump(); if ($summary && !($clean == 1 && $quiet == 1)) { print "$filename " if ($summary_file);
Add a check warning if SPDX-License-Identifier tags are not used in newly added files. Cc: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rob Herring <robh@kernel.org> --- I rewrote my previous version to check more than just dts files. It also now looks for a tag in added files rather than trying a fuzzy match on freeform license text. scripts/checkpatch.pl | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) -- 2.14.1