[v6] checkpatch.pl: Add SPDX license tag check

Message ID 20180202154026.15298-1-robh@kernel.org
State Accepted
Commit 9f3a89926d6dfc30a4fd1bbcb92cc7b218d3786d
Headers show
Series
  • [v6] checkpatch.pl: Add SPDX license tag check
Related show

Commit Message

Rob Herring Feb. 2, 2018, 3:40 p.m.
Add SPDX license tag check based on the rules defined in
Documentation/process/license-rules.rst. To summarize, SPDX license tags
should be on the 1st line (or 2nd line in scripts) using the appropriate
comment style for the file type.

Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Rob Herring <robh@kernel.org>

---
v6:
- Dropped script extension check and only look for #!/... on 1st line. A 
  text executable file was not reliable either.
- Support .awk and .tc which may or may not have a #!/.
- Fixed a typo in script "#!" regex and also match on first /.
- Add Greg's ack.

 scripts/checkpatch.pl | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

-- 
2.14.1

Comments

Igor Stoppa Feb. 2, 2018, 3:49 p.m. | #1
On 02/02/18 17:40, Rob Herring wrote:
> Add SPDX license tag check based on the rules defined in


Shouldn't it also check that the license is compatible?

[...]

> +				} elsif ($realfile =~ /\.rst$/) {

> +					$comment = '..';


What is the correct/acceptable license for documentation?
Creative Commons? AFAIK GPL is for source code.

Googling didn't bring the wished-for enlightenment.

--
igor
gregkh Feb. 2, 2018, 4:12 p.m. | #2
On Fri, Feb 02, 2018 at 05:49:20PM +0200, Igor Stoppa wrote:
> On 02/02/18 17:40, Rob Herring wrote:

> > Add SPDX license tag check based on the rules defined in

> 

> Shouldn't it also check that the license is compatible?


Baby steps please :)

> [...]

> 

> > +				} elsif ($realfile =~ /\.rst$/) {

> > +					$comment = '..';

> 

> What is the correct/acceptable license for documentation?

> Creative Commons? AFAIK GPL is for source code.

> 

> Googling didn't bring the wished-for enlightenment.


It depends on what you want to allow the documentation to be used for.
It's not a simple answer :(

thanks,

greg k-h
Jonathan Corbet Feb. 2, 2018, 4:17 p.m. | #3
On Fri, 2 Feb 2018 17:12:46 +0100
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> > What is the correct/acceptable license for documentation?

> > Creative Commons? AFAIK GPL is for source code.

> > 

> > Googling didn't bring the wished-for enlightenment.  

> 

> It depends on what you want to allow the documentation to be used for.

> It's not a simple answer :(


Honestly, GPL (or more permissive) is the only thing that really makes
sense.  Much of the documentation, once processed, includes an awful lot
of stuff directly from the kernel source; there's really no way to say
that it's not a derived product of the kernel.  So the output of "make
htmldocs" or "make pdfdocs" really has to be GPL, suggesting that the
input should be GPL-compatible.

GPL isn't the best license for documentation.  If we were starting today
I'd try to find a way to use CC-SA instead, but that's not where we're at.
And GPL is workable enough, I think.

At least that's how I see it, not that I really know any more than anybody
else.

jon
Rob Herring Feb. 2, 2018, 6:27 p.m. | #4
On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:
> On 02/02/18 17:40, Rob Herring wrote:

>> Add SPDX license tag check based on the rules defined in

>

> Shouldn't it also check that the license is compatible?

>


Perhaps we shouldn't try to script legal advice.

> [...]

>

>> +                             } elsif ($realfile =~ /\.rst$/) {

>> +                                     $comment = '..';

>

> What is the correct/acceptable license for documentation?

> Creative Commons? AFAIK GPL is for source code.


At least for the DT bindings, we probably want those to be Apache v2
because that's what the DT spec is. But that really only matters on
the common bindings. But then are device specific bindings derived
from those? And we have include/dt-bindings/ which are part of binding
definition/doc and those get used in dts (GPL, MIT or BSD) and source
files. In summary, it's complicated...

Rob
Joe Perches Feb. 2, 2018, 7:06 p.m. | #5
On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:
> On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:

> > On 02/02/18 17:40, Rob Herring wrote:

> > > Add SPDX license tag check based on the rules defined in

> > 

> > Shouldn't it also check that the license is compatible?

> > 

> 

> Perhaps we shouldn't try to script legal advice.


True.

I believe what was meant was that the
entry was a valid SPDX License entry
that already exists as a specific file
in the LICENSES/ path.

So that entry must be some combination of:

$ git ls-files LICENSES/ | cut -f3- -d'/' | sort
BSD-2-Clause
BSD-3-Clause
BSD-3-Clause-Clear
GPL-1.0
GPL-2.0
LGPL-2.0
LGPL-2.1
Linux-syscall-note
MIT
MPL-1.1

From my perspective, it'd be better if the
various + uses had their own individual
license files in the LICENSES/ path.

Right now, there are many missing licenses
that are already used by various existing
SPDX-License-Identifier: entries.


APACHE-2.0
BSD
CDDL
CDDL-1.0
ISC
GPL-1.0+
GPL-2.0+
LGPL-2.1+
OpenSSL

There are odd entries like:

GPL-2.0-only

Parentheses around AND/OR aren't consistent.
Joe Perches Feb. 2, 2018, 8:55 p.m. | #6
On Fri, 2018-02-02 at 14:18 -0600, Kate Stewart wrote:
> On Fri, Feb 2, 2018 at 1:06 PM, Joe Perches <joe@perches.com> wrote:

> 

> > On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:

> > > On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com>

> > 

> > wrote:

> > > > On 02/02/18 17:40, Rob Herring wrote:

> > > > > Add SPDX license tag check based on the rules defined in

> > > > 

> > > > Shouldn't it also check that the license is compatible?

> > > > 

> > > 

> > > Perhaps we shouldn't try to script legal advice.

> > 

> > True.

> > 

> > I believe what was meant was that the

> > entry was a valid SPDX License entry

> > that already exists as a specific file

> > in the LICENSES/ path.

> > 

> > So that entry must be some combination of:

> > 

> > $ git ls-files LICENSES/ | cut -f3- -d'/' | sort

> > BSD-2-Clause

> > BSD-3-Clause

> > BSD-3-Clause-Clear

> > GPL-1.0

> > GPL-2.0

> > LGPL-2.0

> > LGPL-2.1

> > Linux-syscall-note

> > MIT

> > MPL-1.1

> > 

> > From my perspective, it'd be better if the

> > various + uses had their own individual

> > license files in the LICENSES/ path.

> > 

> 

> At the end of december, the SPDX license list[1] was rev'd to

> Version: 3.0 28 December 2017.   At the request of

> FSF, the GNU license family would not use the "+" notation,


That's rather more sensible to me.

This should probably be updated in linux-next in
the near future rather than later.

> and would bias towards using "-only" and "-or-later", explicitly.

> So adding both variants to the LICENSES/ path aligns with

> this forward direction.


It's probably better to remove the + variants.

> > Right now, there are many missing licenses

> > that are already used by various existing

> > SPDX-License-Identifier: entries.

> > 

> > 

> > APACHE-2.0

> > BSD

> > CDDL


CDDL does not exist standalone and was caused by my
defective eyeballs when scanning the SPDX list via:

$ git grep -w "SPDX-License-Identifier:" | \
  cut -f3- -d":" | \
  sed -r -e 's/^\s+//' -e 's/\*\/\s*//' -e 's/\s+$//' | \
  sort | uniq -c | sort -rn

> > CDDL-1.0

> > ISC

> > GPL-1.0+

> > GPL-2.0+

> > LGPL-2.1+

> > OpenSSL

> > 

> > There are odd entries like:

> > 

> > GPL-2.0-only

> > 

> 

> This is the new way to represent GPLv2 only, as described above.

> While the GPL-2.0 and GPL-2.0+ notation is still valid,  it is deprecated

> in the latest version, so transitioning existing over time will probably

> be needed.


Probably better to remove and replace the old notation
instead of doing it piecemeal.

When the appropriate LICENSE file changes exist, a
generic substitution could work well.

$ git grep --name-only "SPDX-License-Identifier:" | \
  grep -vP "^(?:LICENSES/|Documentation/process/license-rules\.rst)" | \
  xargs perl -p -i -e 's/SPDX-License-Identifier:\s*(L?GPL-\d\.\d)\+/SPDX-License-Identifier: \1-or-later/;s/SPDX-License-Identifier:\s*(L?GPL-\d\.\d)(?!-or-later)/SPDX-License-Identifier: \1-only/'

> So I think the list of licenses to be added to

> LICENSES/ path is:

> 

> APACHE-2.0

> BSD

> CDDL

> CDDL-1.0

> ISC

> GPL-1.0-only

> GPL-1.0-or-later (note: actually same contents as one GPL-1.0-only)

> GPL-2.0-only

> GPL-2.0-or-later (same contents as GPL-2.0-only)

> LGPL-2.0-only

> LGPL-2.0-or-later (same contents as LGPL-2.0-only)

> LGPL-2.1-only

> LGPL-2.1-or-later (same contents as LGPL-2.1-only)


If LGPL-n.m -only and -or-later are the same,
there's probably no need for duplicate LICENSE
files and just making sure LGPL-n.m without any
other wording is exclusively used is proper.

> OpenSSL

> 

> Having files with the same contents, but different names is

> irritating, but I can't see a another way of complying with REUSE

> guidelines.   Any better suggestions?


What and where are the REUSE guidelines?

https://reuse.software/dev/
doesn't show me much.

> 

> 

> > Parentheses around AND/OR aren't consistent.

> > 

> 

> The SPDX specification has an appendix that calls for "(",")"

> around every license expresssion.   After discussion with some

> developers it was decided to be ok to relax that, and only add them

> when they were essential to clarify the logic.   The next rev of the

> SPDX specification will have this clarified as well.   I think we caught

> most of the changes in the kernel documentation patches for describing

> this,  but if you have specific cases to be reviewed,  happy to have

> a look.

> 

> Thanks, Kate

> 

> 

> [1] https://spdx.org/licenses/
Rob Herring Feb. 2, 2018, 8:57 p.m. | #7
On Fri, Feb 2, 2018 at 1:06 PM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:

>> On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:

>> > On 02/02/18 17:40, Rob Herring wrote:

>> > > Add SPDX license tag check based on the rules defined in

>> >

>> > Shouldn't it also check that the license is compatible?

>> >

>>

>> Perhaps we shouldn't try to script legal advice.

>

> True.

>

> I believe what was meant was that the

> entry was a valid SPDX License entry

> that already exists as a specific file

> in the LICENSES/ path.

>

> So that entry must be some combination of:

>

> $ git ls-files LICENSES/ | cut -f3- -d'/' | sort

> BSD-2-Clause

> BSD-3-Clause

> BSD-3-Clause-Clear

> GPL-1.0

> GPL-2.0

> LGPL-2.0

> LGPL-2.1

> Linux-syscall-note

> MIT

> MPL-1.1

>

> From my perspective, it'd be better if the

> various + uses had their own individual

> license files in the LICENSES/ path.

>

> Right now, there are many missing licenses

> that are already used by various existing

> SPDX-License-Identifier: entries.

>

>

> APACHE-2.0


Given that Apache 2.0 is not compatible with GPL 2, that would pretty
much mean anything with Apache license is dual licensed and it would
be the other license that applies. Do we really want/need license
texts for all the other possible licenses that don't apply to kernel
files? If so, this should all just be scripted to sync LICENSES/ with
found SPDX tags in the kernel.

> BSD

> CDDL

> CDDL-1.0

> ISC

> GPL-1.0+

> GPL-2.0+

> LGPL-2.1+

> OpenSSL


ISC and OpenSSL are only in license-rules.rst as examples. We should
just fix those examples to something else.

Rob
Joe Perches Feb. 2, 2018, 9:10 p.m. | #8
On Fri, 2018-02-02 at 14:57 -0600, Rob Herring wrote:
> On Fri, Feb 2, 2018 at 1:06 PM, Joe Perches <joe@perches.com> wrote:

[]
> > Right now, there are many missing licenses

> > that are already used by various existing

> > SPDX-License-Identifier: entries.

> > 

> > APACHE-2.0

> 

> Given that Apache 2.0 is not compatible with GPL 2, that would pretty

> much mean anything with Apache license is dual licensed and it would

> be the other license that applies. Do we really want/need license

> texts for all the other possible licenses that don't apply to kernel

> files?


There are 2 uses of SPDX.*Apache in staging.

drivers/staging/android/ashmem.h:// SPDX-License-Identifier: (GPL-2.0 OR Apache-2.0)
drivers/staging/android/uapi/ashmem.h:// SPDX-License-Identifier: (GPL-2.0 OR Apache-2.0)

I believe each specific license text should be
available in either the actual source file with
the Apache license or in a specific LICENSES/
file in the kernel tree.

> If so, 

> this should all just be scripted to sync LICENSES/ with

> found SPDX tags in the kernel.


Right.  That's probably best.

> ISC and OpenSSL are only in license-rules.rst as examples. We should

> just fix those examples to something else.


Sounds good to me.

I should have excluded Documentation/process/license-rules.rst
from the SPDX matches.

Are significant portions of that document generic across
multiple projects or is it entirely specific to linux?

cheers, Joe
Joe Perches Feb. 2, 2018, 9:18 p.m. | #9
On Fri, 2018-02-02 at 09:40 -0600, Rob Herring wrote:
> Add SPDX license tag check based on the rules defined in

> Documentation/process/license-rules.rst. To summarize, SPDX license tags

> should be on the 1st line (or 2nd line in scripts) using the appropriate

> comment style for the file type.

> 

> Cc: Andy Whitcroft <apw@canonical.com>

> Cc: Joe Perches <joe@perches.com>

> Cc: Thomas Gleixner <tglx@linutronix.de>

> Cc: Philippe Ombredanne <pombredanne@nexb.com>

> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


> Signed-off-by: Rob Herring <robh@kernel.org>


Signed-off-by: Joe Perches <joe@perches.com>


> ---

> v6:

> - Dropped script extension check and only look for #!/... on 1st line. A 

>   text executable file was not reliable either.

> - Support .awk and .tc which may or may not have a #!/.

> - Fixed a typo in script "#!" regex and also match on first /.

> - Add Greg's ack.

> 

>  scripts/checkpatch.pl | 27 +++++++++++++++++++++++++++

>  1 file changed, 27 insertions(+)

> 

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

> index ba03f17ff662..6db245e5f93b 100755

> --- a/scripts/checkpatch.pl

> +++ b/scripts/checkpatch.pl

> @@ -2225,6 +2225,8 @@ sub process {

>  

>  	my $camelcase_file_seeded = 0;

>  

> +	my $checklicenseline = 1;

> +

>  	sanitise_line_reset();

>  	my $line;

>  	foreach my $rawline (@rawlines) {

> @@ -2416,6 +2418,7 @@ sub process {

>  			} else {

>  				$check = $check_orig;

>  			}

> +			$checklicenseline = 1;

>  			next;

>  		}

>  

> @@ -2866,6 +2869,30 @@ sub process {

>  			}

>  		}

>  

> +# check for using SPDX license tag at beginning of files

> +		if ($realline == $checklicenseline) {

> +			if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {

> +				$checklicenseline = 2;

> +			} elsif ($rawline =~ /^\+/) {

> +				my $comment = "";

> +				if ($realfile =~ /\.(h|s|S)$/) {

> +					$comment = '/*';

> +				} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {

> +					$comment = '//';

> +				} elsif (($checklicenseline == 2) || $realfile =~ /\.(sh|pl|py|awk|tc)$/) {

> +					$comment = '#';

> +				} elsif ($realfile =~ /\.rst$/) {

> +					$comment = '..';

> +				}

> +

> +				if ($comment !~ /^$/ &&

> +				    $rawline !~ /^\+\Q$comment\E SPDX-License-Identifier: /) {

> +					WARN("SPDX_LICENSE_TAG",

> +					     "Missing or malformed SPDX-License-Identifier tag in line $checklicenseline\n" . $herecurr);

> +				}

> +			}

> +		}

> +

>  # check we are in a valid source file if not then ignore this hunk

>  		next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);

>
Igor Stoppa Feb. 3, 2018, 1:41 p.m. | #10
On 02/02/18 21:06, Joe Perches wrote:
> On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:

>> On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:

>>> On 02/02/18 17:40, Rob Herring wrote:

>>>> Add SPDX license tag check based on the rules defined in

>>>

>>> Shouldn't it also check that the license is compatible?

>>>

>>

>> Perhaps we shouldn't try to script legal advice.

> 

> True.

> 

> I believe what was meant was that the

> entry was a valid SPDX License entry

> that already exists as a specific file

> in the LICENSES/ path.


I expect that there is a finite number of compatible licenses.
Maybe I'm too optimistic about what can be taken as legal advice or not,
but I would expect that a warning about unmatched license type does not
constitute legal advice.

Is it too optimistic? :-D

--
igor
Philippe Ombredanne Feb. 8, 2018, 2:35 p.m. | #11
even
On Fri, Feb 2, 2018 at 8:06 PM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:

>> On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:

>> > On 02/02/18 17:40, Rob Herring wrote:

>> > > Add SPDX license tag check based on the rules defined in

>> >

>> > Shouldn't it also check that the license is compatible?

>> >

>>

>> Perhaps we shouldn't try to script legal advice.

>

> True.

>

> I believe what was meant was that the

> entry was a valid SPDX License entry

> that already exists as a specific file

> in the LICENSES/ path.

>

> So that entry must be some combination of:

>

> $ git ls-files LICENSES/ | cut -f3- -d'/' | sort

> BSD-2-Clause

> BSD-3-Clause

> BSD-3-Clause-Clear

> GPL-1.0

> GPL-2.0

> LGPL-2.0

> LGPL-2.1

> Linux-syscall-note

> MIT

> MPL-1.1

>

> From my perspective, it'd be better if the

> various + uses had their own individual

> license files in the LICENSES/ path.

>

> Right now, there are many missing licenses

> that are already used by various existing

> SPDX-License-Identifier: entries.

>

>

> APACHE-2.0

> BSD

> CDDL

> CDDL-1.0

> ISC

> GPL-1.0+

> GPL-2.0+

> LGPL-2.1+

> OpenSSL

>

> There are odd entries like:

>

> GPL-2.0-only

>

> Parentheses around AND/OR aren't consistent.


Joe,
I have a comprehensive license expressions checker/parser [1] in
Python ;) if it is ever needed, but that's likely overkill for the
kernel. (this is not in Perl for one thing and second it is based on a
boolean expression parser and minimizer, hence overkill for the
limited kernel use case IMHO)

However checking that licenses ids are known and listed in the kernel
doc is essential IMHO to avoid drift and insulate the kernel from SPDX
updates. Case in point  the new SPDX "GPL-2.0-only" is NOT what was
documented by tglx and therefore should not be used and banned until
we update the doc accordingly. and until we update ALL the GPL-2.0 to
GPL-2.0-only eventually which is best done at once. Otherwise, this is
going to be a total mess on top of a complicated topic that requires
quite a bit of maintainer energy!


[1] https://github.com/nexB/license-expression/
-- 
Cordially
Philippe Ombredanne
Philippe Ombredanne Feb. 8, 2018, 2:41 p.m. | #12
Kate,

On Fri, Feb 2, 2018 at 9:18 PM, Kate Stewart
<kstewart@linuxfoundation.org> wrote:
> This is the new way to represent GPLv2 only, as described above.

> While the GPL-2.0 and GPL-2.0+ notation is still valid,  it is deprecated

> in the latest version, so transitioning existing over time will probably

> be needed.   So I think the list of licenses to be added to

> LICENSES/ path is:

>

> APACHE-2.0

> BSD

> CDDL

> CDDL-1.0

> ISC

> GPL-1.0-only

> GPL-1.0-or-later (note: actually same contents as one GPL-1.0-only)

> GPL-2.0-only

> GPL-2.0-or-later (same contents as GPL-2.0-only)

> LGPL-2.0-only

> LGPL-2.0-or-later (same contents as LGPL-2.0-only)

> LGPL-2.1-only

> LGPL-2.1-or-later (same contents as LGPL-2.1-only)

> OpenSSL


Yes, this is the new SPDX was (-only) but this is not yet the kernel
way and doc.

IMHO as long as the new SPDX "GPL-2.0-only" are not what is in the
kernel doc they should not be used and banned. Otherwise, we are
looking at creating an infinite source of confusion
therefore we should not add the -only license for now until they
become the kernel ways.
Joe already spotted drifts and I tried to fight these as much as I
could. One id for one license at a time is the only sane way to go.
-- 
Cordially
Philippe Ombredanne
Philippe Ombredanne Feb. 8, 2018, 2:44 p.m. | #13
On Sat, Feb 3, 2018 at 2:41 PM, Igor Stoppa <igor.stoppa@huawei.com> wrote:
>

>

> On 02/02/18 21:06, Joe Perches wrote:

>> On Fri, 2018-02-02 at 12:27 -0600, Rob Herring wrote:

>>> On Fri, Feb 2, 2018 at 9:49 AM, Igor Stoppa <igor.stoppa@huawei.com> wrote:

>>>> On 02/02/18 17:40, Rob Herring wrote:

>>>>> Add SPDX license tag check based on the rules defined in

>>>>

>>>> Shouldn't it also check that the license is compatible?

>>>>

>>>

>>> Perhaps we shouldn't try to script legal advice.

>>

>> True.

>>

>> I believe what was meant was that the

>> entry was a valid SPDX License entry

>> that already exists as a specific file

>> in the LICENSES/ path.

>

> I expect that there is a finite number of compatible licenses.

> Maybe I'm too optimistic about what can be taken as legal advice or not,

> but I would expect that a warning about unmatched license type does not

> constitute legal advice.

>

> Is it too optimistic? :-D


That's very reasonable IMHO and this not legal advice alright to me.
This would be just a tool that warns you that your license expression
does not match known licenses in the kernel.

-- 
Cordially
Philippe Ombredanne
Joe Perches Feb. 8, 2018, 5:24 p.m. | #14
On Thu, 2018-02-08 at 15:35 +0100, Philippe Ombredanne wrote:
> However checking that licenses ids are known and listed in the kernel

> doc is essential IMHO to avoid drift and insulate the kernel from SPDX

> updates. Case in point  the new SPDX "GPL-2.0-only" is NOT what was

> documented by tglx and therefore should not be used and banned until

> we update the doc accordingly. and until we update ALL the GPL-2.0 to

> GPL-2.0-only eventually which is best done at once.


Agree and I've attached what I believe to be a
reasonable script for that conversion only after
LICENSE directories are updated with the
appropriate and license files and after
Documentation/process/license-rules.rst is modified.

> Otherwise, this is

> going to be a total mess on top of a complicated topic that requires

> quite a bit of maintainer energy!


There will always be some energy requirement and
no doubt some legal advice involvement too.

In another vein:

The existing license files in spdx.org seem
somewhat sloppily edited and perhaps have less
clarity and precision than desired.

For instance:

If the newer SPDX descriptor "GPL-2.0-only" is to
be used, why does this license URL:

https://spdx.org/licenses/GPL-2.0-only.html

still contain the phrase ", or (at your option) any later version".

The current diff between GPL-2.0-only and GPL-2.0-or-later:

$ wget -q https://spdx.org/licenses/GPL-2.0-only.html
$ wget -q https://spdx.org/licenses/GPL-2.0-or-later.html
$ diff -U0 GPL-2.0-only.html GPL-2.0-or-later.html 
--- GPL-2.0-only.html	2017-12-28 12:17:20.000000000 -0800
+++ GPL-2.0-or-later.html	2017-12-28 12:17:22.000000000 -0800
@@ -15 +15 @@
-    <title>GNU General Public License v2.0 only | Software Package Data Exchange (SPDX)</title>
+    <title>GNU General Public License v2.0 or later | Software Package Data Exchange (SPDX)</title>
@@ -141 +141 @@
-      <h1 property="dc:title">GNU General Public License v2.0 only</h1>
+      <h1 property="dc:title">GNU General Public License v2.0 or later</h1>
@@ -144 +144 @@
-          <p style="margin-left: 20px;"><code property="spdx:name">GNU General Public License v2.0 only</code></p>
+          <p style="margin-left: 20px;"><code property="spdx:name">GNU General Public License v2.0 or later</code></p>
@@ -147 +147 @@
-          <p style="margin-left: 20px;"><code property="spdx:licenseId">GPL-2.0-only</code></p>
+          <p style="margin-left: 20px;"><code property="spdx:licenseId">GPL-2.0-or-later</code></p>
@@ -160 +160 @@
-          <p style="margin-left: 20px;">This license was released: June 1991 This refers to when this GPL 2.0 only is being used (as opposed to GPLv2 or later).</p>
+          <p style="margin-left: 20px;">This license was released: June 1991</p>
@@ -679 +679,2 @@
-        as published by the Free Software Foundation; version 2.
+	as published by the Free Software Foundation; version 2
+	or any later version.


I am not a lawyer, this is not legal advice, etc... but:

The "1991 This" use in the -only file seems be missing
a period.

In any case it is awkwardly phrased as "or later" perhaps
should not be referenced at all.

The GPL 2.0 license as published by the Free Software
Foundation includes the option for using later versions.

Perhaps the SPDX -only licenses should be more specific
when it uses the phrase "as published by the Free
Software Foundation; version <n>." to specifically
exclude the option of any later version.
Philippe Ombredanne Feb. 8, 2018, 6:09 p.m. | #15
Joe,

On Thu, Feb 8, 2018 at 6:24 PM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2018-02-08 at 15:35 +0100, Philippe Ombredanne wrote:

>> However checking that licenses ids are known and listed in the kernel

>> doc is essential IMHO to avoid drift and insulate the kernel from SPDX

>> updates. Case in point  the new SPDX "GPL-2.0-only" is NOT what was

>> documented by tglx and therefore should not be used and banned until

>> we update the doc accordingly. and until we update ALL the GPL-2.0 to

>> GPL-2.0-only eventually which is best done at once.

>

> Agree and I've attached what I believe to be a

> reasonable script for that conversion only after

> LICENSE directories are updated with the

> appropriate and license files and after

> Documentation/process/license-rules.rst is modified.


Excellent and clean!

>> Otherwise, this is

>> going to be a total mess on top of a complicated topic that requires

>> quite a bit of maintainer energy!

>

> There will always be some energy requirement and

> no doubt some legal advice involvement too.

>

> In another vein:

>

> The existing license files in spdx.org seem

> somewhat sloppily edited and perhaps have less

> clarity and precision than desired.

>

> For instance:

>

> If the newer SPDX descriptor "GPL-2.0-only" is to

> be used, why does this license URL:

>

> https://spdx.org/licenses/GPL-2.0-only.html

>

> still contain the phrase ", or (at your option) any later version".

>

> The current diff between GPL-2.0-only and GPL-2.0-or-later:

>

> $ wget -q https://spdx.org/licenses/GPL-2.0-only.html

> $ wget -q https://spdx.org/licenses/GPL-2.0-or-later.html

> $ diff -U0 GPL-2.0-only.html GPL-2.0-or-later.html

> --- GPL-2.0-only.html   2017-12-28 12:17:20.000000000 -0800

> +++ GPL-2.0-or-later.html       2017-12-28 12:17:22.000000000 -0800

> @@ -15 +15 @@

> -    <title>GNU General Public License v2.0 only | Software Package Data Exchange (SPDX)</title>

> +    <title>GNU General Public License v2.0 or later | Software Package Data Exchange (SPDX)</title>

> @@ -141 +141 @@

> -      <h1 property="dc:title">GNU General Public License v2.0 only</h1>

> +      <h1 property="dc:title">GNU General Public License v2.0 or later</h1>

> @@ -144 +144 @@

> -          <p style="margin-left: 20px;"><code property="spdx:name">GNU General Public License v2.0 only</code></p>

> +          <p style="margin-left: 20px;"><code property="spdx:name">GNU General Public License v2.0 or later</code></p>

> @@ -147 +147 @@

> -          <p style="margin-left: 20px;"><code property="spdx:licenseId">GPL-2.0-only</code></p>

> +          <p style="margin-left: 20px;"><code property="spdx:licenseId">GPL-2.0-or-later</code></p>

> @@ -160 +160 @@

> -          <p style="margin-left: 20px;">This license was released: June 1991 This refers to when this GPL 2.0 only is being used (as opposed to GPLv2 or later).</p>

> +          <p style="margin-left: 20px;">This license was released: June 1991</p>

> @@ -679 +679,2 @@

> -        as published by the Free Software Foundation; version 2.

> +       as published by the Free Software Foundation; version 2

> +       or any later version.

>

>

> I am not a lawyer, this is not legal advice, etc... but:

>

> The "1991 This" use in the -only file seems be missing

> a period.

>

> In any case it is awkwardly phrased as "or later" perhaps

> should not be referenced at all.

>

> The GPL 2.0 license as published by the Free Software

> Foundation includes the option for using later versions.

>

> Perhaps the SPDX -only licenses should be more specific

> when it uses the phrase "as published by the Free

> Software Foundation; version <n>." to specifically

> exclude the option of any later version.


Good points and this is why we have and need to use the kernel doc as
the stable reference IMHO.

FWIW, I have raised a ticket with SPDX [2] so that the issue you have
found can be properly fixed there.  Also, I think this  (the new -only
license ids that I think we should not yet use) has been reviewed in
details by the SPDX legal group and by the FSF. At least rms posted an
article about it last December [2] ?

[1] https://github.com/spdx/license-list-XML/issues/610
[2] https://web.archive.org/web/20171221220428/https://www.gnu.org/licenses/identify-licenses-clearly.html

-- 
Cordially
Philippe Ombredanne
Joe Perches Feb. 9, 2018, 12:35 a.m. | #16
On Fri, 2018-02-02 at 13:18 -0800, Joe Perches wrote:
> On Fri, 2018-02-02 at 09:40 -0600, Rob Herring wrote:

> > Add SPDX license tag check based on the rules defined in

> > Documentation/process/license-rules.rst. To summarize, SPDX license tags

> > should be on the 1st line (or 2nd line in scripts) using the appropriate

> > comment style for the file type.

> > 

> > Cc: Andy Whitcroft <apw@canonical.com>

> > Cc: Joe Perches <joe@perches.com>

> > Cc: Thomas Gleixner <tglx@linutronix.de>

> > Cc: Philippe Ombredanne <pombredanne@nexb.com>

> > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

> > Signed-off-by: Rob Herring <robh@kernel.org>

> 

> Signed-off-by: Joe Perches <joe@perches.com>


Andrew, would you pick this up please?

> > ---

> > v6:

> > - Dropped script extension check and only look for #!/... on 1st line. A 

> >   text executable file was not reliable either.

> > - Support .awk and .tc which may or may not have a #!/.

> > - Fixed a typo in script "#!" regex and also match on first /.

> > - Add Greg's ack.

> > 

> >  scripts/checkpatch.pl | 27 +++++++++++++++++++++++++++

> >  1 file changed, 27 insertions(+)

> > 

> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

> > index ba03f17ff662..6db245e5f93b 100755

> > --- a/scripts/checkpatch.pl

> > +++ b/scripts/checkpatch.pl

> > @@ -2225,6 +2225,8 @@ sub process {

> >  

> >  	my $camelcase_file_seeded = 0;

> >  

> > +	my $checklicenseline = 1;

> > +

> >  	sanitise_line_reset();

> >  	my $line;

> >  	foreach my $rawline (@rawlines) {

> > @@ -2416,6 +2418,7 @@ sub process {

> >  			} else {

> >  				$check = $check_orig;

> >  			}

> > +			$checklicenseline = 1;

> >  			next;

> >  		}

> >  

> > @@ -2866,6 +2869,30 @@ sub process {

> >  			}

> >  		}

> >  

> > +# check for using SPDX license tag at beginning of files

> > +		if ($realline == $checklicenseline) {

> > +			if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {

> > +				$checklicenseline = 2;

> > +			} elsif ($rawline =~ /^\+/) {

> > +				my $comment = "";

> > +				if ($realfile =~ /\.(h|s|S)$/) {

> > +					$comment = '/*';

> > +				} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {

> > +					$comment = '//';

> > +				} elsif (($checklicenseline == 2) || $realfile =~ /\.(sh|pl|py|awk|tc)$/) {

> > +					$comment = '#';

> > +				} elsif ($realfile =~ /\.rst$/) {

> > +					$comment = '..';

> > +				}

> > +

> > +				if ($comment !~ /^$/ &&

> > +				    $rawline !~ /^\+\Q$comment\E SPDX-License-Identifier: /) {

> > +					WARN("SPDX_LICENSE_TAG",

> > +					     "Missing or malformed SPDX-License-Identifier tag in line $checklicenseline\n" . $herecurr);

> > +				}

> > +			}

> > +		}

> > +

> >  # check we are in a valid source file if not then ignore this hunk

> >  		next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);

> >
Philippe Ombredanne Feb. 9, 2018, 5:58 a.m. | #17
On Fri, Feb 9, 2018 at 1:35 AM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2018-02-02 at 13:18 -0800, Joe Perches wrote:

>> On Fri, 2018-02-02 at 09:40 -0600, Rob Herring wrote:

>> > Add SPDX license tag check based on the rules defined in

>> > Documentation/process/license-rules.rst. To summarize, SPDX license tags

>> > should be on the 1st line (or 2nd line in scripts) using the appropriate

>> > comment style for the file type.

>> >

>> > Cc: Andy Whitcroft <apw@canonical.com>

>> > Cc: Joe Perches <joe@perches.com>

>> > Cc: Thomas Gleixner <tglx@linutronix.de>

>> > Cc: Philippe Ombredanne <pombredanne@nexb.com>

>> > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

>> > Signed-off-by: Rob Herring <robh@kernel.org>

>>

>> Signed-off-by: Joe Perches <joe@perches.com>

>

> Andrew, would you pick this up please?

>

>> > ---

>> > v6:

>> > - Dropped script extension check and only look for #!/... on 1st line. A

>> >   text executable file was not reliable either.

>> > - Support .awk and .tc which may or may not have a #!/.

>> > - Fixed a typo in script "#!" regex and also match on first /.

>> > - Add Greg's ack.

>> >

>> >  scripts/checkpatch.pl | 27 +++++++++++++++++++++++++++

>> >  1 file changed, 27 insertions(+)

>> >

>> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl

>> > index ba03f17ff662..6db245e5f93b 100755

>> > --- a/scripts/checkpatch.pl

>> > +++ b/scripts/checkpatch.pl

>> > @@ -2225,6 +2225,8 @@ sub process {

>> >

>> >     my $camelcase_file_seeded = 0;

>> >

>> > +   my $checklicenseline = 1;

>> > +

>> >     sanitise_line_reset();

>> >     my $line;

>> >     foreach my $rawline (@rawlines) {

>> > @@ -2416,6 +2418,7 @@ sub process {

>> >                     } else {

>> >                             $check = $check_orig;

>> >                     }

>> > +                   $checklicenseline = 1;

>> >                     next;

>> >             }

>> >

>> > @@ -2866,6 +2869,30 @@ sub process {

>> >                     }

>> >             }

>> >

>> > +# check for using SPDX license tag at beginning of files

>> > +           if ($realline == $checklicenseline) {

>> > +                   if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {

>> > +                           $checklicenseline = 2;

>> > +                   } elsif ($rawline =~ /^\+/) {

>> > +                           my $comment = "";

>> > +                           if ($realfile =~ /\.(h|s|S)$/) {

>> > +                                   $comment = '/*';

>> > +                           } elsif ($realfile =~ /\.(c|dts|dtsi)$/) {

>> > +                                   $comment = '//';

>> > +                           } elsif (($checklicenseline == 2) || $realfile =~ /\.(sh|pl|py|awk|tc)$/) {

>> > +                                   $comment = '#';

>> > +                           } elsif ($realfile =~ /\.rst$/) {

>> > +                                   $comment = '..';

>> > +                           }

>> > +

>> > +                           if ($comment !~ /^$/ &&

>> > +                               $rawline !~ /^\+\Q$comment\E SPDX-License-Identifier: /) {

>> > +                                   WARN("SPDX_LICENSE_TAG",

>> > +                                        "Missing or malformed SPDX-License-Identifier tag in line $checklicenseline\n" . $herecurr);

>> > +                           }

>> > +                   }

>> > +           }

>> > +

>> >  # check we are in a valid source file if not then ignore this hunk

>> >             next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);

>> >


BTW I forgot this if you like to add it:

Acked-by: Philippe Ombredanne <pombredanne@nexb.com>

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ba03f17ff662..6db245e5f93b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2225,6 +2225,8 @@  sub process {
 
 	my $camelcase_file_seeded = 0;
 
+	my $checklicenseline = 1;
+
 	sanitise_line_reset();
 	my $line;
 	foreach my $rawline (@rawlines) {
@@ -2416,6 +2418,7 @@  sub process {
 			} else {
 				$check = $check_orig;
 			}
+			$checklicenseline = 1;
 			next;
 		}
 
@@ -2866,6 +2869,30 @@  sub process {
 			}
 		}
 
+# check for using SPDX license tag at beginning of files
+		if ($realline == $checklicenseline) {
+			if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {
+				$checklicenseline = 2;
+			} elsif ($rawline =~ /^\+/) {
+				my $comment = "";
+				if ($realfile =~ /\.(h|s|S)$/) {
+					$comment = '/*';
+				} elsif ($realfile =~ /\.(c|dts|dtsi)$/) {
+					$comment = '//';
+				} elsif (($checklicenseline == 2) || $realfile =~ /\.(sh|pl|py|awk|tc)$/) {
+					$comment = '#';
+				} elsif ($realfile =~ /\.rst$/) {
+					$comment = '..';
+				}
+
+				if ($comment !~ /^$/ &&
+				    $rawline !~ /^\+\Q$comment\E SPDX-License-Identifier: /) {
+					WARN("SPDX_LICENSE_TAG",
+					     "Missing or malformed SPDX-License-Identifier tag in line $checklicenseline\n" . $herecurr);
+				}
+			}
+		}
+
 # check we are in a valid source file if not then ignore this hunk
 		next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);