[0/7] atomics: generate atomic headers

Message ID 20180529180746.29684-1-mark.rutland@arm.com
Headers show
Series
  • atomics: generate atomic headers
Related show

Message

Mark Rutland May 29, 2018, 6:07 p.m.
This series scripts the generation of the various atomic headers, to
ensure that the various atomic APIs remain consistent, reducing the risk
of human error, and simplifying future rework.

The series is based on my atomic API cleanup patches [1,2], and can be
found on my atomics/generated branch [3] on kernel.org.

The first three patches are preparatory rework, with patch four
introducing the infrastructure. The final three patches migrate to
generated headers. The scripts themselves are mostly POSIX sh (modulo
local), without bashisms, and work in dash and bash.

Per Linus request that it is possible to use git grep to inspect the
atomic headers [3], the headers are committed (and not generated by
kbuild). Since we now expand the fallback definitions inline, each
*should* be easier to find with grep. Each header also has a comment at
the top with a path to the script used to generate it. 

While the diff stat looks like a huge addition, the scripting comes in
at ~800 lines in total, including the fallback definitions, so we're
removing ~1000 lines of hand-written code. At the same time, we fill in
gaps in the atomic_long API, and the instrumented atomic wrappers.

Longer-term, I think things could be simplified if we were to rework the
headers such that we have:

* arch/*/include/asm/atomic.h providing arch_atomic_*().

* include/linux/atomic-raw.h building raw_atomic_*() atop of the
  arch_atomic_*() definitions, filling in gaps in the API. Having
  separate arch_ and raw_ namespaces would simplify the ifdeffery.

* include/linux/atomic.h building atomic_*() atop of the raw_atomic_*()
  definitions, complete with any instrumentation. Instrumenting at this
  level would lower the instrumentation overhead, and would not require
  any ifdeffery as the whole raw_atomic_*() API would be available.

... I've avoided this for the time being due to the necessary churn in
arch code.

Thanks,
Mark.

[1] https://lkml.kernel.org/r/20180529154346.3168-1-mark.rutland@arm.com
[2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/api-cleanup
[3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/generated
[4] https://lkml.kernel.org/r/CA+55aFxjU0op8QLLu0n-RjHBs7gQsLvD8jcyedgw6jeZFN7B+Q@mail.gmail.com

Mark Rutland (7):
  atomics/tty: add missing atomic_long_t * cast
  atomics/treewide: rework ordering barriers
  atomics: separate basic {cmp,}xchg()
  atomics: add common header generation files
  atomics: switch to generated fallbacks
  atomics: switch to generated atomic-long
  atomics: switch to generated instrumentation

 MAINTAINERS                                      |    1 +
 arch/alpha/include/asm/atomic.h                  |    8 +-
 arch/powerpc/include/asm/atomic.h                |   17 +-
 arch/riscv/include/asm/atomic.h                  |   17 +-
 drivers/tty/tty_ldsem.c                          |    2 +-
 include/asm-generic/atomic-instrumented-atomic.h | 1629 ++++++++++++++++
 include/asm-generic/atomic-instrumented.h        |  394 +---
 include/asm-generic/atomic-long.h                | 1150 +++++++++--
 include/linux/atomic-fallback.h                  | 2223 ++++++++++++++++++++++
 include/linux/atomic.h                           | 1244 +-----------
 scripts/atomic/atomic-tbl.sh                     |  186 ++
 scripts/atomic/atomics.tbl                       |   41 +
 scripts/atomic/fallbacks/acquire                 |    9 +
 scripts/atomic/fallbacks/add_negative            |   16 +
 scripts/atomic/fallbacks/add_unless              |   16 +
 scripts/atomic/fallbacks/andnot                  |    7 +
 scripts/atomic/fallbacks/dec                     |    7 +
 scripts/atomic/fallbacks/dec_and_test            |   15 +
 scripts/atomic/fallbacks/dec_if_positive         |   15 +
 scripts/atomic/fallbacks/dec_unless_positive     |   14 +
 scripts/atomic/fallbacks/fence                   |   11 +
 scripts/atomic/fallbacks/fetch_add_unless        |   23 +
 scripts/atomic/fallbacks/inc                     |    7 +
 scripts/atomic/fallbacks/inc_and_test            |   15 +
 scripts/atomic/fallbacks/inc_not_zero            |   14 +
 scripts/atomic/fallbacks/inc_unless_negative     |   14 +
 scripts/atomic/fallbacks/read_acquire            |    7 +
 scripts/atomic/fallbacks/release                 |    8 +
 scripts/atomic/fallbacks/set_release             |    7 +
 scripts/atomic/fallbacks/sub_and_test            |   16 +
 scripts/atomic/fallbacks/try_cmpxchg             |   11 +
 scripts/atomic/gen-atomic-fallback.sh            |  144 ++
 scripts/atomic/gen-atomic-instrumented.sh        |  122 ++
 scripts/atomic/gen-atomic-long.sh                |   98 +
 34 files changed, 5671 insertions(+), 1837 deletions(-)
 create mode 100644 include/asm-generic/atomic-instrumented-atomic.h
 create mode 100644 include/linux/atomic-fallback.h
 create mode 100755 scripts/atomic/atomic-tbl.sh
 create mode 100644 scripts/atomic/atomics.tbl
 create mode 100644 scripts/atomic/fallbacks/acquire
 create mode 100644 scripts/atomic/fallbacks/add_negative
 create mode 100644 scripts/atomic/fallbacks/add_unless
 create mode 100644 scripts/atomic/fallbacks/andnot
 create mode 100644 scripts/atomic/fallbacks/dec
 create mode 100644 scripts/atomic/fallbacks/dec_and_test
 create mode 100644 scripts/atomic/fallbacks/dec_if_positive
 create mode 100644 scripts/atomic/fallbacks/dec_unless_positive
 create mode 100644 scripts/atomic/fallbacks/fence
 create mode 100644 scripts/atomic/fallbacks/fetch_add_unless
 create mode 100644 scripts/atomic/fallbacks/inc
 create mode 100644 scripts/atomic/fallbacks/inc_and_test
 create mode 100644 scripts/atomic/fallbacks/inc_not_zero
 create mode 100644 scripts/atomic/fallbacks/inc_unless_negative
 create mode 100644 scripts/atomic/fallbacks/read_acquire
 create mode 100644 scripts/atomic/fallbacks/release
 create mode 100644 scripts/atomic/fallbacks/set_release
 create mode 100644 scripts/atomic/fallbacks/sub_and_test
 create mode 100644 scripts/atomic/fallbacks/try_cmpxchg
 create mode 100755 scripts/atomic/gen-atomic-fallback.sh
 create mode 100755 scripts/atomic/gen-atomic-instrumented.sh
 create mode 100755 scripts/atomic/gen-atomic-long.sh

-- 
2.11.0

Comments

Dmitry Vyukov June 4, 2018, 4:21 p.m. | #1
On Tue, May 29, 2018 at 8:07 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> This series scripts the generation of the various atomic headers, to

> ensure that the various atomic APIs remain consistent, reducing the risk

> of human error, and simplifying future rework.

>

> The series is based on my atomic API cleanup patches [1,2], and can be

> found on my atomics/generated branch [3] on kernel.org.

>

> The first three patches are preparatory rework, with patch four

> introducing the infrastructure. The final three patches migrate to

> generated headers. The scripts themselves are mostly POSIX sh (modulo

> local), without bashisms, and work in dash and bash.

>

> Per Linus request that it is possible to use git grep to inspect the

> atomic headers [3], the headers are committed (and not generated by

> kbuild). Since we now expand the fallback definitions inline, each

> *should* be easier to find with grep. Each header also has a comment at

> the top with a path to the script used to generate it.

>

> While the diff stat looks like a huge addition, the scripting comes in

> at ~800 lines in total, including the fallback definitions, so we're

> removing ~1000 lines of hand-written code. At the same time, we fill in

> gaps in the atomic_long API, and the instrumented atomic wrappers.

>

> Longer-term, I think things could be simplified if we were to rework the

> headers such that we have:

>

> * arch/*/include/asm/atomic.h providing arch_atomic_*().

>

> * include/linux/atomic-raw.h building raw_atomic_*() atop of the

>   arch_atomic_*() definitions, filling in gaps in the API. Having

>   separate arch_ and raw_ namespaces would simplify the ifdeffery.

>

> * include/linux/atomic.h building atomic_*() atop of the raw_atomic_*()

>   definitions, complete with any instrumentation. Instrumenting at this

>   level would lower the instrumentation overhead, and would not require

>   any ifdeffery as the whole raw_atomic_*() API would be available.

>

> ... I've avoided this for the time being due to the necessary churn in

> arch code.

>

> Thanks,

> Mark.


Wow! Nice work!

I can say that besides KASAN, we will need hooking into atomics for
KMSAN and KTSAN in future. Without this it would be lots of manual
work with possibility of copy-paste errors.




> [1] https://lkml.kernel.org/r/20180529154346.3168-1-mark.rutland@arm.com

> [2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/api-cleanup

> [3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/generated

> [4] https://lkml.kernel.org/r/CA+55aFxjU0op8QLLu0n-RjHBs7gQsLvD8jcyedgw6jeZFN7B+Q@mail.gmail.com

>

> Mark Rutland (7):

>   atomics/tty: add missing atomic_long_t * cast

>   atomics/treewide: rework ordering barriers

>   atomics: separate basic {cmp,}xchg()

>   atomics: add common header generation files

>   atomics: switch to generated fallbacks

>   atomics: switch to generated atomic-long

>   atomics: switch to generated instrumentation

>

>  MAINTAINERS                                      |    1 +

>  arch/alpha/include/asm/atomic.h                  |    8 +-

>  arch/powerpc/include/asm/atomic.h                |   17 +-

>  arch/riscv/include/asm/atomic.h                  |   17 +-

>  drivers/tty/tty_ldsem.c                          |    2 +-

>  include/asm-generic/atomic-instrumented-atomic.h | 1629 ++++++++++++++++

>  include/asm-generic/atomic-instrumented.h        |  394 +---

>  include/asm-generic/atomic-long.h                | 1150 +++++++++--

>  include/linux/atomic-fallback.h                  | 2223 ++++++++++++++++++++++

>  include/linux/atomic.h                           | 1244 +-----------

>  scripts/atomic/atomic-tbl.sh                     |  186 ++

>  scripts/atomic/atomics.tbl                       |   41 +

>  scripts/atomic/fallbacks/acquire                 |    9 +

>  scripts/atomic/fallbacks/add_negative            |   16 +

>  scripts/atomic/fallbacks/add_unless              |   16 +

>  scripts/atomic/fallbacks/andnot                  |    7 +

>  scripts/atomic/fallbacks/dec                     |    7 +

>  scripts/atomic/fallbacks/dec_and_test            |   15 +

>  scripts/atomic/fallbacks/dec_if_positive         |   15 +

>  scripts/atomic/fallbacks/dec_unless_positive     |   14 +

>  scripts/atomic/fallbacks/fence                   |   11 +

>  scripts/atomic/fallbacks/fetch_add_unless        |   23 +

>  scripts/atomic/fallbacks/inc                     |    7 +

>  scripts/atomic/fallbacks/inc_and_test            |   15 +

>  scripts/atomic/fallbacks/inc_not_zero            |   14 +

>  scripts/atomic/fallbacks/inc_unless_negative     |   14 +

>  scripts/atomic/fallbacks/read_acquire            |    7 +

>  scripts/atomic/fallbacks/release                 |    8 +

>  scripts/atomic/fallbacks/set_release             |    7 +

>  scripts/atomic/fallbacks/sub_and_test            |   16 +

>  scripts/atomic/fallbacks/try_cmpxchg             |   11 +

>  scripts/atomic/gen-atomic-fallback.sh            |  144 ++

>  scripts/atomic/gen-atomic-instrumented.sh        |  122 ++

>  scripts/atomic/gen-atomic-long.sh                |   98 +

>  34 files changed, 5671 insertions(+), 1837 deletions(-)

>  create mode 100644 include/asm-generic/atomic-instrumented-atomic.h

>  create mode 100644 include/linux/atomic-fallback.h

>  create mode 100755 scripts/atomic/atomic-tbl.sh

>  create mode 100644 scripts/atomic/atomics.tbl

>  create mode 100644 scripts/atomic/fallbacks/acquire

>  create mode 100644 scripts/atomic/fallbacks/add_negative

>  create mode 100644 scripts/atomic/fallbacks/add_unless

>  create mode 100644 scripts/atomic/fallbacks/andnot

>  create mode 100644 scripts/atomic/fallbacks/dec

>  create mode 100644 scripts/atomic/fallbacks/dec_and_test

>  create mode 100644 scripts/atomic/fallbacks/dec_if_positive

>  create mode 100644 scripts/atomic/fallbacks/dec_unless_positive

>  create mode 100644 scripts/atomic/fallbacks/fence

>  create mode 100644 scripts/atomic/fallbacks/fetch_add_unless

>  create mode 100644 scripts/atomic/fallbacks/inc

>  create mode 100644 scripts/atomic/fallbacks/inc_and_test

>  create mode 100644 scripts/atomic/fallbacks/inc_not_zero

>  create mode 100644 scripts/atomic/fallbacks/inc_unless_negative

>  create mode 100644 scripts/atomic/fallbacks/read_acquire

>  create mode 100644 scripts/atomic/fallbacks/release

>  create mode 100644 scripts/atomic/fallbacks/set_release

>  create mode 100644 scripts/atomic/fallbacks/sub_and_test

>  create mode 100644 scripts/atomic/fallbacks/try_cmpxchg

>  create mode 100755 scripts/atomic/gen-atomic-fallback.sh

>  create mode 100755 scripts/atomic/gen-atomic-instrumented.sh

>  create mode 100755 scripts/atomic/gen-atomic-long.sh

>

> --

> 2.11.0

>
Mark Rutland June 5, 2018, 5:56 a.m. | #2
On Mon, Jun 04, 2018 at 06:21:22PM +0200, Dmitry Vyukov wrote:
> On Tue, May 29, 2018 at 8:07 PM, Mark Rutland <mark.rutland@arm.com> wrote:

> > This series scripts the generation of the various atomic headers, to

> > ensure that the various atomic APIs remain consistent, reducing the risk

> > of human error, and simplifying future rework.

> >

> > The series is based on my atomic API cleanup patches [1,2], and can be

> > found on my atomics/generated branch [3] on kernel.org.

> >

> > The first three patches are preparatory rework, with patch four

> > introducing the infrastructure. The final three patches migrate to

> > generated headers. The scripts themselves are mostly POSIX sh (modulo

> > local), without bashisms, and work in dash and bash.

> >

> > Per Linus request that it is possible to use git grep to inspect the

> > atomic headers [3], the headers are committed (and not generated by

> > kbuild). Since we now expand the fallback definitions inline, each

> > *should* be easier to find with grep. Each header also has a comment at

> > the top with a path to the script used to generate it.

> >

> > While the diff stat looks like a huge addition, the scripting comes in

> > at ~800 lines in total, including the fallback definitions, so we're

> > removing ~1000 lines of hand-written code. At the same time, we fill in

> > gaps in the atomic_long API, and the instrumented atomic wrappers.

> >

> > Longer-term, I think things could be simplified if we were to rework the

> > headers such that we have:

> >

> > * arch/*/include/asm/atomic.h providing arch_atomic_*().

> >

> > * include/linux/atomic-raw.h building raw_atomic_*() atop of the

> >   arch_atomic_*() definitions, filling in gaps in the API. Having

> >   separate arch_ and raw_ namespaces would simplify the ifdeffery.

> >

> > * include/linux/atomic.h building atomic_*() atop of the raw_atomic_*()

> >   definitions, complete with any instrumentation. Instrumenting at this

> >   level would lower the instrumentation overhead, and would not require

> >   any ifdeffery as the whole raw_atomic_*() API would be available.

> >

> > ... I've avoided this for the time being due to the necessary churn in

> > arch code.

> >

> > Thanks,

> > Mark.

> 

> Wow! Nice work!


Thanks!

> I can say that besides KASAN, we will need hooking into atomics for

> KMSAN and KTSAN in future. Without this it would be lots of manual

> work with possibility of copy-paste errors.


Indeed!

This was largely driven by wanting the arm64 atomics instrumented, and
generating that does end up easier to get right than manually expanding all the
acquire/release/relaxed ifdeffery manually.

First we need to get the preparatory patches [1,2] merged. Those have had a few
fixups since they were last posted, so I'll send an updated version come
v4.18-rc1 unless the atomics maintainers really want to queue those ASAP.

There'll be some additional prep work for instrumentation on arm64, too (e.g.
fiddling with the (cmp)xchg instrumentation), so I'll look into that in the
mean time.

Thanks,
Mark.

> > [1] https://lkml.kernel.org/r/20180529154346.3168-1-mark.rutland@arm.com

> > [2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/api-cleanup

> > [3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git atomics/generated

> > [4] https://lkml.kernel.org/r/CA+55aFxjU0op8QLLu0n-RjHBs7gQsLvD8jcyedgw6jeZFN7B+Q@mail.gmail.com
Peter Zijlstra June 5, 2018, 1:29 p.m. | #3
On Tue, May 29, 2018 at 07:07:39PM +0100, Mark Rutland wrote:
> Longer-term, I think things could be simplified if we were to rework the

> headers such that we have:

> 

> * arch/*/include/asm/atomic.h providing arch_atomic_*().

> 

> * include/linux/atomic-raw.h building raw_atomic_*() atop of the

>   arch_atomic_*() definitions, filling in gaps in the API. Having

>   separate arch_ and raw_ namespaces would simplify the ifdeffery.

> 

> * include/linux/atomic.h building atomic_*() atop of the raw_atomic_*()

>   definitions, complete with any instrumentation. Instrumenting at this

>   level would lower the instrumentation overhead, and would not require

>   any ifdeffery as the whole raw_atomic_*() API would be available.

> 

> ... I've avoided this for the time being due to the necessary churn in

> arch code.


I'm not entirely sure I get the point of raw_atomic, we only need to
instrument the arch_atomic bits, right? When those are done, everything
that's build on top will also automagically be instrumented.
Peter Zijlstra June 5, 2018, 1:31 p.m. | #4
On Mon, Jun 04, 2018 at 06:21:22PM +0200, Dmitry Vyukov wrote:
> On Tue, May 29, 2018 at 8:07 PM, Mark Rutland <mark.rutland@arm.com> wrote:

> > While the diff stat looks like a huge addition, the scripting comes in

> > at ~800 lines in total, including the fallback definitions, so we're

> > removing ~1000 lines of hand-written code. At the same time, we fill in

> > gaps in the atomic_long API, and the instrumented atomic wrappers.


> Wow! Nice work!


Indeed, looking good Mark, thanks!
Mark Rutland June 5, 2018, 1:58 p.m. | #5
On Tue, Jun 05, 2018 at 03:29:49PM +0200, Peter Zijlstra wrote:
> On Tue, May 29, 2018 at 07:07:39PM +0100, Mark Rutland wrote:

> > Longer-term, I think things could be simplified if we were to rework the

> > headers such that we have:

> > 

> > * arch/*/include/asm/atomic.h providing arch_atomic_*().

> > 

> > * include/linux/atomic-raw.h building raw_atomic_*() atop of the

> >   arch_atomic_*() definitions, filling in gaps in the API. Having

> >   separate arch_ and raw_ namespaces would simplify the ifdeffery.

> > 

> > * include/linux/atomic.h building atomic_*() atop of the raw_atomic_*()

> >   definitions, complete with any instrumentation. Instrumenting at this

> >   level would lower the instrumentation overhead, and would not require

> >   any ifdeffery as the whole raw_atomic_*() API would be available.

> > 

> > ... I've avoided this for the time being due to the necessary churn in

> > arch code.

> 

> I'm not entirely sure I get the point of raw_atomic, we only need to

> instrument the arch_atomic bits, right?


Well, we only *need* to instrument the top-level atomic. KASAN (and
KTSAN/KMSAN) only care that we're touching a memory location, and not
how many times we happen to touch it.

e.g. when we have fallbacks we might have:

static inline int atomic_fetch_add_unless(atomic_t *v, int a, int u)
{
	int c = atomic_read(v);

	do {
		if (unlikely(c == u))
			break;
	} while (!atomic_try_cmpxchg(v, &c, c + a));

	return c;
}

... where:

* atomic_read() is a simple wrapper around arch_atomic_read(), with
  instrumentation.

* atomic_try_cmpxchg() might be a simple wrapper around
  arch_atomic_try_cmpxchg, or a wrapper around atomic_cmpxchg(), which
  calls arch_atomic_cmpxchg(). Either way, one of the two is
  instrumented.

... so each call to atomic_fetch_add_unless() calls the instrumentation
at least once for the read, and at least once per retry. Whereas if
implemented in arch code, it only calls the instrumentation once.

> When those are done, everything that's build on top will also

> automagically be instrumented.


Sure, it all works, it's just less than optimal as above, and also means
that we have to duplicate the ifdeffery for optional atomics -- once in
the instrumented atomics, then in the "real" atomics.

Whereas if we filled in the raw atomics atop of the arch atomics,
everything above that can assume the whole API is present, no ifdeffery
required.

Thanks,
Mark.
Peter Zijlstra June 5, 2018, 2:14 p.m. | #6
On Tue, Jun 05, 2018 at 02:58:23PM +0100, Mark Rutland wrote:

> Sure, it all works, it's just less than optimal as above, and also means

> that we have to duplicate the ifdeffery for optional atomics -- once in

> the instrumented atomics, then in the "real" atomics.

> 

> Whereas if we filled in the raw atomics atop of the arch atomics,

> everything above that can assume the whole API is present, no ifdeffery

> required.


Aah, I see your point now. I don't think performance is a particular
concern when you enable K*SAN, but getting rid of a fair bunch of
ifdeffery is always nice.
Mark Rutland June 5, 2018, 2:23 p.m. | #7
On Tue, Jun 05, 2018 at 04:14:16PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 05, 2018 at 02:58:23PM +0100, Mark Rutland wrote:

> 

> > Sure, it all works, it's just less than optimal as above, and also means

> > that we have to duplicate the ifdeffery for optional atomics -- once in

> > the instrumented atomics, then in the "real" atomics.

> > 

> > Whereas if we filled in the raw atomics atop of the arch atomics,

> > everything above that can assume the whole API is present, no ifdeffery

> > required.

> 

> Aah, I see your point now. I don't think performance is a particular

> concern when you enable K*SAN, but getting rid of a fair bunch of

> ifdeffery is always nice.


I agree that performance isn't a concern there when debugging, but I
would like to keep the overhead down when fuzzing.

Regardless, we'd have to move arches over to arch_atomic_* first, and
once that's done the raw_atomic_* stuff is fairly easy to implement by
reworking the scripts.

Thanks,
Mark.