diff mbox

[2/2] arm64: bpf: add BPF XADD instruction

Message ID 1447195301-16757-3-git-send-email-yang.shi@linaro.org
State New
Headers show

Commit Message

Yang Shi Nov. 10, 2015, 10:41 p.m. UTC
aarch64 doesn't have native support for XADD instruction, implement it by
the below instruction sequence:

Load (dst + off) to a register
Add src to it
Store it back to (dst + off)

Signed-off-by: Yang Shi <yang.shi@linaro.org>

CC: Zi Shen Lim <zlim.lnx@gmail.com>
CC: Xi Wang <xi.wang@gmail.com>
---
 arch/arm64/net/bpf_jit_comp.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

-- 
2.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Yang Shi Nov. 11, 2015, 12:26 a.m. UTC | #1
On 11/10/2015 4:08 PM, Eric Dumazet wrote:
> On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:

>> aarch64 doesn't have native support for XADD instruction, implement it by

>> the below instruction sequence:

>>

>> Load (dst + off) to a register

>> Add src to it

>> Store it back to (dst + off)

>

> Not really what is needed ?

>

> See this BPF_XADD as an atomic_add() equivalent.


I see. Thanks. The documentation doesn't say too much about "exclusive" 
add. If so it should need load-acquire/store-release.

I will rework it.

Yang

>

>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Zi Shen Lim Nov. 11, 2015, 2:52 a.m. UTC | #2
Yang,

On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:

>> On 11/10/2015 4:08 PM, Eric Dumazet wrote:

>> >On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:

>> >>aarch64 doesn't have native support for XADD instruction, implement it by

>> >>the below instruction sequence:


aarch64 supports atomic add in ARMv8.1.
For ARMv8(.0), please consider using LDXR/STXR sequence.

>> >>

>> >>Load (dst + off) to a register

>> >>Add src to it

>> >>Store it back to (dst + off)

>> >

>> >Not really what is needed ?

>> >

>> >See this BPF_XADD as an atomic_add() equivalent.

>>

>> I see. Thanks. The documentation doesn't say too much about "exclusive" add.

>> If so it should need load-acquire/store-release.

>

> I think doc is clear enough, but it can always be improved. Pls suggest a patch.

> It's quite hard to write a test for atomicity in test_bpf framework, so

> code review is the key. Eric, thanks for catching it!

>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 10:24 a.m. UTC | #3
On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:
> On Tuesday 10 November 2015 18:52:45 Z Lim wrote:

> > On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov

> > <alexei.starovoitov@gmail.com> wrote:

> > > On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:

> > >> On 11/10/2015 4:08 PM, Eric Dumazet wrote:

> > >> >On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:

> > >> >>aarch64 doesn't have native support for XADD instruction, implement it by

> > >> >>the below instruction sequence:

> > 

> > aarch64 supports atomic add in ARMv8.1.

> > For ARMv8(.0), please consider using LDXR/STXR sequence.

> 

> Is it worth optimizing for the 8.1 case? It would add a bit of complexity

> to make the code depend on the CPU feature, but it's certainly doable.


What's the atomicity required for? Put another way, what are we racing
with (I thought bpf was single-threaded)? Do we need to worry about
memory barriers?

Apologies if these are stupid questions, but all I could find was
samples/bpf/sock_example.c and it didn't help much :(

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 11:58 a.m. UTC | #4
Hi Daniel,

On Wed, Nov 11, 2015 at 11:42:11AM +0100, Daniel Borkmann wrote:
> On 11/11/2015 11:24 AM, Will Deacon wrote:

> >On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:

> >>On Tuesday 10 November 2015 18:52:45 Z Lim wrote:

> >>>On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov

> >>><alexei.starovoitov@gmail.com> wrote:

> >>>>On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:

> >>>>>On 11/10/2015 4:08 PM, Eric Dumazet wrote:

> >>>>>>On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:

> >>>>>>>aarch64 doesn't have native support for XADD instruction, implement it by

> >>>>>>>the below instruction sequence:

> >>>

> >>>aarch64 supports atomic add in ARMv8.1.

> >>>For ARMv8(.0), please consider using LDXR/STXR sequence.

> >>

> >>Is it worth optimizing for the 8.1 case? It would add a bit of complexity

> >>to make the code depend on the CPU feature, but it's certainly doable.

> >

> >What's the atomicity required for? Put another way, what are we racing

> >with (I thought bpf was single-threaded)? Do we need to worry about

> >memory barriers?

> >

> >Apologies if these are stupid questions, but all I could find was

> >samples/bpf/sock_example.c and it didn't help much :(

> 

> The equivalent code more readable in restricted C syntax (that can be

> compiled by llvm) can be found in samples/bpf/sockex1_kern.c. So the

> built-in __sync_fetch_and_add() will be translated into a BPF_XADD

> insn variant.


Yikes, so the memory-model for BPF is based around the deprecated GCC
__sync builtins, that inherit their semantics from ia64? Any reason not
to use the C11-compatible __atomic builtins[1] as a base?

> What you can race against is that an eBPF map can be _shared_ by

> multiple eBPF programs that are attached somewhere in the system, and

> they could all update a particular entry/counter from the map at the

> same time.


Ok, so it does sound like eBPF needs to define/choose a memory-model and
I worry that riding on the back of __sync isn't necessarily the right
thing to do, particularly as its fallen out of favour with the compiler
folks. On weakly-ordered architectures, it's also going to result in
heavy-weight barriers for all atomic operations.

Will

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 12:38 p.m. UTC | #5
On Wed, Nov 11, 2015 at 01:21:04PM +0100, Daniel Borkmann wrote:
> On 11/11/2015 12:58 PM, Will Deacon wrote:

> >On Wed, Nov 11, 2015 at 11:42:11AM +0100, Daniel Borkmann wrote:

> >>On 11/11/2015 11:24 AM, Will Deacon wrote:

> >>>On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:

> >>>>On Tuesday 10 November 2015 18:52:45 Z Lim wrote:

> >>>>>On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov

> >>>>><alexei.starovoitov@gmail.com> wrote:

> >>>>>>On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:

> >>>>>>>On 11/10/2015 4:08 PM, Eric Dumazet wrote:

> >>>>>>>>On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:

> >>>>>>>>>aarch64 doesn't have native support for XADD instruction, implement it by

> >>>>>>>>>the below instruction sequence:

> >>>>>

> >>>>>aarch64 supports atomic add in ARMv8.1.

> >>>>>For ARMv8(.0), please consider using LDXR/STXR sequence.

> >>>>

> >>>>Is it worth optimizing for the 8.1 case? It would add a bit of complexity

> >>>>to make the code depend on the CPU feature, but it's certainly doable.

> >>>

> >>>What's the atomicity required for? Put another way, what are we racing

> >>>with (I thought bpf was single-threaded)? Do we need to worry about

> >>>memory barriers?

> >>>

> >>>Apologies if these are stupid questions, but all I could find was

> >>>samples/bpf/sock_example.c and it didn't help much :(

> >>

> >>The equivalent code more readable in restricted C syntax (that can be

> >>compiled by llvm) can be found in samples/bpf/sockex1_kern.c. So the

> >>built-in __sync_fetch_and_add() will be translated into a BPF_XADD

> >>insn variant.

> >

> >Yikes, so the memory-model for BPF is based around the deprecated GCC

> >__sync builtins, that inherit their semantics from ia64? Any reason not

> >to use the C11-compatible __atomic builtins[1] as a base?

> 

> Hmm, gcc doesn't have an eBPF compiler backend, so this won't work on

> gcc at all. The eBPF backend in LLVM recognizes the __sync_fetch_and_add()

> keyword and maps that to a BPF_XADD version (BPF_W or BPF_DW). In the

> interpreter (__bpf_prog_run()), as Eric mentioned, this maps to atomic_add()

> and atomic64_add(), respectively. So the struct bpf_insn prog[] you saw

> from sock_example.c can be regarded as one possible equivalent program

> section output from the compiler.


Ok, so if I understand you correctly, then __sync_fetch_and_add() has
different semantics depending on the backend target. That seems counter
to the LLVM atomics Documentation:

  http://llvm.org/docs/Atomics.html

which specifically calls out the __sync_* primitives as being
sequentially-consistent and requiring barriers on ARM (which isn't the
case for atomic[64]_add in the kernel).

If we re-use the __sync_* naming scheme in the source language, I don't
think we can overlay our own semantics in the backend. The
__sync_fetch_and_add primitive is also expected to return the old value,
which doesn't appear to be the case for BPF_XADD.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 4:23 p.m. UTC | #6
Hi Daniel,

Thanks for investigating this further.

On Wed, Nov 11, 2015 at 04:52:00PM +0100, Daniel Borkmann wrote:
> I played a bit around with eBPF code to assign the __sync_fetch_and_add()

> return value to a var and dump it to trace pipe, or use it as return code.

> llvm compiles it (with the result assignment) and it looks like:

> 

> [...]

> 206: (b7) r3 = 3

> 207: (db) lock *(u64 *)(r0 +0) += r3

> 208: (bf) r1 = r10

> 209: (07) r1 += -16

> 210: (b7) r2 = 10

> 211: (85) call 6 // r3 dumped here

> [...]

> 

> [...]

> 206: (b7) r5 = 3

> 207: (db) lock *(u64 *)(r0 +0) += r5

> 208: (bf) r1 = r10

> 209: (07) r1 += -16

> 210: (b7) r2 = 10

> 211: (b7) r3 = 43

> 212: (b7) r4 = 42

> 213: (85) call 6 // r5 dumped here

> [...]

> 

> [...]

> 11: (b7) r0 = 3

> 12: (db) lock *(u64 *)(r1 +0) += r0

> 13: (95) exit // r0 returned here

> [...]

> 

> What it seems is that we 'get back' the value (== 3 here in r3, r5, r0)

> that we're adding, at least that's what seems to be generated wrt

> register assignments. Hmm, the semantic differences of bpf target

> should be documented somewhere for people writing eBPF programs to

> be aware of.


If we're going to document it, a bug tracker might be a good place to
start. The behaviour, as it stands, is broken wrt the definition of the
__sync primitives. That is, there is no way to build __sync_fetch_and_add
out of BPF_XADD without changing its semantics.

We could fix this by either:

(1) Defining BPF_XADD to match __sync_fetch_and_add (including memory
    barriers).

(2) Introducing some new BPF_ atomics, that map to something like the
    C11 __atomic builtins and deprecating BPF_XADD in favour of these.

(3) Introducing new source-language intrinsics to match what BPF can do
    (unlikely to be popular).

As it stands, I'm not especially keen on adding BPF_XADD to the arm64
JIT backend until we have at least (1) and preferably (2) as well.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 5:44 p.m. UTC | #7
On Wed, Nov 11, 2015 at 12:35:48PM -0500, David Miller wrote:
> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>

> Date: Wed, 11 Nov 2015 09:27:00 -0800

> 

> > BPF_XADD == atomic_add() in kernel. period.

> > we are not going to deprecate it or introduce something else.

> 

> Agreed, it makes no sense to try and tie C99 or whatever atomic

> semantics to something that is already clearly defined to have

> exactly kernel atomic_add() semantics.


... and which is emitted by LLVM when asked to compile __sync_fetch_and_add,
which has clearly defined (yet conflicting) semantics.

If the discrepancy is in LLVM (and it sounds like it is), then I'll raise
a bug over there instead.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 11, 2015, 6:46 p.m. UTC | #8
On Wed, Nov 11, 2015 at 10:11:33AM -0800, Alexei Starovoitov wrote:
> On Wed, Nov 11, 2015 at 06:57:41PM +0100, Peter Zijlstra wrote:

> > On Wed, Nov 11, 2015 at 12:35:48PM -0500, David Miller wrote:

> > > From: Alexei Starovoitov <alexei.starovoitov@gmail.com>

> > > Date: Wed, 11 Nov 2015 09:27:00 -0800

> > > 

> > > > BPF_XADD == atomic_add() in kernel. period.

> > > > we are not going to deprecate it or introduce something else.

> > > 

> > > Agreed, it makes no sense to try and tie C99 or whatever atomic

> > > semantics to something that is already clearly defined to have

> > > exactly kernel atomic_add() semantics.

> > 

> > Dave, this really doesn't make any sense to me. __sync primitives have

> > well defined semantics and (e)BPF is violating this.

> 

> bpf_xadd was never meant to be __sync_fetch_and_add equivalent.

> From the day one it meant to be atomic_add() as kernel does it.

> I did piggy back on __sync in the llvm backend because it was the quick

> and dirty way to move forward.

> In retrospect I should have introduced a clean intrinstic for that instead,

> but it's not too late to do it now. user space we can change at any time

> unlike kernel.


But it's not just "user space", it's the source language definition!
I also don't see how you can change it now, without simply rejecting
the __sync primitives outright.

> > Furthermore, the fetch_and_add (or XADD) name has well defined

> > semantics, which (e)BPF also violates.

> 

> bpf_xadd also didn't meant to be 'fetch'. It was void return from the beginning.


Right, so it's just a misnomer.

> > Atomicy is hard enough as it is, backends giving random interpretations

> > to them isn't helping anybody.

> 

> no randomness. bpf_xadd == atomic_add() in kernel.

> imo that is the simplest and cleanest intepretantion one can have, no?


I don't really mind, as long as there is a semantic that everybody agrees
on. Really, I just want this to be consistent because memory models are
a PITA enough without having multiple interpretations flying around.

> > It also baffles me that Alexei is seemingly unwilling to change/rev the

> > (e)BPF instructions, which would be invisible to the regular user, he

> > does want to change the language itself, which will impact all

> > 'scripts'.

> 

> well, we cannot change it in kernel because it's ABI.

> I'm not against adding new insns. We definitely can, but let's figure out why?

> Is anything broken? No. So what new insns make sense?


If you end up needing a suite of atomics, I would suggest the __atomic
builtins because they are likely to be more portable and more flexible
than trying to use the kernel memory model outside of the environment
for which it was developed. However, I agree with you that we can cross
that bridge when we get there.

> Adding new intrinsic to llvm is not a big deal. I'll add it as soon

> as I have time to work on it or if somebody beats me to it I would be

> glad to test it and apply it.


I'm more interested in what you do about the existing intrinsic. Anyway,
I'll raise a ticket against LLVM so that they're aware (and maybe
somebody else will fix it :).

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 11, 2015, 7:01 p.m. UTC | #9
From: Will Deacon <will.deacon@arm.com>

Date: Wed, 11 Nov 2015 17:44:01 +0000

> On Wed, Nov 11, 2015 at 12:35:48PM -0500, David Miller wrote:

>> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>

>> Date: Wed, 11 Nov 2015 09:27:00 -0800

>> 

>> > BPF_XADD == atomic_add() in kernel. period.

>> > we are not going to deprecate it or introduce something else.

>> 

>> Agreed, it makes no sense to try and tie C99 or whatever atomic

>> semantics to something that is already clearly defined to have

>> exactly kernel atomic_add() semantics.

> 

> ... and which is emitted by LLVM when asked to compile __sync_fetch_and_add,

> which has clearly defined (yet conflicting) semantics.


Alexei clearly stated that he knows about this issue and will fully
fix this up in LLVM.

What more do you need to hear from him once he's stated that he is
aware and is working on it?  Meanwhile you should make your JIT emit
what is expected, rather than arguing to change the semantics.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 49c1f1b..0b1d2d3 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -609,7 +609,21 @@  emit_cond_jmp:
 	case BPF_STX | BPF_XADD | BPF_W:
 	/* STX XADD: lock *(u64 *)(dst + off) += src */
 	case BPF_STX | BPF_XADD | BPF_DW:
-		goto notyet;
+		ctx->tmp_used = 1;
+		emit_a64_mov_i(1, tmp2, off, ctx);
+		switch (BPF_SIZE(code)) {
+		case BPF_W:
+			emit(A64_LDR32(tmp, dst, tmp2), ctx);
+			emit(A64_ADD(is64, tmp, tmp, src), ctx);
+			emit(A64_STR32(tmp, dst, tmp2), ctx);
+			break;
+		case BPF_DW:
+			emit(A64_LDR64(tmp, dst, tmp2), ctx);
+			emit(A64_ADD(is64, tmp, tmp, src), ctx);
+			emit(A64_STR64(tmp, dst, tmp2), ctx);
+			break;
+		}
+		break;
 
 	/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
 	case BPF_LD | BPF_ABS | BPF_W:
@@ -679,9 +693,6 @@  emit_cond_jmp:
 		}
 		break;
 	}
-notyet:
-		pr_info_once("*** NOT YET: opcode %02x ***\n", code);
-		return -EFAULT;
 
 	default:
 		pr_err_once("unknown opcode %02x\n", code);