diff mbox series

[7/9] Input: n64joy - Fix DMA buffer alignment.

Message ID 20221127144116.1418083-8-jic23@kernel.org
State New
Headers show
Series Input: Fix insufficent DMA alignment. | expand

Commit Message

Jonathan Cameron Nov. 27, 2022, 2:41 p.m. UTC
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

The use of ____cacheline_aligned to ensure a buffer is DMA safe only
enforces the start of the buffer alignment. In this case, sufficient
alignment is already ensured by the use of kzalloc().
____cacheline_aligned does not ensure that no other members of the
structure are placed in the same cacheline after the end of the
buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
of the structure.

Whilst here switch from ____cacheline_aligned to
__aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
sized cacheline lines across their cache hierarchy, require this
greater alignment guarantee for DMA safety.  Make this change throughout
the driver as it reduces need for a reader to know about the particular
architecture.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lauri Kasanen <cand@gmx.com>
--
Only partly compile tested as I don't have a mips toolchain set up.
Would be great to add the stubs to be able to build these drivers
with COMPILE_TEST.
---
 drivers/input/joystick/n64joy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Lauri Kasanen Nov. 27, 2022, 4:48 p.m. UTC | #1
On Sun, 27 Nov 2022 14:41:14 +0000
Jonathan Cameron <jic23@kernel.org> wrote:

> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> enforces the start of the buffer alignment. In this case, sufficient
> alignment is already ensured by the use of kzalloc().
> ____cacheline_aligned does not ensure that no other members of the
> structure are placed in the same cacheline after the end of the
> buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> of the structure.

This move is unnecessary, because the cacheline is 16 bytes and the
buffer is 64 bytes.

> Whilst here switch from ____cacheline_aligned to
> __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
> sized cacheline lines across their cache hierarchy, require this
> greater alignment guarantee for DMA safety.  Make this change throughout
> the driver as it reduces need for a reader to know about the particular
> architecture.

This change looks ok.

- Lauri
Jonathan Cameron Nov. 27, 2022, 6:01 p.m. UTC | #2
On Sun, 27 Nov 2022 18:48:44 +0200
Lauri Kasanen <cand@gmx.com> wrote:

> On Sun, 27 Nov 2022 14:41:14 +0000
> Jonathan Cameron <jic23@kernel.org> wrote:
> 
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> > enforces the start of the buffer alignment. In this case, sufficient
> > alignment is already ensured by the use of kzalloc().
> > ____cacheline_aligned does not ensure that no other members of the
> > structure are placed in the same cacheline after the end of the
> > buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> > of the structure.  
> 
> This move is unnecessary, because the cacheline is 16 bytes and the
> buffer is 64 bytes.

Ah.  That particular option hadn't occurred to me (and I'd failed to notice
how big the buffer is :( ).
The marking isn't needed at all then as the allocation is already
guaranteed to be sufficiently aligned. However, maybe that is a bit subtle
and having some sort of marking is useful.

Curious question though, why is the buffer so big?
Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them.

Is the hardware putting garbage into the remaining 2 cachelines or is there
something subtle going on?

Or given my earlier success, maybe I'm misreading the code entirely.

Jonathan

> 
> > Whilst here switch from ____cacheline_aligned to
> > __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
> > sized cacheline lines across their cache hierarchy, require this
> > greater alignment guarantee for DMA safety.  Make this change throughout
> > the driver as it reduces need for a reader to know about the particular
> > architecture.  
> 
> This change looks ok.
> 
> - Lauri
Lauri Kasanen Nov. 28, 2022, 6:49 a.m. UTC | #3
On Sun, 27 Nov 2022 18:01:26 +0000
Jonathan Cameron <jic23@kernel.org> wrote:

> > This move is unnecessary, because the cacheline is 16 bytes and the
> > buffer is 64 bytes.
> 
> Ah.  That particular option hadn't occurred to me (and I'd failed to notice
> how big the buffer is :( ).
> The marking isn't needed at all then as the allocation is already
> guaranteed to be sufficiently aligned. However, maybe that is a bit subtle
> and having some sort of marking is useful.

You can replace the __cacheline annotation with a comment, that's
totally fine.

> Curious question though, why is the buffer so big?
> Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them.
> 
> Is the hardware putting garbage into the remaining 2 cachelines or is there
> something subtle going on?

That chip operates on 64-byte units. I don't remember whether the
remaining area is garbage or the input bytes or something else.

- Lauri
Dmitry Torokhov Nov. 28, 2022, 6:04 p.m. UTC | #4
On Sun, Nov 27, 2022 at 06:48:44PM +0200, Lauri Kasanen wrote:
> On Sun, 27 Nov 2022 14:41:14 +0000
> Jonathan Cameron <jic23@kernel.org> wrote:
> 
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> > enforces the start of the buffer alignment. In this case, sufficient
> > alignment is already ensured by the use of kzalloc().
> > ____cacheline_aligned does not ensure that no other members of the
> > structure are placed in the same cacheline after the end of the
> > buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> > of the structure.
> 
> This move is unnecessary, because the cacheline is 16 bytes and the
> buffer is 64 bytes.

I think it is still worth moving it or alternatively adding a comment
why we believe the following member will not be sharing cacheline with
the buffer.

Thanks.
diff mbox series

Patch

diff --git a/drivers/input/joystick/n64joy.c b/drivers/input/joystick/n64joy.c
index 9dbca366613e..d8c50103c108 100644
--- a/drivers/input/joystick/n64joy.c
+++ b/drivers/input/joystick/n64joy.c
@@ -44,12 +44,12 @@  static const char *n64joy_phys[MAX_CONTROLLERS] = {
 };
 
 struct n64joy_priv {
-	u64 si_buf[8] ____cacheline_aligned;
 	struct timer_list timer;
 	struct mutex n64joy_mutex;
 	struct input_dev *n64joy_dev[MAX_CONTROLLERS];
 	u32 __iomem *reg_base;
 	u8 n64joy_opened;
+	u64 si_buf[8] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 struct joydata {
@@ -129,7 +129,7 @@  static void n64joy_exec_pif(struct n64joy_priv *priv, const u64 in[8])
 	local_irq_restore(flags);
 }
 
-static const u64 polldata[] ____cacheline_aligned = {
+static const u64 polldata[] __aligned(ARCH_KMALLOC_MINALIGN) = {
 	0xff010401ffffffff,
 	0xff010401ffffffff,
 	0xff010401ffffffff,
@@ -222,7 +222,7 @@  static void n64joy_close(struct input_dev *dev)
 	mutex_unlock(&priv->n64joy_mutex);
 }
 
-static const u64 __initconst scandata[] ____cacheline_aligned = {
+static const u64 __initconst scandata[] __aligned(ARCH_KMALLOC_MINALIGN) = {
 	0xff010300ffffffff,
 	0xff010300ffffffff,
 	0xff010300ffffffff,