diff mbox series

scsi: ipr: fix build on 32-bit architectures

Message ID 20180608144617.2900894-1-arnd@arndb.de
State New
Headers show
Series scsi: ipr: fix build on 32-bit architectures | expand

Commit Message

Arnd Bergmann June 8, 2018, 2:46 p.m. UTC
Replacing writeq() with writeq_relaxed() doesn't work on many architectures,
as that variant is not available in general:

net/Makefile:24: CC cannot link executables. Skipping bpfilter.
drivers/scsi/ipr.c: In function 'ipr_mask_and_clear_interrupts':
drivers/scsi/ipr.c:767:3: error: implicit declaration of function 'writeq_relaxed'; did you mean 'writew_relaxed'? [-Werror=implicit-function-declaration]
   writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
   ^~~~~~~~~~~~~~
   writew_relaxed

The other issue here is that the patch eliminated the wrong barrier.
As per a long discussion that followed Sinan's original patch submission,
the conclusion was that drivers should generally assume that the barrier
implied by writel() is sufficient for ordering DMA, so this reverts his
change and instead removes the extraneous wmb() before it, which is no
longer needed on any architecture now.

Fixes: 0109a4f2e02d ("scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
 drivers/scsi/ipr.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

-- 
2.9.0

Comments

Arnd Bergmann June 8, 2018, 3:47 p.m. UTC | #1
On Fri, Jun 8, 2018 at 5:27 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> +Will,

>

> On 6/8/2018 10:46 AM, Arnd Bergmann wrote:

>> Replacing writeq() with writeq_relaxed() doesn't work on many architectures,

>> as that variant is not available in general:

>>

>> net/Makefile:24: CC cannot link executables. Skipping bpfilter.

>> drivers/scsi/ipr.c: In function 'ipr_mask_and_clear_interrupts':

>> drivers/scsi/ipr.c:767:3: error: implicit declaration of function 'writeq_relaxed'; did you mean 'writew_relaxed'? [-Werror=implicit-function-declaration]

>>    writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);

>>    ^~~~~~~~~~~~~~

>>    writew_relaxed

>>

>> The other issue here is that the patch eliminated the wrong barrier.

>> As per a long discussion that followed Sinan's original patch submission,

>> the conclusion was that drivers should generally assume that the barrier

>> implied by writel() is sufficient for ordering DMA, so this reverts his

>> change and instead removes the extraneous wmb() before it, which is no

>> longer needed on any architecture now.

>>

>> Fixes: 0109a4f2e02d ("scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs")

>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

>

> This looks good on paper however we need an input from the driver maintainer

> because some drivers like Intel NIC are using write barriers in place of

> a SMP barrier + write barrier combination as an optimizatin.

>

> Removing the barrier itself can actually break the driver if SMP barrier is

> actually needed instead.

>

> So, it is difficult to judge how this barrier has been used without an

> expert opinion.

>

> Changing

>

> wmb() + writel()

>

> to

>

> wmb() + writel_relaxed()

>

> is safer than dropping the wmb() altogether.


If the wmb() was not just about the writeq() then I would argue your patch
description was misleading. We certainly shouldn't replace random writeq()
calls with writeq_relaxed() just because we can show that the driver has
a barrier in front of it.

In particular, the ipr_mask_and_clear_interrupts() function has multiple
writeq() or writel() calls, and even a readl() and your patch only changes
one of them, which seems like a rather pointless exercise as the function
still fully synchronizes the I/O multiple times.

> Will Deacon should probably look at why writeq_relaxed is missing on some ARM

> arches too.

>

> Drivers shouldn't worry about write derivatives.


This driver defines writeq() itself for architectures that don't have it, but
it doesn't define writeq_relaxed() and doesn't include
linux/io-64-nonatomic-lo-hi.h
or linux/io-64-nonatomic-hi-lo.h. It seems that it needs a different behavior
from all other drivers here, storing the upper 32 bits into the lower
address and
the lower 32 bits into the upper address.

         Arnd
Sinan Kaya June 8, 2018, 4:10 p.m. UTC | #2
On 6/8/2018 11:47 AM, Arnd Bergmann wrote:
> On Fri, Jun 8, 2018 at 5:27 PM, Sinan Kaya <okaya@codeaurora.org> wrote:

>> +Will,

>>


[snip]

>> So, it is difficult to judge how this barrier has been used without an

>> expert opinion.

>>

>> Changing

>>

>> wmb() + writel()

>>

>> to

>>

>> wmb() + writel_relaxed()

>>

>> is safer than dropping the wmb() altogether.

> 

> If the wmb() was not just about the writeq() then I would argue your patch

> description was misleading. We certainly shouldn't replace random writeq()

> calls with writeq_relaxed() just because we can show that the driver has

> a barrier in front of it.

> 

> In particular, the ipr_mask_and_clear_interrupts() function has multiple

> writeq() or writel() calls, and even a readl() and your patch only changes

> one of them, which seems like a rather pointless exercise as the function

> still fully synchronizes the I/O multiple times.


You are right, I only searched wmb() + writel() combinations. Changed only
the places where I found issues. 

We were given a direction to go to eliminating barriers direction as you already
noted. 

I just wanted to highlight the difficulty of wholesale dropping of wmb() without
careful inspection. [1] [2]

We certainly need a better patch that covers all use cases. Your patch is
a step in the good direction. We just need some attention from the maintainer
that we are not actually breaking something.

> 

>> Will Deacon should probably look at why writeq_relaxed is missing on some ARM

>> arches too.

>>

>> Drivers shouldn't worry about write derivatives.

> 

> This driver defines writeq() itself for architectures that don't have it, but

> it doesn't define writeq_relaxed() and doesn't include

> linux/io-64-nonatomic-lo-hi.h

> or linux/io-64-nonatomic-hi-lo.h. It seems that it needs a different behavior

> from all other drivers here, storing the upper 32 bits into the lower

> address and

> the lower 32 bits into the upper address.


I don't think there is a consensus about using these includes in the community.
I bumped into this issue before and came up with an include you pointed.
I didn't get too much enthusiasm from the maintainer.

Why are we pushing the responsibility into the drivers? I'd think that architecture
should take care of this. Is there a portability issue that I'm missing from some
architecture I never heart of? (I work on Little-Endian machines most of the time)

[1] https://patchwork.kernel.org/patch/10301861/
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg227443.html

> 

>          Arnd

> 



-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Martin K. Petersen June 13, 2018, 5:14 p.m. UTC | #3
Arnd,

> the conclusion was that drivers should generally assume that the

> barrier implied by writel() is sufficient for ordering DMA, so this

> reverts his change and instead removes the extraneous wmb() before it,

> which is no longer needed on any architecture now.


Applied to 4.18/scsi-fixes and squashed with Sinan's patch.

-- 
Martin K. Petersen	Oracle Linux Engineering
diff mbox series

Patch

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index 865c07dc11ea..d2f67a41fcdd 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -760,13 +760,12 @@  static void ipr_mask_and_clear_interrupts(struct ipr_ioa_cfg *ioa_cfg,
 		ioa_cfg->hrrq[i].allow_interrupts = 0;
 		spin_unlock(&ioa_cfg->hrrq[i]._lock);
 	}
-	wmb();
 
 	/* Set interrupt mask to stop all new interrupts */
 	if (ioa_cfg->sis64)
-		writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writeq(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 	else
-		writel_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writel(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 
 	/* Clear any pending interrupts */
 	if (ioa_cfg->sis64)
@@ -8401,10 +8400,9 @@  static int ipr_reset_enable_ioa(struct ipr_cmnd *ipr_cmd)
 		ioa_cfg->hrrq[i].allow_interrupts = 1;
 		spin_unlock(&ioa_cfg->hrrq[i]._lock);
 	}
-	wmb();
 	if (ioa_cfg->sis64) {
 		/* Set the adapter to the correct endian mode. */
-		writel_relaxed(IPR_ENDIAN_SWAP_KEY,
+		writel(IPR_ENDIAN_SWAP_KEY,
 			       ioa_cfg->regs.endian_swap_reg);
 		int_reg = readl(ioa_cfg->regs.endian_swap_reg);
 	}