Message ID | 1410264760-29756-1-git-send-email-daniel.thompson@linaro.org |
---|---|
State | New |
Headers | show |
Hi Daniel, On Tue, Sep 09, 2014 at 01:12:40PM +0100, Daniel Thompson wrote: > Currently the read[bwlq]_relaxed() family are implemented on every > architecture except blackfin, m68k[1], metag, openrisc, s390[2] and > score. Increasingly drivers are being optimized to exploit relaxed > reads putting these architectures at risk of compilation failures for > shared drivers. > > This patch addresses this by providing implementations of > read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). > All the above architectures include asm-generic/io.h . > > Note that currently only eight architectures (alpha, arm, arm64, avr32, > hexagon, microblaze, mips and sh) implement write[bwlq]_relaxed() meaning > these functions are deliberately not included in this patch. > > [1] m68k includes the relaxed family only when configured *without* MMU. > [2] s390 requires CONFIG_PCI to include the relaxed family. > > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > Cc: Will Deacon <will.deacon@arm.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: linux-arch@vger.kernel.org > --- > include/asm-generic/io.h | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) I have a larger series adding these (and the write equivalents) to all architectures that I periodically post and then fail to get on top of. The key part you're missing is defining some generic semantics for these accessors. Without those, I don't think it makes sense to put them into asm-generic, because drivers can't safely infer any meaning from the relaxed definition. Ben and I agreed on something back in May: https://lkml.org/lkml/2014/5/22/468 but I need to send a new version including: - ioreadX_relaxed and iowriteX_relaxed - Strengthening non-relaxed I/O accessors on architectures with non-empty mmiowb() I'll bump it up the list. In the meantime, you can have a look at my io branch on kernel.org Will
Hi Daniel, On Tue, Sep 9, 2014 at 2:12 PM, Daniel Thompson <daniel.thompson@linaro.org> wrote: > Currently the read[bwlq]_relaxed() family are implemented on every > architecture except blackfin, m68k[1], metag, openrisc, s390[2] and > score. Increasingly drivers are being optimized to exploit relaxed > reads putting these architectures at risk of compilation failures for > shared drivers. > > This patch addresses this by providing implementations of > read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). > All the above architectures include asm-generic/io.h . m68k does not include asm-generic/io.h. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On 09/09/14 13:28, Will Deacon wrote: > Hi Daniel, > > On Tue, Sep 09, 2014 at 01:12:40PM +0100, Daniel Thompson wrote: >> Currently the read[bwlq]_relaxed() family are implemented on every >> architecture except blackfin, m68k[1], metag, openrisc, s390[2] and >> score. Increasingly drivers are being optimized to exploit relaxed >> reads putting these architectures at risk of compilation failures for >> shared drivers. >> >> This patch addresses this by providing implementations of >> read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). >> All the above architectures include asm-generic/io.h . >> >> Note that currently only eight architectures (alpha, arm, arm64, avr32, >> hexagon, microblaze, mips and sh) implement write[bwlq]_relaxed() meaning >> these functions are deliberately not included in this patch. >> >> [1] m68k includes the relaxed family only when configured *without* MMU. >> [2] s390 requires CONFIG_PCI to include the relaxed family. >> >> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> >> Cc: Will Deacon <will.deacon@arm.com> >> Cc: Arnd Bergmann <arnd@arndb.de> >> Cc: linux-arch@vger.kernel.org >> --- >> include/asm-generic/io.h | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) > > I have a larger series adding these (and the write equivalents) to all > architectures that I periodically post and then fail to get on top of. That's why you're on Cc:... > The key part you're missing is defining some generic semantics for these > accessors. Without those, I don't think it makes sense to put them into > asm-generic, because drivers can't safely infer any meaning from the relaxed > definition. Currently the semantics are described as: --- cut here --- PCI ordering rules also guarantee that PIO read responses arrive after any outstanding DMA writes from that bus, since for some devices the result of a readb call may signal to the driver that a DMA transaction is complete. In many cases, however, the driver may want to indicate that the next readb call has no relation to any previous DMA writes performed by the device. The driver can use readb_relaxed for these cases, although only some platforms will honor the relaxed semantics. Using the relaxed read functions will provide significant performance benefits on platforms that support it. The qla2xxx driver provides examples of how to use readX_relaxed . In many cases, a majority of the driver’s readX calls can safely be converted to readX_relaxed calls, since only a few will indicate or depend on DMA completion. --- cut here --- The implementation provided in the patch trivially meets this definition (by not honouring the relaxedness). > Ben and I agreed on something back in May: > > https://lkml.org/lkml/2014/5/22/468 ... and didn't you also conclude with hpa that the very relaxed x86 implementation of readl_relaxed() already meets this definition (as do these changes to asm-generic/io.h). Thus allowing its use to perculate more widely really shouldn't do an harm. > but I need to send a new version including: > > - ioreadX_relaxed and iowriteX_relaxed > - Strengthening non-relaxed I/O accessors on architectures with non-empty > mmiowb() > > I'll bump it up the list. In the meantime, you can have a look at my io > branch on kernel.org I'd really like to see your work included (which I spotted after I wrote the patch and when it occured to me to visit https://www.google.com/search?q=asm-generic+readl_relaxed to see if there was a well known reason not to make this change). However... I really can't see why we should delay introducing an already documented function to the remaining architectures. Daniel.
On 09/09/14 13:31, Geert Uytterhoeven wrote: > Hi Daniel, > > On Tue, Sep 9, 2014 at 2:12 PM, Daniel Thompson > <daniel.thompson@linaro.org> wrote: >> Currently the read[bwlq]_relaxed() family are implemented on every >> architecture except blackfin, m68k[1], metag, openrisc, s390[2] and >> score. Increasingly drivers are being optimized to exploit relaxed >> reads putting these architectures at risk of compilation failures for >> shared drivers. >> >> This patch addresses this by providing implementations of >> read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). >> All the above architectures include asm-generic/io.h . > > m68k does not include asm-generic/io.h. Looking back through my command history I can't figure out how I made this mistake. I will change the comment. For the record I have also just double checked blackfin, metag, openrisc, s390 and score and these really do use asm-generic/io.h . Daniel.
On 09/09/14 14:03, Daniel Thompson wrote: > On 09/09/14 13:28, Will Deacon wrote: >> Hi Daniel, >> >> On Tue, Sep 09, 2014 at 01:12:40PM +0100, Daniel Thompson wrote: >>> Currently the read[bwlq]_relaxed() family are implemented on every >>> architecture except blackfin, m68k[1], metag, openrisc, s390[2] and >>> score. Increasingly drivers are being optimized to exploit relaxed >>> reads putting these architectures at risk of compilation failures for >>> shared drivers. >>> >>> This patch addresses this by providing implementations of >>> read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). >>> All the above architectures include asm-generic/io.h . >>> >>> Note that currently only eight architectures (alpha, arm, arm64, avr32, >>> hexagon, microblaze, mips and sh) implement write[bwlq]_relaxed() meaning >>> these functions are deliberately not included in this patch. >>> >>> [1] m68k includes the relaxed family only when configured *without* MMU. >>> [2] s390 requires CONFIG_PCI to include the relaxed family. >>> >>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> >>> Cc: Will Deacon <will.deacon@arm.com> >>> Cc: Arnd Bergmann <arnd@arndb.de> >>> Cc: linux-arch@vger.kernel.org >>> --- >>> include/asm-generic/io.h | 14 ++++++++++++++ >>> 1 file changed, 14 insertions(+) >> >> I have a larger series adding these (and the write equivalents) to all >> architectures that I periodically post and then fail to get on top of. > > That's why you're on Cc:... > > >> The key part you're missing is defining some generic semantics for these >> accessors. Without those, I don't think it makes sense to put them into >> asm-generic, because drivers can't safely infer any meaning from the relaxed >> definition. > > Currently the semantics are described as: > --- cut here --- > PCI ordering rules also guarantee that PIO read responses arrive after > any outstanding DMA writes from that bus, since for some devices the > result of a readb call may signal to the driver that a DMA transaction > is complete. In many cases, however, the driver may want to indicate > that the next readb call has no relation to any previous DMA writes > performed by the device. The driver can use readb_relaxed for these > cases, although only some platforms will honor the relaxed semantics. > Using the relaxed read functions will provide significant performance > benefits on platforms that support it. The qla2xxx driver provides > examples of how to use readX_relaxed . In many cases, a majority of the > driver’s readX calls can safely be converted to readX_relaxed calls, > since only a few will indicate or depend on DMA completion. > --- cut here --- > > The implementation provided in the patch trivially meets this definition > (by not honouring the relaxedness). > > >> Ben and I agreed on something back in May: >> >> https://lkml.org/lkml/2014/5/22/468 > > ... and didn't you also conclude with hpa that the very relaxed x86 > implementation of readl_relaxed() already meets this definition (as do > these changes to asm-generic/io.h). Sorry. "very relaxed" is always a very stupid thing to say about x86 (especially to an arm guy). More exactly I was referring to the absence of memory clobber in x86 readl_relaxed(). > > Thus allowing its use to perculate more widely really shouldn't do an harm. > > >> but I need to send a new version including: >> >> - ioreadX_relaxed and iowriteX_relaxed >> - Strengthening non-relaxed I/O accessors on architectures with non-empty >> mmiowb() >> >> I'll bump it up the list. In the meantime, you can have a look at my io >> branch on kernel.org > > I'd really like to see your work included (which I spotted after I wrote > the patch and when it occured to me to visit > https://www.google.com/search?q=asm-generic+readl_relaxed to see if > there was a well known reason not to make this change). > > However... I really can't see why we should delay introducing an already > documented function to the remaining architectures. > > > Daniel. >
On Tue, Sep 09, 2014 at 02:14:54PM +0100, Daniel Thompson wrote: > On 09/09/14 14:03, Daniel Thompson wrote: > > On 09/09/14 13:28, Will Deacon wrote: > >> I have a larger series adding these (and the write equivalents) to all > >> architectures that I periodically post and then fail to get on top of. > > > > That's why you're on Cc:... Ok, so why not just pick the asm-generic patch out of my series? > >> The key part you're missing is defining some generic semantics for these > >> accessors. Without those, I don't think it makes sense to put them into > >> asm-generic, because drivers can't safely infer any meaning from the relaxed > >> definition. > > > > Currently the semantics are described as: > > --- cut here --- > > PCI ordering rules also guarantee that PIO read responses arrive after > > any outstanding DMA writes from that bus, since for some devices the > > result of a readb call may signal to the driver that a DMA transaction > > is complete. In many cases, however, the driver may want to indicate > > that the next readb call has no relation to any previous DMA writes > > performed by the device. The driver can use readb_relaxed for these > > cases, although only some platforms will honor the relaxed semantics. > > Using the relaxed read functions will provide significant performance > > benefits on platforms that support it. The qla2xxx driver provides > > examples of how to use readX_relaxed . In many cases, a majority of the > > driver’s readX calls can safely be converted to readX_relaxed calls, > > since only a few will indicate or depend on DMA completion. > > --- cut here --- > > > > The implementation provided in the patch trivially meets this definition > > (by not honouring the relaxedness). I still think we need to mention ordering of relaxed reads against each other and also against spinlocks. > >> Ben and I agreed on something back in May: > >> > >> https://lkml.org/lkml/2014/5/22/468 > > > > ... and didn't you also conclude with hpa that the very relaxed x86 > > implementation of readl_relaxed() already meets this definition (as do > > these changes to asm-generic/io.h). > > Sorry. "very relaxed" is always a very stupid thing to say about x86 > (especially to an arm guy). > > More exactly I was referring to the absence of memory clobber in x86 > readl_relaxed(). Yeah, my series just adds the relaxed write accessors for x86. > > Thus allowing its use to perculate more widely really shouldn't do an harm. > > > > > >> but I need to send a new version including: > >> > >> - ioreadX_relaxed and iowriteX_relaxed > >> - Strengthening non-relaxed I/O accessors on architectures with non-empty > >> mmiowb() > >> > >> I'll bump it up the list. In the meantime, you can have a look at my io > >> branch on kernel.org > > > > I'd really like to see your work included (which I spotted after I wrote > > the patch and when it occured to me to visit > > https://www.google.com/search?q=asm-generic+readl_relaxed to see if > > there was a well known reason not to make this change). > > > > However... I really can't see why we should delay introducing an already > > documented function to the remaining architectures. I'd just rather fix the interface once instead of churning it about. How about I dust off the series again? Will
On 09/09/14 15:15, Will Deacon wrote: > On Tue, Sep 09, 2014 at 02:14:54PM +0100, Daniel Thompson wrote: >> On 09/09/14 14:03, Daniel Thompson wrote: >>> On 09/09/14 13:28, Will Deacon wrote: >>>> I have a larger series adding these (and the write equivalents) to all >>>> architectures that I periodically post and then fail to get on top of. >>> >>> That's why you're on Cc:... > > Ok, so why not just pick the asm-generic patch out of my series? Only really that your patch introducing the writeX_relaxed() family at the same time. However it would be fine to subset your patch rather than use mine. You did post it first... >>>> The key part you're missing is defining some generic semantics for these >>>> accessors. Without those, I don't think it makes sense to put them into >>>> asm-generic, because drivers can't safely infer any meaning from the relaxed >>>> definition. >>> >>> Currently the semantics are described as: >>> --- cut here --- >>> PCI ordering rules also guarantee that PIO read responses arrive after >>> any outstanding DMA writes from that bus, since for some devices the >>> result of a readb call may signal to the driver that a DMA transaction >>> is complete. In many cases, however, the driver may want to indicate >>> that the next readb call has no relation to any previous DMA writes >>> performed by the device. The driver can use readb_relaxed for these >>> cases, although only some platforms will honor the relaxed semantics. >>> Using the relaxed read functions will provide significant performance >>> benefits on platforms that support it. The qla2xxx driver provides >>> examples of how to use readX_relaxed . In many cases, a majority of the >>> driver’s readX calls can safely be converted to readX_relaxed calls, >>> since only a few will indicate or depend on DMA completion. >>> --- cut here --- >>> >>> The implementation provided in the patch trivially meets this definition >>> (by not honouring the relaxedness). > > I still think we need to mention ordering of relaxed reads against each > other and also against spinlocks. I don't disagree. I just think the documentation being sub-optimal is not a good reason to avoid implementing the read functions. >>>> Ben and I agreed on something back in May: >>>> >>>> https://lkml.org/lkml/2014/5/22/468 >>> >>> ... and didn't you also conclude with hpa that the very relaxed x86 >>> implementation of readl_relaxed() already meets this definition (as do >>> these changes to asm-generic/io.h). >> >> Sorry. "very relaxed" is always a very stupid thing to say about x86 >> (especially to an arm guy). >> >> More exactly I was referring to the absence of memory clobber in x86 >> readl_relaxed(). > > Yeah, my series just adds the relaxed write accessors for x86. > >>> Thus allowing its use to perculate more widely really shouldn't do an harm. >>> >>> >>>> but I need to send a new version including: >>>> >>>> - ioreadX_relaxed and iowriteX_relaxed >>>> - Strengthening non-relaxed I/O accessors on architectures with non-empty >>>> mmiowb() >>>> >>>> I'll bump it up the list. In the meantime, you can have a look at my io >>>> branch on kernel.org >>> >>> I'd really like to see your work included (which I spotted after I wrote >>> the patch and when it occured to me to visit >>> https://www.google.com/search?q=asm-generic+readl_relaxed to see if >>> there was a well known reason not to make this change). >>> >>> However... I really can't see why we should delay introducing an already >>> documented function to the remaining architectures. > > I'd just rather fix the interface once instead of churning it about Churn? 12 lines of code where two people independently produce the same thing (apart from ordering within the file)? > How about I dust off the series again? Dusting off the series again would be great. Would you consider putting readX_relaxed() and its documentation at the front of the patchset? That way if the writel_relaxed() side log jams again we can still get some of it delivered. Daniel.
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index 975e1cc..85ea117 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -66,6 +66,16 @@ static inline u32 readl(const volatile void __iomem *addr) return __le32_to_cpu(__raw_readl(addr)); } +#ifndef readb_relaxed +#define readb_relaxed __raw_readb +#endif +#ifndef readw_relaxed +#define readw_relaxed readw +#endif +#ifndef readl_relaxed +#define readl_relaxed readl +#endif + #ifndef __raw_writeb static inline void __raw_writeb(u8 b, volatile void __iomem *addr) { @@ -105,6 +115,10 @@ static inline u64 readq(const volatile void __iomem *addr) return __le64_to_cpu(__raw_readq(addr)); } +#ifndef readq_relaxed +#define readq_relaxed readq +#endif + #ifndef __raw_writeq static inline void __raw_writeq(u64 b, volatile void __iomem *addr) {
Currently the read[bwlq]_relaxed() family are implemented on every architecture except blackfin, m68k[1], metag, openrisc, s390[2] and score. Increasingly drivers are being optimized to exploit relaxed reads putting these architectures at risk of compilation failures for shared drivers. This patch addresses this by providing implementations of read[bwlq]_relaxed() that are identical to the equivalent read[bwlq](). All the above architectures include asm-generic/io.h . Note that currently only eight architectures (alpha, arm, arm64, avr32, hexagon, microblaze, mips and sh) implement write[bwlq]_relaxed() meaning these functions are deliberately not included in this patch. [1] m68k includes the relaxed family only when configured *without* MMU. [2] s390 requires CONFIG_PCI to include the relaxed family. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org --- include/asm-generic/io.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) -- 1.9.3