Message ID | 1326033862-25351-3-git-send-email-richard.zhao@linaro.org |
---|---|
State | Changes Requested |
Headers | show |
On Sun, Jan 08, 2012 at 10:44:22PM +0800, Richard Zhao wrote: > readl/writel is more genric. And if CONFIG_ARM_DMA_MEM_BUFFERABLE, > they includes necessary memory barriers. > > Signed-off-by: Richard Zhao <richard.zhao@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Regards, Shawn
On Sun, Jan 08, 2012 at 10:44:22PM +0800, Richard Zhao wrote: > readl/writel is more genric. And if CONFIG_ARM_DMA_MEM_BUFFERABLE, > they includes necessary memory barriers. In a DMA engine driver, you need to use the barrier accessors when: 1. You finally enable the DMA engine to perform a transfer. The included barrier ensures that writes to the descriptors are visible to the DMA engine. 2. You read from a status register before examining the descriptors. This ensures that the descriptor accesses won't be ordered before the status register read. Provided other accesses are within the same 1K region, the remainder of them do not have to be the strictly ordered accessors, and you can use the _relaxed variants (but only in ARM specific drivers.) So, if your DMA engine has a control register, and a descriptor pointer register, you can write the descriptor pointer register with a writel_relaxed(). When you write the control register to enable the transfer, use writel() to ensure there's a barrier so the descriptors are visible.
On Mon, Jan 9, 2012 at 7:51 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sun, Jan 08, 2012 at 10:44:22PM +0800, Richard Zhao wrote: >> readl/writel is more genric. And if CONFIG_ARM_DMA_MEM_BUFFERABLE, >> they includes necessary memory barriers. > > In a DMA engine driver, you need to use the barrier accessors when: > > 1. You finally enable the DMA engine to perform a transfer. > The included barrier ensures that writes to the descriptors are visible > to the DMA engine. > > 2. You read from a status register before examining the descriptors. > This ensures that the descriptor accesses won't be ordered before the > status register read. > > Provided other accesses are within the same 1K region, the remainder of > them do not have to be the strictly ordered accessors, and you can use > the _relaxed variants (but only in ARM specific drivers.) Russell, Does this also mean when endian conversion is not necessary, the __raw_* version will be better here? Or generally the _relaxed variants are more recommended as endian conversion will be optimized away anyway with these AMBA accesses as both sides are little-endian? > > So, if your DMA engine has a control register, and a descriptor pointer > register, you can write the descriptor pointer register with a > writel_relaxed(). When you write the control register to enable the > transfer, use writel() to ensure there's a barrier so the descriptors > are visible. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Jan 09, 2012 at 09:25:12PM +0800, Eric Miao wrote: > Does this also mean when endian conversion is not necessary, the __raw_* > version will be better here? Or generally the _relaxed variants are more > recommended as endian conversion will be optimized away anyway with > these AMBA accesses as both sides are little-endian? Useless endian conversions are always optimized away. Here's the definitions: If your CPU is operating in little endian mode, for 32-bit and 16-bit: #define __cpu_to_le32(x) ((__force __le32)(__u32)(x)) #define __le32_to_cpu(x) ((__force __u32)(__le32)(x)) #define __cpu_to_le16(x) ((__force __le16)(__u16)(x)) #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) So these are just casts to keep sparse happy and able to check this stuff. #define __cpu_to_be32(x) ((__force __be32)__swab32((x))) #define __be32_to_cpu(x) __swab32((__force __u32)(__be32)(x)) #define __cpu_to_be16(x) ((__force __be16)__swab16((x))) #define __be16_to_cpu(x) __swab16((__force __u16)(__be16)(x)) These do the endian conversion. If your CPU is running in big endian mode: #define __cpu_to_le32(x) ((__force __le32)__swab32((x))) #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) #define __cpu_to_le16(x) ((__force __le16)__swab16((x))) #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) So these do the endian conversion, and: #define __cpu_to_be32(x) ((__force __be32)(__u32)(x)) #define __be32_to_cpu(x) ((__force __u32)(__be32)(x)) #define __cpu_to_be16(x) ((__force __be16)(__u16)(x)) #define __be16_to_cpu(x) ((__force __u16)(__be16)(x)) These are just casts.
On Mon, Jan 09, 2012 at 11:51:43AM +0000, Russell King - ARM Linux wrote: > On Sun, Jan 08, 2012 at 10:44:22PM +0800, Richard Zhao wrote: > > readl/writel is more genric. And if CONFIG_ARM_DMA_MEM_BUFFERABLE, > > they includes necessary memory barriers. > > In a DMA engine driver, you need to use the barrier accessors when: > > 1. You finally enable the DMA engine to perform a transfer. > The included barrier ensures that writes to the descriptors are visible > to the DMA engine. > > 2. You read from a status register before examining the descriptors. > This ensures that the descriptor accesses won't be ordered before the > status register read. > > Provided other accesses are within the same 1K region, the remainder of > them do not have to be the strictly ordered accessors, and you can use > the _relaxed variants (but only in ARM specific drivers.) > > So, if your DMA engine has a control register, and a descriptor pointer > register, you can write the descriptor pointer register with a > writel_relaxed(). When you write the control register to enable the > transfer, use writel() to ensure there's a barrier so the descriptors > are visible. Thanks very much for teaching. I understand now what relaxed means here. I'll change all register access to _relaxed version except in function sdma_enable_channel. Freescale power pc also has sdma, I guess we can use _relaxed till they tend to use the same driver ( I don't know when). Thanks Richard
On Mon, Jan 09, 2012 at 10:21:28PM +0800, Richard Zhao wrote: > Freescale power pc also has sdma, I guess we can use _relaxed till they > tend to use the same driver ( I don't know when). The answer to that is to persuade powerpc to gain the _relaxed versions (which can simply call the non-relaxed versions as a first stab.) They have the readx_relaxed() stuff already.
On Mon, Jan 09, 2012 at 02:46:27PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 09, 2012 at 10:21:28PM +0800, Richard Zhao wrote: > > Freescale power pc also has sdma, I guess we can use _relaxed till they > > tend to use the same driver ( I don't know when). > > The answer to that is to persuade powerpc to gain the _relaxed versions > (which can simply call the non-relaxed versions as a first stab.) They > have the readx_relaxed() stuff already. quite right. Thanks Richard
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index c2bc4f1..2cc96c4 100644 --- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@ -368,9 +368,9 @@ static int sdma_config_ownership(struct sdma_channel *sdmac, if (event_override && mcu_override && dsp_override) return -EINVAL; - evt = __raw_readl(sdma->regs + SDMA_H_EVTOVR); - mcu = __raw_readl(sdma->regs + SDMA_H_HOSTOVR); - dsp = __raw_readl(sdma->regs + SDMA_H_DSPOVR); + evt = readl(sdma->regs + SDMA_H_EVTOVR); + mcu = readl(sdma->regs + SDMA_H_HOSTOVR); + dsp = readl(sdma->regs + SDMA_H_DSPOVR); if (dsp_override) dsp &= ~(1 << channel); @@ -387,16 +387,16 @@ static int sdma_config_ownership(struct sdma_channel *sdmac, else mcu |= (1 << channel); - __raw_writel(evt, sdma->regs + SDMA_H_EVTOVR); - __raw_writel(mcu, sdma->regs + SDMA_H_HOSTOVR); - __raw_writel(dsp, sdma->regs + SDMA_H_DSPOVR); + writel(evt, sdma->regs + SDMA_H_EVTOVR); + writel(mcu, sdma->regs + SDMA_H_HOSTOVR); + writel(dsp, sdma->regs + SDMA_H_DSPOVR); return 0; } static void sdma_enable_channel(struct sdma_engine *sdma, int channel) { - __raw_writel(1 << channel, sdma->regs + SDMA_H_START); + writel(1 << channel, sdma->regs + SDMA_H_START); } /* @@ -460,9 +460,9 @@ static void sdma_event_enable(struct sdma_channel *sdmac, unsigned int event) u32 val; u32 chnenbl = chnenbl_ofs(sdma, event); - val = __raw_readl(sdma->regs + chnenbl); + val = readl(sdma->regs + chnenbl); val |= (1 << channel); - __raw_writel(val, sdma->regs + chnenbl); + writel(val, sdma->regs + chnenbl); } static void sdma_event_disable(struct sdma_channel *sdmac, unsigned int event) @@ -472,9 +472,9 @@ static void sdma_event_disable(struct sdma_channel *sdmac, unsigned int event) u32 chnenbl = chnenbl_ofs(sdma, event); u32 val; - val = __raw_readl(sdma->regs + chnenbl); + val = readl(sdma->regs + chnenbl); val &= ~(1 << channel); - __raw_writel(val, sdma->regs + chnenbl); + writel(val, sdma->regs + chnenbl); } static void sdma_handle_channel_loop(struct sdma_channel *sdmac) @@ -552,8 +552,8 @@ static irqreturn_t sdma_int_handler(int irq, void *dev_id) struct sdma_engine *sdma = dev_id; u32 stat; - stat = __raw_readl(sdma->regs + SDMA_H_INTR); - __raw_writel(stat, sdma->regs + SDMA_H_INTR); + stat = readl(sdma->regs + SDMA_H_INTR); + writel(stat, sdma->regs + SDMA_H_INTR); while (stat) { int channel = fls(stat) - 1; @@ -707,7 +707,7 @@ static void sdma_disable_channel(struct sdma_channel *sdmac) struct sdma_engine *sdma = sdmac->sdma; int channel = sdmac->channel; - __raw_writel(1 << channel, sdma->regs + SDMA_H_STATSTOP); + writel(1 << channel, sdma->regs + SDMA_H_STATSTOP); sdmac->status = DMA_ERROR; } @@ -780,7 +780,7 @@ static int sdma_set_channel_priority(struct sdma_channel *sdmac, return -EINVAL; } - __raw_writel(priority, sdma->regs + SDMA_CHNPRI_0 + 4 * channel); + writel(priority, sdma->regs + SDMA_CHNPRI_0 + 4 * channel); return 0; } @@ -1228,7 +1228,7 @@ static int __init sdma_init(struct sdma_engine *sdma) clk_enable(sdma->clk); /* Be sure SDMA has not started yet */ - __raw_writel(0, sdma->regs + SDMA_H_C0PTR); + writel(0, sdma->regs + SDMA_H_C0PTR); sdma->channel_control = dma_alloc_coherent(NULL, MAX_DMA_CHANNELS * sizeof (struct sdma_channel_control) + @@ -1251,11 +1251,11 @@ static int __init sdma_init(struct sdma_engine *sdma) /* disable all channels */ for (i = 0; i < sdma->num_events; i++) - __raw_writel(0, sdma->regs + chnenbl_ofs(sdma, i)); + writel(0, sdma->regs + chnenbl_ofs(sdma, i)); /* All channels have priority 0 */ for (i = 0; i < MAX_DMA_CHANNELS; i++) - __raw_writel(0, sdma->regs + SDMA_CHNPRI_0 + i * 4); + writel(0, sdma->regs + SDMA_CHNPRI_0 + i * 4); ret = sdma_request_channel(&sdma->channel[0]); if (ret) @@ -1264,16 +1264,16 @@ static int __init sdma_init(struct sdma_engine *sdma) sdma_config_ownership(&sdma->channel[0], false, true, false); /* Set Command Channel (Channel Zero) */ - __raw_writel(0x4050, sdma->regs + SDMA_CHN0ADDR); + writel(0x4050, sdma->regs + SDMA_CHN0ADDR); /* Set bits of CONFIG register but with static context switching */ /* FIXME: Check whether to set ACR bit depending on clock ratios */ - __raw_writel(0, sdma->regs + SDMA_H_CONFIG); + writel(0, sdma->regs + SDMA_H_CONFIG); - __raw_writel(ccb_phys, sdma->regs + SDMA_H_C0PTR); + writel(ccb_phys, sdma->regs + SDMA_H_C0PTR); /* Set bits of CONFIG register with given context switching mode */ - __raw_writel(SDMA_H_CONFIG_CSM, sdma->regs + SDMA_H_CONFIG); + writel(SDMA_H_CONFIG_CSM, sdma->regs + SDMA_H_CONFIG); /* Initializes channel's priorities */ sdma_set_channel_priority(&sdma->channel[0], 7);
readl/writel is more genric. And if CONFIG_ARM_DMA_MEM_BUFFERABLE, they includes necessary memory barriers. Signed-off-by: Richard Zhao <richard.zhao@linaro.org> --- drivers/dma/imx-sdma.c | 44 ++++++++++++++++++++++---------------------- 1 files changed, 22 insertions(+), 22 deletions(-)