diff mbox series

[08/25] block/nvme: Simplify device reset

Message ID 20201027135547.374946-9-philmd@redhat.com
State New
Headers show
Series block/nvme: Fix Aarch64 host | expand

Commit Message

Philippe Mathieu-Daudé Oct. 27, 2020, 1:55 p.m. UTC
Avoid multiple endianess conversion by using device endianess.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Keith Busch Oct. 27, 2020, 2:54 p.m. UTC | #1
On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:
> Avoid multiple endianess conversion by using device endianess.
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index e95d59d3126..be14350f959 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
>  
>      /* Reset device to get a clean state. */
> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
> +    regs->cc &= const_le32(0xFE);

This doesn't look right. The 'regs' is an MMIO address, correct? Memory
mappings use the CPU native.
Keith Busch Oct. 27, 2020, 2:58 p.m. UTC | #2
On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:
> Avoid multiple endianess conversion by using device endianess.

> 

> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---

>  block/nvme.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/block/nvme.c b/block/nvme.c

> index e95d59d3126..be14350f959 100644

> --- a/block/nvme.c

> +++ b/block/nvme.c

> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,

>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);

>  

>      /* Reset device to get a clean state. */

> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);

> +    regs->cc &= const_le32(0xFE);


This doesn't look right. The 'regs' is an MMIO address, correct? Memory
mappings use the CPU native access.
Philippe Mathieu-Daudé Oct. 27, 2020, 3:53 p.m. UTC | #3
On 10/27/20 3:58 PM, Keith Busch wrote:
> On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:

>> Avoid multiple endianess conversion by using device endianess.

>>

>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>> ---

>>  block/nvme.c | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/block/nvme.c b/block/nvme.c

>> index e95d59d3126..be14350f959 100644

>> --- a/block/nvme.c

>> +++ b/block/nvme.c

>> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,

>>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);

>>  

>>      /* Reset device to get a clean state. */

>> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);

>> +    regs->cc &= const_le32(0xFE);

> 

> This doesn't look right. The 'regs' is an MMIO address, correct? Memory

> mappings use the CPU native access.


cc is little-endian uint32_t.

on big-endian: const_le32(0xFE) = 0xfe000000;
so: regs->cc &= 0xfe000000.

Anyway this is an example of unproductive patch, as it makes
things more confuse to you. Let's ignore it.
Keith Busch Oct. 27, 2020, 4:55 p.m. UTC | #4
On Tue, Oct 27, 2020 at 04:53:31PM +0100, Philippe Mathieu-Daudé wrote:
> On 10/27/20 3:58 PM, Keith Busch wrote:

> > On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:

> >> Avoid multiple endianess conversion by using device endianess.

> >>

> >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> >> ---

> >>  block/nvme.c | 2 +-

> >>  1 file changed, 1 insertion(+), 1 deletion(-)

> >>

> >> diff --git a/block/nvme.c b/block/nvme.c

> >> index e95d59d3126..be14350f959 100644

> >> --- a/block/nvme.c

> >> +++ b/block/nvme.c

> >> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,

> >>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);

> >>  

> >>      /* Reset device to get a clean state. */

> >> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);

> >> +    regs->cc &= const_le32(0xFE);

> > 

> > This doesn't look right. The 'regs' is an MMIO address, correct? Memory

> > mappings use the CPU native access.

> 

> cc is little-endian uint32_t.


Well, yes and no. PCI is defined as a little endian transport, so all
CPUs have to automatically convert from their native format when
accessing memory mapped addresses over that transport, so you always use
the arch native format from the host software.

This isn't just for CC. This includes all memory mapped registers, so
this driver's CSTS, AQA, doorbells, etc... shouldn't have any endian
swapping.

See also: every other nvme driver. :)
Stefan Hajnoczi Oct. 28, 2020, 3:02 p.m. UTC | #5
On Tue, Oct 27, 2020 at 09:55:34AM -0700, Keith Busch wrote:
> On Tue, Oct 27, 2020 at 04:53:31PM +0100, Philippe Mathieu-Daudé wrote:
> > On 10/27/20 3:58 PM, Keith Busch wrote:
> > > On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:
> > >> Avoid multiple endianess conversion by using device endianess.
> > >>
> > >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > >> ---
> > >>  block/nvme.c | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/block/nvme.c b/block/nvme.c
> > >> index e95d59d3126..be14350f959 100644
> > >> --- a/block/nvme.c
> > >> +++ b/block/nvme.c
> > >> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
> > >>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
> > >>  
> > >>      /* Reset device to get a clean state. */
> > >> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
> > >> +    regs->cc &= const_le32(0xFE);
> > > 
> > > This doesn't look right. The 'regs' is an MMIO address, correct? Memory
> > > mappings use the CPU native access.
> > 
> > cc is little-endian uint32_t.
> 
> Well, yes and no. PCI is defined as a little endian transport, so all
> CPUs have to automatically convert from their native format when
> accessing memory mapped addresses over that transport, so you always use
> the arch native format from the host software.
> 
> This isn't just for CC. This includes all memory mapped registers, so
> this driver's CSTS, AQA, doorbells, etc... shouldn't have any endian
> swapping.
> 
> See also: every other nvme driver. :)

I don't see the opposite in Linux. The Linux NVMe drivers use byteswap
instructions because readl()/writel() and friends perform little-endian
memory accesses, not native endian memory accesses:

  static int nvme_pci_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
  {
  	writel(val, to_nvme_dev(ctrl)->bar + off);
  	return 0;
  }

arch/arm/include/asm/io.h:

  #define writel(v,c)     ({ __iowmb(); writel_relaxed(v,c); })

where the byteswap happens here:

  #define writel_relaxed(v,c) __raw_writel((__force u32) cpu_to_le32(v),c)

The CPU is using explicit byteswaps, which matches what the QEMU driver
is doing. Am I missing something?

Stefan
Keith Busch Oct. 28, 2020, 3:10 p.m. UTC | #6
On Wed, Oct 28, 2020 at 03:02:09PM +0000, Stefan Hajnoczi wrote:
> On Tue, Oct 27, 2020 at 09:55:34AM -0700, Keith Busch wrote:
> > On Tue, Oct 27, 2020 at 04:53:31PM +0100, Philippe Mathieu-Daudé wrote:
> > > On 10/27/20 3:58 PM, Keith Busch wrote:
> > > > On Tue, Oct 27, 2020 at 02:55:30PM +0100, Philippe Mathieu-Daudé wrote:
> > > >> Avoid multiple endianess conversion by using device endianess.
> > > >>
> > > >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > >> ---
> > > >>  block/nvme.c | 2 +-
> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/block/nvme.c b/block/nvme.c
> > > >> index e95d59d3126..be14350f959 100644
> > > >> --- a/block/nvme.c
> > > >> +++ b/block/nvme.c
> > > >> @@ -755,7 +755,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
> > > >>      timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
> > > >>  
> > > >>      /* Reset device to get a clean state. */
> > > >> -    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
> > > >> +    regs->cc &= const_le32(0xFE);
> > > > 
> > > > This doesn't look right. The 'regs' is an MMIO address, correct? Memory
> > > > mappings use the CPU native access.
> > > 
> > > cc is little-endian uint32_t.
> > 
> > Well, yes and no. PCI is defined as a little endian transport, so all
> > CPUs have to automatically convert from their native format when
> > accessing memory mapped addresses over that transport, so you always use
> > the arch native format from the host software.
> > 
> > This isn't just for CC. This includes all memory mapped registers, so
> > this driver's CSTS, AQA, doorbells, etc... shouldn't have any endian
> > swapping.
> > 
> > See also: every other nvme driver. :)
> 
> I don't see the opposite in Linux. The Linux NVMe drivers use byteswap
> instructions because readl()/writel() and friends perform little-endian
> memory accesses, not native endian memory accesses:
> 
>   static int nvme_pci_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
>   {
>   	writel(val, to_nvme_dev(ctrl)->bar + off);
>   	return 0;
>   }
> 
> arch/arm/include/asm/io.h:
> 
>   #define writel(v,c)     ({ __iowmb(); writel_relaxed(v,c); })
> 
> where the byteswap happens here:
> 
>   #define writel_relaxed(v,c) __raw_writel((__force u32) cpu_to_le32(v),c)
> 
> The CPU is using explicit byteswaps, which matches what the QEMU driver
> is doing. Am I missing something?

You're not missing anything. I just made a mistake. Looks like I never
followed the implementation that far along for the BE archs.
diff mbox series

Patch

diff --git a/block/nvme.c b/block/nvme.c
index e95d59d3126..be14350f959 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -755,7 +755,7 @@  static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
 
     /* Reset device to get a clean state. */
-    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
+    regs->cc &= const_le32(0xFE);
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
     while (NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {