[0/3] MTD: spinand: Add spi_mem_poll_status() support

Message ID	20210426143934.25275-1-patrice.chotard@foss.st.com
Headers	show Return-Path: <linux-spi-owner@kernel.org> From: <patrice.chotard@foss.st.com> To: Mark Brown <broonie@kernel.org>, Miquel Raynal <miquel.raynal@bootlin.com>, Vignesh Raghavendra <vigneshr@ti.com>, Boris Brezillon <boris.brezillon@collabora.com>, <linux-mtd@lists.infradead.org>, Alexandre Torgue <alexandre.torgue@foss.st.com>, <linux-spi@vger.kernel.org>, <linux-stm32@st-md-mailman.stormreply.com>, <linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org> CC: <patrice.chotard@foss.st.com>, <christophe.kerello@foss.st.com> Subject: [PATCH 0/3] MTD: spinand: Add spi_mem_poll_status() support Date: Mon, 26 Apr 2021 16:39:31 +0200 Message-ID: <20210426143934.25275-1-patrice.chotard@foss.st.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk
Series	MTD: spinand: Add spi_mem_poll_status() support \| expand [0/3] MTD: spinand: Add spi_mem_poll_status() support [2/3] mtd: spinand: use the spi-mem poll status APIs

Patrice CHOTARD April 26, 2021, 2:39 p.m. UTC

From: Patrice Chotard <patrice.chotard@foss.st.com>

This series adds support for the spi_mem_poll_status() spinand
interface.
Some QSPI controllers allows to poll automatically memory 
status during operations (erase or write). This allows to 
offload the CPU for this task.
STM32 QSPI is supporting this feature, driver update are also
part of this series.

Christophe Kerello (3):
  spi: spi-mem: add automatic poll status functions
  mtd: spinand: use the spi-mem poll status APIs
  spi: stm32-qspi: add automatic poll status feature

 drivers/mtd/nand/spi/core.c  | 22 ++++++++--
 drivers/spi/spi-mem.c        | 34 +++++++++++++++
 drivers/spi/spi-stm32-qspi.c | 80 ++++++++++++++++++++++++++++++++----
 include/linux/mtd/spinand.h  |  1 +
 include/linux/spi/spi-mem.h  |  8 ++++
 5 files changed, 133 insertions(+), 12 deletions(-)

Mark Brown April 26, 2021, 4:51 p.m. UTC | #1

On Mon, Apr 26, 2021 at 09:56:12PM +0530, Pratyush Yadav wrote:
> On 26/04/21 04:39PM, patrice.chotard@foss.st.com wrote:

> > + * spi_mem_poll_status() - Poll memory device status
> > + * @mem: SPI memory device
> > + * @op: the memory operation to execute
> > + * @mask: status bitmask to ckeck
> > + * @match: status expected value

> Technically, (status & mask) expected value. Dunno if that is obvious 
> enough to not spell out explicitly.

Is it possible there's some situation where you're waiting for some bits
to clear as well?

> > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);

I'm not sure I like this name since it makes me think the driver is
going to poll when really it's offloaded to the hardware, but I can't
think of any better ideas either and it *is* what the hardware is going
to be doing so meh.

> I wonder if it is better to let spi-mem core take care of the timeout 
> part. On one hand it reduces code duplication on the driver side a 
> little bit. Plus it makes sure drivers don't mess anything up with bad 
> (or no) handling of the timeout. But on the other hand the interface 
> becomes a bit awkward since you'd have to pass a struct completion 
> around, and it isn't something particularly hard to get right either. 
> What do you think?

We already have the core handling other timeouts.  We don't pass around
completions but rather have an API function that the driver has to call
when the operation completes, a similar pattern might work here.  Part
of the thing with those APIs which I'm missing here is that this will
just return -EOPNOTSUPP if the driver can't do the delay in hardware, I
think it would be cleaner if this API were similar and the core dealt
with doing the delay/poll on the CPU.  That way the users don't need to
repeat the handling for the offload/non-offload cases.

Pratyush Yadav April 26, 2021, 5:39 p.m. UTC | #2

On 26/04/21 05:51PM, Mark Brown wrote:
> On Mon, Apr 26, 2021 at 09:56:12PM +0530, Pratyush Yadav wrote:
> > On 26/04/21 04:39PM, patrice.chotard@foss.st.com wrote:
> 
> > > + * spi_mem_poll_status() - Poll memory device status
> > > + * @mem: SPI memory device
> > > + * @op: the memory operation to execute
> > > + * @mask: status bitmask to ckeck
> > > + * @match: status expected value
> 
> > Technically, (status & mask) expected value. Dunno if that is obvious 
> > enough to not spell out explicitly.
> 
> Is it possible there's some situation where you're waiting for some bits
> to clear as well?

Yes. In fact, that is the more common situation. Both SPI NOR 
(spi_nor_sr_ready()) and SPI NAND (spinand_wait()) need to wait for the 
"busy" bit to be cleared.

AFAICT this API is supposed to check for (status & mask) == (match & 
mask) so it should be able to handle both polarities for the bits being 
polled.

> 
> > > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);
> 
> I'm not sure I like this name since it makes me think the driver is
> going to poll when really it's offloaded to the hardware, but I can't
> think of any better ideas either and it *is* what the hardware is going
> to be doing so meh.
> 
> > I wonder if it is better to let spi-mem core take care of the timeout 
> > part. On one hand it reduces code duplication on the driver side a 
> > little bit. Plus it makes sure drivers don't mess anything up with bad 
> > (or no) handling of the timeout. But on the other hand the interface 
> > becomes a bit awkward since you'd have to pass a struct completion 
> > around, and it isn't something particularly hard to get right either. 
> > What do you think?
> 
> We already have the core handling other timeouts.  We don't pass around
> completions but rather have an API function that the driver has to call
> when the operation completes, a similar pattern might work here.  Part
> of the thing with those APIs which I'm missing here is that this will
> just return -EOPNOTSUPP if the driver can't do the delay in hardware, I
> think it would be cleaner if this API were similar and the core dealt
> with doing the delay/poll on the CPU.  That way the users don't need to
> repeat the handling for the offload/non-offload cases.

Makes sense to me.

Patrice CHOTARD April 30, 2021, 2:16 p.m. UTC | #3

Hi Pratyush

On 4/26/21 6:26 PM, Pratyush Yadav wrote:
> Hi,

> 

> On 26/04/21 04:39PM, patrice.chotard@foss.st.com wrote:

>> From: Christophe Kerello <christophe.kerello@foss.st.com>

>>

>> With STM32 QSPI, it is possible to poll the status register of the device.

>> This could be done to offload the CPU during an operation (erase or

>> program a SPI NAND for example).

>>

>> spi_mem_poll_status API has been added to handle this feature.

>>

>> Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

>> Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

>> ---

>>  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

>>  include/linux/spi/spi-mem.h |  8 ++++++++

>>  2 files changed, 42 insertions(+)

>>

>> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

>> index 1513553e4080..43dce4b0efa4 100644

>> --- a/drivers/spi/spi-mem.c

>> +++ b/drivers/spi/spi-mem.c

>> @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

>>  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

>>  }

>>  

>> +/**

>> + * spi_mem_poll_status() - Poll memory device status

>> + * @mem: SPI memory device

>> + * @op: the memory operation to execute

>> + * @mask: status bitmask to ckeck

>> + * @match: status expected value

> 

> Technically, (status & mask) expected value. Dunno if that is obvious 

> enough to not spell out explicitly.


Yes, match = (status & mask)
I will update the comment accordingly.

> 

>> + * @timeout: timeout

>> + *

>> + * This function send a polling status request to the controller driver

>> + *

>> + * Return: 0 in case of success, -ETIMEDOUT in case of error,

>> + *         -EOPNOTSUPP if not supported.

>> + */

>> +int spi_mem_poll_status(struct spi_mem *mem,

>> +			const struct spi_mem_op *op,

>> +			u8 mask, u8 match, u16 timeout)

>> +{

>> +	struct spi_controller *ctlr = mem->spi->controller;

>> +	int ret = -EOPNOTSUPP;

>> +

>> +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> 

> You should call spi_mem_supports_op() before sending any ops to the 

> controller. Invalid/unsupported ops can cause unexpected behavior.


Ok i will add it.

Thanks
Patrice

> 

>> +		ret = spi_mem_access_start(mem);

>> +		if (ret)

>> +			return ret;

>> +

>> +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);

> 

> I wonder if it is better to let spi-mem core take care of the timeout 

> part. On one hand it reduces code duplication on the driver side a 

> little bit. Plus it makes sure drivers don't mess anything up with bad 

> (or no) handling of the timeout. But on the other hand the interface 

> becomes a bit awkward since you'd have to pass a struct completion 

> around, and it isn't something particularly hard to get right either. 

> What do you think?

> 

>> +

>> +		spi_mem_access_end(mem);

>> +	}

>> +

>> +	return ret;

>> +}

>> +EXPORT_SYMBOL_GPL(spi_mem_poll_status);

>> +

>>  static int spi_mem_probe(struct spi_device *spi)

>>  {

>>  	struct spi_mem_driver *memdrv = to_spi_mem_drv(spi->dev.driver);

>> diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h

>> index 2b65c9edc34e..5f78917c0f68 100644

>> --- a/include/linux/spi/spi-mem.h

>> +++ b/include/linux/spi/spi-mem.h

>> @@ -250,6 +250,7 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)

>>   *		  the currently mapped area), and the caller of

>>   *		  spi_mem_dirmap_write() is responsible for calling it again in

>>   *		  this case.

>> + * @poll_status: poll memory device status

>>   *

>>   * This interface should be implemented by SPI controllers providing an

>>   * high-level interface to execute SPI memory operation, which is usually the

>> @@ -274,6 +275,9 @@ struct spi_controller_mem_ops {

>>  			       u64 offs, size_t len, void *buf);

>>  	ssize_t (*dirmap_write)(struct spi_mem_dirmap_desc *desc,

>>  				u64 offs, size_t len, const void *buf);

>> +	int (*poll_status)(struct spi_mem *mem,

>> +			   const struct spi_mem_op *op,

>> +			   u8 mask, u8 match, u16 timeout);

>>  };

>>  

>>  /**

>> @@ -369,6 +373,10 @@ devm_spi_mem_dirmap_create(struct device *dev, struct spi_mem *mem,

>>  void devm_spi_mem_dirmap_destroy(struct device *dev,

>>  				 struct spi_mem_dirmap_desc *desc);

>>  

>> +int spi_mem_poll_status(struct spi_mem *mem,

>> +			const struct spi_mem_op *op,

>> +			u8 mask, u8 match, u16 timeout);

>> +

>>  int spi_mem_driver_register_with_owner(struct spi_mem_driver *drv,

>>  				       struct module *owner);

>

Patrice CHOTARD April 30, 2021, 2:22 p.m. UTC | #4

Hi Mark, Pratyush

On 4/26/21 6:51 PM, Mark Brown wrote:
> On Mon, Apr 26, 2021 at 09:56:12PM +0530, Pratyush Yadav wrote:

>> On 26/04/21 04:39PM, patrice.chotard@foss.st.com wrote:

> 

>>> + * spi_mem_poll_status() - Poll memory device status

>>> + * @mem: SPI memory device

>>> + * @op: the memory operation to execute

>>> + * @mask: status bitmask to ckeck

>>> + * @match: status expected value

> 

>> Technically, (status & mask) expected value. Dunno if that is obvious 

>> enough to not spell out explicitly.

> 

> Is it possible there's some situation where you're waiting for some bits

> to clear as well?

> 

Yes, we are waiting STATUS_BUSY bit to be cleared, see patch 2 which is making 
usage of this API.

>>> +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);

> 

> I'm not sure I like this name since it makes me think the driver is

> going to poll when really it's offloaded to the hardware, but I can't

> think of any better ideas either and it *is* what the hardware is going

> to be doing so meh.

> 

>> I wonder if it is better to let spi-mem core take care of the timeout 

>> part. On one hand it reduces code duplication on the driver side a 

>> little bit. Plus it makes sure drivers don't mess anything up with bad 

>> (or no) handling of the timeout. But on the other hand the interface 

>> becomes a bit awkward since you'd have to pass a struct completion 

>> around, and it isn't something particularly hard to get right either. 

>> What do you think?

> 

> We already have the core handling other timeouts.  We don't pass around

> completions but rather have an API function that the driver has to call

> when the operation completes, a similar pattern might work here.  Part


So, if i correctly understood, you make allusion to what is already done
in SPI core framework with spi_finalize_current_transfer() right ?

> of the thing with those APIs which I'm missing here is that this will

> just return -EOPNOTSUPP if the driver can't do the delay in hardware, I

> think it would be cleaner if this API were similar and the core dealt

> with doing the delay/poll on the CPU.  That way the users don't need to

> repeat the handling for the offload/non-offload cases.


Sorry, i didn't catch what you mean here. In PATCH 2, that's the case,
if spi_mem_poll_status() is not supported, the core is dealing with 
the delay/poll on the CPU in spinand_wait().

Patrice
Thanks


>

Mark Brown April 30, 2021, 3:56 p.m. UTC | #5

On Fri, Apr 30, 2021 at 04:22:34PM +0200, Patrice CHOTARD wrote:
> On 4/26/21 6:51 PM, Mark Brown wrote:

> > On Mon, Apr 26, 2021 at 09:56:12PM +0530, Pratyush Yadav wrote:


> > Is it possible there's some situation where you're waiting for some bits

> > to clear as well?


> Yes, we are waiting STATUS_BUSY bit to be cleared, see patch 2 which is making 

> usage of this API.


Then the inverse question applies - is there no circumstance where we
might be waiting for a bit to be set?

> > We already have the core handling other timeouts.  We don't pass around

> > completions but rather have an API function that the driver has to call

> > when the operation completes, a similar pattern might work here.  Part


> So, if i correctly understood, you make allusion to what is already done

> in SPI core framework with spi_finalize_current_transfer() right ?


Yes, and _current_message().

> > of the thing with those APIs which I'm missing here is that this will

> > just return -EOPNOTSUPP if the driver can't do the delay in hardware, I

> > think it would be cleaner if this API were similar and the core dealt

> > with doing the delay/poll on the CPU.  That way the users don't need to

> > repeat the handling for the offload/non-offload cases.


> Sorry, i didn't catch what you mean here. In PATCH 2, that's the case,

> if spi_mem_poll_status() is not supported, the core is dealing with 

> the delay/poll on the CPU in spinand_wait().


That's in the NAND core, not in spi-mem.  Any other users of spi-mem
will also need to open code stuff.

Boris Brezillon April 30, 2021, 4:51 p.m. UTC | #6

On Mon, 26 Apr 2021 16:39:32 +0200
<patrice.chotard@foss.st.com> wrote:

> From: Christophe Kerello <christophe.kerello@foss.st.com>

> 

> With STM32 QSPI, it is possible to poll the status register of the device.

> This could be done to offload the CPU during an operation (erase or

> program a SPI NAND for example).

> 

> spi_mem_poll_status API has been added to handle this feature.

> 

> Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> ---

>  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

>  include/linux/spi/spi-mem.h |  8 ++++++++

>  2 files changed, 42 insertions(+)

> 

> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> index 1513553e4080..43dce4b0efa4 100644

> --- a/drivers/spi/spi-mem.c

> +++ b/drivers/spi/spi-mem.c

> @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

>  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

>  }

>  

> +/**

> + * spi_mem_poll_status() - Poll memory device status

> + * @mem: SPI memory device

> + * @op: the memory operation to execute

> + * @mask: status bitmask to ckeck

> + * @match: status expected value

> + * @timeout: timeout

> + *

> + * This function send a polling status request to the controller driver

> + *

> + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> + *         -EOPNOTSUPP if not supported.

> + */

> +int spi_mem_poll_status(struct spi_mem *mem,

> +			const struct spi_mem_op *op,

> +			u8 mask, u8 match, u16 timeout)

> +{

> +	struct spi_controller *ctlr = mem->spi->controller;

> +	int ret = -EOPNOTSUPP;

> +

> +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> +		ret = spi_mem_access_start(mem);


You should probably check that op is a single byte read before
accepting the command.

> +		if (ret)

> +			return ret;

> +

> +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);


You also need some sort of ->poll_status_is_supported() to validate
that the controller supports the status polling for this specific op (I
can imagine some controllers having a limit on the number of dummy
cycles/address bytes). I guess you could just fall back on SW-based
status polling if ctlr->mem_ops->poll_status() returns -ENOTSUPP.

> +

> +		spi_mem_access_end(mem);

> +	}

> +

> +	return ret;

> +}

> +EXPORT_SYMBOL_GPL(spi_mem_poll_status);

> +

>  static int spi_mem_probe(struct spi_device *spi)

>  {

>  	struct spi_mem_driver *memdrv = to_spi_mem_drv(spi->dev.driver);

> diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h

> index 2b65c9edc34e..5f78917c0f68 100644

> --- a/include/linux/spi/spi-mem.h

> +++ b/include/linux/spi/spi-mem.h

> @@ -250,6 +250,7 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)

>   *		  the currently mapped area), and the caller of

>   *		  spi_mem_dirmap_write() is responsible for calling it again in

>   *		  this case.

> + * @poll_status: poll memory device status

>   *

>   * This interface should be implemented by SPI controllers providing an

>   * high-level interface to execute SPI memory operation, which is usually the

> @@ -274,6 +275,9 @@ struct spi_controller_mem_ops {

>  			       u64 offs, size_t len, void *buf);

>  	ssize_t (*dirmap_write)(struct spi_mem_dirmap_desc *desc,

>  				u64 offs, size_t len, const void *buf);

> +	int (*poll_status)(struct spi_mem *mem,

> +			   const struct spi_mem_op *op,

> +			   u8 mask, u8 match, u16 timeout);

>  };

>  

>  /**

> @@ -369,6 +373,10 @@ devm_spi_mem_dirmap_create(struct device *dev, struct spi_mem *mem,

>  void devm_spi_mem_dirmap_destroy(struct device *dev,

>  				 struct spi_mem_dirmap_desc *desc);

>  

> +int spi_mem_poll_status(struct spi_mem *mem,

> +			const struct spi_mem_op *op,

> +			u8 mask, u8 match, u16 timeout);

> +

>  int spi_mem_driver_register_with_owner(struct spi_mem_driver *drv,

>  				       struct module *owner);

>

Boris Brezillon April 30, 2021, 4:52 p.m. UTC | #7

On Mon, 26 Apr 2021 16:39:32 +0200
<patrice.chotard@foss.st.com> wrote:

> +/**

> + * spi_mem_poll_status() - Poll memory device status

> + * @mem: SPI memory device

> + * @op: the memory operation to execute

> + * @mask: status bitmask to ckeck

> + * @match: status expected value

> + * @timeout: timeout


What's the unit?

Boris Brezillon April 30, 2021, 4:53 p.m. UTC | #8

On Mon, 26 Apr 2021 16:39:34 +0200
<patrice.chotard@foss.st.com> wrote:

> From: Christophe Kerello <christophe.kerello@foss.st.com>

> 

> STM32 QSPI is able to automatically poll a specified register inside the

> memory and relieve the CPU from this task.

> 

> As example, when erasing a large memory area, we got cpu load

> equal to 50%.


Ouch! I feel like this should be fixed at the core level too so we
don't poll the status so frequently, or at least wait a bit before
polling the status (especially for erase/write operations).

> This patch allows to perform the same operation

> with a cpu load around 2%.

> 

> Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> ---

>  drivers/spi/spi-stm32-qspi.c | 80 ++++++++++++++++++++++++++++++++----

>  1 file changed, 72 insertions(+), 8 deletions(-)

> 

> diff --git a/drivers/spi/spi-stm32-qspi.c b/drivers/spi/spi-stm32-qspi.c

> index 7e640ccc7e77..2f9e5941f14f 100644

> --- a/drivers/spi/spi-stm32-qspi.c

> +++ b/drivers/spi/spi-stm32-qspi.c

> @@ -36,6 +36,7 @@

>  #define CR_FTIE			BIT(18)

>  #define CR_SMIE			BIT(19)

>  #define CR_TOIE			BIT(20)

> +#define CR_APMS			BIT(22)

>  #define CR_PRESC_MASK		GENMASK(31, 24)

>  

>  #define QSPI_DCR		0x04

> @@ -53,6 +54,7 @@

>  #define QSPI_FCR		0x0c

>  #define FCR_CTEF		BIT(0)

>  #define FCR_CTCF		BIT(1)

> +#define FCR_CSMF		BIT(3)

>  

>  #define QSPI_DLR		0x10

>  

> @@ -107,6 +109,7 @@ struct stm32_qspi {

>  	u32 clk_rate;

>  	struct stm32_qspi_flash flash[STM32_QSPI_MAX_NORCHIP];

>  	struct completion data_completion;

> +	struct completion match_completion;

>  	u32 fmode;

>  

>  	struct dma_chan *dma_chtx;

> @@ -115,6 +118,7 @@ struct stm32_qspi {

>  

>  	u32 cr_reg;

>  	u32 dcr_reg;

> +	u16 status_timeout;

>  

>  	/*

>  	 * to protect device configuration, could be different between

> @@ -128,11 +132,20 @@ static irqreturn_t stm32_qspi_irq(int irq, void *dev_id)

>  	struct stm32_qspi *qspi = (struct stm32_qspi *)dev_id;

>  	u32 cr, sr;

>  

> +	cr = readl_relaxed(qspi->io_base + QSPI_CR);

>  	sr = readl_relaxed(qspi->io_base + QSPI_SR);

>  

> +	if (cr & CR_SMIE && sr & SR_SMF) {

> +		/* disable irq */

> +		cr &= ~CR_SMIE;

> +		writel_relaxed(cr, qspi->io_base + QSPI_CR);

> +		complete(&qspi->match_completion);

> +

> +		return IRQ_HANDLED;

> +	}

> +

>  	if (sr & (SR_TEF | SR_TCF)) {

>  		/* disable irq */

> -		cr = readl_relaxed(qspi->io_base + QSPI_CR);

>  		cr &= ~CR_TCIE & ~CR_TEIE;

>  		writel_relaxed(cr, qspi->io_base + QSPI_CR);

>  		complete(&qspi->data_completion);

> @@ -319,6 +332,24 @@ static int stm32_qspi_wait_cmd(struct stm32_qspi *qspi,

>  	return err;

>  }

>  

> +static int stm32_qspi_wait_poll_status(struct stm32_qspi *qspi,

> +				       const struct spi_mem_op *op)

> +{

> +	u32 cr;

> +

> +	reinit_completion(&qspi->match_completion);

> +	cr = readl_relaxed(qspi->io_base + QSPI_CR);

> +	writel_relaxed(cr | CR_SMIE, qspi->io_base + QSPI_CR);

> +

> +	if (!wait_for_completion_timeout(&qspi->match_completion,

> +				msecs_to_jiffies(qspi->status_timeout)))

> +		return -ETIMEDOUT;

> +

> +	writel_relaxed(FCR_CSMF, qspi->io_base + QSPI_FCR);

> +

> +	return 0;

> +}

> +

>  static int stm32_qspi_get_mode(struct stm32_qspi *qspi, u8 buswidth)

>  {

>  	if (buswidth == 4)

> @@ -332,7 +363,7 @@ static int stm32_qspi_send(struct spi_mem *mem, const struct spi_mem_op *op)

>  	struct stm32_qspi *qspi = spi_controller_get_devdata(mem->spi->master);

>  	struct stm32_qspi_flash *flash = &qspi->flash[mem->spi->chip_select];

>  	u32 ccr, cr;

> -	int timeout, err = 0;

> +	int timeout, err = 0, err_poll_status = 0;

>  

>  	dev_dbg(qspi->dev, "cmd:%#x mode:%d.%d.%d.%d addr:%#llx len:%#x\n",

>  		op->cmd.opcode, op->cmd.buswidth, op->addr.buswidth,

> @@ -378,6 +409,9 @@ static int stm32_qspi_send(struct spi_mem *mem, const struct spi_mem_op *op)

>  	if (op->addr.nbytes && qspi->fmode != CCR_FMODE_MM)

>  		writel_relaxed(op->addr.val, qspi->io_base + QSPI_AR);

>  

> +	if (qspi->fmode == CCR_FMODE_APM)

> +		err_poll_status = stm32_qspi_wait_poll_status(qspi, op);

> +

>  	err = stm32_qspi_tx(qspi, op);

>  

>  	/*

> @@ -387,7 +421,7 @@ static int stm32_qspi_send(struct spi_mem *mem, const struct spi_mem_op *op)

>  	 *  byte of device (device size - fifo size). like device size is not

>  	 *  knows, the prefetching is always stop.

>  	 */

> -	if (err || qspi->fmode == CCR_FMODE_MM)

> +	if (err || err_poll_status || qspi->fmode == CCR_FMODE_MM)

>  		goto abort;

>  

>  	/* wait end of tx in indirect mode */

> @@ -406,15 +440,43 @@ static int stm32_qspi_send(struct spi_mem *mem, const struct spi_mem_op *op)

>  						    cr, !(cr & CR_ABORT), 1,

>  						    STM32_ABT_TIMEOUT_US);

>  

> -	writel_relaxed(FCR_CTCF, qspi->io_base + QSPI_FCR);

> +	writel_relaxed(FCR_CTCF | FCR_CSMF, qspi->io_base + QSPI_FCR);

>  

> -	if (err || timeout)

> -		dev_err(qspi->dev, "%s err:%d abort timeout:%d\n",

> -			__func__, err, timeout);

> +	if (err || err_poll_status || timeout)

> +		dev_err(qspi->dev, "%s err:%d err_poll_status:%d abort timeout:%d\n",

> +			__func__, err, err_poll_status, timeout);

>  

>  	return err;

>  }

>  

> +static int stm32_qspi_poll_status(struct spi_mem *mem, const struct spi_mem_op *op,

> +				  u8 mask, u8 match, u16 timeout_ms)

> +{

> +	struct stm32_qspi *qspi = spi_controller_get_devdata(mem->spi->master);

> +	int ret;

> +

> +	ret = pm_runtime_get_sync(qspi->dev);

> +	if (ret < 0) {

> +		pm_runtime_put_noidle(qspi->dev);

> +		return ret;

> +	}

> +

> +	mutex_lock(&qspi->lock);

> +

> +	writel_relaxed(mask, qspi->io_base + QSPI_PSMKR);

> +	writel_relaxed(match, qspi->io_base + QSPI_PSMAR);

> +	qspi->fmode = CCR_FMODE_APM;

> +	qspi->status_timeout = timeout_ms;

> +

> +	ret = stm32_qspi_send(mem, op);

> +	mutex_unlock(&qspi->lock);

> +

> +	pm_runtime_mark_last_busy(qspi->dev);

> +	pm_runtime_put_autosuspend(qspi->dev);

> +

> +	return ret;

> +}

> +

>  static int stm32_qspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)

>  {

>  	struct stm32_qspi *qspi = spi_controller_get_devdata(mem->spi->master);

> @@ -527,7 +589,7 @@ static int stm32_qspi_setup(struct spi_device *spi)

>  	flash->presc = presc;

>  

>  	mutex_lock(&qspi->lock);

> -	qspi->cr_reg = 3 << CR_FTHRES_SHIFT | CR_SSHIFT | CR_EN;

> +	qspi->cr_reg = CR_APMS | 3 << CR_FTHRES_SHIFT | CR_SSHIFT | CR_EN;

>  	writel_relaxed(qspi->cr_reg, qspi->io_base + QSPI_CR);

>  

>  	/* set dcr fsize to max address */

> @@ -607,6 +669,7 @@ static const struct spi_controller_mem_ops stm32_qspi_mem_ops = {

>  	.exec_op	= stm32_qspi_exec_op,

>  	.dirmap_create	= stm32_qspi_dirmap_create,

>  	.dirmap_read	= stm32_qspi_dirmap_read,

> +	.poll_status	= stm32_qspi_poll_status,

>  };

>  

>  static int stm32_qspi_probe(struct platform_device *pdev)

> @@ -661,6 +724,7 @@ static int stm32_qspi_probe(struct platform_device *pdev)

>  	}

>  

>  	init_completion(&qspi->data_completion);

> +	init_completion(&qspi->match_completion);

>  

>  	qspi->clk = devm_clk_get(dev, NULL);

>  	if (IS_ERR(qspi->clk)) {

Pratyush Yadav May 3, 2021, 8:47 a.m. UTC | #9

On 30/04/21 06:51PM, Boris Brezillon wrote:
> On Mon, 26 Apr 2021 16:39:32 +0200

> <patrice.chotard@foss.st.com> wrote:

> 

> > From: Christophe Kerello <christophe.kerello@foss.st.com>

> > 

> > With STM32 QSPI, it is possible to poll the status register of the device.

> > This could be done to offload the CPU during an operation (erase or

> > program a SPI NAND for example).

> > 

> > spi_mem_poll_status API has been added to handle this feature.

> > 

> > Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> > Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> > ---

> >  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

> >  include/linux/spi/spi-mem.h |  8 ++++++++

> >  2 files changed, 42 insertions(+)

> > 

> > diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> > index 1513553e4080..43dce4b0efa4 100644

> > --- a/drivers/spi/spi-mem.c

> > +++ b/drivers/spi/spi-mem.c

> > @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

> >  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

> >  }

> >  

> > +/**

> > + * spi_mem_poll_status() - Poll memory device status

> > + * @mem: SPI memory device

> > + * @op: the memory operation to execute

> > + * @mask: status bitmask to ckeck

> > + * @match: status expected value

> > + * @timeout: timeout

> > + *

> > + * This function send a polling status request to the controller driver

> > + *

> > + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> > + *         -EOPNOTSUPP if not supported.

> > + */

> > +int spi_mem_poll_status(struct spi_mem *mem,

> > +			const struct spi_mem_op *op,

> > +			u8 mask, u8 match, u16 timeout)

> > +{

> > +	struct spi_controller *ctlr = mem->spi->controller;

> > +	int ret = -EOPNOTSUPP;

> > +

> > +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> > +		ret = spi_mem_access_start(mem);

> 

> You should probably check that op is a single byte read before

> accepting the command.


Please do not discriminate against 8D-8D-8D flashes ;-)

> 

> > +		if (ret)

> > +			return ret;

> > +

> > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);

> 

> You also need some sort of ->poll_status_is_supported() to validate

> that the controller supports the status polling for this specific op (I


I don't think a separate function is needed for checking if the poll 
status op is supported. Return value of -EOPNOTSUPP should be able to 
signal that. This can also be used to check if Octal DDR capable 
controllers are able to poll using 2-byte reads.

> can imagine some controllers having a limit on the number of dummy

> cycles/address bytes). I guess you could just fall back on SW-based

> status polling if ctlr->mem_ops->poll_status() returns -ENOTSUPP.

> 

> > +

> > +		spi_mem_access_end(mem);

> > +	}

> > +

> > +	return ret;

> > +}

> > +EXPORT_SYMBOL_GPL(spi_mem_poll_status);

> > +

> >  static int spi_mem_probe(struct spi_device *spi)

> >  {

> >  	struct spi_mem_driver *memdrv = to_spi_mem_drv(spi->dev.driver);

> > diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h

> > index 2b65c9edc34e..5f78917c0f68 100644

> > --- a/include/linux/spi/spi-mem.h

> > +++ b/include/linux/spi/spi-mem.h

> > @@ -250,6 +250,7 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)

> >   *		  the currently mapped area), and the caller of

> >   *		  spi_mem_dirmap_write() is responsible for calling it again in

> >   *		  this case.

> > + * @poll_status: poll memory device status

> >   *

> >   * This interface should be implemented by SPI controllers providing an

> >   * high-level interface to execute SPI memory operation, which is usually the

> > @@ -274,6 +275,9 @@ struct spi_controller_mem_ops {

> >  			       u64 offs, size_t len, void *buf);

> >  	ssize_t (*dirmap_write)(struct spi_mem_dirmap_desc *desc,

> >  				u64 offs, size_t len, const void *buf);

> > +	int (*poll_status)(struct spi_mem *mem,

> > +			   const struct spi_mem_op *op,

> > +			   u8 mask, u8 match, u16 timeout);

> >  };

> >  

> >  /**

> > @@ -369,6 +373,10 @@ devm_spi_mem_dirmap_create(struct device *dev, struct spi_mem *mem,

> >  void devm_spi_mem_dirmap_destroy(struct device *dev,

> >  				 struct spi_mem_dirmap_desc *desc);

> >  

> > +int spi_mem_poll_status(struct spi_mem *mem,

> > +			const struct spi_mem_op *op,

> > +			u8 mask, u8 match, u16 timeout);

> > +

> >  int spi_mem_driver_register_with_owner(struct spi_mem_driver *drv,

> >  				       struct module *owner);

> >  


-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

Boris Brezillon May 3, 2021, 9:11 a.m. UTC | #10

On Mon, 3 May 2021 14:17:44 +0530
Pratyush Yadav <p.yadav@ti.com> wrote:

> On 30/04/21 06:51PM, Boris Brezillon wrote:

> > On Mon, 26 Apr 2021 16:39:32 +0200

> > <patrice.chotard@foss.st.com> wrote:

> >   

> > > From: Christophe Kerello <christophe.kerello@foss.st.com>

> > > 

> > > With STM32 QSPI, it is possible to poll the status register of the device.

> > > This could be done to offload the CPU during an operation (erase or

> > > program a SPI NAND for example).

> > > 

> > > spi_mem_poll_status API has been added to handle this feature.

> > > 

> > > Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> > > Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> > > ---

> > >  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

> > >  include/linux/spi/spi-mem.h |  8 ++++++++

> > >  2 files changed, 42 insertions(+)

> > > 

> > > diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> > > index 1513553e4080..43dce4b0efa4 100644

> > > --- a/drivers/spi/spi-mem.c

> > > +++ b/drivers/spi/spi-mem.c

> > > @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

> > >  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

> > >  }

> > >  

> > > +/**

> > > + * spi_mem_poll_status() - Poll memory device status

> > > + * @mem: SPI memory device

> > > + * @op: the memory operation to execute

> > > + * @mask: status bitmask to ckeck

> > > + * @match: status expected value

> > > + * @timeout: timeout

> > > + *

> > > + * This function send a polling status request to the controller driver

> > > + *

> > > + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> > > + *         -EOPNOTSUPP if not supported.

> > > + */

> > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > +			const struct spi_mem_op *op,

> > > +			u8 mask, u8 match, u16 timeout)

> > > +{

> > > +	struct spi_controller *ctlr = mem->spi->controller;

> > > +	int ret = -EOPNOTSUPP;

> > > +

> > > +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> > > +		ret = spi_mem_access_start(mem);  

> > 

> > You should probably check that op is a single byte read before

> > accepting the command.  

> 

> Please do not discriminate against 8D-8D-8D flashes ;-).


Then mask and match should probably be u16 :P. And the check as it is
seems a bit lax to me. Drivers will of course be able to reject the op
when there's more than one byte (or 16bit word in case of 8D) to read,
but it feels like the core could automate that a bit.

> 

> >   

> > > +		if (ret)

> > > +			return ret;

> > > +

> > > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);  

> > 

> > You also need some sort of ->poll_status_is_supported() to validate

> > that the controller supports the status polling for this specific op (I  

> 

> I don't think a separate function is needed for checking if the poll 

> status op is supported. Return value of -EOPNOTSUPP should be able to 

> signal that. This can also be used to check if Octal DDR capable 

> controllers are able to poll using 2-byte reads.


Yeah, I had something more complex in mind to avoid doing this 'try
native mode and fall back on sw-based more if not supported' dance
every time a status poll is requested (something similar to what we do
for dirmaps, with a status poll desc), but I guess that's a bit
premature (and probably uneeded).

> 

> > can imagine some controllers having a limit on the number of dummy

> > cycles/address bytes). I guess you could just fall back on SW-based

> > status polling if ctlr->mem_ops->poll_status() returns -ENOTSUPP.

> >   

> > > +

> > > +		spi_mem_access_end(mem);

> > > +	}

> > > +

> > > +	return ret;

> > > +}

> > > +EXPORT_SYMBOL_GPL(spi_mem_poll_status);

> > > +

> > >  static int spi_mem_probe(struct spi_device *spi)

> > >  {

> > >  	struct spi_mem_driver *memdrv = to_spi_mem_drv(spi->dev.driver);

> > > diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h

> > > index 2b65c9edc34e..5f78917c0f68 100644

> > > --- a/include/linux/spi/spi-mem.h

> > > +++ b/include/linux/spi/spi-mem.h

> > > @@ -250,6 +250,7 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)

> > >   *		  the currently mapped area), and the caller of

> > >   *		  spi_mem_dirmap_write() is responsible for calling it again in

> > >   *		  this case.

> > > + * @poll_status: poll memory device status

> > >   *

> > >   * This interface should be implemented by SPI controllers providing an

> > >   * high-level interface to execute SPI memory operation, which is usually the

> > > @@ -274,6 +275,9 @@ struct spi_controller_mem_ops {

> > >  			       u64 offs, size_t len, void *buf);

> > >  	ssize_t (*dirmap_write)(struct spi_mem_dirmap_desc *desc,

> > >  				u64 offs, size_t len, const void *buf);

> > > +	int (*poll_status)(struct spi_mem *mem,

> > > +			   const struct spi_mem_op *op,

> > > +			   u8 mask, u8 match, u16 timeout);

> > >  };

> > >  

> > >  /**

> > > @@ -369,6 +373,10 @@ devm_spi_mem_dirmap_create(struct device *dev, struct spi_mem *mem,

> > >  void devm_spi_mem_dirmap_destroy(struct device *dev,

> > >  				 struct spi_mem_dirmap_desc *desc);

> > >  

> > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > +			const struct spi_mem_op *op,

> > > +			u8 mask, u8 match, u16 timeout);

> > > +

> > >  int spi_mem_driver_register_with_owner(struct spi_mem_driver *drv,

> > >  				       struct module *owner);

> > >    

>

Pratyush Yadav May 3, 2021, 9:29 a.m. UTC | #11

On 03/05/21 11:11AM, Boris Brezillon wrote:
> On Mon, 3 May 2021 14:17:44 +0530

> Pratyush Yadav <p.yadav@ti.com> wrote:

> 

> > On 30/04/21 06:51PM, Boris Brezillon wrote:

> > > On Mon, 26 Apr 2021 16:39:32 +0200

> > > <patrice.chotard@foss.st.com> wrote:

> > >   

> > > > From: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > 

> > > > With STM32 QSPI, it is possible to poll the status register of the device.

> > > > This could be done to offload the CPU during an operation (erase or

> > > > program a SPI NAND for example).

> > > > 

> > > > spi_mem_poll_status API has been added to handle this feature.

> > > > 

> > > > Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> > > > ---

> > > >  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

> > > >  include/linux/spi/spi-mem.h |  8 ++++++++

> > > >  2 files changed, 42 insertions(+)

> > > > 

> > > > diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> > > > index 1513553e4080..43dce4b0efa4 100644

> > > > --- a/drivers/spi/spi-mem.c

> > > > +++ b/drivers/spi/spi-mem.c

> > > > @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

> > > >  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

> > > >  }

> > > >  

> > > > +/**

> > > > + * spi_mem_poll_status() - Poll memory device status

> > > > + * @mem: SPI memory device

> > > > + * @op: the memory operation to execute

> > > > + * @mask: status bitmask to ckeck

> > > > + * @match: status expected value

> > > > + * @timeout: timeout

> > > > + *

> > > > + * This function send a polling status request to the controller driver

> > > > + *

> > > > + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> > > > + *         -EOPNOTSUPP if not supported.

> > > > + */

> > > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > > +			const struct spi_mem_op *op,

> > > > +			u8 mask, u8 match, u16 timeout)

> > > > +{

> > > > +	struct spi_controller *ctlr = mem->spi->controller;

> > > > +	int ret = -EOPNOTSUPP;

> > > > +

> > > > +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> > > > +		ret = spi_mem_access_start(mem);  

> > > 

> > > You should probably check that op is a single byte read before

> > > accepting the command.  

> > 

> > Please do not discriminate against 8D-8D-8D flashes ;-).

> 

> Then mask and match should probably be u16 :P. And the check as it is

> seems a bit lax to me. Drivers will of course be able to reject the op

> when there's more than one byte (or 16bit word in case of 8D) to read,

> but it feels like the core could automate that a bit.


The two 8D flashes that are currently supported in SPI NOR both have a 
1-byte status register. But to read it, the read op should be 2-byte 
long to avoid partial cycles at the end. The second byte is simply 
discarded.

2-byte wide registers might show up in the future, but for now at least 
we don't have to worry about them.

> 

> > 

> > >   

> > > > +		if (ret)

> > > > +			return ret;

> > > > +

> > > > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);  

> > > 

> > > You also need some sort of ->poll_status_is_supported() to validate

> > > that the controller supports the status polling for this specific op (I  

> > 

> > I don't think a separate function is needed for checking if the poll 

> > status op is supported. Return value of -EOPNOTSUPP should be able to 

> > signal that. This can also be used to check if Octal DDR capable 

> > controllers are able to poll using 2-byte reads.

> 

> Yeah, I had something more complex in mind to avoid doing this 'try

> native mode and fall back on sw-based more if not supported' dance

> every time a status poll is requested (something similar to what we do

> for dirmaps, with a status poll desc), but I guess that's a bit

> premature (and probably uneeded).


I think Mark also suggested something similar. Make the CPU/non-CPU case 
transparent to the caller. I agree with with this direction. Makes the 
caller simpler.

I also mentioned in a reply to this patch that supports_op() should be 
called before the op is executed. That should take care of "base" 
support for the op. The poll-specific checks can go in the poll_status() 
function itself. If either of those say the op is not supported, it 
should fall back to CPU based polling. That's the design that makes the 
most sense to me.

> 

> > 

> > > can imagine some controllers having a limit on the number of dummy

> > > cycles/address bytes). I guess you could just fall back on SW-based

> > > status polling if ctlr->mem_ops->poll_status() returns -ENOTSUPP.

> > >   

> > > > +

> > > > +		spi_mem_access_end(mem);

> > > > +	}

> > > > +

> > > > +	return ret;

> > > > +}

> > > > +EXPORT_SYMBOL_GPL(spi_mem_poll_status);

> > > > +

> > > >  static int spi_mem_probe(struct spi_device *spi)

> > > >  {

> > > >  	struct spi_mem_driver *memdrv = to_spi_mem_drv(spi->dev.driver);

> > > > diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h

> > > > index 2b65c9edc34e..5f78917c0f68 100644

> > > > --- a/include/linux/spi/spi-mem.h

> > > > +++ b/include/linux/spi/spi-mem.h

> > > > @@ -250,6 +250,7 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)

> > > >   *		  the currently mapped area), and the caller of

> > > >   *		  spi_mem_dirmap_write() is responsible for calling it again in

> > > >   *		  this case.

> > > > + * @poll_status: poll memory device status

> > > >   *

> > > >   * This interface should be implemented by SPI controllers providing an

> > > >   * high-level interface to execute SPI memory operation, which is usually the

> > > > @@ -274,6 +275,9 @@ struct spi_controller_mem_ops {

> > > >  			       u64 offs, size_t len, void *buf);

> > > >  	ssize_t (*dirmap_write)(struct spi_mem_dirmap_desc *desc,

> > > >  				u64 offs, size_t len, const void *buf);

> > > > +	int (*poll_status)(struct spi_mem *mem,

> > > > +			   const struct spi_mem_op *op,

> > > > +			   u8 mask, u8 match, u16 timeout);

> > > >  };

> > > >  

> > > >  /**

> > > > @@ -369,6 +373,10 @@ devm_spi_mem_dirmap_create(struct device *dev, struct spi_mem *mem,

> > > >  void devm_spi_mem_dirmap_destroy(struct device *dev,

> > > >  				 struct spi_mem_dirmap_desc *desc);

> > > >  

> > > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > > +			const struct spi_mem_op *op,

> > > > +			u8 mask, u8 match, u16 timeout);

> > > > +

> > > >  int spi_mem_driver_register_with_owner(struct spi_mem_driver *drv,

> > > >  				       struct module *owner);

> > > >    

> > 

> 


-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

Boris Brezillon May 3, 2021, 9:52 a.m. UTC | #12

On Mon, 3 May 2021 14:59:37 +0530
Pratyush Yadav <p.yadav@ti.com> wrote:

> On 03/05/21 11:11AM, Boris Brezillon wrote:

> > On Mon, 3 May 2021 14:17:44 +0530

> > Pratyush Yadav <p.yadav@ti.com> wrote:

> >   

> > > On 30/04/21 06:51PM, Boris Brezillon wrote:  

> > > > On Mon, 26 Apr 2021 16:39:32 +0200

> > > > <patrice.chotard@foss.st.com> wrote:

> > > >     

> > > > > From: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > > 

> > > > > With STM32 QSPI, it is possible to poll the status register of the device.

> > > > > This could be done to offload the CPU during an operation (erase or

> > > > > program a SPI NAND for example).

> > > > > 

> > > > > spi_mem_poll_status API has been added to handle this feature.

> > > > > 

> > > > > Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > > Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> > > > > ---

> > > > >  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

> > > > >  include/linux/spi/spi-mem.h |  8 ++++++++

> > > > >  2 files changed, 42 insertions(+)

> > > > > 

> > > > > diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> > > > > index 1513553e4080..43dce4b0efa4 100644

> > > > > --- a/drivers/spi/spi-mem.c

> > > > > +++ b/drivers/spi/spi-mem.c

> > > > > @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

> > > > >  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

> > > > >  }

> > > > >  

> > > > > +/**

> > > > > + * spi_mem_poll_status() - Poll memory device status

> > > > > + * @mem: SPI memory device

> > > > > + * @op: the memory operation to execute

> > > > > + * @mask: status bitmask to ckeck

> > > > > + * @match: status expected value

> > > > > + * @timeout: timeout

> > > > > + *

> > > > > + * This function send a polling status request to the controller driver

> > > > > + *

> > > > > + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> > > > > + *         -EOPNOTSUPP if not supported.

> > > > > + */

> > > > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > > > +			const struct spi_mem_op *op,

> > > > > +			u8 mask, u8 match, u16 timeout)

> > > > > +{

> > > > > +	struct spi_controller *ctlr = mem->spi->controller;

> > > > > +	int ret = -EOPNOTSUPP;

> > > > > +

> > > > > +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> > > > > +		ret = spi_mem_access_start(mem);    

> > > > 

> > > > You should probably check that op is a single byte read before

> > > > accepting the command.    

> > > 

> > > Please do not discriminate against 8D-8D-8D flashes ;-).  

> > 

> > Then mask and match should probably be u16 :P. And the check as it is

> > seems a bit lax to me. Drivers will of course be able to reject the op

> > when there's more than one byte (or 16bit word in case of 8D) to read,

> > but it feels like the core could automate that a bit.  

> 

> The two 8D flashes that are currently supported in SPI NOR both have a 

> 1-byte status register. But to read it, the read op should be 2-byte 

> long to avoid partial cycles at the end. The second byte is simply 

> discarded.

> 

> 2-byte wide registers might show up in the future, but for now at least 

> we don't have to worry about them.


Well, I guess it doesn't hurt to take it into account now. I mean,
what's happening on the bus in that case is a 2byte transfer, with the
second byte being ignored, which you can describe with a 16bit mask
of 0xMM00 (assuming big endian transfers here, as done for other ops).

> 

> >   

> > >   

> > > >     

> > > > > +		if (ret)

> > > > > +			return ret;

> > > > > +

> > > > > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);    

> > > > 

> > > > You also need some sort of ->poll_status_is_supported() to validate

> > > > that the controller supports the status polling for this specific op (I    

> > > 

> > > I don't think a separate function is needed for checking if the poll 

> > > status op is supported. Return value of -EOPNOTSUPP should be able to 

> > > signal that. This can also be used to check if Octal DDR capable 

> > > controllers are able to poll using 2-byte reads.  

> > 

> > Yeah, I had something more complex in mind to avoid doing this 'try

> > native mode and fall back on sw-based more if not supported' dance

> > every time a status poll is requested (something similar to what we do

> > for dirmaps, with a status poll desc), but I guess that's a bit

> > premature (and probably uneeded).  

> 

> I think Mark also suggested something similar. Make the CPU/non-CPU case 

> transparent to the caller. I agree with with this direction. Makes the 

> caller simpler.


It's kind of orthogonal to what I was suggesting, but yes, that's
definitely a good idea. We certainly don't want the spi-nor layer to
open code the same logic if the spi-mem layer can do it for us.

> 

> I also mentioned in a reply to this patch that supports_op() should be 

> called before the op is executed. That should take care of "base" 

> support for the op. The poll-specific checks can go in the poll_status() 

> function itself. If either of those say the op is not supported, it 

> should fall back to CPU based polling. That's the design that makes the 

> most sense to me.


What I had in mind was more:

1/ create a poll desc with spi_mem_create_poll_status_desc(). The
   "operation supported" check is done here. The controller can store
   all its HW-specific state in there. If the operation is not natively
   supported, a SW-based poll descriptor (similar to the SW-based
   dirmap) is created
2/ poll the status with spi_mem_poll_status(). This function is passed
   a poll descriptor which helps select the path that should be taken
   without having to check every time whether the hardware supports a
   specific status polling op. I can also imagine some preparation
   being done during the desc creation if that makes sense (preparing
   reg values to be written when a status poll request is issued for
   instance)

Anyway, as I said, this sort of optimization might be a bit premature.

Pratyush Yadav May 4, 2021, 7:46 a.m. UTC | #13

On 03/05/21 11:52AM, Boris Brezillon wrote:
> On Mon, 3 May 2021 14:59:37 +0530

> Pratyush Yadav <p.yadav@ti.com> wrote:

> 

> > On 03/05/21 11:11AM, Boris Brezillon wrote:

> > > On Mon, 3 May 2021 14:17:44 +0530

> > > Pratyush Yadav <p.yadav@ti.com> wrote:

> > >   

> > > > On 30/04/21 06:51PM, Boris Brezillon wrote:  

> > > > > On Mon, 26 Apr 2021 16:39:32 +0200

> > > > > <patrice.chotard@foss.st.com> wrote:

> > > > >     

> > > > > > From: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > > > 

> > > > > > With STM32 QSPI, it is possible to poll the status register of the device.

> > > > > > This could be done to offload the CPU during an operation (erase or

> > > > > > program a SPI NAND for example).

> > > > > > 

> > > > > > spi_mem_poll_status API has been added to handle this feature.

> > > > > > 

> > > > > > Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

> > > > > > Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

> > > > > > ---

> > > > > >  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

> > > > > >  include/linux/spi/spi-mem.h |  8 ++++++++

> > > > > >  2 files changed, 42 insertions(+)

> > > > > > 

> > > > > > diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

> > > > > > index 1513553e4080..43dce4b0efa4 100644

> > > > > > --- a/drivers/spi/spi-mem.c

> > > > > > +++ b/drivers/spi/spi-mem.c

> > > > > > @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

> > > > > >  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

> > > > > >  }

> > > > > >  

> > > > > > +/**

> > > > > > + * spi_mem_poll_status() - Poll memory device status

> > > > > > + * @mem: SPI memory device

> > > > > > + * @op: the memory operation to execute

> > > > > > + * @mask: status bitmask to ckeck

> > > > > > + * @match: status expected value

> > > > > > + * @timeout: timeout

> > > > > > + *

> > > > > > + * This function send a polling status request to the controller driver

> > > > > > + *

> > > > > > + * Return: 0 in case of success, -ETIMEDOUT in case of error,

> > > > > > + *         -EOPNOTSUPP if not supported.

> > > > > > + */

> > > > > > +int spi_mem_poll_status(struct spi_mem *mem,

> > > > > > +			const struct spi_mem_op *op,

> > > > > > +			u8 mask, u8 match, u16 timeout)

> > > > > > +{

> > > > > > +	struct spi_controller *ctlr = mem->spi->controller;

> > > > > > +	int ret = -EOPNOTSUPP;

> > > > > > +

> > > > > > +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

> > > > > > +		ret = spi_mem_access_start(mem);    

> > > > > 

> > > > > You should probably check that op is a single byte read before

> > > > > accepting the command.    

> > > > 

> > > > Please do not discriminate against 8D-8D-8D flashes ;-).  

> > > 

> > > Then mask and match should probably be u16 :P. And the check as it is

> > > seems a bit lax to me. Drivers will of course be able to reject the op

> > > when there's more than one byte (or 16bit word in case of 8D) to read,

> > > but it feels like the core could automate that a bit.  

> > 

> > The two 8D flashes that are currently supported in SPI NOR both have a 

> > 1-byte status register. But to read it, the read op should be 2-byte 

> > long to avoid partial cycles at the end. The second byte is simply 

> > discarded.

> > 

> > 2-byte wide registers might show up in the future, but for now at least 

> > we don't have to worry about them.

> 

> Well, I guess it doesn't hurt to take it into account now. I mean,

> what's happening on the bus in that case is a 2byte transfer, with the

> second byte being ignored, which you can describe with a 16bit mask

> of 0xMM00 (assuming big endian transfers here, as done for other ops).


Makes sense.

> 

> > 

> > >   

> > > >   

> > > > >     

> > > > > > +		if (ret)

> > > > > > +			return ret;

> > > > > > +

> > > > > > +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);    

> > > > > 

> > > > > You also need some sort of ->poll_status_is_supported() to validate

> > > > > that the controller supports the status polling for this specific op (I    

> > > > 

> > > > I don't think a separate function is needed for checking if the poll 

> > > > status op is supported. Return value of -EOPNOTSUPP should be able to 

> > > > signal that. This can also be used to check if Octal DDR capable 

> > > > controllers are able to poll using 2-byte reads.  

> > > 

> > > Yeah, I had something more complex in mind to avoid doing this 'try

> > > native mode and fall back on sw-based more if not supported' dance

> > > every time a status poll is requested (something similar to what we do

> > > for dirmaps, with a status poll desc), but I guess that's a bit

> > > premature (and probably uneeded).  

> > 

> > I think Mark also suggested something similar. Make the CPU/non-CPU case 

> > transparent to the caller. I agree with with this direction. Makes the 

> > caller simpler.

> 

> It's kind of orthogonal to what I was suggesting, but yes, that's

> definitely a good idea. We certainly don't want the spi-nor layer to

> open code the same logic if the spi-mem layer can do it for us.

> 

> > 

> > I also mentioned in a reply to this patch that supports_op() should be 

> > called before the op is executed. That should take care of "base" 

> > support for the op. The poll-specific checks can go in the poll_status() 

> > function itself. If either of those say the op is not supported, it 

> > should fall back to CPU based polling. That's the design that makes the 

> > most sense to me.

> 

> What I had in mind was more:

> 

> 1/ create a poll desc with spi_mem_create_poll_status_desc(). The

>    "operation supported" check is done here. The controller can store

>    all its HW-specific state in there. If the operation is not natively

>    supported, a SW-based poll descriptor (similar to the SW-based

>    dirmap) is created

> 2/ poll the status with spi_mem_poll_status(). This function is passed

>    a poll descriptor which helps select the path that should be taken

>    without having to check every time whether the hardware supports a

>    specific status polling op. I can also imagine some preparation

>    being done during the desc creation if that makes sense (preparing

>    reg values to be written when a status poll request is issued for

>    instance)

> 

> Anyway, as I said, this sort of optimization might be a bit premature.


Indeed, this sounds a bit premature to me too.

-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

Patrice CHOTARD May 5, 2021, 7:21 a.m. UTC | #14

Hi Mark

On 4/30/21 5:56 PM, Mark Brown wrote:
> On Fri, Apr 30, 2021 at 04:22:34PM +0200, Patrice CHOTARD wrote:

>> On 4/26/21 6:51 PM, Mark Brown wrote:

>>> On Mon, Apr 26, 2021 at 09:56:12PM +0530, Pratyush Yadav wrote:

> 

>>> Is it possible there's some situation where you're waiting for some bits

>>> to clear as well?

> 

>> Yes, we are waiting STATUS_BUSY bit to be cleared, see patch 2 which is making 

>> usage of this API.

> 

> Then the inverse question applies - is there no circumstance where we

> might be waiting for a bit to be set?

> 

>>> We already have the core handling other timeouts.  We don't pass around

>>> completions but rather have an API function that the driver has to call

>>> when the operation completes, a similar pattern might work here.  Part

> 

>> So, if i correctly understood, you make allusion to what is already done

>> in SPI core framework with spi_finalize_current_transfer() right ?

> 

> Yes, and _current_message().

> 

>>> of the thing with those APIs which I'm missing here is that this will

>>> just return -EOPNOTSUPP if the driver can't do the delay in hardware, I

>>> think it would be cleaner if this API were similar and the core dealt

>>> with doing the delay/poll on the CPU.  That way the users don't need to

>>> repeat the handling for the offload/non-offload cases.

> 

>> Sorry, i didn't catch what you mean here. In PATCH 2, that's the case,

>> if spi_mem_poll_status() is not supported, the core is dealing with 

>> the delay/poll on the CPU in spinand_wait().

> 

> That's in the NAND core, not in spi-mem.  Any other users of spi-mem

> will also need to open code stuff.

> 


Ok, got it, i will transfer what is done in spi_nand_wait() into spi_mem_poll_status() 
in order to get the full feature in spi-mem which will profit to all spi-mem users as requested.

Thanks
Patrice

Patrice CHOTARD May 5, 2021, 7:26 a.m. UTC | #15

Hi Boris, Pratyush

On 5/3/21 11:52 AM, Boris Brezillon wrote:
> On Mon, 3 May 2021 14:59:37 +0530

> Pratyush Yadav <p.yadav@ti.com> wrote:

> 

>> On 03/05/21 11:11AM, Boris Brezillon wrote:

>>> On Mon, 3 May 2021 14:17:44 +0530

>>> Pratyush Yadav <p.yadav@ti.com> wrote:

>>>   

>>>> On 30/04/21 06:51PM, Boris Brezillon wrote:  

>>>>> On Mon, 26 Apr 2021 16:39:32 +0200

>>>>> <patrice.chotard@foss.st.com> wrote:

>>>>>     

>>>>>> From: Christophe Kerello <christophe.kerello@foss.st.com>

>>>>>>

>>>>>> With STM32 QSPI, it is possible to poll the status register of the device.

>>>>>> This could be done to offload the CPU during an operation (erase or

>>>>>> program a SPI NAND for example).

>>>>>>

>>>>>> spi_mem_poll_status API has been added to handle this feature.

>>>>>>

>>>>>> Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>

>>>>>> Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>

>>>>>> ---

>>>>>>  drivers/spi/spi-mem.c       | 34 ++++++++++++++++++++++++++++++++++

>>>>>>  include/linux/spi/spi-mem.h |  8 ++++++++

>>>>>>  2 files changed, 42 insertions(+)

>>>>>>

>>>>>> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c

>>>>>> index 1513553e4080..43dce4b0efa4 100644

>>>>>> --- a/drivers/spi/spi-mem.c

>>>>>> +++ b/drivers/spi/spi-mem.c

>>>>>> @@ -743,6 +743,40 @@ static inline struct spi_mem_driver *to_spi_mem_drv(struct device_driver *drv)

>>>>>>  	return container_of(drv, struct spi_mem_driver, spidrv.driver);

>>>>>>  }

>>>>>>  

>>>>>> +/**

>>>>>> + * spi_mem_poll_status() - Poll memory device status

>>>>>> + * @mem: SPI memory device

>>>>>> + * @op: the memory operation to execute

>>>>>> + * @mask: status bitmask to ckeck

>>>>>> + * @match: status expected value

>>>>>> + * @timeout: timeout

>>>>>> + *

>>>>>> + * This function send a polling status request to the controller driver

>>>>>> + *

>>>>>> + * Return: 0 in case of success, -ETIMEDOUT in case of error,

>>>>>> + *         -EOPNOTSUPP if not supported.

>>>>>> + */

>>>>>> +int spi_mem_poll_status(struct spi_mem *mem,

>>>>>> +			const struct spi_mem_op *op,

>>>>>> +			u8 mask, u8 match, u16 timeout)

>>>>>> +{

>>>>>> +	struct spi_controller *ctlr = mem->spi->controller;

>>>>>> +	int ret = -EOPNOTSUPP;

>>>>>> +

>>>>>> +	if (ctlr->mem_ops && ctlr->mem_ops->poll_status) {

>>>>>> +		ret = spi_mem_access_start(mem);    

>>>>>

>>>>> You should probably check that op is a single byte read before

>>>>> accepting the command.    

>>>>

>>>> Please do not discriminate against 8D-8D-8D flashes ;-).  

>>>

>>> Then mask and match should probably be u16 :P. And the check as it is

>>> seems a bit lax to me. Drivers will of course be able to reject the op

>>> when there's more than one byte (or 16bit word in case of 8D) to read,

>>> but it feels like the core could automate that a bit.  

>>

>> The two 8D flashes that are currently supported in SPI NOR both have a 

>> 1-byte status register. But to read it, the read op should be 2-byte 

>> long to avoid partial cycles at the end. The second byte is simply 

>> discarded.

>>

>> 2-byte wide registers might show up in the future, but for now at least 

>> we don't have to worry about them.

> 

> Well, I guess it doesn't hurt to take it into account now. I mean,

> what's happening on the bus in that case is a 2byte transfer, with the

> second byte being ignored, which you can describe with a 16bit mask

> of 0xMM00 (assuming big endian transfers here, as done for other ops).


OK, i will take the 2 byte case into account.

I will grab all remarks done and come back with an updated proposal in a few days.

Thanks
Patrice

> 

>>

>>>   

>>>>   

>>>>>     

>>>>>> +		if (ret)

>>>>>> +			return ret;

>>>>>> +

>>>>>> +		ret = ctlr->mem_ops->poll_status(mem, op, mask, match, timeout);    

>>>>>

>>>>> You also need some sort of ->poll_status_is_supported() to validate

>>>>> that the controller supports the status polling for this specific op (I    

>>>>

>>>> I don't think a separate function is needed for checking if the poll 

>>>> status op is supported. Return value of -EOPNOTSUPP should be able to 

>>>> signal that. This can also be used to check if Octal DDR capable 

>>>> controllers are able to poll using 2-byte reads.  

>>>

>>> Yeah, I had something more complex in mind to avoid doing this 'try

>>> native mode and fall back on sw-based more if not supported' dance

>>> every time a status poll is requested (something similar to what we do

>>> for dirmaps, with a status poll desc), but I guess that's a bit

>>> premature (and probably uneeded).  

>>

>> I think Mark also suggested something similar. Make the CPU/non-CPU case 

>> transparent to the caller. I agree with with this direction. Makes the 

>> caller simpler.

> 

> It's kind of orthogonal to what I was suggesting, but yes, that's

> definitely a good idea. We certainly don't want the spi-nor layer to

> open code the same logic if the spi-mem layer can do it for us.

> 

>>

>> I also mentioned in a reply to this patch that supports_op() should be 

>> called before the op is executed. That should take care of "base" 

>> support for the op. The poll-specific checks can go in the poll_status() 

>> function itself. If either of those say the op is not supported, it 

>> should fall back to CPU based polling. That's the design that makes the 

>> most sense to me.

> 

> What I had in mind was more:

> 

> 1/ create a poll desc with spi_mem_create_poll_status_desc(). The

>    "operation supported" check is done here. The controller can store

>    all its HW-specific state in there. If the operation is not natively

>    supported, a SW-based poll descriptor (similar to the SW-based

>    dirmap) is created

> 2/ poll the status with spi_mem_poll_status(). This function is passed

>    a poll descriptor which helps select the path that should be taken

>    without having to check every time whether the hardware supports a

>    specific status polling op. I can also imagine some preparation

>    being done during the desc creation if that makes sense (preparing

>    reg values to be written when a status poll request is issued for

>    instance)

> 

> Anyway, as I said, this sort of optimization might be a bit premature.

>

[0/3] MTD: spinand: Add spi_mem_poll_status() support

Message

Comments