mbox series

[v2,0/4] Add support for hwspinlock bust

Message ID 20240524-hwspinlock-bust-v2-0-fb88fd17ca0b@quicinc.com
Headers show
Series Add support for hwspinlock bust | expand

Message

Chris Lew May 25, 2024, 1:26 a.m. UTC
hwspinlocks can be acquired by many devices on the SoC. If any of these
devices go into a bad state before the device releases the hwspinlock,
then that hwspinlock may end up in an unusable state.

In the case of smem, each remoteproc takes a hwspinlock before trying to
allocate an smem item. If the remoteproc were to suddenly crash without
releasing this, it would be impossible for other remoteprocs to allocate
any smem items.

We propose a new api to bust a hwspinlock. A driver can use the
the bust api if it detects the device has gone into an error state, thus
allowing other entities in the system to use the hwspinlock.

These patches were tested on an sm8650 mtp using engineering cdsp
firmware that triggers a watchdog with the smem hwspinlock acquired.

Checked for error in dt-bindings with below.
 - make DT_CHECKER_FLAGS=-m DT_SCHEMA_FILES=remoteproc/qcom,pas-common.yaml dt_binding_check
 - make qcom/sm8650-mtp.dtb CHECK_DTBS=1

Signed-off-by: Chris Lew <quic_clew@quicinc.com>
---
Changes in v2:
- Remove extra print in qcom_q6v5_pas as suggested by Konrad
- Expose SMEM API that remotproc can call to bust the smem hwlock as suggested by Krzysztof
- Drop patches related to multiple references on a hwlock
- Link to v1: https://lore.kernel.org/r/20240516-hwspinlock-bust-v1-0-47a90a859238@quicinc.com

---
Chris Lew (1):
      soc: qcom: smem: Add qcom_smem_bust_hwspin_lock_by_host()

Richard Maina (3):
      hwspinlock: Introduce hwspin_lock_bust()
      hwspinlock: qcom: implement bust operation
      remoteproc: qcom_q6v5_pas: Add hwspinlock bust on stop

 Documentation/locking/hwspinlock.rst     | 11 +++++++++++
 drivers/hwspinlock/hwspinlock_core.c     | 28 ++++++++++++++++++++++++++++
 drivers/hwspinlock/hwspinlock_internal.h |  3 +++
 drivers/hwspinlock/qcom_hwspinlock.c     | 25 +++++++++++++++++++++++++
 drivers/remoteproc/qcom_q6v5_pas.c       | 11 +++++++++++
 drivers/soc/qcom/smem.c                  | 28 ++++++++++++++++++++++++++++
 include/linux/hwspinlock.h               |  6 ++++++
 include/linux/soc/qcom/smem.h            |  2 ++
 8 files changed, 114 insertions(+)
---
base-commit: e7b4ef8fffaca247809337bb78daceb406659f2d
change-id: 20240509-hwspinlock-bust-d497a70c1a3a

Best regards,

Comments

Bjorn Andersson May 28, 2024, 9:55 p.m. UTC | #1
On Fri, May 24, 2024 at 06:26:42PM GMT, Chris Lew wrote:
> Add qcom_smem_bust_hwspin_lock_by_host to enable remoteproc to bust the
> hwspin_lock owned by smem. In the event the remoteproc crashes
> unexpectedly, the remoteproc driver can invoke this API to try and bust
> the hwspin_lock and release the lock if still held by the remoteproc
> device.
> 
> Signed-off-by: Chris Lew <quic_clew@quicinc.com>
> ---
>  drivers/soc/qcom/smem.c       | 28 ++++++++++++++++++++++++++++
>  include/linux/soc/qcom/smem.h |  2 ++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
> index 7191fa0c087f..683599990387 100644
> --- a/drivers/soc/qcom/smem.c
> +++ b/drivers/soc/qcom/smem.c
> @@ -359,6 +359,34 @@ static struct qcom_smem *__smem;
>  /* Timeout (ms) for the trylock of remote spinlocks */
>  #define HWSPINLOCK_TIMEOUT	1000
>  
> +/* The qcom hwspinlock id is always plus one from the smem host id */
> +#define SMEM_HOST_ID_TO_HWSPINLOCK_ID(__x) ((__x) + 1)
> +
> +/**
> + * qcom_smem_bust_hwspin_lock_by_host() - bust the smem hwspinlock for an smem host id
> + * @host:	remote processor id
> + *
> + * Busts the hwspin_lock for the given smem host id. This helper is intended for remoteproc drivers
> + * that manage remoteprocs with an equivalent smem driver instance in the remote firmware. Drivers
> + * can force a release of the smem hwspin_lock if the rproc unexpectedly goes into a bad state.

Please wrap these at 80 characters.

> + *
> + * Context: Process context.
> + *
> + * Returns: 0 on success, otherwise negative errno.
> + */
> +int qcom_smem_bust_hwspin_lock_by_host(unsigned host)
> +{
> +	if (!__smem)
> +		return -EPROBE_DEFER;

This would be called at a time where -EPROBE_DEFER isn't appropriate,
the client should invoke qcom_smem_is_available() at probe time to guard
against this.

> +
> +	/* This function is for remote procs, so ignore SMEM_HOST_APPS */
> +	if (host == SMEM_HOST_APPS ||host >= SMEM_HOST_COUNT)

Missing <space> after ||

Regards,
Bjorn

> +		return -EINVAL;
> +
> +	return hwspin_lock_bust(__smem->hwlock, SMEM_HOST_ID_TO_HWSPINLOCK_ID(host));
> +}
> +EXPORT_SYMBOL_GPL(qcom_smem_bust_hwspin_lock_by_host);
> +
>  /**
>   * qcom_smem_is_available() - Check if SMEM is available
>   *
> diff --git a/include/linux/soc/qcom/smem.h b/include/linux/soc/qcom/smem.h
> index a36a3b9d4929..959eea0812bb 100644
> --- a/include/linux/soc/qcom/smem.h
> +++ b/include/linux/soc/qcom/smem.h
> @@ -14,4 +14,6 @@ phys_addr_t qcom_smem_virt_to_phys(void *p);
>  
>  int qcom_smem_get_soc_id(u32 *id);
>  
> +int qcom_smem_bust_hwspin_lock_by_host(unsigned host);
> +
>  #endif
> 
> -- 
> 2.25.1
>
Chris Lew May 28, 2024, 10:50 p.m. UTC | #2
On 5/28/2024 2:55 PM, Bjorn Andersson wrote:
> On Fri, May 24, 2024 at 06:26:42PM GMT, Chris Lew wrote:
>> Add qcom_smem_bust_hwspin_lock_by_host to enable remoteproc to bust the
>> hwspin_lock owned by smem. In the event the remoteproc crashes
>> unexpectedly, the remoteproc driver can invoke this API to try and bust
>> the hwspin_lock and release the lock if still held by the remoteproc
>> device.
>>
>> Signed-off-by: Chris Lew <quic_clew@quicinc.com>
>> ---
>>   drivers/soc/qcom/smem.c       | 28 ++++++++++++++++++++++++++++
>>   include/linux/soc/qcom/smem.h |  2 ++
>>   2 files changed, 30 insertions(+)
>>
>> diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
>> index 7191fa0c087f..683599990387 100644
>> --- a/drivers/soc/qcom/smem.c
>> +++ b/drivers/soc/qcom/smem.c
...
>> + *
>> + * Context: Process context.
>> + *
>> + * Returns: 0 on success, otherwise negative errno.
>> + */
>> +int qcom_smem_bust_hwspin_lock_by_host(unsigned host)
>> +{
>> +	if (!__smem)
>> +		return -EPROBE_DEFER;
> 
> This would be called at a time where -EPROBE_DEFER isn't appropriate,
> the client should invoke qcom_smem_is_available() at probe time to guard
> against this.
> 

Should we keep the null pointer check to prevent null pointer 
dereference and return 0? Or would it be better to allow the null 
pointer deference to go through so we can catch misuse of the API and 
ask clients to use qcom_smem_is_available()?
Bjorn Andersson May 29, 2024, 2:08 a.m. UTC | #3
On Tue, May 28, 2024 at 03:50:25PM -0700, Chris Lew wrote:
> 
> 
> On 5/28/2024 2:55 PM, Bjorn Andersson wrote:
> > On Fri, May 24, 2024 at 06:26:42PM GMT, Chris Lew wrote:
> > > Add qcom_smem_bust_hwspin_lock_by_host to enable remoteproc to bust the
> > > hwspin_lock owned by smem. In the event the remoteproc crashes
> > > unexpectedly, the remoteproc driver can invoke this API to try and bust
> > > the hwspin_lock and release the lock if still held by the remoteproc
> > > device.
> > > 
> > > Signed-off-by: Chris Lew <quic_clew@quicinc.com>
> > > ---
> > >   drivers/soc/qcom/smem.c       | 28 ++++++++++++++++++++++++++++
> > >   include/linux/soc/qcom/smem.h |  2 ++
> > >   2 files changed, 30 insertions(+)
> > > 
> > > diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
> > > index 7191fa0c087f..683599990387 100644
> > > --- a/drivers/soc/qcom/smem.c
> > > +++ b/drivers/soc/qcom/smem.c
> ...
> > > + *
> > > + * Context: Process context.
> > > + *
> > > + * Returns: 0 on success, otherwise negative errno.
> > > + */
> > > +int qcom_smem_bust_hwspin_lock_by_host(unsigned host)
> > > +{
> > > +	if (!__smem)
> > > +		return -EPROBE_DEFER;
> > 
> > This would be called at a time where -EPROBE_DEFER isn't appropriate,
> > the client should invoke qcom_smem_is_available() at probe time to guard
> > against this.
> > 
> 
> Should we keep the null pointer check to prevent null pointer dereference
> and return 0? Or would it be better to allow the null pointer deference to
> go through so we can catch misuse of the API and ask clients to use
> qcom_smem_is_available()?
> 

I like the helpful callstack you get from the NULL pointer
dereference...

Regards,
Bjorn