diff mbox series

regmap: move readable check before accessing regcache.

Message ID 20210618113558.10046-1-srinivas.kandagatla@linaro.org
State New
Headers show
Series regmap: move readable check before accessing regcache. | expand

Commit Message

Srinivas Kandagatla June 18, 2021, 11:35 a.m. UTC
The issue that I encountered is when doing regmap_update_bits on
a write only register. In regcache path this will not do the right
thing as the register is not readable and driver which is using
regmap_update_bits will never notice that it can not do a update
bits on write only register leading to inconsistent writes and
random hardware behavior.

There seems to be missing checks in regcache_read() which is
now added by moving the orignal check in _regmap_read() before
accessing regcache.

Cc: stable@vger.kernel.org
Fixes: 5d1729e7f02f ("regmap: Incorporate the regcache core into regmap")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

---
 drivers/base/regmap/regmap.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.21.0

Comments

Mark Brown June 18, 2021, 11:51 a.m. UTC | #1
On Fri, Jun 18, 2021 at 12:35:58PM +0100, Srinivas Kandagatla wrote:

> The issue that I encountered is when doing regmap_update_bits on

> a write only register. In regcache path this will not do the right

> thing as the register is not readable and driver which is using

> regmap_update_bits will never notice that it can not do a update

> bits on write only register leading to inconsistent writes and

> random hardware behavior.


Why will use of regmap_update_bits() mean that a driver will never
notice a write failure?  Shouldn't remgap_update_bits() be fixed to
report any errors it isn't reporting, or the driver fixed to check 
error codes?  I really don't understand the issue you're trying to
report - what is "the right thing" and what makes you believe that a
driver can't do an _update_bits() on a write only but cached register?
Can you specify in concrete terms what the problem is.

> There seems to be missing checks in regcache_read() which is

> now added by moving the orignal check in _regmap_read() before

> accessing regcache.


> Cc: stable@vger.kernel.org

> Fixes: 5d1729e7f02f ("regmap: Incorporate the regcache core into regmap")


Are you *sure* you've identified the actual issue here - nobody has seen
any problems with this in the past decade?  Please don't just pick a
random commit for the sake of adding a Fixes tag.

> @@ -2677,6 +2677,9 @@ static int _regmap_read(struct regmap *map, unsigned int reg,

>  	int ret;

>  	void *context = _regmap_map_get_context(map);

>  

> +	if (!regmap_readable(map, reg))

> +		return -EIO;

> +

>  	if (!map->cache_bypass) {

>  		ret = regcache_read(map, reg, val);

>  		if (ret == 0)

> @@ -2686,9 +2689,6 @@ static int _regmap_read(struct regmap *map, unsigned int reg,

>  	if (map->cache_only)

>  		return -EBUSY;

>  

> -	if (!regmap_readable(map, reg))

> -		return -EIO;

> -


This puts the readability check before the cache check which will break
all drivers using the cache on write only registers.
Srinivas Kandagatla June 18, 2021, 12:29 p.m. UTC | #2
Thanks Mark for review,

On 18/06/2021 12:51, Mark Brown wrote:
> On Fri, Jun 18, 2021 at 12:35:58PM +0100, Srinivas Kandagatla wrote:

> 

>> The issue that I encountered is when doing regmap_update_bits on

>> a write only register. In regcache path this will not do the right

>> thing as the register is not readable and driver which is using

>> regmap_update_bits will never notice that it can not do a update

>> bits on write only register leading to inconsistent writes and

>> random hardware behavior.

> 

> Why will use of regmap_update_bits() mean that a driver will never

> notice a write failure?  Shouldn't remgap_update_bits() be fixed to

> report any errors it isn't reporting, or the driver fixed to check


usecase is performing regmap_update_bits() on a *write-only* registers.

_regmap_update_bits() checks _regmap_read() return value before bailing 
out. In non cache path we have this regmap_readable() check however in 
cached patch we do not have this check, so _regmap_read() will return 
success in this case so regmap_update_bits() never reports any error.

driver in question does check the return value.

> error codes?  I really don't understand the issue you're trying to

> report - what is "the right thing" and what makes you believe that a

> driver can't do an _update_bits() on a write only but cached register?

> Can you specify in concrete terms what the problem is.


So one of recent patch ("ASoC: qcom: Fix for DMA interrupt clear reg 
overwriting) 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20210618&id=da0363f7bfd3c32f8d5918e40bfddb9905c86ee1

broke audio on DragonBoard 410c.

This patch simply converts writes to regmap_update_bits for that 
particular dma channel. The register that its updating is IRQ_CLEAR 
register which is software "WRITE-ONLY" and Hardware read-only register.

The bits in particular case is updating is a period interrupt clear bit.

Because we are using regmap cache in this driver,

first regmap_update_bits(map, 0x1, 0x1) on first period interrupt will 
update the cache and write to IRQ_CLEAR hardware register which then 
clears the interrupt latch.
On second period interrupt we do regmap_update_bits(map, 0x1, 0x1) with 
the same bits, Because we are using cache for this regmap caches sees no 
change in the cache value vs the new value so it will never write/update 
  IRQ_CLEAR hardware register, so hardware is stuck here waiting for 
IRQ_CLEAR write from driver and audio keeps repeating the last period.

> 

>> There seems to be missing checks in regcache_read() which is

>> now added by moving the orignal check in _regmap_read() before

>> accessing regcache.

> 

>> Cc: stable@vger.kernel.org

>> Fixes: 5d1729e7f02f ("regmap: Incorporate the regcache core into regmap")

> 

> Are you *sure* you've identified the actual issue here - nobody has seen


I think so, my above triage does summarizes the problem in detail.

> any problems with this in the past decade?  Please don't just pick a

> random commit for the sake of adding a Fixes tag.


I did git blame and picked up this changeset which is when the cache was 
integrated.

> 

>> @@ -2677,6 +2677,9 @@ static int _regmap_read(struct regmap *map, unsigned int reg,

>>   	int ret;

>>   	void *context = _regmap_map_get_context(map);

>>   

>> +	if (!regmap_readable(map, reg))

>> +		return -EIO;

>> +

>>   	if (!map->cache_bypass) {

>>   		ret = regcache_read(map, reg, val);

>>   		if (ret == 0)

>> @@ -2686,9 +2689,6 @@ static int _regmap_read(struct regmap *map, unsigned int reg,

>>   	if (map->cache_only)

>>   		return -EBUSY;

>>   

>> -	if (!regmap_readable(map, reg))

>> -		return -EIO;

>> -

> 

> This puts the readability check before the cache check which will break

> all drivers using the cache on write only registers.

Initially I added check in regcache_read(), later I moved it to 
_regmap_read. do you think check in regcache_read() is the correct place?

--srini
>
Mark Brown June 18, 2021, 3:48 p.m. UTC | #3
On Fri, Jun 18, 2021 at 01:29:50PM +0100, Srinivas Kandagatla wrote:
> On 18/06/2021 12:51, Mark Brown wrote:


> > Why will use of regmap_update_bits() mean that a driver will never

> > notice a write failure?  Shouldn't remgap_update_bits() be fixed to

> > report any errors it isn't reporting, or the driver fixed to check


> usecase is performing regmap_update_bits() on a *write-only* registers.


> _regmap_update_bits() checks _regmap_read() return value before bailing out.

> In non cache path we have this regmap_readable() check however in cached

> patch we do not have this check, so _regmap_read() will return success in

> this case so regmap_update_bits() never reports any error.


> driver in question does check the return value.


OK, so everything is working fine then - what's the problem?  The value
in the register is cached, the read returns that cached value and then
the relevant bits are updated and the new value written out to both the
cache and the hardware.  No part of that sounds like a problem to me.

> > error codes?  I really don't understand the issue you're trying to

> > report - what is "the right thing" and what makes you believe that a

> > driver can't do an _update_bits() on a write only but cached register?

> > Can you specify in concrete terms what the problem is.


> So one of recent patch ("ASoC: qcom: Fix for DMA interrupt clear reg

> overwriting) https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20210618&id=da0363f7bfd3c32f8d5918e40bfddb9905c86ee1

> 

> broke audio on DragonBoard 410c.


> This patch simply converts writes to regmap_update_bits for that particular

> dma channel. The register that its updating is IRQ_CLEAR register which is

> software "WRITE-ONLY" and Hardware read-only register.


So the driver is buggy then, the register is clearly volatile and can't
be cached, and it sounds like it's unsafe to use _update_bits() on it.
The register is clearly not write only since it can be read and it's
volatile since the readback value does not reflect what was written (and
presumably can change),.  Even without that it's buggy to use
_update_bits() here since the device needs the write to happen
regardless of the value that is read back, with the register marked as
volatile that will still potentially fail if the readback value happens
to be the same as whatever bits the driver is trying to set.

Your description of the register is being "software write only" and
"hardware read only" should be a warning sign here, think about what
that might mean for a minute.

> The bits in particular case is updating is a period interrupt clear bit.


> Because we are using regmap cache in this driver,


> first regmap_update_bits(map, 0x1, 0x1) on first period interrupt will

> update the cache and write to IRQ_CLEAR hardware register which then clears

> the interrupt latch.

> On second period interrupt we do regmap_update_bits(map, 0x1, 0x1) with the

> same bits, Because we are using cache for this regmap caches sees no change

> in the cache value vs the new value so it will never write/update  IRQ_CLEAR

> hardware register, so hardware is stuck here waiting for IRQ_CLEAR write

> from driver and audio keeps repeating the last period.


This is because the register is volatile and we can't cache the written
value, it is a driver bug.  If the value in the register does not change
from the value written unexpectedly then the register can be cached and
there is no problem but that's not the case here.
Srinivas Kandagatla June 21, 2021, 10:30 a.m. UTC | #4
Thanks Mark for your inputs,

On 18/06/2021 16:48, Mark Brown wrote:
> On Fri, Jun 18, 2021 at 01:29:50PM +0100, Srinivas Kandagatla wrote:

>> On 18/06/2021 12:51, Mark Brown wrote:

> 

>>> Why will use of regmap_update_bits() mean that a driver will never

>>> notice a write failure?  Shouldn't remgap_update_bits() be fixed to

>>> report any errors it isn't reporting, or the driver fixed to check

> 

>> usecase is performing regmap_update_bits() on a *write-only* registers.

> 

>> _regmap_update_bits() checks _regmap_read() return value before bailing out.

>> In non cache path we have this regmap_readable() check however in cached

>> patch we do not have this check, so _regmap_read() will return success in

>> this case so regmap_update_bits() never reports any error.

> 

>> driver in question does check the return value.

> 

> OK, so everything is working fine then - what's the problem?  The value


How can this be working fine?

In this particular setup the register is marked as write only and is not 
readable. Should it really store value in cache at the first instance?

Also on the other note, if we mark the same regmap as uncached this 
usecase will fail straightaway with -EIO, so why is the behavior 
different in regcache path?

Shouldn't the regcache path check if the register is readable before 
trying to cache the value?

> in the register is cached, the read returns that cached value and then

> the relevant bits are updated and the new value written out to both the

> cache and the hardware.  No part of that sounds like a problem to me.

> 

>>> error codes?  I really don't understand the issue you're trying to

>>> report - what is "the right thing" and what makes you believe that a

>>> driver can't do an _update_bits() on a write only but cached register?

>>> Can you specify in concrete terms what the problem is.

> 

>> So one of recent patch ("ASoC: qcom: Fix for DMA interrupt clear reg

>> overwriting) https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20210618&id=da0363f7bfd3c32f8d5918e40bfddb9905c86ee1

>>

>> broke audio on DragonBoard 410c.

> 

>> This patch simply converts writes to regmap_update_bits for that particular

>> dma channel. The register that its updating is IRQ_CLEAR register which is

>> software "WRITE-ONLY" and Hardware read-only register.

> 

> So the driver is buggy then, the register is clearly volatile and can't

> be cached, and it sounds like it's unsafe to use _update_bits() on it.

> The register is clearly not write only since it can be read and it's

> volatile since the readback value does not reflect what was written (and

> presumably can change),.  Even without that it's buggy to use

> _update_bits() here since the device needs the write to happen

> regardless of the value that is read back, with the register marked as

> volatile that will still potentially fail if the readback value happens

> to be the same as whatever bits the driver is trying to set.


 From "APQ8016E Technical Reference Manual" 
https://developer.qualcomm.com/qfile/28813/lm80-p0436-7_f_410e_proc_apq8016e_device_spec.pdf

Section: 4.5.9.6.19
this register LPASS_LPAIF_IRQ_CLEARa is clearly marked with Type: W

with this description:
"Writing a 1 to a bit in this register clears the latched interrupt event

So am not 100% sure if we read this we will get anything real from the 
register. I always get zeros if I do that.

Should this behavior treated as volatile?

If we mark this register as volatile and make it readable then it would 
work but that just sounds like a hack to avoid cache.

Am sure other hardware platforms have similar write-only registers, how 
do they handle regmap_update_bits case if they have regcache enabled?

thanks,
srini
Mark Brown June 21, 2021, 11:27 a.m. UTC | #5
On Mon, Jun 21, 2021 at 11:30:00AM +0100, Srinivas Kandagatla wrote:
> On 18/06/2021 16:48, Mark Brown wrote:

> > On Fri, Jun 18, 2021 at 01:29:50PM +0100, Srinivas Kandagatla wrote:


> > > _regmap_update_bits() checks _regmap_read() return value before bailing out.

> > > In non cache path we have this regmap_readable() check however in cached

> > > patch we do not have this check, so _regmap_read() will return success in

> > > this case so regmap_update_bits() never reports any error.

> > 

> > > driver in question does check the return value.


> > OK, so everything is working fine then - what's the problem?  The value


> How can this be working fine?


> In this particular setup the register is marked as write only and is not

> readable. Should it really store value in cache at the first instance?


Yes, we know exactly what the value in the register is since we wrote it
so there's no problem with us remembering and using that.

> Also on the other note, if we mark the same regmap as uncached this usecase

> will fail straightaway with -EIO, so why is the behavior different in

> regcache path?


If the register is marked as uncachable then obviously the cache
behaviour is going to be different to that for a register which we can
cache for whatever reason the register was marked volatile.

> Shouldn't the regcache path check if the register is readable before trying

> to cache the value?


Why?  If we know what the value is we can cache it and then use it,
meaning things like restoring the value in a cache sync and update_bits()
work, this is useful especially on devices which have no read support at
all.  What would the benefit it not caching it be?

> From "APQ8016E Technical Reference Manual" https://developer.qualcomm.com/qfile/28813/lm80-p0436-7_f_410e_proc_apq8016e_device_spec.pdf


> Section: 4.5.9.6.19

> this register LPASS_LPAIF_IRQ_CLEARa is clearly marked with Type: W


> with this description:

> "Writing a 1 to a bit in this register clears the latched interrupt event


> So am not 100% sure if we read this we will get anything real from the

> register. I always get zeros if I do that.


> Should this behavior treated as volatile?


Yes.  This is indistingusihable from a register that is volatile because
it doesn't latch written values, given that you're saying readback
actually works there's an argument here that the documentation isn't
accurate here.  My guess is that this device doesn't have any write only
registers as far as anything outside the device is concerned since the
I/O hardware won't fault or anything on reads, it just has addresses
where the read side isn't wired up to anything.

> If we mark this register as volatile and make it readable then it would work

> but that just sounds like a hack to avoid cache.


> Am sure other hardware platforms have similar write-only registers, how do

> they handle regmap_update_bits case if they have regcache enabled?


They either mark the registers as volatile or just don't do any
operations that involve reading the value so it's a non-issue.
Srinivas Kandagatla June 21, 2021, 12:43 p.m. UTC | #6
On 21/06/2021 12:27, Mark Brown wrote:
> On Mon, Jun 21, 2021 at 11:30:00AM +0100, Srinivas Kandagatla wrote:

>> On 18/06/2021 16:48, Mark Brown wrote:

>>> On Fri, Jun 18, 2021 at 01:29:50PM +0100, Srinivas Kandagatla wrote:

> 

>>>> _regmap_update_bits() checks _regmap_read() return value before bailing out.

>>>> In non cache path we have this regmap_readable() check however in cached

>>>> patch we do not have this check, so _regmap_read() will return success in

>>>> this case so regmap_update_bits() never reports any error.

>>>

>>>> driver in question does check the return value.

> 

>>> OK, so everything is working fine then - what's the problem?  The value

> 

>> How can this be working fine?

> 

>> In this particular setup the register is marked as write only and is not

>> readable. Should it really store value in cache at the first instance?

> 

> Yes, we know exactly what the value in the register is since we wrote it

> so there's no problem with us remembering and using that.

> 

>> Also on the other note, if we mark the same regmap as uncached this usecase

>> will fail straightaway with -EIO, so why is the behavior different in

>> regcache path?

> 

> If the register is marked as uncachable then obviously the cache

> behaviour is going to be different to that for a register which we can

> cache for whatever reason the register was marked volatile.

> 

>> Shouldn't the regcache path check if the register is readable before trying

>> to cache the value?

> 

> Why?  If we know what the value is we can cache it and then use it,

> meaning things like restoring the value in a cache sync and update_bits()

> work, this is useful especially on devices which have no read support at


Thanks for the insight,
Yes that makes more sense to have cache for write-only too.

> all.  What would the benefit it not caching it be? >

>>  From "APQ8016E Technical Reference Manual" https://developer.qualcomm.com/qfile/28813/lm80-p0436-7_f_410e_proc_apq8016e_device_spec.pdf

> 

>> Section: 4.5.9.6.19

>> this register LPASS_LPAIF_IRQ_CLEARa is clearly marked with Type: W

> 

>> with this description:

>> "Writing a 1 to a bit in this register clears the latched interrupt event

> 

>> So am not 100% sure if we read this we will get anything real from the

>> register. I always get zeros if I do that.

> 

>> Should this behavior treated as volatile?

> 

> Yes.  This is indistingusihable from a register that is volatile because

> it doesn't latch written values, given that you're saying readback

> actually works there's an argument here that the documentation isn't

> accurate here.  My guess is that this device doesn't have any write only

> registers as far as anything outside the device is concerned since the

> I/O hardware won't fault or anything on reads, it just has addresses

> where the read side isn't wired up to anything.

> 

>> If we mark this register as volatile and make it readable then it would work

>> but that just sounds like a hack to avoid cache.

> 

>> Am sure other hardware platforms have similar write-only registers, how do

>> they handle regmap_update_bits case if they have regcache enabled?

> 

> They either mark the registers as volatile or just don't do any

> operations that involve reading the value so it's a non-issue.

I agree,
qcom lpass driver is already doing the second one here, so we should 
mark the register as volatile and readable to avoid the reported issue.

--srini

>
diff mbox series

Patch

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 297e95be25b3..3ed37a09a8e9 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -2677,6 +2677,9 @@  static int _regmap_read(struct regmap *map, unsigned int reg,
 	int ret;
 	void *context = _regmap_map_get_context(map);
 
+	if (!regmap_readable(map, reg))
+		return -EIO;
+
 	if (!map->cache_bypass) {
 		ret = regcache_read(map, reg, val);
 		if (ret == 0)
@@ -2686,9 +2689,6 @@  static int _regmap_read(struct regmap *map, unsigned int reg,
 	if (map->cache_only)
 		return -EBUSY;
 
-	if (!regmap_readable(map, reg))
-		return -EIO;
-
 	ret = map->reg_read(context, reg, val);
 	if (ret == 0) {
 		if (regmap_should_log(map))