mtd: nand: omap: Fix NAND enumeration on 3430 LDP

Message ID 1415185258-11747-1-git-send-email-rogerq@ti.com
State New
Headers show

Commit Message

Roger Quadros Nov. 5, 2014, 11 a.m.
In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
we switched back to using 1-bit SW ECC scheme by default. However
commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
when checking for small page devices because it was already got rid of
one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
if we can proceed with small page devices or not.

Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")

Cc: <stable@vger.kernel.org>        [3.17+]
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
---
 drivers/mtd/nand/omap2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Tony Lindgren Nov. 6, 2014, 6:03 p.m. | #1
* Roger Quadros <rogerq@ti.com> [141105 03:02]:
> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
> we switched back to using 1-bit SW ECC scheme by default. However
> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
> when checking for small page devices because it was already got rid of
> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
> if we can proceed with small page devices or not.
> 
> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
> 
> Cc: <stable@vger.kernel.org>        [3.17+]
> Reported-by: Tony Lindgren <tony@atomide.com>
> Signed-off-by: Roger Quadros <rogerq@ti.com>
> ---
>  drivers/mtd/nand/omap2.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
> index 3b357e9..758e594 100644
> --- a/drivers/mtd/nand/omap2.c
> +++ b/drivers/mtd/nand/omap2.c
> @@ -1742,7 +1742,8 @@ static int omap_nand_probe(struct platform_device *pdev)
>  	}
>  
>  	/* check for small page devices */
> -	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW)) {
> +	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW) &&
> +	    (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_SW)) {
>  		dev_err(&info->pdev->dev, "small page devices are not supported\n");
>  		err = -EINVAL;
>  		goto return_error;

Should this maybe have || instead of && For the OMAP_ECC_HAM1_CODE_SW?

With this patch applied on top of the other pending fixes, I still
get:

nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xba
nand: Micron MT29F2G16ABD
nand: 128MiB, SLC, page size: 1024, OOB size: 32
omap2-nand omap2-nand.0: small page devices are not supported
omap2-nand: probe of omap2-nand.0 failed with error -22

Regards,

Tony

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roger Quadros Nov. 7, 2014, 9:35 a.m. | #2
On 11/06/2014 08:03 PM, Tony Lindgren wrote:
> * Roger Quadros <rogerq@ti.com> [141105 03:02]:
>> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
>> we switched back to using 1-bit SW ECC scheme by default. However
>> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
>> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
>> when checking for small page devices because it was already got rid of
>> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
>> if we can proceed with small page devices or not.
>>
>> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
>>
>> Cc: <stable@vger.kernel.org>        [3.17+]
>> Reported-by: Tony Lindgren <tony@atomide.com>
>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>> ---
>>  drivers/mtd/nand/omap2.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
>> index 3b357e9..758e594 100644
>> --- a/drivers/mtd/nand/omap2.c
>> +++ b/drivers/mtd/nand/omap2.c
>> @@ -1742,7 +1742,8 @@ static int omap_nand_probe(struct platform_device *pdev)
>>  	}
>>  
>>  	/* check for small page devices */
>> -	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW)) {
>> +	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW) &&
>> +	    (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_SW)) {
>>  		dev_err(&info->pdev->dev, "small page devices are not supported\n");
>>  		err = -EINVAL;
>>  		goto return_error;
> 
> Should this maybe have || instead of && For the OMAP_ECC_HAM1_CODE_SW?

The code is right.

there is a bug in omap3-ldp.dts.

there it says 
ti,nand-ecc-opt = "bch8";

This is wrong. OMAP3 doesn't support bch8. I think you should use either "ham1" or "sw"

Ideally we want "ham1" but to be compatible with with NAND partitions created
using legacy boot we need to stick with "sw"

see board_nand_init() in mach-omap2/board-flash.c

cheers,
-roger

> 
> With this patch applied on top of the other pending fixes, I still
> get:
> 
> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xba
> nand: Micron MT29F2G16ABD
> nand: 128MiB, SLC, page size: 1024, OOB size: 32
> omap2-nand omap2-nand.0: small page devices are not supported
> omap2-nand: probe of omap2-nand.0 failed with error -22
> 
> Regards,
> 
> Tony
> 

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roger Quadros Nov. 7, 2014, 9:58 a.m. | #3
On 11/07/2014 11:35 AM, Roger Quadros wrote:
> On 11/06/2014 08:03 PM, Tony Lindgren wrote:
>> * Roger Quadros <rogerq@ti.com> [141105 03:02]:
>>> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
>>> we switched back to using 1-bit SW ECC scheme by default. However
>>> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
>>> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
>>> when checking for small page devices because it was already got rid of
>>> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
>>> if we can proceed with small page devices or not.
>>>
>>> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
>>>
>>> Cc: <stable@vger.kernel.org>        [3.17+]
>>> Reported-by: Tony Lindgren <tony@atomide.com>
>>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>>> ---
>>>  drivers/mtd/nand/omap2.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
>>> index 3b357e9..758e594 100644
>>> --- a/drivers/mtd/nand/omap2.c
>>> +++ b/drivers/mtd/nand/omap2.c
>>> @@ -1742,7 +1742,8 @@ static int omap_nand_probe(struct platform_device *pdev)
>>>  	}
>>>  
>>>  	/* check for small page devices */
>>> -	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW)) {
>>> +	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW) &&
>>> +	    (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_SW)) {
>>>  		dev_err(&info->pdev->dev, "small page devices are not supported\n");
>>>  		err = -EINVAL;
>>>  		goto return_error;
>>
>> Should this maybe have || instead of && For the OMAP_ECC_HAM1_CODE_SW?
> 
> The code is right.
> 
> there is a bug in omap3-ldp.dts.
> 
> there it says 
> ti,nand-ecc-opt = "bch8";
> 
> This is wrong. OMAP3 doesn't support bch8. I think you should use either "ham1" or "sw"

Well I'm wrong about the OMAP3 information. OMAP3 does support BCH4 and BCH8 as well.

I'm don't thinkg small page check is right. For BCH8 we need 13 bytes per 512 bytes.
In the LDP case we have page size:1024 and OOB size: 32.
Thus for BCH8 we need 26 bytes per page. which should fit in 32 bytes OOB.
So this check is bogus in that case.

Pekon,

can you please explain why you check for 64 bytes OOB size for all ECC schemes?

Tony,

The question for backward compatibility still remains. Even if OMAP3 supports bch8
do we stick to the scheme that was used in legacy boot or do we switch?

Then there is the question of boot rom compatibility. OMAP3 bootloaders use HAM1 scheme.
So if you want to be able to flash bootloader from the kernel we have to stick with HAM1.

changing the ECC scheme would mean that NAND filesystems created earlier won't work
and will have to be erased and recreated.

cheers,
-roger
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Nov. 7, 2014, 10:48 p.m. | #4
* Roger Quadros <rogerq@ti.com> [141107 01:59]:
> On 11/07/2014 11:35 AM, Roger Quadros wrote:
> > On 11/06/2014 08:03 PM, Tony Lindgren wrote:
> >> * Roger Quadros <rogerq@ti.com> [141105 03:02]:
> >>> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
> >>> we switched back to using 1-bit SW ECC scheme by default. However
> >>> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
> >>> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
> >>> when checking for small page devices because it was already got rid of
> >>> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
> >>> if we can proceed with small page devices or not.
> >>>
> >>> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
> >>>
> >>> Cc: <stable@vger.kernel.org>        [3.17+]
> >>> Reported-by: Tony Lindgren <tony@atomide.com>
> >>> Signed-off-by: Roger Quadros <rogerq@ti.com>
> >>> ---
> >>>  drivers/mtd/nand/omap2.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
> >>> index 3b357e9..758e594 100644
> >>> --- a/drivers/mtd/nand/omap2.c
> >>> +++ b/drivers/mtd/nand/omap2.c
> >>> @@ -1742,7 +1742,8 @@ static int omap_nand_probe(struct platform_device *pdev)
> >>>  	}
> >>>  
> >>>  	/* check for small page devices */
> >>> -	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW)) {
> >>> +	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW) &&
> >>> +	    (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_SW)) {
> >>>  		dev_err(&info->pdev->dev, "small page devices are not supported\n");
> >>>  		err = -EINVAL;
> >>>  		goto return_error;
> >>
> >> Should this maybe have || instead of && For the OMAP_ECC_HAM1_CODE_SW?
> > 
> > The code is right.
> > 
> > there is a bug in omap3-ldp.dts.
> > 
> > there it says 
> > ti,nand-ecc-opt = "bch8";
> > 
> > This is wrong. OMAP3 doesn't support bch8. I think you should use either "ham1" or "sw"
> 
> Well I'm wrong about the OMAP3 information. OMAP3 does support BCH4 and BCH8 as well.
> 
> I'm don't thinkg small page check is right. For BCH8 we need 13 bytes per 512 bytes.
> In the LDP case we have page size:1024 and OOB size: 32.
> Thus for BCH8 we need 26 bytes per page. which should fit in 32 bytes OOB.
> So this check is bogus in that case.
> 
> Pekon,
> 
> can you please explain why you check for 64 bytes OOB size for all ECC schemes?
> 
> Tony,
> 
> The question for backward compatibility still remains. Even if OMAP3 supports bch8
> do we stick to the scheme that was used in legacy boot or do we switch?

Well for the boards that had a standard file system, we should support
that. But at least for the LDP, there was only the "bootable microSD card"
AFAIK:

http://support.logicpd.com/ProductDownloads/LegacyProducts/ZoomOMAP34xMDK.aspx
 
> Then there is the question of boot rom compatibility. OMAP3 bootloaders use HAM1 scheme.
> So if you want to be able to flash bootloader from the kernel we have to stick with HAM1.
> 
> changing the ECC scheme would mean that NAND filesystems created earlier won't work
> and will have to be erased and recreated.

Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
must be ham1 to boot at all, that's what we should change for the
devices that do not have not standardized on bch8.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
pekon Nov. 9, 2014, 7:29 p.m. | #5
On Saturday 08 November 2014 04:18 AM, Tony Lindgren wrote:
> * Roger Quadros <rogerq@ti.com> [141107 01:59]:
>> On 11/07/2014 11:35 AM, Roger Quadros wrote:
>>> On 11/06/2014 08:03 PM, Tony Lindgren wrote:
>>>> * Roger Quadros <rogerq@ti.com> [141105 03:02]:
>>>>> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
>>>>> we switched back to using 1-bit SW ECC scheme by default. However
>>>>> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
>>>>> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
>>>>> when checking for small page devices because it was already got rid of
>>>>> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
>>>>> if we can proceed with small page devices or not.
>>>>>
>>>>> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
>>>>>
>>>>> Cc: <stable@vger.kernel.org>        [3.17+]
>>>>> Reported-by: Tony Lindgren <tony@atomide.com>
>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>>>>> ---
[...]
>>
>> Well I'm wrong about the OMAP3 information. OMAP3 does support BCH4 and BCH8 as well.
>>
>> I'm don't thinkg small page check is right. For BCH8 we need 13 bytes per 512 bytes.
>> In the LDP case we have page size:1024 and OOB size: 32.
>> Thus for BCH8 we need 26 bytes per page. which should fit in 32 bytes OOB.
>> So this check is bogus in that case.
>>
>> Pekon,
>>
>> can you please explain why you check for 64 bytes OOB size for all ECC schemes?
>>
Accept this is a bug.
As I re-review now, small-page check is applicable only for BCH ECC 
schemes, and not for HAM1.

>> Tony,
>>
>> The question for backward compatibility still remains. Even if OMAP3 supports bch8
>> do we stick to the scheme that was used in legacy boot or do we switch?
>
> Well for the boards that had a standard file system, we should support
> that. But at least for the LDP, there was only the "bootable microSD card"
> AFAIK:
>
> http://support.logicpd.com/ProductDownloads/LegacyProducts/ZoomOMAP34xMDK.aspx
>
OMAP3 supports BCH ECC scheme.
Document: http://www.ti.com/lit/pdf/spruf98
(1) GPMC in OMAP3 supports BCH8
	Section: 11.1.5.14.3.2 BCH Code
(2) ROM code in GPMC supports BCH (but its not clear whether its BCH4 or 
BCH8)
	Section: 25.4.7.4.3 MLC NAND Read Sector Procedure

But I agree that now it doesn't make sense to put effort on OMAP3 for 
upgrading default ECC schemes. If user has to migrate then there are 
other easy ways of doing so.



>> Then there is the question of boot rom compatibility. OMAP3 bootloaders use HAM1 scheme.
>> So if you want to be able to flash bootloader from the kernel we have to stick with HAM1.
>>
>> changing the ECC scheme would mean that NAND filesystems created earlier won't work
>> and will have to be erased and recreated.
>
> Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
> must be ham1 to boot at all, that's what we should change for the
> devices that do not have not standardized on bch8.
>
My view on this is slightly different:
- Lately, everyone seems to have migrated to BCH8.

- Also HAM1 does not have strength to fix bitflips in aging NAND. So if 
someone already has OMAP3-LDP deployed on field then its NAND would have 
already aged to such an extend that HAM1 may not be sufficient.

My question is..
Moving back to HAM1 should be decided based on statistics rather than 
rule of breaking backward compatibility. So please review:
(1) How many user of OMAP3-Zoom or other old boards complaining about 
using BCH8 in mainline kernel? OR
(2) How many users of OMAP3 legacy boards are migrating to newer kernel?

At-least I have not, rather most of the OMAP3 users I interacted via 
TI's E2E forum wanted to migrate to BCH8 or even BCH16, as HAM1 was not 
sufficient for their usage.

So whatever you decide on this topic, please be cautious that you don't 
re-invent work for yourself, as I did. It took me lot of time and 
testing effort on multiple boards to migrate HAM1 to BCH8, And add BCH16 
for newer devices.

Also, now I realize u-boot (boot-loaders) are better place for all this 
migration.


> Regards,
>
> Tony
>

with regards, pekon

------------------------
Powered by BigRock.com

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Nov. 12, 2014, 6:02 p.m. | #6
* pekon <pekon@pek-sem.com> [141109 11:31]:
> On Saturday 08 November 2014 04:18 AM, Tony Lindgren wrote:
> >
> >Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
> >must be ham1 to boot at all, that's what we should change for the
> >devices that do not have not standardized on bch8.
> >
> My view on this is slightly different:
> - Lately, everyone seems to have migrated to BCH8.
> 
> - Also HAM1 does not have strength to fix bitflips in aging NAND. So if
> someone already has OMAP3-LDP deployed on field then its NAND would have
> already aged to such an extend that HAM1 may not be sufficient.

OK so it makes sense to keep it as BCH8 then. Would be nice to get
the writing u-boot from kernel issue fixed somehow though as people
are hitting that for sure.
 
> My question is..
> Moving back to HAM1 should be decided based on statistics rather than rule
> of breaking backward compatibility. So please review:
> (1) How many user of OMAP3-Zoom or other old boards complaining about using
> BCH8 in mainline kernel? OR

0

> (2) How many users of OMAP3 legacy boards are migrating to newer kernel?

Quite a few for sure, but are probably also using rootfs on MMC instead.
 
> At-least I have not, rather most of the OMAP3 users I interacted via TI's
> E2E forum wanted to migrate to BCH8 or even BCH16, as HAM1 was not
> sufficient for their usage.
> 
> So whatever you decide on this topic, please be cautious that you don't
> re-invent work for yourself, as I did. It took me lot of time and testing
> effort on multiple boards to migrate HAM1 to BCH8, And add BCH16 for newer
> devices.

Right no objections to using BCH8 for rootfs, except it stopped working
over past year or so.

And it seems the settings should be partition specific because of u-boot
requiring HAM1.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roger Quadros Nov. 13, 2014, 11:29 a.m. | #7
On 11/12/2014 08:02 PM, Tony Lindgren wrote:
> * pekon <pekon@pek-sem.com> [141109 11:31]:
>> On Saturday 08 November 2014 04:18 AM, Tony Lindgren wrote:
>>>
>>> Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
>>> must be ham1 to boot at all, that's what we should change for the
>>> devices that do not have not standardized on bch8.
>>>
>> My view on this is slightly different:
>> - Lately, everyone seems to have migrated to BCH8.
>>
>> - Also HAM1 does not have strength to fix bitflips in aging NAND. So if
>> someone already has OMAP3-LDP deployed on field then its NAND would have
>> already aged to such an extend that HAM1 may not be sufficient.
> 
> OK so it makes sense to keep it as BCH8 then. Would be nice to get
> the writing u-boot from kernel issue fixed somehow though as people
> are hitting that for sure.

What about performance impact? OMAP3 doesn't have ELM module. So error location
for BCH8 has to be done in software.

>  
>> My question is..
>> Moving back to HAM1 should be decided based on statistics rather than rule
>> of breaking backward compatibility. So please review:
>> (1) How many user of OMAP3-Zoom or other old boards complaining about using
>> BCH8 in mainline kernel? OR
> 
> 0
> 
>> (2) How many users of OMAP3 legacy boards are migrating to newer kernel?
> 
> Quite a few for sure, but are probably also using rootfs on MMC instead.
>  
>> At-least I have not, rather most of the OMAP3 users I interacted via TI's
>> E2E forum wanted to migrate to BCH8 or even BCH16, as HAM1 was not
>> sufficient for their usage.
>>
>> So whatever you decide on this topic, please be cautious that you don't
>> re-invent work for yourself, as I did. It took me lot of time and testing
>> effort on multiple boards to migrate HAM1 to BCH8, And add BCH16 for newer
>> devices.
> 
> Right no objections to using BCH8 for rootfs, except it stopped working
> over past year or so.

This would be BCH8-sw on OMAP3 right? AM3xx uses BCH8-hw and that seems to work fine.
So it seems nobody has used/tested BCH8-sw.

> 
> And it seems the settings should be partition specific because of u-boot
> requiring HAM1.
> 
I don't think we have partition specific ECC scheme support in u-boot or kernel,
so it will become messy to manage.
That means we either stick with HAM1 for all partitions or don't support NAND boot
at all and go with any other ECC scheme for OMAP3.

cheers,
-roger
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roger Quadros Nov. 13, 2014, noon | #8
On 11/09/2014 09:29 PM, pekon wrote:
> On Saturday 08 November 2014 04:18 AM, Tony Lindgren wrote:
>> * Roger Quadros <rogerq@ti.com> [141107 01:59]:
>>> On 11/07/2014 11:35 AM, Roger Quadros wrote:
>>>> On 11/06/2014 08:03 PM, Tony Lindgren wrote:
>>>>> * Roger Quadros <rogerq@ti.com> [141105 03:02]:
>>>>>> In commit 7d5929c1f343 ("mtd: nand: omap: Revert to using software ECC by default"),
>>>>>> we switched back to using 1-bit SW ECC scheme by default. However
>>>>>> commit b491da7233d5 ("mtd: nand: omap: clean-up ecc layout for BCH ecc schemes")
>>>>>> didn't take into account the 1-bit SW scheme (i.e. OMAP_ECC_HAM1_CODE_SW)
>>>>>> when checking for small page devices because it was already got rid of
>>>>>> one commit earlier. Consider OMAP_ECC_HAM1_CODE_SW while deciding
>>>>>> if we can proceed with small page devices or not.
>>>>>>
>>>>>> Fixes: 7d5929c1f34 ("mtd: nand: omap: Revert to using software ECC by default")
>>>>>>
>>>>>> Cc: <stable@vger.kernel.org>        [3.17+]
>>>>>> Reported-by: Tony Lindgren <tony@atomide.com>
>>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>>>>>> ---
> [...]
>>>
>>> Well I'm wrong about the OMAP3 information. OMAP3 does support BCH4 and BCH8 as well.
>>>
>>> I'm don't thinkg small page check is right. For BCH8 we need 13 bytes per 512 bytes.
>>> In the LDP case we have page size:1024 and OOB size: 32.
>>> Thus for BCH8 we need 26 bytes per page. which should fit in 32 bytes OOB.
>>> So this check is bogus in that case.
>>>
>>> Pekon,
>>>
>>> can you please explain why you check for 64 bytes OOB size for all ECC schemes?
>>>
> Accept this is a bug.
> As I re-review now, small-page check is applicable only for BCH ECC schemes, and not for HAM1.

But 32 bytes OOB can accomodate BCH8 codes. right?

> 
>>> Tony,
>>>
>>> The question for backward compatibility still remains. Even if OMAP3 supports bch8
>>> do we stick to the scheme that was used in legacy boot or do we switch?
>>
>> Well for the boards that had a standard file system, we should support
>> that. But at least for the LDP, there was only the "bootable microSD card"
>> AFAIK:
>>
>> http://support.logicpd.com/ProductDownloads/LegacyProducts/ZoomOMAP34xMDK.aspx
>>
> OMAP3 supports BCH ECC scheme.
> Document: http://www.ti.com/lit/pdf/spruf98
> (1) GPMC in OMAP3 supports BCH8
>     Section: 11.1.5.14.3.2 BCH Code
> (2) ROM code in GPMC supports BCH (but its not clear whether its BCH4 or BCH8)
>     Section: 25.4.7.4.3 MLC NAND Read Sector Procedure

No, ROM code on OMAP3 doesn't support standard BCH but its own proprietary scheme which combines BCH
checksum and other redundancy techniques, which means that MLC NAND is not recommended to be used
as boot media for OMAP3.

cheers,
-roger

> 
> But I agree that now it doesn't make sense to put effort on OMAP3 for upgrading default ECC schemes. If user has to migrate then there are other easy ways of doing so.
> 
> 
> 
>>> Then there is the question of boot rom compatibility. OMAP3 bootloaders use HAM1 scheme.
>>> So if you want to be able to flash bootloader from the kernel we have to stick with HAM1.
>>>
>>> changing the ECC scheme would mean that NAND filesystems created earlier won't work
>>> and will have to be erased and recreated.
>>
>> Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
>> must be ham1 to boot at all, that's what we should change for the
>> devices that do not have not standardized on bch8.
>>
> My view on this is slightly different:
> - Lately, everyone seems to have migrated to BCH8.
> 
> - Also HAM1 does not have strength to fix bitflips in aging NAND. So if someone already has OMAP3-LDP deployed on field then its NAND would have already aged to such an extend that HAM1 may not be sufficient.
> 
> My question is..
> Moving back to HAM1 should be decided based on statistics rather than rule of breaking backward compatibility. So please review:
> (1) How many user of OMAP3-Zoom or other old boards complaining about using BCH8 in mainline kernel? OR
> (2) How many users of OMAP3 legacy boards are migrating to newer kernel?
> 
> At-least I have not, rather most of the OMAP3 users I interacted via TI's E2E forum wanted to migrate to BCH8 or even BCH16, as HAM1 was not sufficient for their usage.
> 
> So whatever you decide on this topic, please be cautious that you don't re-invent work for yourself, as I did. It took me lot of time and testing effort on multiple boards to migrate HAM1 to BCH8, And add BCH16 for newer devices.
> 
> Also, now I realize u-boot (boot-loaders) are better place for all this migration.
> 
> 
>> Regards,
>>
>> Tony
>>
> 
> with regards, pekon
> 
> ------------------------
> Powered by BigRock.com
> 

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Rini Nov. 13, 2014, 4 p.m. | #9
On 11/13/2014 06:29 AM, Roger Quadros wrote:
> On 11/12/2014 08:02 PM, Tony Lindgren wrote:
>> * pekon <pekon@pek-sem.com> [141109 11:31]:
>>> On Saturday 08 November 2014 04:18 AM, Tony Lindgren wrote:
>>>>
>>>> Right. I doubt anybody has bch8 rootfs on LDP.. And considering u-boot
>>>> must be ham1 to boot at all, that's what we should change for the
>>>> devices that do not have not standardized on bch8.
>>>>
>>> My view on this is slightly different:
>>> - Lately, everyone seems to have migrated to BCH8.
>>>
>>> - Also HAM1 does not have strength to fix bitflips in aging NAND. So if
>>> someone already has OMAP3-LDP deployed on field then its NAND would have
>>> already aged to such an extend that HAM1 may not be sufficient.
>>
>> OK so it makes sense to keep it as BCH8 then. Would be nice to get
>> the writing u-boot from kernel issue fixed somehow though as people
>> are hitting that for sure.
> 
> What about performance impact? OMAP3 doesn't have ELM module. So error location
> for BCH8 has to be done in software.
> 
>>  
>>> My question is..
>>> Moving back to HAM1 should be decided based on statistics rather than rule
>>> of breaking backward compatibility. So please review:
>>> (1) How many user of OMAP3-Zoom or other old boards complaining about using
>>> BCH8 in mainline kernel? OR
>>
>> 0
>>
>>> (2) How many users of OMAP3 legacy boards are migrating to newer kernel?
>>
>> Quite a few for sure, but are probably also using rootfs on MMC instead.
>>  
>>> At-least I have not, rather most of the OMAP3 users I interacted via TI's
>>> E2E forum wanted to migrate to BCH8 or even BCH16, as HAM1 was not
>>> sufficient for their usage.
>>>
>>> So whatever you decide on this topic, please be cautious that you don't
>>> re-invent work for yourself, as I did. It took me lot of time and testing
>>> effort on multiple boards to migrate HAM1 to BCH8, And add BCH16 for newer
>>> devices.
>>
>> Right no objections to using BCH8 for rootfs, except it stopped working
>> over past year or so.
> 
> This would be BCH8-sw on OMAP3 right? AM3xx uses BCH8-hw and that seems to work fine.
> So it seems nobody has used/tested BCH8-sw.

A few years back, and I want to choose my words carefully when talking
about error correction, BCH8-sw was "working" well enough for rootfs (I
didn't induce any particular amount of errors or try and cause corner
cases, etc, etc).

>> And it seems the settings should be partition specific because of u-boot
>> requiring HAM1.
>>
> I don't think we have partition specific ECC scheme support in u-boot or kernel,
> so it will become messy to manage.
> That means we either stick with HAM1 for all partitions or don't support NAND boot
> at all and go with any other ECC scheme for OMAP3.

This is indeed where life starts getting more complicated than what we
allow for today in both U-Boot and Kernel, as I recall things anyhow.  I
think omap3 does (and I know am335x and later do) allow for saying ECC
is all done on-chip and ROM should do nothing.  But if you want to boot
you're going to be limited to HAM1 (or in some cases BCH4?  I didn't
have to deal with these parts myself so second-hand recollections here)
if you want to _boot_.  So you'll be in that particular area of the part
life-span where you have to worry about read disturbances and so forth.

We _really_ do need (in both kernel and U-Boot) the ability to say a
"partition" has ECC scheme X and another has scheme Y due to limitations
on what you can boot from vs what you need for the (continued) life span
of the device in question.
Tony Lindgren Nov. 13, 2014, 5:54 p.m. | #10
* Tom Rini <trini@ti.com> [141113 08:02]:
> On 11/13/2014 06:29 AM, Roger Quadros wrote:
> > On 11/12/2014 08:02 PM, Tony Lindgren wrote:
> >>
> >> Right no objections to using BCH8 for rootfs, except it stopped working
> >> over past year or so.
> > 
> > This would be BCH8-sw on OMAP3 right? AM3xx uses BCH8-hw and that seems to work fine.
> > So it seems nobody has used/tested BCH8-sw.
> 
> A few years back, and I want to choose my words carefully when talking
> about error correction, BCH8-sw was "working" well enough for rootfs (I
> didn't induce any particular amount of errors or try and cause corner
> cases, etc, etc).

Yes it still works, but for LDP it broke because of the regression
discussed in $thread.
 
> >> And it seems the settings should be partition specific because of u-boot
> >> requiring HAM1.
> >>
> > I don't think we have partition specific ECC scheme support in u-boot or kernel,
> > so it will become messy to manage.
> > That means we either stick with HAM1 for all partitions or don't support NAND boot
> > at all and go with any other ECC scheme for OMAP3.
> 
> This is indeed where life starts getting more complicated than what we
> allow for today in both U-Boot and Kernel, as I recall things anyhow.  I
> think omap3 does (and I know am335x and later do) allow for saying ECC
> is all done on-chip and ROM should do nothing.  But if you want to boot
> you're going to be limited to HAM1 (or in some cases BCH4?  I didn't
> have to deal with these parts myself so second-hand recollections here)
> if you want to _boot_.  So you'll be in that particular area of the part
> life-span where you have to worry about read disturbances and so forth.
> 
> We _really_ do need (in both kernel and U-Boot) the ability to say a
> "partition" has ECC scheme X and another has scheme Y due to limitations
> on what you can boot from vs what you need for the (continued) life span
> of the device in question.

Totally. Updating the bootloader should be doable safely from userspace
after booting the kernel. Or else should disable those partitions for
kernel because people are trashing their u-boot while trying to update.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
pekon Nov. 15, 2014, 11:12 a.m. | #11
Hi,

On Thursday 13 November 2014 04:59 PM, Roger Quadros wrote:
[...]
>
> What about performance impact? OMAP3 doesn't have ELM module. So error location
> for BCH8 has to be done in software.
>

Performance impact happens only when there are bit-flips in an page.
- For BCH8_HW ECC schemes
ELM comes in the picture only when during reads there is an ECC 
mis-match, if there is no ECC mis-match GPMC engine alone handles all 
the decoding part.

- For BCH8_SW ECC schemes
Same-way, lib/bch.8: decode_bch() path is taken only when ECC mismatch 
is detected on the read path, for normal read/write GPMC engine handles 
everything.

Yes, you may argue that with aging of the NAND the occurrence of 
bit-flips is common. That's true, but for new devices especially for SLC 
NAND the occurrence is usually rare on fresh parts. Also considering 
that most SLC are now manufactured in matured technologies.

Also, if you are using UBIFS on top of NAND, then it will scrub the data 
on first detection of bit-flips, thereby reducing the accumulation of 
bit-flips and thereby conserving read performance.


>>
>>
[...]
>> Right no objections to using BCH8 for rootfs, except it stopped working
>> over past year or so.
>
> This would be BCH8-sw on OMAP3 right? AM3xx uses BCH8-hw and that seems to work fine.
> So it seems nobody has used/tested BCH8-sw.
>
I have tested BCH8_SW on both AM335x and later SoC, and checked its 
compatibility with various combinations of u-boot. [1] should give you 
some indications of various combinations users can try ..


with regards, pekon

[1]
http://processors.wiki.ti.com/index.php/Linux_Core_NAND_User%27s_Guide#ECC_schemes_support


------------------------
Powered by BigRock.com

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch hide | download patch | download mbox

diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index 3b357e9..758e594 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1742,7 +1742,8 @@  static int omap_nand_probe(struct platform_device *pdev)
 	}
 
 	/* check for small page devices */
-	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW)) {
+	if ((mtd->oobsize < 64) && (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_HW) &&
+	    (pdata->ecc_opt != OMAP_ECC_HAM1_CODE_SW)) {
 		dev_err(&info->pdev->dev, "small page devices are not supported\n");
 		err = -EINVAL;
 		goto return_error;