diff mbox series

[01/14] mtd: gpmi: change the BCH layout setting for large oob NAND

Message ID 20200504140903.23602-2-peng.fan@nxp.com
State New
Headers show
Series mtd: nand: i.MX update | expand

Commit Message

Peng Fan May 4, 2020, 2:08 p.m. UTC
From: Ye Li <ye.li at nxp.com>

The code change updated the NAND driver BCH ECC layout algorithm to
support large oob size NAND chips(oob > 1024 bytes) and proposed a new
way to set ECC layout.

Current implementation requires each chunk size larger than oob size so
the bad block marker (BBM) can be guaranteed located in data chunk. The
ECC layout always using the unbalanced layout(Ecc for both meta and
Data0 chunk), but for the NAND chips with oob larger than 1k, the driver
cannot support because BCH doesn?t support GF 15 for 2K chunk.

The change keeps the data chunk no larger than 1k and adjust the ECC
strength or ECC layout to locate the BBM in data chunk. General idea for
large oob NAND chips is

1.Try all ECC strength from the minimum value required by NAND spec to
the maximum one that works, any ECC makes the BBM locate in data chunk
can be chosen.

2.If none of them works, using separate ECC for meta, which will add one
extra ecc with the same ECC strength as other data chunks. This extra
ECC can guarantee BBM located in data chunk, of course, we need to check
if oob can afford it.

Previous code has two methods for ECC layout setting, the
legacy_calc_ecc_layout and calc_ecc_layout_by_info, the difference
between these two methods is, legacy_calc_ecc_layout set the chunk size
larger chan oob size and then set the maximum ECC strength that oob can
afford. While the calc_ecc_layout_by_info set chunk size and ECC
strength according to NAND spec. It has been proved that the first
method cannot provide safe ECC strength for some modern NAND chips, so
in current code,

1. Driver read NAND parameters first and then chose the proper ECC
layout setting method.

2. If the oob is large or NAND required data chunk larger than oob size,
chose calc_ecc_for_large_oob, otherwise use calc_ecc_layout_by_info

3. legacy_calc_ecc_layout only used for some NAND chips does not contains
necessary information. So this is only a backup plan, it is NOT
recommended to use these NAND chips.

Signed-off-by: Han Xu <b45815 at freescale.com>
Signed-off-by: Ye Li <ye.li at nxp.com>
Signed-off-by: Peng Fan <peng.fan at nxp.com>
---
 drivers/mtd/nand/raw/mxs_nand.c | 205 ++++++++++++++++++++++++++--------------
 include/mxs_nand.h              |  12 ++-
 2 files changed, 144 insertions(+), 73 deletions(-)

Comments

Stefano Babic May 11, 2020, 10:16 a.m. UTC | #1
> From: Ye Li <ye.li at nxp.com>
> The code change updated the NAND driver BCH ECC layout algorithm to
> support large oob size NAND chips(oob > 1024 bytes) and proposed a new
> way to set ECC layout.
> Current implementation requires each chunk size larger than oob size so
> the bad block marker (BBM) can be guaranteed located in data chunk. The
> ECC layout always using the unbalanced layout(Ecc for both meta and
> Data0 chunk), but for the NAND chips with oob larger than 1k, the driver
> cannot support because BCH doesn?t support GF 15 for 2K chunk.
> The change keeps the data chunk no larger than 1k and adjust the ECC
> strength or ECC layout to locate the BBM in data chunk. General idea for
> large oob NAND chips is
> 1.Try all ECC strength from the minimum value required by NAND spec to
> the maximum one that works, any ECC makes the BBM locate in data chunk
> can be chosen.
> 2.If none of them works, using separate ECC for meta, which will add one
> extra ecc with the same ECC strength as other data chunks. This extra
> ECC can guarantee BBM located in data chunk, of course, we need to check
> if oob can afford it.
> Previous code has two methods for ECC layout setting, the
> legacy_calc_ecc_layout and calc_ecc_layout_by_info, the difference
> between these two methods is, legacy_calc_ecc_layout set the chunk size
> larger chan oob size and then set the maximum ECC strength that oob can
> afford. While the calc_ecc_layout_by_info set chunk size and ECC
> strength according to NAND spec. It has been proved that the first
> method cannot provide safe ECC strength for some modern NAND chips, so
> in current code,
> 1. Driver read NAND parameters first and then chose the proper ECC
> layout setting method.
> 2. If the oob is large or NAND required data chunk larger than oob size,
> chose calc_ecc_for_large_oob, otherwise use calc_ecc_layout_by_info
> 3. legacy_calc_ecc_layout only used for some NAND chips does not contains
> necessary information. So this is only a backup plan, it is NOT
> recommended to use these NAND chips.
> Signed-off-by: Han Xu <b45815 at freescale.com>
> Signed-off-by: Ye Li <ye.li at nxp.com>
> Signed-off-by: Peng Fan <peng.fan at nxp.com>
Applied to u-boot-imx, master, thanks !

Best regards,
Stefano Babic
Fabio Estevam March 16, 2022, 2:08 p.m. UTC | #2
Adding Han Xu's NXP email on Cc.

On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
<frieder.schrempf@kontron.de> wrote:
>
> Hello everyone,
>
> sorry to dig out an old thread, but the below patch which was applied
> upstream as 616f03dabacb causes a regression for me when trying to
> attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
> AMD/Spansion parallel NAND.
>
> The failure looks like this:
>
> ubi0: attaching mtd2
> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> from PEB 0:0, read 64 bytes
> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> from PEB 0:2048, read 2048 bytes
> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> from PEB 1:0, read 64 bytes
> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> from PEB 1:2048, read 2048 bytes
>
> The NAND as reported by Linux is:
>
> nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
> nand: AMD/Spansion S34ML04G1
> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>
> A different revision of the same board with a different NAND from
> manufacturer ESMT doesn't show the issue:
>
> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
> nand: ESMT NAND 512MiB 3,3V 8-bit
> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>
> When I revert the mentioned commit (see patch here: [1]), the UBI boot
> starts working again.
>
> Does anyone know what the problem is and how to properly solve it?
>
> Thanks for any help!
> Frieder
>
> [1]
> https://zerobin.net/?57a57a322bbdcf3c#rZa3vHlWi+RxtRomoljtrngqWwiv6v4Js/2LNfdV10o=
>
> Am 04.05.20 um 16:08 schrieb Peng Fan:
> > From: Ye Li <ye.li@nxp.com>
> >
> > The code change updated the NAND driver BCH ECC layout algorithm to
> > support large oob size NAND chips(oob > 1024 bytes) and proposed a new
> > way to set ECC layout.
> >
> > Current implementation requires each chunk size larger than oob size so
> > the bad block marker (BBM) can be guaranteed located in data chunk. The
> > ECC layout always using the unbalanced layout(Ecc for both meta and
> > Data0 chunk), but for the NAND chips with oob larger than 1k, the driver
> > cannot support because BCH doesn’t support GF 15 for 2K chunk.
> >
> > The change keeps the data chunk no larger than 1k and adjust the ECC
> > strength or ECC layout to locate the BBM in data chunk. General idea for
> > large oob NAND chips is
> >
> > 1.Try all ECC strength from the minimum value required by NAND spec to
> > the maximum one that works, any ECC makes the BBM locate in data chunk
> > can be chosen.
> >
> > 2.If none of them works, using separate ECC for meta, which will add one
> > extra ecc with the same ECC strength as other data chunks. This extra
> > ECC can guarantee BBM located in data chunk, of course, we need to check
> > if oob can afford it.
> >
> > Previous code has two methods for ECC layout setting, the
> > legacy_calc_ecc_layout and calc_ecc_layout_by_info, the difference
> > between these two methods is, legacy_calc_ecc_layout set the chunk size
> > larger chan oob size and then set the maximum ECC strength that oob can
> > afford. While the calc_ecc_layout_by_info set chunk size and ECC
> > strength according to NAND spec. It has been proved that the first
> > method cannot provide safe ECC strength for some modern NAND chips, so
> > in current code,
> >
> > 1. Driver read NAND parameters first and then chose the proper ECC
> > layout setting method.
> >
> > 2. If the oob is large or NAND required data chunk larger than oob size,
> > chose calc_ecc_for_large_oob, otherwise use calc_ecc_layout_by_info
> >
> > 3. legacy_calc_ecc_layout only used for some NAND chips does not contains
> > necessary information. So this is only a backup plan, it is NOT
> > recommended to use these NAND chips.
> >
> > Signed-off-by: Han Xu <b45815@freescale.com>
> > Signed-off-by: Ye Li <ye.li@nxp.com>
> > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/mtd/nand/raw/mxs_nand.c | 205 ++++++++++++++++++++++++++--------------
> >  include/mxs_nand.h              |  12 ++-
> >  2 files changed, 144 insertions(+), 73 deletions(-)
> >
> > diff --git a/drivers/mtd/nand/raw/mxs_nand.c b/drivers/mtd/nand/raw/mxs_nand.c
> > index fe8097c146..8da59e39c0 100644
> > --- a/drivers/mtd/nand/raw/mxs_nand.c
> > +++ b/drivers/mtd/nand/raw/mxs_nand.c
> > @@ -112,53 +112,32 @@ static uint32_t mxs_nand_aux_status_offset(void)
> >       return (MXS_NAND_METADATA_SIZE + 0x3) & ~0x3;
> >  }
> >
> > -static inline int mxs_nand_calc_mark_offset(struct bch_geometry *geo,
> > -                                         uint32_t page_data_size)
> > +static inline bool mxs_nand_bbm_in_data_chunk(struct bch_geometry *geo, struct mtd_info *mtd,
> > +             unsigned int *chunk_num)
> >  {
> > -     uint32_t chunk_data_size_in_bits = geo->ecc_chunk_size * 8;
> > -     uint32_t chunk_ecc_size_in_bits = geo->ecc_strength * geo->gf_len;
> > -     uint32_t chunk_total_size_in_bits;
> > -     uint32_t block_mark_chunk_number;
> > -     uint32_t block_mark_chunk_bit_offset;
> > -     uint32_t block_mark_bit_offset;
> > +     unsigned int i, j;
> >
> > -     chunk_total_size_in_bits =
> > -                     chunk_data_size_in_bits + chunk_ecc_size_in_bits;
> > -
> > -     /* Compute the bit offset of the block mark within the physical page. */
> > -     block_mark_bit_offset = page_data_size * 8;
> > -
> > -     /* Subtract the metadata bits. */
> > -     block_mark_bit_offset -= MXS_NAND_METADATA_SIZE * 8;
> > -
> > -     /*
> > -      * Compute the chunk number (starting at zero) in which the block mark
> > -      * appears.
> > -      */
> > -     block_mark_chunk_number =
> > -                     block_mark_bit_offset / chunk_total_size_in_bits;
> > -
> > -     /*
> > -      * Compute the bit offset of the block mark within its chunk, and
> > -      * validate it.
> > -      */
> > -     block_mark_chunk_bit_offset = block_mark_bit_offset -
> > -                     (block_mark_chunk_number * chunk_total_size_in_bits);
> > +     if (geo->ecc_chunk0_size != geo->ecc_chunkn_size) {
> > +             dev_err(this->dev, "The size of chunk0 must equal to chunkn\n");
> > +             return false;
> > +     }
> >
> > -     if (block_mark_chunk_bit_offset > chunk_data_size_in_bits)
> > -             return -EINVAL;
> > +     i = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) /
> > +             (geo->gf_len * geo->ecc_strength +
> > +                             geo->ecc_chunkn_size * 8);
> >
> > -     /*
> > -      * Now that we know the chunk number in which the block mark appears,
> > -      * we can subtract all the ECC bits that appear before it.
> > -      */
> > -     block_mark_bit_offset -=
> > -             block_mark_chunk_number * chunk_ecc_size_in_bits;
> > +     j = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) -
> > +             (geo->gf_len * geo->ecc_strength +
> > +                             geo->ecc_chunkn_size * 8) * i;
> >
> > -     geo->block_mark_byte_offset = block_mark_bit_offset >> 3;
> > -     geo->block_mark_bit_offset = block_mark_bit_offset & 0x7;
> > +     if (j < geo->ecc_chunkn_size * 8) {
> > +             *chunk_num = i + 1;
> > +             dev_dbg(this->dev, "Set ecc to %d and bbm in chunk %d\n",
> > +                     geo->ecc_strength, *chunk_num);
> > +             return true;
> > +     }
> >
> > -     return 0;
> > +     return false;
> >  }
> >
> >  static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> > @@ -168,6 +147,7 @@ static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> >  {
> >       struct nand_chip *chip = mtd_to_nand(mtd);
> >       struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > +     unsigned int block_mark_bit_offset;
> >
> >       switch (ecc_step) {
> >       case SZ_512:
> > @@ -180,45 +160,51 @@ static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> >               return -EINVAL;
> >       }
> >
> > -     geo->ecc_chunk_size = ecc_step;
> > +     geo->ecc_chunk0_size = ecc_step;
> > +     geo->ecc_chunkn_size = ecc_step;
> >       geo->ecc_strength = round_up(ecc_strength, 2);
> >
> >       /* Keep the C >= O */
> > -     if (geo->ecc_chunk_size < mtd->oobsize)
> > +     if (geo->ecc_chunkn_size < mtd->oobsize)
> >               return -EINVAL;
> >
> >       if (geo->ecc_strength > nand_info->max_ecc_strength_supported)
> >               return -EINVAL;
> >
> > -     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
> > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> > +
> > +     /* For bit swap. */
> > +     block_mark_bit_offset = mtd->writesize * 8 -
> > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
> > +                             + MXS_NAND_METADATA_SIZE * 8);
> > +
> > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> >
> >       return 0;
> >  }
> >
> > -static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
> > +static inline int mxs_nand_legacy_calc_ecc_layout(struct bch_geometry *geo,
> >                                          struct mtd_info *mtd)
> >  {
> >       struct nand_chip *chip = mtd_to_nand(mtd);
> >       struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > +     unsigned int block_mark_bit_offset;
> >
> >       /* The default for the length of Galois Field. */
> >       geo->gf_len = 13;
> >
> >       /* The default for chunk size. */
> > -     geo->ecc_chunk_size = 512;
> > +     geo->ecc_chunk0_size = 512;
> > +     geo->ecc_chunkn_size = 512;
> >
> > -     if (geo->ecc_chunk_size < mtd->oobsize) {
> > +     if (geo->ecc_chunkn_size < mtd->oobsize) {
> >               geo->gf_len = 14;
> > -             geo->ecc_chunk_size *= 2;
> > +             geo->ecc_chunk0_size *= 2;
> > +             geo->ecc_chunkn_size *= 2;
> >       }
> >
> > -     if (mtd->oobsize > geo->ecc_chunk_size) {
> > -             printf("Not support the NAND chips whose oob size is larger then %d bytes!\n",
> > -                    geo->ecc_chunk_size);
> > -             return -EINVAL;
> > -     }
> > -
> > -     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
> > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> >
> >       /*
> >        * Determine the ECC layout with the formula:
> > @@ -234,6 +220,84 @@ static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
> >       geo->ecc_strength = min(round_down(geo->ecc_strength, 2),
> >                               nand_info->max_ecc_strength_supported);
> >
> > +     block_mark_bit_offset = mtd->writesize * 8 -
> > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
> > +                             + MXS_NAND_METADATA_SIZE * 8);
> > +
> > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> > +
> > +     return 0;
> > +}
> > +
> > +static inline int mxs_nand_calc_ecc_for_large_oob(struct bch_geometry *geo,
> > +                                        struct mtd_info *mtd)
> > +{
> > +     struct nand_chip *chip = mtd_to_nand(mtd);
> > +     struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > +     unsigned int block_mark_bit_offset;
> > +     unsigned int max_ecc;
> > +     unsigned int bbm_chunk;
> > +     unsigned int i;
> > +
> > +     /* sanity check for the minimum ecc nand required */
> > +     if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
> > +             return -EINVAL;
> > +     geo->ecc_strength = chip->ecc_strength_ds;
> > +
> > +     /* calculate the maximum ecc platform can support*/
> > +     geo->gf_len = 14;
> > +     geo->ecc_chunk0_size = 1024;
> > +     geo->ecc_chunkn_size = 1024;
> > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> > +     max_ecc = ((mtd->oobsize - MXS_NAND_METADATA_SIZE) * 8)
> > +                     / (geo->gf_len * geo->ecc_chunk_count);
> > +     max_ecc = min(round_down(max_ecc, 2),
> > +                             nand_info->max_ecc_strength_supported);
> > +
> > +
> > +     /* search a supported ecc strength that makes bbm */
> > +     /* located in data chunk  */
> > +     geo->ecc_strength = chip->ecc_strength_ds;
> > +     while (!(geo->ecc_strength > max_ecc)) {
> > +             if (mxs_nand_bbm_in_data_chunk(geo, mtd, &bbm_chunk))
> > +                     break;
> > +             geo->ecc_strength += 2;
> > +     }
> > +
> > +     /* if none of them works, keep using the minimum ecc */
> > +     /* nand required but changing ecc page layout  */
> > +     if (geo->ecc_strength > max_ecc) {
> > +             geo->ecc_strength = chip->ecc_strength_ds;
> > +             /* add extra ecc for meta data */
> > +             geo->ecc_chunk0_size = 0;
> > +             geo->ecc_chunk_count = (mtd->writesize / geo->ecc_chunkn_size) + 1;
> > +             geo->ecc_for_meta = 1;
> > +             /* check if oob can afford this extra ecc chunk */
> > +             if (mtd->oobsize * 8 < MXS_NAND_METADATA_SIZE * 8 +
> > +                             geo->gf_len * geo->ecc_strength
> > +                             * geo->ecc_chunk_count) {
> > +                     printf("unsupported NAND chip with new layout\n");
> > +                     return -EINVAL;
> > +             }
> > +
> > +             /* calculate in which chunk bbm located */
> > +             bbm_chunk = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8 -
> > +                     geo->gf_len * geo->ecc_strength) /
> > +                     (geo->gf_len * geo->ecc_strength +
> > +                                     geo->ecc_chunkn_size * 8) + 1;
> > +     }
> > +
> > +     /* calculate the number of ecc chunk behind the bbm */
> > +     i = (mtd->writesize / geo->ecc_chunkn_size) - bbm_chunk + 1;
> > +
> > +     block_mark_bit_offset = mtd->writesize * 8 -
> > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - i)
> > +                             + MXS_NAND_METADATA_SIZE * 8);
> > +
> > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> > +
> >       return 0;
> >  }
> >
> > @@ -983,18 +1047,23 @@ static int mxs_nand_set_geometry(struct mtd_info *mtd, struct bch_geometry *geo)
> >       struct nand_chip *nand = mtd_to_nand(mtd);
> >       struct mxs_nand_info *nand_info = nand_get_controller_data(nand);
> >
> > -     if (chip->ecc.strength > 0 && chip->ecc.size > 0)
> > -             return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> > -                             chip->ecc.strength, chip->ecc.size);
> > +     if (chip->ecc_strength_ds > nand_info->max_ecc_strength_supported) {
> > +             printf("unsupported NAND chip, minimum ecc required %d\n"
> > +                     , chip->ecc_strength_ds);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0) &&
> > +                     (mtd->oobsize < 1024)) {
> > +             dev_warn(this->dev, "use legacy bch geometry\n");
> > +             return mxs_nand_legacy_calc_ecc_layout(geo, mtd);
> > +     }
> >
> > -     if (nand_info->use_minimum_ecc ||
> > -             mxs_nand_calc_ecc_layout(geo, mtd)) {
> > -             if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
> > -                     return -EINVAL;
> > +     if (mtd->oobsize > 1024 || chip->ecc_step_ds < mtd->oobsize)
> > +             return mxs_nand_calc_ecc_for_large_oob(geo, mtd);
> >
> > -             return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> > +     return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> >                               chip->ecc_strength_ds, chip->ecc_step_ds);
> > -     }
> >
> >       return 0;
> >  }
> > @@ -1025,8 +1094,6 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> >       if (ret)
> >               return ret;
> >
> > -     mxs_nand_calc_mark_offset(geo, mtd->writesize);
> > -
> >       /* Configure BCH and set NFC geometry */
> >       mxs_reset_block(&bch_regs->hw_bch_ctrl_reg);
> >
> > @@ -1034,7 +1101,7 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> >       tmp = (geo->ecc_chunk_count - 1) << BCH_FLASHLAYOUT0_NBLOCKS_OFFSET;
> >       tmp |= MXS_NAND_METADATA_SIZE << BCH_FLASHLAYOUT0_META_SIZE_OFFSET;
> >       tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT0_ECC0_OFFSET;
> > -     tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > +     tmp |= geo->ecc_chunk0_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> >       tmp |= (geo->gf_len == 14 ? 1 : 0) <<
> >               BCH_FLASHLAYOUT0_GF13_0_GF14_1_OFFSET;
> >       writel(tmp, &bch_regs->hw_bch_flash0layout0);
> > @@ -1043,7 +1110,7 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> >       tmp = (mtd->writesize + mtd->oobsize)
> >               << BCH_FLASHLAYOUT1_PAGE_SIZE_OFFSET;
> >       tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT1_ECCN_OFFSET;
> > -     tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > +     tmp |= geo->ecc_chunkn_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> >       tmp |= (geo->gf_len == 14 ? 1 : 0) <<
> >               BCH_FLASHLAYOUT1_GF13_0_GF14_1_OFFSET;
> >       writel(tmp, &bch_regs->hw_bch_flash0layout1);
> > @@ -1268,7 +1335,7 @@ int mxs_nand_init_ctrl(struct mxs_nand_info *nand_info)
> >
> >       nand->ecc.layout        = &fake_ecc_layout;
> >       nand->ecc.mode          = NAND_ECC_HW;
> > -     nand->ecc.size          = nand_info->bch_geometry.ecc_chunk_size;
> > +     nand->ecc.size          = nand_info->bch_geometry.ecc_chunkn_size;
> >       nand->ecc.strength      = nand_info->bch_geometry.ecc_strength;
> >
> >       /* second phase scan */
> > diff --git a/include/mxs_nand.h b/include/mxs_nand.h
> > index ada20483d0..497da77a16 100644
> > --- a/include/mxs_nand.h
> > +++ b/include/mxs_nand.h
> > @@ -16,22 +16,26 @@
> >   * @gf_len:                   The length of Galois Field. (e.g., 13 or 14)
> >   * @ecc_strength:             A number that describes the strength of the ECC
> >   *                            algorithm.
> > - * @ecc_chunk_size:           The size, in bytes, of a single ECC chunk. Note
> > - *                            the first chunk in the page includes both data and
> > - *                            metadata, so it's a bit larger than this value.
> > + * @ecc_chunk0_size:          The size, in bytes, of a first ECC chunk.
> > + * @ecc_chunkn_size:          The size, in bytes, of a single ECC chunk after
> > + *                            the first chunk in the page.
> >   * @ecc_chunk_count:          The number of ECC chunks in the page,
> >   * @block_mark_byte_offset:   The byte offset in the ECC-based page view at
> >   *                            which the underlying physical block mark appears.
> >   * @block_mark_bit_offset:    The bit offset into the ECC-based page view at
> >   *                            which the underlying physical block mark appears.
> > + * @ecc_for_meta:             The flag to indicate if there is a dedicate ecc
> > + *                            for meta.
> >   */
> >  struct bch_geometry {
> >       unsigned int  gf_len;
> >       unsigned int  ecc_strength;
> > -     unsigned int  ecc_chunk_size;
> > +     unsigned int  ecc_chunk0_size;
> > +     unsigned int  ecc_chunkn_size;
> >       unsigned int  ecc_chunk_count;
> >       unsigned int  block_mark_byte_offset;
> >       unsigned int  block_mark_bit_offset;
> > +     unsigned int  ecc_for_meta; /* ECC for meta data */
> >  };
> >
> >  struct mxs_nand_info {
> >
Tim Harvey March 16, 2022, 4:34 p.m. UTC | #3
On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com> wrote:
>
> Adding Han Xu's NXP email on Cc.
>
> On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
> <frieder.schrempf@kontron.de> wrote:
> >
> > Hello everyone,
> >
> > sorry to dig out an old thread, but the below patch which was applied
> > upstream as 616f03dabacb causes a regression for me when trying to
> > attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
> > AMD/Spansion parallel NAND.
> >
> > The failure looks like this:
> >
> > ubi0: attaching mtd2
> > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> > from PEB 0:0, read 64 bytes
> > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> > from PEB 0:2048, read 2048 bytes
> > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> > from PEB 1:0, read 64 bytes
> > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> > from PEB 1:2048, read 2048 bytes
> >
> > The NAND as reported by Linux is:
> >
> > nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
> > nand: AMD/Spansion S34ML04G1
> > nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> >
> > A different revision of the same board with a different NAND from
> > manufacturer ESMT doesn't show the issue:
> >
> > nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
> > nand: ESMT NAND 512MiB 3,3V 8-bit
> > nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> >
> > When I revert the mentioned commit (see patch here: [1]), the UBI boot
> > starts working again.
> >
> > Does anyone know what the problem is and how to properly solve it?
> >
> > Thanks for any help!
> > Frieder
> >
> > [1]
> > https://zerobin.net/?57a57a322bbdcf3c#rZa3vHlWi+RxtRomoljtrngqWwiv6v4Js/2LNfdV10o=
> >

Frieder,

I see the same issue here with IMX6Q/DL GPMI NAND.

If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi &&
nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
find that U-Boot can attach and mount the ubi fine but Linux will have
issues

Ventana > ubi part rootfs
ubi0: attaching mtd3
ubi0: scanning is finished
ubi0: attached mtd3 (name "rootfs", size 2031 MiB)
ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
ubi0: good PEBs: 8123, bad PEBs: 1, corrupted PEBs: 0
ubi0: user volume: 3, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 1/0, WL threshold: 4096, image sequence
number: 1632759352
ubi0: available PEBs: 0, total reserved PEBs: 8123, PEBs reserved for
bad PEB handling: 159
Ventana > ubifsmount ubi0:boot
Ventana > ubifsls
            38158  Mon Sep 27 17:18:10 2021  gateworks-imx6-imx6q-gw5400-a.dtb
            41659  Mon Sep 27 17:18:07 2021  gateworks-imx6-imx6dl-gw53xx.dtb
            42964  Mon Sep 27 17:18:10 2021  gateworks-imx6-imx6q-gw53xx.dtb
...

[    2.065328] ubi0: attaching mtd2
[    2.092015] ubi0 warning: ubi_io_read: error -74 (ECC error) while
reading 253952 bytes from PEB 11:8192, read only 253952 bytes, retry
[    2.096132] mmc0: host does not support reading read-only switch,
assuming write-enable
[    2.119427] mmc0: new high speed SDHC card at address 0007
[    2.126035] mmcblk0: mmc0:0007 SD32G 29.9 GiB
[    2.130221] ubi0 warning: ubi_io_read: error -74 (ECC error) while
reading 253952 bytes from PEB 11:8192, read only 253952 bytes, retry
[    2.131746]  mmcblk0: p1
[    2.164579] ubi0 warning: ubi_io_read: error -74 (ECC error) while
reading 253952 bytes from PEB 11:8192, read only 253952 bytes, retry
[    2.197969] ubi0 error: ubi_io_read: error -74 (ECC error) while
reading 253952 bytes from PEB 11:8192, read 253952 bytes

Best regards,

Tim

> > Am 04.05.20 um 16:08 schrieb Peng Fan:
> > > From: Ye Li <ye.li@nxp.com>
> > >
> > > The code change updated the NAND driver BCH ECC layout algorithm to
> > > support large oob size NAND chips(oob > 1024 bytes) and proposed a new
> > > way to set ECC layout.
> > >
> > > Current implementation requires each chunk size larger than oob size so
> > > the bad block marker (BBM) can be guaranteed located in data chunk. The
> > > ECC layout always using the unbalanced layout(Ecc for both meta and
> > > Data0 chunk), but for the NAND chips with oob larger than 1k, the driver
> > > cannot support because BCH doesn’t support GF 15 for 2K chunk.
> > >
> > > The change keeps the data chunk no larger than 1k and adjust the ECC
> > > strength or ECC layout to locate the BBM in data chunk. General idea for
> > > large oob NAND chips is
> > >
> > > 1.Try all ECC strength from the minimum value required by NAND spec to
> > > the maximum one that works, any ECC makes the BBM locate in data chunk
> > > can be chosen.
> > >
> > > 2.If none of them works, using separate ECC for meta, which will add one
> > > extra ecc with the same ECC strength as other data chunks. This extra
> > > ECC can guarantee BBM located in data chunk, of course, we need to check
> > > if oob can afford it.
> > >
> > > Previous code has two methods for ECC layout setting, the
> > > legacy_calc_ecc_layout and calc_ecc_layout_by_info, the difference
> > > between these two methods is, legacy_calc_ecc_layout set the chunk size
> > > larger chan oob size and then set the maximum ECC strength that oob can
> > > afford. While the calc_ecc_layout_by_info set chunk size and ECC
> > > strength according to NAND spec. It has been proved that the first
> > > method cannot provide safe ECC strength for some modern NAND chips, so
> > > in current code,
> > >
> > > 1. Driver read NAND parameters first and then chose the proper ECC
> > > layout setting method.
> > >
> > > 2. If the oob is large or NAND required data chunk larger than oob size,
> > > chose calc_ecc_for_large_oob, otherwise use calc_ecc_layout_by_info
> > >
> > > 3. legacy_calc_ecc_layout only used for some NAND chips does not contains
> > > necessary information. So this is only a backup plan, it is NOT
> > > recommended to use these NAND chips.
> > >
> > > Signed-off-by: Han Xu <b45815@freescale.com>
> > > Signed-off-by: Ye Li <ye.li@nxp.com>
> > > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> > > ---
> > >  drivers/mtd/nand/raw/mxs_nand.c | 205 ++++++++++++++++++++++++++--------------
> > >  include/mxs_nand.h              |  12 ++-
> > >  2 files changed, 144 insertions(+), 73 deletions(-)
> > >
> > > diff --git a/drivers/mtd/nand/raw/mxs_nand.c b/drivers/mtd/nand/raw/mxs_nand.c
> > > index fe8097c146..8da59e39c0 100644
> > > --- a/drivers/mtd/nand/raw/mxs_nand.c
> > > +++ b/drivers/mtd/nand/raw/mxs_nand.c
> > > @@ -112,53 +112,32 @@ static uint32_t mxs_nand_aux_status_offset(void)
> > >       return (MXS_NAND_METADATA_SIZE + 0x3) & ~0x3;
> > >  }
> > >
> > > -static inline int mxs_nand_calc_mark_offset(struct bch_geometry *geo,
> > > -                                         uint32_t page_data_size)
> > > +static inline bool mxs_nand_bbm_in_data_chunk(struct bch_geometry *geo, struct mtd_info *mtd,
> > > +             unsigned int *chunk_num)
> > >  {
> > > -     uint32_t chunk_data_size_in_bits = geo->ecc_chunk_size * 8;
> > > -     uint32_t chunk_ecc_size_in_bits = geo->ecc_strength * geo->gf_len;
> > > -     uint32_t chunk_total_size_in_bits;
> > > -     uint32_t block_mark_chunk_number;
> > > -     uint32_t block_mark_chunk_bit_offset;
> > > -     uint32_t block_mark_bit_offset;
> > > +     unsigned int i, j;
> > >
> > > -     chunk_total_size_in_bits =
> > > -                     chunk_data_size_in_bits + chunk_ecc_size_in_bits;
> > > -
> > > -     /* Compute the bit offset of the block mark within the physical page. */
> > > -     block_mark_bit_offset = page_data_size * 8;
> > > -
> > > -     /* Subtract the metadata bits. */
> > > -     block_mark_bit_offset -= MXS_NAND_METADATA_SIZE * 8;
> > > -
> > > -     /*
> > > -      * Compute the chunk number (starting at zero) in which the block mark
> > > -      * appears.
> > > -      */
> > > -     block_mark_chunk_number =
> > > -                     block_mark_bit_offset / chunk_total_size_in_bits;
> > > -
> > > -     /*
> > > -      * Compute the bit offset of the block mark within its chunk, and
> > > -      * validate it.
> > > -      */
> > > -     block_mark_chunk_bit_offset = block_mark_bit_offset -
> > > -                     (block_mark_chunk_number * chunk_total_size_in_bits);
> > > +     if (geo->ecc_chunk0_size != geo->ecc_chunkn_size) {
> > > +             dev_err(this->dev, "The size of chunk0 must equal to chunkn\n");
> > > +             return false;
> > > +     }
> > >
> > > -     if (block_mark_chunk_bit_offset > chunk_data_size_in_bits)
> > > -             return -EINVAL;
> > > +     i = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) /
> > > +             (geo->gf_len * geo->ecc_strength +
> > > +                             geo->ecc_chunkn_size * 8);
> > >
> > > -     /*
> > > -      * Now that we know the chunk number in which the block mark appears,
> > > -      * we can subtract all the ECC bits that appear before it.
> > > -      */
> > > -     block_mark_bit_offset -=
> > > -             block_mark_chunk_number * chunk_ecc_size_in_bits;
> > > +     j = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) -
> > > +             (geo->gf_len * geo->ecc_strength +
> > > +                             geo->ecc_chunkn_size * 8) * i;
> > >
> > > -     geo->block_mark_byte_offset = block_mark_bit_offset >> 3;
> > > -     geo->block_mark_bit_offset = block_mark_bit_offset & 0x7;
> > > +     if (j < geo->ecc_chunkn_size * 8) {
> > > +             *chunk_num = i + 1;
> > > +             dev_dbg(this->dev, "Set ecc to %d and bbm in chunk %d\n",
> > > +                     geo->ecc_strength, *chunk_num);
> > > +             return true;
> > > +     }
> > >
> > > -     return 0;
> > > +     return false;
> > >  }
> > >
> > >  static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> > > @@ -168,6 +147,7 @@ static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> > >  {
> > >       struct nand_chip *chip = mtd_to_nand(mtd);
> > >       struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > > +     unsigned int block_mark_bit_offset;
> > >
> > >       switch (ecc_step) {
> > >       case SZ_512:
> > > @@ -180,45 +160,51 @@ static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
> > >               return -EINVAL;
> > >       }
> > >
> > > -     geo->ecc_chunk_size = ecc_step;
> > > +     geo->ecc_chunk0_size = ecc_step;
> > > +     geo->ecc_chunkn_size = ecc_step;
> > >       geo->ecc_strength = round_up(ecc_strength, 2);
> > >
> > >       /* Keep the C >= O */
> > > -     if (geo->ecc_chunk_size < mtd->oobsize)
> > > +     if (geo->ecc_chunkn_size < mtd->oobsize)
> > >               return -EINVAL;
> > >
> > >       if (geo->ecc_strength > nand_info->max_ecc_strength_supported)
> > >               return -EINVAL;
> > >
> > > -     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
> > > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> > > +
> > > +     /* For bit swap. */
> > > +     block_mark_bit_offset = mtd->writesize * 8 -
> > > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
> > > +                             + MXS_NAND_METADATA_SIZE * 8);
> > > +
> > > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> > >
> > >       return 0;
> > >  }
> > >
> > > -static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
> > > +static inline int mxs_nand_legacy_calc_ecc_layout(struct bch_geometry *geo,
> > >                                          struct mtd_info *mtd)
> > >  {
> > >       struct nand_chip *chip = mtd_to_nand(mtd);
> > >       struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > > +     unsigned int block_mark_bit_offset;
> > >
> > >       /* The default for the length of Galois Field. */
> > >       geo->gf_len = 13;
> > >
> > >       /* The default for chunk size. */
> > > -     geo->ecc_chunk_size = 512;
> > > +     geo->ecc_chunk0_size = 512;
> > > +     geo->ecc_chunkn_size = 512;
> > >
> > > -     if (geo->ecc_chunk_size < mtd->oobsize) {
> > > +     if (geo->ecc_chunkn_size < mtd->oobsize) {
> > >               geo->gf_len = 14;
> > > -             geo->ecc_chunk_size *= 2;
> > > +             geo->ecc_chunk0_size *= 2;
> > > +             geo->ecc_chunkn_size *= 2;
> > >       }
> > >
> > > -     if (mtd->oobsize > geo->ecc_chunk_size) {
> > > -             printf("Not support the NAND chips whose oob size is larger then %d bytes!\n",
> > > -                    geo->ecc_chunk_size);
> > > -             return -EINVAL;
> > > -     }
> > > -
> > > -     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
> > > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> > >
> > >       /*
> > >        * Determine the ECC layout with the formula:
> > > @@ -234,6 +220,84 @@ static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
> > >       geo->ecc_strength = min(round_down(geo->ecc_strength, 2),
> > >                               nand_info->max_ecc_strength_supported);
> > >
> > > +     block_mark_bit_offset = mtd->writesize * 8 -
> > > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
> > > +                             + MXS_NAND_METADATA_SIZE * 8);
> > > +
> > > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> > > +
> > > +     return 0;
> > > +}
> > > +
> > > +static inline int mxs_nand_calc_ecc_for_large_oob(struct bch_geometry *geo,
> > > +                                        struct mtd_info *mtd)
> > > +{
> > > +     struct nand_chip *chip = mtd_to_nand(mtd);
> > > +     struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
> > > +     unsigned int block_mark_bit_offset;
> > > +     unsigned int max_ecc;
> > > +     unsigned int bbm_chunk;
> > > +     unsigned int i;
> > > +
> > > +     /* sanity check for the minimum ecc nand required */
> > > +     if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
> > > +             return -EINVAL;
> > > +     geo->ecc_strength = chip->ecc_strength_ds;
> > > +
> > > +     /* calculate the maximum ecc platform can support*/
> > > +     geo->gf_len = 14;
> > > +     geo->ecc_chunk0_size = 1024;
> > > +     geo->ecc_chunkn_size = 1024;
> > > +     geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
> > > +     max_ecc = ((mtd->oobsize - MXS_NAND_METADATA_SIZE) * 8)
> > > +                     / (geo->gf_len * geo->ecc_chunk_count);
> > > +     max_ecc = min(round_down(max_ecc, 2),
> > > +                             nand_info->max_ecc_strength_supported);
> > > +
> > > +
> > > +     /* search a supported ecc strength that makes bbm */
> > > +     /* located in data chunk  */
> > > +     geo->ecc_strength = chip->ecc_strength_ds;
> > > +     while (!(geo->ecc_strength > max_ecc)) {
> > > +             if (mxs_nand_bbm_in_data_chunk(geo, mtd, &bbm_chunk))
> > > +                     break;
> > > +             geo->ecc_strength += 2;
> > > +     }
> > > +
> > > +     /* if none of them works, keep using the minimum ecc */
> > > +     /* nand required but changing ecc page layout  */
> > > +     if (geo->ecc_strength > max_ecc) {
> > > +             geo->ecc_strength = chip->ecc_strength_ds;
> > > +             /* add extra ecc for meta data */
> > > +             geo->ecc_chunk0_size = 0;
> > > +             geo->ecc_chunk_count = (mtd->writesize / geo->ecc_chunkn_size) + 1;
> > > +             geo->ecc_for_meta = 1;
> > > +             /* check if oob can afford this extra ecc chunk */
> > > +             if (mtd->oobsize * 8 < MXS_NAND_METADATA_SIZE * 8 +
> > > +                             geo->gf_len * geo->ecc_strength
> > > +                             * geo->ecc_chunk_count) {
> > > +                     printf("unsupported NAND chip with new layout\n");
> > > +                     return -EINVAL;
> > > +             }
> > > +
> > > +             /* calculate in which chunk bbm located */
> > > +             bbm_chunk = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8 -
> > > +                     geo->gf_len * geo->ecc_strength) /
> > > +                     (geo->gf_len * geo->ecc_strength +
> > > +                                     geo->ecc_chunkn_size * 8) + 1;
> > > +     }
> > > +
> > > +     /* calculate the number of ecc chunk behind the bbm */
> > > +     i = (mtd->writesize / geo->ecc_chunkn_size) - bbm_chunk + 1;
> > > +
> > > +     block_mark_bit_offset = mtd->writesize * 8 -
> > > +             (geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - i)
> > > +                             + MXS_NAND_METADATA_SIZE * 8);
> > > +
> > > +     geo->block_mark_byte_offset = block_mark_bit_offset / 8;
> > > +     geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
> > > +
> > >       return 0;
> > >  }
> > >
> > > @@ -983,18 +1047,23 @@ static int mxs_nand_set_geometry(struct mtd_info *mtd, struct bch_geometry *geo)
> > >       struct nand_chip *nand = mtd_to_nand(mtd);
> > >       struct mxs_nand_info *nand_info = nand_get_controller_data(nand);
> > >
> > > -     if (chip->ecc.strength > 0 && chip->ecc.size > 0)
> > > -             return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> > > -                             chip->ecc.strength, chip->ecc.size);
> > > +     if (chip->ecc_strength_ds > nand_info->max_ecc_strength_supported) {
> > > +             printf("unsupported NAND chip, minimum ecc required %d\n"
> > > +                     , chip->ecc_strength_ds);
> > > +             return -EINVAL;
> > > +     }
> > > +
> > > +     if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0) &&
> > > +                     (mtd->oobsize < 1024)) {
> > > +             dev_warn(this->dev, "use legacy bch geometry\n");
> > > +             return mxs_nand_legacy_calc_ecc_layout(geo, mtd);
> > > +     }
> > >
> > > -     if (nand_info->use_minimum_ecc ||
> > > -             mxs_nand_calc_ecc_layout(geo, mtd)) {
> > > -             if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
> > > -                     return -EINVAL;
> > > +     if (mtd->oobsize > 1024 || chip->ecc_step_ds < mtd->oobsize)
> > > +             return mxs_nand_calc_ecc_for_large_oob(geo, mtd);
> > >
> > > -             return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> > > +     return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
> > >                               chip->ecc_strength_ds, chip->ecc_step_ds);
> > > -     }
> > >
> > >       return 0;
> > >  }
> > > @@ -1025,8 +1094,6 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> > >       if (ret)
> > >               return ret;
> > >
> > > -     mxs_nand_calc_mark_offset(geo, mtd->writesize);
> > > -
> > >       /* Configure BCH and set NFC geometry */
> > >       mxs_reset_block(&bch_regs->hw_bch_ctrl_reg);
> > >
> > > @@ -1034,7 +1101,7 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> > >       tmp = (geo->ecc_chunk_count - 1) << BCH_FLASHLAYOUT0_NBLOCKS_OFFSET;
> > >       tmp |= MXS_NAND_METADATA_SIZE << BCH_FLASHLAYOUT0_META_SIZE_OFFSET;
> > >       tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT0_ECC0_OFFSET;
> > > -     tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > > +     tmp |= geo->ecc_chunk0_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > >       tmp |= (geo->gf_len == 14 ? 1 : 0) <<
> > >               BCH_FLASHLAYOUT0_GF13_0_GF14_1_OFFSET;
> > >       writel(tmp, &bch_regs->hw_bch_flash0layout0);
> > > @@ -1043,7 +1110,7 @@ int mxs_nand_setup_ecc(struct mtd_info *mtd)
> > >       tmp = (mtd->writesize + mtd->oobsize)
> > >               << BCH_FLASHLAYOUT1_PAGE_SIZE_OFFSET;
> > >       tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT1_ECCN_OFFSET;
> > > -     tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > > +     tmp |= geo->ecc_chunkn_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
> > >       tmp |= (geo->gf_len == 14 ? 1 : 0) <<
> > >               BCH_FLASHLAYOUT1_GF13_0_GF14_1_OFFSET;
> > >       writel(tmp, &bch_regs->hw_bch_flash0layout1);
> > > @@ -1268,7 +1335,7 @@ int mxs_nand_init_ctrl(struct mxs_nand_info *nand_info)
> > >
> > >       nand->ecc.layout        = &fake_ecc_layout;
> > >       nand->ecc.mode          = NAND_ECC_HW;
> > > -     nand->ecc.size          = nand_info->bch_geometry.ecc_chunk_size;
> > > +     nand->ecc.size          = nand_info->bch_geometry.ecc_chunkn_size;
> > >       nand->ecc.strength      = nand_info->bch_geometry.ecc_strength;
> > >
> > >       /* second phase scan */
> > > diff --git a/include/mxs_nand.h b/include/mxs_nand.h
> > > index ada20483d0..497da77a16 100644
> > > --- a/include/mxs_nand.h
> > > +++ b/include/mxs_nand.h
> > > @@ -16,22 +16,26 @@
> > >   * @gf_len:                   The length of Galois Field. (e.g., 13 or 14)
> > >   * @ecc_strength:             A number that describes the strength of the ECC
> > >   *                            algorithm.
> > > - * @ecc_chunk_size:           The size, in bytes, of a single ECC chunk. Note
> > > - *                            the first chunk in the page includes both data and
> > > - *                            metadata, so it's a bit larger than this value.
> > > + * @ecc_chunk0_size:          The size, in bytes, of a first ECC chunk.
> > > + * @ecc_chunkn_size:          The size, in bytes, of a single ECC chunk after
> > > + *                            the first chunk in the page.
> > >   * @ecc_chunk_count:          The number of ECC chunks in the page,
> > >   * @block_mark_byte_offset:   The byte offset in the ECC-based page view at
> > >   *                            which the underlying physical block mark appears.
> > >   * @block_mark_bit_offset:    The bit offset into the ECC-based page view at
> > >   *                            which the underlying physical block mark appears.
> > > + * @ecc_for_meta:             The flag to indicate if there is a dedicate ecc
> > > + *                            for meta.
> > >   */
> > >  struct bch_geometry {
> > >       unsigned int  gf_len;
> > >       unsigned int  ecc_strength;
> > > -     unsigned int  ecc_chunk_size;
> > > +     unsigned int  ecc_chunk0_size;
> > > +     unsigned int  ecc_chunkn_size;
> > >       unsigned int  ecc_chunk_count;
> > >       unsigned int  block_mark_byte_offset;
> > >       unsigned int  block_mark_bit_offset;
> > > +     unsigned int  ecc_for_meta; /* ECC for meta data */
> > >  };
> > >
> > >  struct mxs_nand_info {
> > >
Frieder Schrempf March 17, 2022, 8:06 a.m. UTC | #4
Hi Tim,

Am 16.03.22 um 17:34 schrieb Tim Harvey:
> On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com> wrote:
>>
>> Adding Han Xu's NXP email on Cc.
>>
>> On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
>> <frieder.schrempf@kontron.de> wrote:
>>>
>>> Hello everyone,
>>>
>>> sorry to dig out an old thread, but the below patch which was applied
>>> upstream as 616f03dabacb causes a regression for me when trying to
>>> attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
>>> AMD/Spansion parallel NAND.
>>>
>>> The failure looks like this:
>>>
>>> ubi0: attaching mtd2
>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>> from PEB 0:0, read 64 bytes
>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>> from PEB 0:2048, read 2048 bytes
>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>> from PEB 1:0, read 64 bytes
>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>> from PEB 1:2048, read 2048 bytes
>>>
>>> The NAND as reported by Linux is:
>>>
>>> nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
>>> nand: AMD/Spansion S34ML04G1
>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>
>>> A different revision of the same board with a different NAND from
>>> manufacturer ESMT doesn't show the issue:
>>>
>>> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
>>> nand: ESMT NAND 512MiB 3,3V 8-bit
>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>
>>> When I revert the mentioned commit (see patch here: [1]), the UBI boot
>>> starts working again.
>>>
>>> Does anyone know what the problem is and how to properly solve it?
>>>
>>> Thanks for any help!
>>> Frieder
>>>
>>> [1]
>>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fzerobin.net%2F%3F57a57a322bbdcf3c%23rZa3vHlWi%2BRxtRomoljtrngqWwiv6v4Js%2F2LNfdV10o%3D&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C978979eaf1fa48171bab08da076aea84%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637830453031595997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=NGl%2FrFnffV2oC90js0aHyTvoZjnrzW%2FD830lOnR4TbQ%3D&amp;reserved=0
>>>
> 
> Frieder,
> 
> I see the same issue here with IMX6Q/DL GPMI NAND.
> 
> If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi &&
> nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
> find that U-Boot can attach and mount the ubi fine but Linux will have
> issues

Interesting! This sounds like U-Boot and Linux somehow diverge in how
they handle the ECC data in OOB. I'm pretty confident that Linux does
things "correctly" and U-Boot should match what Linux does in this case.

Does the patch (revert of 616f03dabacb) I mentioned before "solve" the
issue for your case, too?

@Han, Ye, Peng: As you signed-off the mentioned commit, do you have any
ideas for a fix?

Thanks
Frieder
Frieder Schrempf March 17, 2022, 12:59 p.m. UTC | #5
Am 17.03.22 um 09:06 schrieb Frieder Schrempf:
> Hi Tim,
> 
> Am 16.03.22 um 17:34 schrieb Tim Harvey:
>> On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com> wrote:
>>>
>>> Adding Han Xu's NXP email on Cc.
>>>
>>> On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
>>> <frieder.schrempf@kontron.de> wrote:
>>>>
>>>> Hello everyone,
>>>>
>>>> sorry to dig out an old thread, but the below patch which was applied
>>>> upstream as 616f03dabacb causes a regression for me when trying to
>>>> attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
>>>> AMD/Spansion parallel NAND.
>>>>
>>>> The failure looks like this:
>>>>
>>>> ubi0: attaching mtd2
>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>>> from PEB 0:0, read 64 bytes
>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>>> from PEB 0:2048, read 2048 bytes
>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>>> from PEB 1:0, read 64 bytes
>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>>> from PEB 1:2048, read 2048 bytes
>>>>
>>>> The NAND as reported by Linux is:
>>>>
>>>> nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
>>>> nand: AMD/Spansion S34ML04G1
>>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>>
>>>> A different revision of the same board with a different NAND from
>>>> manufacturer ESMT doesn't show the issue:
>>>>
>>>> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
>>>> nand: ESMT NAND 512MiB 3,3V 8-bit
>>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>>
>>>> When I revert the mentioned commit (see patch here: [1]), the UBI boot
>>>> starts working again.
>>>>
>>>> Does anyone know what the problem is and how to properly solve it?
>>>>
>>>> Thanks for any help!
>>>> Frieder
>>>>
>>>> [1]
>>>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fzerobin.net%2F%3F57a57a322bbdcf3c%23rZa3vHlWi%2BRxtRomoljtrngqWwiv6v4Js%2F2LNfdV10o%3D&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C978979eaf1fa48171bab08da076aea84%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637830453031595997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=NGl%2FrFnffV2oC90js0aHyTvoZjnrzW%2FD830lOnR4TbQ%3D&amp;reserved=0
>>>>
>>
>> Frieder,
>>
>> I see the same issue here with IMX6Q/DL GPMI NAND.
>>
>> If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi &&
>> nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
>> find that U-Boot can attach and mount the ubi fine but Linux will have
>> issues
> 
> Interesting! This sounds like U-Boot and Linux somehow diverge in how
> they handle the ECC data in OOB. I'm pretty confident that Linux does
> things "correctly" and U-Boot should match what Linux does in this case.
> 
> Does the patch (revert of 616f03dabacb) I mentioned before "solve" the
> issue for your case, too?
> 
> @Han, Ye, Peng: As you signed-off the mentioned commit, do you have any
> ideas for a fix?

So the proper fix seems to be to revert to the "legacy" BCH layout that
is used by Linux. Sean already tried to get this fixed almost a year ago
[1] but Han turned the change down in favor of changing the layout on
the kernel side. But this series [2] never made it upstream and it
doesn't look like it will anytime soon.

I will try to resurrect Sean's fix.

[1]
https://patchwork.ozlabs.org/project/uboot/patch/20210510100043.449294-1-sean@geanix.com/
[2]
https://patchwork.ozlabs.org/project/linux-mtd/patch/20210522205136.19465-1-han.xu@nxp.com/
Sean Nyekjaer March 17, 2022, 1:14 p.m. UTC | #6
Hi Frieder,

On Thu, Mar 17, 2022 at 01:59:07PM +0100, Frieder Schrempf wrote:
> Am 17.03.22 um 09:06 schrieb Frieder Schrempf:
> > Hi Tim,
> > 
> > Am 16.03.22 um 17:34 schrieb Tim Harvey:
> >> On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com> wrote:
> >>>
> >>> Adding Han Xu's NXP email on Cc.
> >>>
> >>> On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
> >>> <frieder.schrempf@kontron.de> wrote:
> >>>>
> >>>> Hello everyone,
> >>>>
> >>>> sorry to dig out an old thread, but the below patch which was applied
> >>>> upstream as 616f03dabacb causes a regression for me when trying to
> >>>> attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
> >>>> AMD/Spansion parallel NAND.
> >>>>
> >>>> The failure looks like this:
> >>>>
> >>>> ubi0: attaching mtd2
> >>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> >>>> from PEB 0:0, read 64 bytes
> >>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> >>>> from PEB 0:2048, read 2048 bytes
> >>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
> >>>> from PEB 1:0, read 64 bytes
> >>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
> >>>> from PEB 1:2048, read 2048 bytes
> >>>>
> >>>> The NAND as reported by Linux is:
> >>>>
> >>>> nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
> >>>> nand: AMD/Spansion S34ML04G1
> >>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> >>>>
> >>>> A different revision of the same board with a different NAND from
> >>>> manufacturer ESMT doesn't show the issue:
> >>>>
> >>>> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
> >>>> nand: ESMT NAND 512MiB 3,3V 8-bit
> >>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> >>>>
> >>>> When I revert the mentioned commit (see patch here: [1]), the UBI boot
> >>>> starts working again.
> >>>>
> >>>> Does anyone know what the problem is and how to properly solve it?
> >>>>
> >>>> Thanks for any help!
> >>>> Frieder
> >>>>
> >>>> [1]
> >>>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fzerobin.net%2F%3F57a57a322bbdcf3c%23rZa3vHlWi%2BRxtRomoljtrngqWwiv6v4Js%2F2LNfdV10o%3D&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C978979eaf1fa48171bab08da076aea84%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637830453031595997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=NGl%2FrFnffV2oC90js0aHyTvoZjnrzW%2FD830lOnR4TbQ%3D&amp;reserved=0
> >>>>
> >>
> >> Frieder,
> >>
> >> I see the same issue here with IMX6Q/DL GPMI NAND.
> >>
> >> If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi &&
> >> nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
> >> find that U-Boot can attach and mount the ubi fine but Linux will have
> >> issues
> > 
> > Interesting! This sounds like U-Boot and Linux somehow diverge in how
> > they handle the ECC data in OOB. I'm pretty confident that Linux does
> > things "correctly" and U-Boot should match what Linux does in this case.
> > 
> > Does the patch (revert of 616f03dabacb) I mentioned before "solve" the
> > issue for your case, too?
> > 
> > @Han, Ye, Peng: As you signed-off the mentioned commit, do you have any
> > ideas for a fix?
> 
> So the proper fix seems to be to revert to the "legacy" BCH layout that
> is used by Linux. Sean already tried to get this fixed almost a year ago
> [1] but Han turned the change down in favor of changing the layout on
> the kernel side. But this series [2] never made it upstream and it
> doesn't look like it will anytime soon.
> 
> I will try to resurrect Sean's fix.
> 
> [1]
> https://patchwork.ozlabs.org/project/uboot/patch/20210510100043.449294-1-sean@geanix.com/
> [2]
> https://patchwork.ozlabs.org/project/linux-mtd/patch/20210522205136.19465-1-han.xu@nxp.com/
> 

Thanks for bringing this up again. I have only had time to send Han a
ping a few times.
I would still prefer that we added the new(correct) metod as a option.

/Sean
Frieder Schrempf March 17, 2022, 1:19 p.m. UTC | #7
Hi Sean,

Am 17.03.22 um 14:14 schrieb Sean Nyekjaer:
> [Sie erhalten nicht oft E-Mail von "sean@geanix.com". Weitere Informationen, warum dies wichtig ist, finden Sie unter "http://aka.ms/LearnAboutSenderIdentification".]
> 
> Hi Frieder,
> 
> On Thu, Mar 17, 2022 at 01:59:07PM +0100, Frieder Schrempf wrote:
>> Am 17.03.22 um 09:06 schrieb Frieder Schrempf:
>>> Hi Tim,
>>>
>>> Am 16.03.22 um 17:34 schrieb Tim Harvey:
>>>> On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com> wrote:
>>>>>
>>>>> Adding Han Xu's NXP email on Cc.
>>>>>
>>>>> On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
>>>>> <frieder.schrempf@kontron.de> wrote:
>>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> sorry to dig out an old thread, but the below patch which was applied
>>>>>> upstream as 616f03dabacb causes a regression for me when trying to
>>>>>> attach an UBI volume with U-Boot 2022.01 on a board with i.MX6 Solo and
>>>>>> AMD/Spansion parallel NAND.
>>>>>>
>>>>>> The failure looks like this:
>>>>>>
>>>>>> ubi0: attaching mtd2
>>>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>>>>> from PEB 0:0, read 64 bytes
>>>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>>>>> from PEB 0:2048, read 2048 bytes
>>>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes
>>>>>> from PEB 1:0, read 64 bytes
>>>>>> ubi0 error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes
>>>>>> from PEB 1:2048, read 2048 bytes
>>>>>>
>>>>>> The NAND as reported by Linux is:
>>>>>>
>>>>>> nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
>>>>>> nand: AMD/Spansion S34ML04G1
>>>>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>>>>
>>>>>> A different revision of the same board with a different NAND from
>>>>>> manufacturer ESMT doesn't show the issue:
>>>>>>
>>>>>> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
>>>>>> nand: ESMT NAND 512MiB 3,3V 8-bit
>>>>>> nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>>>>
>>>>>> When I revert the mentioned commit (see patch here: [1]), the UBI boot
>>>>>> starts working again.
>>>>>>
>>>>>> Does anyone know what the problem is and how to properly solve it?
>>>>>>
>>>>>> Thanks for any help!
>>>>>> Frieder
>>>>>>
>>>>>> [1]
>>>>>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fzerobin.net%2F%3F57a57a322bbdcf3c%23rZa3vHlWi%2BRxtRomoljtrngqWwiv6v4Js%2F2LNfdV10o%3D&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C7a58965995b34fe86ef808da08180b2a%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637831196603019724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=eexagX96EdnCNJiOvxTL%2FXVVWtQYsF5yt1zeMy2SghQ%3D&amp;reserved=0
>>>>>>
>>>>
>>>> Frieder,
>>>>
>>>> I see the same issue here with IMX6Q/DL GPMI NAND.
>>>>
>>>> If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi &&
>>>> nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
>>>> find that U-Boot can attach and mount the ubi fine but Linux will have
>>>> issues
>>>
>>> Interesting! This sounds like U-Boot and Linux somehow diverge in how
>>> they handle the ECC data in OOB. I'm pretty confident that Linux does
>>> things "correctly" and U-Boot should match what Linux does in this case.
>>>
>>> Does the patch (revert of 616f03dabacb) I mentioned before "solve" the
>>> issue for your case, too?
>>>
>>> @Han, Ye, Peng: As you signed-off the mentioned commit, do you have any
>>> ideas for a fix?
>>
>> So the proper fix seems to be to revert to the "legacy" BCH layout that
>> is used by Linux. Sean already tried to get this fixed almost a year ago
>> [1] but Han turned the change down in favor of changing the layout on
>> the kernel side. But this series [2] never made it upstream and it
>> doesn't look like it will anytime soon.
>>
>> I will try to resurrect Sean's fix.
>>
>> [1]
>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Fuboot%2Fpatch%2F20210510100043.449294-1-sean%40geanix.com%2F&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C7a58965995b34fe86ef808da08180b2a%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637831196603019724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ZzCm2Jl9BPd8hP%2B13judqwblR4tzmyZnIHUR922KLdQ%3D&amp;reserved=0
>> [2]
>> https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Flinux-mtd%2Fpatch%2F20210522205136.19465-1-han.xu%40nxp.com%2F&amp;data=04%7C01%7Cfrieder.schrempf%40kontron.de%7C7a58965995b34fe86ef808da08180b2a%7C8c9d3c973fd941c8a2b1646f3942daf1%7C0%7C0%7C637831196603019724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=iiBenZkPMWw%2Bg%2FLQjnBSgqZABScRQXt6%2F08typx7buY%3D&amp;reserved=0
>>
> 
> Thanks for bringing this up again. I have only had time to send Han a
> ping a few times.
> I would still prefer that we added the new(correct) metod as a option.

The top priority should be to fix the regression. And reading the
discussion with the Linux MTD maintainer Miquel in the thread for Han's
kernel patches linked above makes me doubt that the new method is really
a good option. Even if it would land in the kernel, it would break
systems that use older versions of U-Boot which is just not acceptable
in my opinion.

Frieder
Ye Li March 18, 2022, 8:47 a.m. UTC | #8
Hi Frieder,

On Thu, 2022-03-17 at 09:06 +0100, Frieder Schrempf wrote:
> Caution: EXT Email
> 
> Hi Tim,
> 
> Am 16.03.22 um 17:34 schrieb Tim Harvey:
> > 
> > On Wed, Mar 16, 2022 at 7:09 AM Fabio Estevam <festevam@gmail.com>
> > wrote:
> > > 
> > > 
> > > Adding Han Xu's NXP email on Cc.
> > > 
> > > On Mon, Mar 14, 2022 at 10:31 AM Frieder Schrempf
> > > <frieder.schrempf@kontron.de> wrote:
> > > > 
> > > > 
> > > > Hello everyone,
> > > > 
> > > > sorry to dig out an old thread, but the below patch which was
> > > > applied
> > > > upstream as 616f03dabacb causes a regression for me when trying
> > > > to
> > > > attach an UBI volume with U-Boot 2022.01 on a board with i.MX6
> > > > Solo and
> > > > AMD/Spansion parallel NAND.
> > > > 
> > > > The failure looks like this:
> > > > 
> > > > ubi0: attaching mtd2
> > > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64
> > > > bytes
> > > > from PEB 0:0, read 64 bytes
> > > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading
> > > > 2048 bytes
> > > > from PEB 0:2048, read 2048 bytes
> > > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64
> > > > bytes
> > > > from PEB 1:0, read 64 bytes
> > > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading
> > > > 2048 bytes
> > > > from PEB 1:2048, read 2048 bytes
> > > > 
> > > > The NAND as reported by Linux is:
> > > > 
> > > > nand: device found, Manufacturer ID: 0x01, Chip ID: 0xdc
> > > > nand: AMD/Spansion S34ML04G1
> > > > nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB
> > > > size: 64
> > > > 
> > > > A different revision of the same board with a different NAND
> > > > from
> > > > manufacturer ESMT doesn't show the issue:
> > > > 
> > > > nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xdc
> > > > nand: ESMT NAND 512MiB 3,3V 8-bit
> > > > nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB
> > > > size: 64
> > > > 
> > > > When I revert the mentioned commit (see patch here: [1]), the
> > > > UBI boot
> > > > starts working again.
> > > > 
> > > > Does anyone know what the problem is and how to properly solve
> > > > it?
> > > > 
> > > > Thanks for any help!
> > > > Frieder
> > > > 
> > > > [1]
> > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F
> > > > %2Fzerobin.net%2F%3F57a57a322bbdcf3c%23rZa3vHlWi%2BRxtRomoljtrn
> > > > gqWwiv6v4Js%2F2LNfdV10o%3D&amp;data=04%7C01%7Cye.li%40nxp.com%7
> > > > Cee3b85ceee3d4ece78e108da07ed1e2c%7C686ea1d3bc2b4c6fa92cd99c5c3
> > > > 01635%7C0%7C0%7C637831012279864791%7CUnknown%7CTWFpbGZsb3d8eyJW
> > > > IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > > > 7C3000&amp;sdata=O6MzwCv9ClB8ZW7i3%2BV1RmOaviGqQRNa8L0uxE4M%2F%
> > > > 2BI%3D&amp;reserved=0
> > > > 
> > Frieder,
> > 
> > I see the same issue here with IMX6Q/DL GPMI NAND.
> > 
> > If I re-flash the ubi within U-Boot (tftpboot $loadaddr rootfs.ubi
> > &&
> > nand erase.part rootfs && nand write $loadaddr rootfs $filesize) I
> > find that U-Boot can attach and mount the ubi fine but Linux will
> > have
> > issues
> Interesting! This sounds like U-Boot and Linux somehow diverge in how
> they handle the ECC data in OOB. I'm pretty confident that Linux does
> things "correctly" and U-Boot should match what Linux does in this
> case.
> 
> Does the patch (revert of 616f03dabacb) I mentioned before "solve"
> the
> issue for your case, too?
> 
> @Han, Ye, Peng: As you signed-off the mentioned commit, do you have
> any
> ideas for a fix?

The dt nand driver will check "fsl,legacy-bch-geometry" property to use
legacy bch. If this can't work for you in case you don't use DM driver,
I prefer adding a config to select the legacy bch not reverting the
patch. 

Best regards,
Ye Li
> 
> Thanks
> Frieder
diff mbox series

Patch

diff --git a/drivers/mtd/nand/raw/mxs_nand.c b/drivers/mtd/nand/raw/mxs_nand.c
index fe8097c146..8da59e39c0 100644
--- a/drivers/mtd/nand/raw/mxs_nand.c
+++ b/drivers/mtd/nand/raw/mxs_nand.c
@@ -112,53 +112,32 @@  static uint32_t mxs_nand_aux_status_offset(void)
 	return (MXS_NAND_METADATA_SIZE + 0x3) & ~0x3;
 }
 
-static inline int mxs_nand_calc_mark_offset(struct bch_geometry *geo,
-					    uint32_t page_data_size)
+static inline bool mxs_nand_bbm_in_data_chunk(struct bch_geometry *geo, struct mtd_info *mtd,
+		unsigned int *chunk_num)
 {
-	uint32_t chunk_data_size_in_bits = geo->ecc_chunk_size * 8;
-	uint32_t chunk_ecc_size_in_bits = geo->ecc_strength * geo->gf_len;
-	uint32_t chunk_total_size_in_bits;
-	uint32_t block_mark_chunk_number;
-	uint32_t block_mark_chunk_bit_offset;
-	uint32_t block_mark_bit_offset;
+	unsigned int i, j;
 
-	chunk_total_size_in_bits =
-			chunk_data_size_in_bits + chunk_ecc_size_in_bits;
-
-	/* Compute the bit offset of the block mark within the physical page. */
-	block_mark_bit_offset = page_data_size * 8;
-
-	/* Subtract the metadata bits. */
-	block_mark_bit_offset -= MXS_NAND_METADATA_SIZE * 8;
-
-	/*
-	 * Compute the chunk number (starting at zero) in which the block mark
-	 * appears.
-	 */
-	block_mark_chunk_number =
-			block_mark_bit_offset / chunk_total_size_in_bits;
-
-	/*
-	 * Compute the bit offset of the block mark within its chunk, and
-	 * validate it.
-	 */
-	block_mark_chunk_bit_offset = block_mark_bit_offset -
-			(block_mark_chunk_number * chunk_total_size_in_bits);
+	if (geo->ecc_chunk0_size != geo->ecc_chunkn_size) {
+		dev_err(this->dev, "The size of chunk0 must equal to chunkn\n");
+		return false;
+	}
 
-	if (block_mark_chunk_bit_offset > chunk_data_size_in_bits)
-		return -EINVAL;
+	i = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) /
+		(geo->gf_len * geo->ecc_strength +
+				geo->ecc_chunkn_size * 8);
 
-	/*
-	 * Now that we know the chunk number in which the block mark appears,
-	 * we can subtract all the ECC bits that appear before it.
-	 */
-	block_mark_bit_offset -=
-		block_mark_chunk_number * chunk_ecc_size_in_bits;
+	j = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8) -
+		(geo->gf_len * geo->ecc_strength +
+				geo->ecc_chunkn_size * 8) * i;
 
-	geo->block_mark_byte_offset = block_mark_bit_offset >> 3;
-	geo->block_mark_bit_offset = block_mark_bit_offset & 0x7;
+	if (j < geo->ecc_chunkn_size * 8) {
+		*chunk_num = i + 1;
+		dev_dbg(this->dev, "Set ecc to %d and bbm in chunk %d\n",
+			geo->ecc_strength, *chunk_num);
+		return true;
+	}
 
-	return 0;
+	return false;
 }
 
 static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
@@ -168,6 +147,7 @@  static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
+	unsigned int block_mark_bit_offset;
 
 	switch (ecc_step) {
 	case SZ_512:
@@ -180,45 +160,51 @@  static inline int mxs_nand_calc_ecc_layout_by_info(struct bch_geometry *geo,
 		return -EINVAL;
 	}
 
-	geo->ecc_chunk_size = ecc_step;
+	geo->ecc_chunk0_size = ecc_step;
+	geo->ecc_chunkn_size = ecc_step;
 	geo->ecc_strength = round_up(ecc_strength, 2);
 
 	/* Keep the C >= O */
-	if (geo->ecc_chunk_size < mtd->oobsize)
+	if (geo->ecc_chunkn_size < mtd->oobsize)
 		return -EINVAL;
 
 	if (geo->ecc_strength > nand_info->max_ecc_strength_supported)
 		return -EINVAL;
 
-	geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
+	geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
+
+	/* For bit swap. */
+	block_mark_bit_offset = mtd->writesize * 8 -
+		(geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
+				+ MXS_NAND_METADATA_SIZE * 8);
+
+	geo->block_mark_byte_offset = block_mark_bit_offset / 8;
+	geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
 
 	return 0;
 }
 
-static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
+static inline int mxs_nand_legacy_calc_ecc_layout(struct bch_geometry *geo,
 					   struct mtd_info *mtd)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
+	unsigned int block_mark_bit_offset;
 
 	/* The default for the length of Galois Field. */
 	geo->gf_len = 13;
 
 	/* The default for chunk size. */
-	geo->ecc_chunk_size = 512;
+	geo->ecc_chunk0_size = 512;
+	geo->ecc_chunkn_size = 512;
 
-	if (geo->ecc_chunk_size < mtd->oobsize) {
+	if (geo->ecc_chunkn_size < mtd->oobsize) {
 		geo->gf_len = 14;
-		geo->ecc_chunk_size *= 2;
+		geo->ecc_chunk0_size *= 2;
+		geo->ecc_chunkn_size *= 2;
 	}
 
-	if (mtd->oobsize > geo->ecc_chunk_size) {
-		printf("Not support the NAND chips whose oob size is larger then %d bytes!\n",
-		       geo->ecc_chunk_size);
-		return -EINVAL;
-	}
-
-	geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunk_size;
+	geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
 
 	/*
 	 * Determine the ECC layout with the formula:
@@ -234,6 +220,84 @@  static inline int mxs_nand_calc_ecc_layout(struct bch_geometry *geo,
 	geo->ecc_strength = min(round_down(geo->ecc_strength, 2),
 				nand_info->max_ecc_strength_supported);
 
+	block_mark_bit_offset = mtd->writesize * 8 -
+		(geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - 1)
+				+ MXS_NAND_METADATA_SIZE * 8);
+
+	geo->block_mark_byte_offset = block_mark_bit_offset / 8;
+	geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
+
+	return 0;
+}
+
+static inline int mxs_nand_calc_ecc_for_large_oob(struct bch_geometry *geo,
+					   struct mtd_info *mtd)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	struct mxs_nand_info *nand_info = nand_get_controller_data(chip);
+	unsigned int block_mark_bit_offset;
+	unsigned int max_ecc;
+	unsigned int bbm_chunk;
+	unsigned int i;
+
+	/* sanity check for the minimum ecc nand required */
+	if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
+		return -EINVAL;
+	geo->ecc_strength = chip->ecc_strength_ds;
+
+	/* calculate the maximum ecc platform can support*/
+	geo->gf_len = 14;
+	geo->ecc_chunk0_size = 1024;
+	geo->ecc_chunkn_size = 1024;
+	geo->ecc_chunk_count = mtd->writesize / geo->ecc_chunkn_size;
+	max_ecc = ((mtd->oobsize - MXS_NAND_METADATA_SIZE) * 8)
+			/ (geo->gf_len * geo->ecc_chunk_count);
+	max_ecc = min(round_down(max_ecc, 2),
+				nand_info->max_ecc_strength_supported);
+
+
+	/* search a supported ecc strength that makes bbm */
+	/* located in data chunk  */
+	geo->ecc_strength = chip->ecc_strength_ds;
+	while (!(geo->ecc_strength > max_ecc)) {
+		if (mxs_nand_bbm_in_data_chunk(geo, mtd, &bbm_chunk))
+			break;
+		geo->ecc_strength += 2;
+	}
+
+	/* if none of them works, keep using the minimum ecc */
+	/* nand required but changing ecc page layout  */
+	if (geo->ecc_strength > max_ecc) {
+		geo->ecc_strength = chip->ecc_strength_ds;
+		/* add extra ecc for meta data */
+		geo->ecc_chunk0_size = 0;
+		geo->ecc_chunk_count = (mtd->writesize / geo->ecc_chunkn_size) + 1;
+		geo->ecc_for_meta = 1;
+		/* check if oob can afford this extra ecc chunk */
+		if (mtd->oobsize * 8 < MXS_NAND_METADATA_SIZE * 8 +
+				geo->gf_len * geo->ecc_strength
+				* geo->ecc_chunk_count) {
+			printf("unsupported NAND chip with new layout\n");
+			return -EINVAL;
+		}
+
+		/* calculate in which chunk bbm located */
+		bbm_chunk = (mtd->writesize * 8 - MXS_NAND_METADATA_SIZE * 8 -
+			geo->gf_len * geo->ecc_strength) /
+			(geo->gf_len * geo->ecc_strength +
+					geo->ecc_chunkn_size * 8) + 1;
+	}
+
+	/* calculate the number of ecc chunk behind the bbm */
+	i = (mtd->writesize / geo->ecc_chunkn_size) - bbm_chunk + 1;
+
+	block_mark_bit_offset = mtd->writesize * 8 -
+		(geo->ecc_strength * geo->gf_len * (geo->ecc_chunk_count - i)
+				+ MXS_NAND_METADATA_SIZE * 8);
+
+	geo->block_mark_byte_offset = block_mark_bit_offset / 8;
+	geo->block_mark_bit_offset  = block_mark_bit_offset % 8;
+
 	return 0;
 }
 
@@ -983,18 +1047,23 @@  static int mxs_nand_set_geometry(struct mtd_info *mtd, struct bch_geometry *geo)
 	struct nand_chip *nand = mtd_to_nand(mtd);
 	struct mxs_nand_info *nand_info = nand_get_controller_data(nand);
 
-	if (chip->ecc.strength > 0 && chip->ecc.size > 0)
-		return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
-				chip->ecc.strength, chip->ecc.size);
+	if (chip->ecc_strength_ds > nand_info->max_ecc_strength_supported) {
+		printf("unsupported NAND chip, minimum ecc required %d\n"
+			, chip->ecc_strength_ds);
+		return -EINVAL;
+	}
+
+	if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0) &&
+			(mtd->oobsize < 1024)) {
+		dev_warn(this->dev, "use legacy bch geometry\n");
+		return mxs_nand_legacy_calc_ecc_layout(geo, mtd);
+	}
 
-	if (nand_info->use_minimum_ecc ||
-		mxs_nand_calc_ecc_layout(geo, mtd)) {
-		if (!(chip->ecc_strength_ds > 0 && chip->ecc_step_ds > 0))
-			return -EINVAL;
+	if (mtd->oobsize > 1024 || chip->ecc_step_ds < mtd->oobsize)
+		return mxs_nand_calc_ecc_for_large_oob(geo, mtd);
 
-		return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
+	return mxs_nand_calc_ecc_layout_by_info(geo, mtd,
 				chip->ecc_strength_ds, chip->ecc_step_ds);
-	}
 
 	return 0;
 }
@@ -1025,8 +1094,6 @@  int mxs_nand_setup_ecc(struct mtd_info *mtd)
 	if (ret)
 		return ret;
 
-	mxs_nand_calc_mark_offset(geo, mtd->writesize);
-
 	/* Configure BCH and set NFC geometry */
 	mxs_reset_block(&bch_regs->hw_bch_ctrl_reg);
 
@@ -1034,7 +1101,7 @@  int mxs_nand_setup_ecc(struct mtd_info *mtd)
 	tmp = (geo->ecc_chunk_count - 1) << BCH_FLASHLAYOUT0_NBLOCKS_OFFSET;
 	tmp |= MXS_NAND_METADATA_SIZE << BCH_FLASHLAYOUT0_META_SIZE_OFFSET;
 	tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT0_ECC0_OFFSET;
-	tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
+	tmp |= geo->ecc_chunk0_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
 	tmp |= (geo->gf_len == 14 ? 1 : 0) <<
 		BCH_FLASHLAYOUT0_GF13_0_GF14_1_OFFSET;
 	writel(tmp, &bch_regs->hw_bch_flash0layout0);
@@ -1043,7 +1110,7 @@  int mxs_nand_setup_ecc(struct mtd_info *mtd)
 	tmp = (mtd->writesize + mtd->oobsize)
 		<< BCH_FLASHLAYOUT1_PAGE_SIZE_OFFSET;
 	tmp |= (geo->ecc_strength >> 1) << BCH_FLASHLAYOUT1_ECCN_OFFSET;
-	tmp |= geo->ecc_chunk_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
+	tmp |= geo->ecc_chunkn_size >> MXS_NAND_CHUNK_DATA_CHUNK_SIZE_SHIFT;
 	tmp |= (geo->gf_len == 14 ? 1 : 0) <<
 		BCH_FLASHLAYOUT1_GF13_0_GF14_1_OFFSET;
 	writel(tmp, &bch_regs->hw_bch_flash0layout1);
@@ -1268,7 +1335,7 @@  int mxs_nand_init_ctrl(struct mxs_nand_info *nand_info)
 
 	nand->ecc.layout	= &fake_ecc_layout;
 	nand->ecc.mode		= NAND_ECC_HW;
-	nand->ecc.size		= nand_info->bch_geometry.ecc_chunk_size;
+	nand->ecc.size		= nand_info->bch_geometry.ecc_chunkn_size;
 	nand->ecc.strength	= nand_info->bch_geometry.ecc_strength;
 
 	/* second phase scan */
diff --git a/include/mxs_nand.h b/include/mxs_nand.h
index ada20483d0..497da77a16 100644
--- a/include/mxs_nand.h
+++ b/include/mxs_nand.h
@@ -16,22 +16,26 @@ 
  * @gf_len:                   The length of Galois Field. (e.g., 13 or 14)
  * @ecc_strength:             A number that describes the strength of the ECC
  *                            algorithm.
- * @ecc_chunk_size:           The size, in bytes, of a single ECC chunk. Note
- *                            the first chunk in the page includes both data and
- *                            metadata, so it's a bit larger than this value.
+ * @ecc_chunk0_size:          The size, in bytes, of a first ECC chunk.
+ * @ecc_chunkn_size:          The size, in bytes, of a single ECC chunk after
+ *                            the first chunk in the page.
  * @ecc_chunk_count:          The number of ECC chunks in the page,
  * @block_mark_byte_offset:   The byte offset in the ECC-based page view at
  *                            which the underlying physical block mark appears.
  * @block_mark_bit_offset:    The bit offset into the ECC-based page view at
  *                            which the underlying physical block mark appears.
+ * @ecc_for_meta:             The flag to indicate if there is a dedicate ecc
+ *                            for meta.
  */
 struct bch_geometry {
 	unsigned int  gf_len;
 	unsigned int  ecc_strength;
-	unsigned int  ecc_chunk_size;
+	unsigned int  ecc_chunk0_size;
+	unsigned int  ecc_chunkn_size;
 	unsigned int  ecc_chunk_count;
 	unsigned int  block_mark_byte_offset;
 	unsigned int  block_mark_bit_offset;
+	unsigned int  ecc_for_meta; /* ECC for meta data */
 };
 
 struct mxs_nand_info {