LD Fix for STMicroelectronics hardware erratum STM32L4XX 629360.

From: Christophe Monat <christophe.monat@st.com>

Hi list,

This is a proposal for a ld patch that implements a work-around to a 
STMicroelectronics hardware erratum.

* Description of the problem:
The STM32L4x6xx family of microcontroller suffers from a memory
controller ('FMC') limitation, described in the publicly available
document:
<www.st.com/st-web-ui/static/active/en/resource/technical/document/errata_sheet/DM00111498.pdf>
See : "Read burst access of 9 words or more is not supported by
FMC" for the full the description.
In a few words, the multiple loads that are issued from the code
(ldm, vldm) through the FMC may receive corrupt data when more
than 9 words are involved. Access should be limited to chunks of 8
words.

* Impact of the problem:
The STM32L476 product is available to anybody worldwide either
directly through ST or through the distributors.  As there is no
plan for a revision of that silicon and it will live for at least
10 years, in the end tens of millions of parts or even more are
potentially affected.

* Possible solutions to the problem:
The compilation tools can be changed to limit their code
generation to appropriate multiple load sizes, but this solution
does not account for existing libraries and third-party software
libraries that cannot be recompiled.

An alternative solution is to create a specific treatment at
the linker level to identify and divert the faulty instruction
and divert it to a correct sequence. Thus it can deal with any
pre-existing .o or .a contribution that may expose the issue
and produce a fixed, final executable.

* Purpose of the patch:
The patch is disabled by default and has no observable
impact. When it is enabled, it checks if the expected architecture
is used and warns otherwise. It scans the code sections for faulty
ldm and vldm instructions and when one is detected it creates a
veneer that contains a fixed sequence and a branch back to the
following instruction, and replaces the original instruction to a
branch to that veneer.

The veneer contents are emitted based on the original
instruction details, depending on the instruction family (9
cases for ldm, 3 for vldm) and various other details (PC being
or not being part of the set of restored registers, write-back
of the base register, etc..)

* Implementation:
The implementation is strongly inspired by the so-called VFPV11
work-around, and works in a similar way with regard to the veneer
management. The only specificities are the specific conditions
detection, and the emitted code supported by quite a few
instruction encoding helpers.

A deviation with regard to VFPV11 is that the veneers are not
created in a dedicated section (as it was for .vfp11_veneer),
but in a .text.stm32l4xx_veneer that is naturally treated by
the linker scripts.

* Testing:
- Unit tests have been written to cover the veneer emission
    code for all ldm and vldm sub-cases, the IT block logic
    detection, the detection of specific IT block issues, the
    detection of branch overflows when creating veneers; and are
    part of the proposed change.
- No regression on ld test suite under arm-none-eabi and
    arm-linux-gnueabi configuration
- Our standard QA has been run with a 4_9-2015q2-20150609
    Linaro embedded compiler patched with this linker changes,
    in two variants: default linker settings and fix enabled,
    forcing the linker warnings to be turned into hard
    errors. Our QA contains various commercial tests suites,
    industrial benchmarks and proprietary ST applications. We
    have not observed behavioral changes with and without the
    fixes, nor detected the occurrence of unhandled sequences
    (IT blocks limitations), nor observed branch limitations.
- Hardware testing has been done on as STM32L476 "Discovery
    kit".

* Impact:
- The code size increase has been measured to 1.5% geometric
    mean increase on our whole set of QA applications, with a
    peak at 3.2%.

* Limitations:
- The veneer sizes are dimensioned to the maximum sequence
    that might be generated from a given instruction class (4
    words for ldm, 6 for vldm), but the actual sequence may be
    smaller.
- Faulty ldm or vldm that are part of IT blocks are
    transformed if and only if they appear as the last
    instruction of the IT block, where the creation of a jump is
    the only legitimate place, otherwise the linker emits a
    warning and does nothing.
- In ldmdb cases when the pc is involved in the restored
    register list, the memory access order is reversed compared
    to what it was originally.

Changelogs:

bfd/ChangeLog:

2015-09-17  Laurent Alfonsi <laurent.alfonsi@st.com>
             Christophe Monat <christophe.monat@st.com>

      * bfd-in2.h: Regenerate.
      * bfd-in.h (bfd_arm_stm32l4xx_fix): New enum. Specify how
      STM32L4XX instruction scanning should be done.
      (bfd_elf32_arm_set_stm32l4xx_fix)
      (bfd_elf32_arm_stm32l4xx_erratum_scan)
      (bfd_elf32_arm_stm32l4xx_fix_veneer_locations): Add prototypes.
      (bfd_elf32_arm_set_target_relocs): Add stm32l4xx fix type argument
      to prototype.
      * elf32-arm.c (STM32L4XX_ERRATUM_VENEER_SECTION_NAME)
      (STM32L4XX_ERRATUM_VENEER_ENTRY_NAME): Define macros.
      (elf32_stm32l4xx_erratum_type): New enum.
      (elf32_stm32l4xx_erratum_list): New struct. List of veneers or
      jumps to veneers.
      (_arm_elf_section_data): Add stm32l4xx_erratumcount,
      stm32l4xx_erratumlist.
      (elf32_arm_link_hash_table): Add stm32l4xx_erratum_glue_size,
      stm32l4xx_fix and num_stm32l4xx_fixes fields.
      (ctz): New function.
      (popcount): New function.
      (elf32_arm_link_hash_table_create): Initialize stm32l4xx_fix.
      (put_thumb2_insn): New function.
      (STM32L4XX_ERRATUM_LDM_VENEER_SIZE): Define. Size of a veneer for
      LDM instructions.
      (STM32L4XX_ERRATUM_VLDM_VENEER_SIZE): Define. Size of a veneer for
      VLDM instructions.
      (bfd_elf32_arm_allocate_interworking_sections): Initialise erratum
      glue section.
      (record_stm32l4xx_erratum_veneer) : New function. Create a single
      veneer, and its associated symbols.
      (bfd_elf32_arm_add_glue_sections_to_bfd): Add STM32L4XX erratum glue.
      (bfd_elf32_arm_set_stm32l4xx_fix): New function. Set the type of
      erratum workaround required.
      (bfd_elf32_arm_stm32l4xx_fix_veneer_locations): New function. Find
      out where veneers and branches to veneers have been placed in
      virtual memory after layout.
      (is_thumb2_ldmia): New function.
      (is_thumb2_ldmdb): Likewise.
      (is_thumb2_vldm ): Likewise.
      (stm32l4xx_need_create_replacing_stub): New function. Decide if a
      veneer must be emitted.
      (bfd_elf32_arm_stm32l4xx_erratum_scan): Scan the sections of an
      input BFD for potential erratum-triggering insns. Record results.
      (bfd_elf32_arm_set_target_relocs): Set stm32l4xx_fix field in
      global hash table.
      (elf32_arm_size_dynamic_sections): Collect glue information.
      (create_instruction_branch_absolute): New function.
      (create_instruction_ldmia): Likewise.
      (create_instruction_ldmdb): Likewise.
      (create_instruction_mov): Likewise.
      (create_instruction_sub): Likewise.
      (create_instruction_vldmia): Likewise.
      (create_instruction_vldmdb): Likewise.
      (create_instruction_udf_w): Likewise.
      (create_instruction_udf): Likewise.
      (push_thumb2_insn32): Likewise.
      (push_thumb2_insn16): Likewise.
      (stm32l4xx_fill_stub_udf): Likewise.
      (stm32l4xx_create_replacing_stub_ldmia): New function. Expands the
      replacing stub for ldmia instructions.
      (stm32l4xx_create_replacing_stub_ldmdb): Likewise for ldmdb.
      (stm32l4xx_create_replacing_stub_vldm): Likewise for vldm.
      (stm32l4xx_create_replacing_stub): New function. Dispatches the
      stub emission to the appropriate functions.
      (elf32_arm_write_section): Output veneers, and branches to veneers.

ld/ChangeLog:

2015-09-17  Laurent Alfonsi <laurent.alfonsi@st.com>
             Christophe Monat <christophe.monat@st.com>

      * ld.texinfo: Description of the STM32L4xx erratum workaround.
      * emultempl/armelf.em (stm32l4xx_fix): New.
      (arm_elf_before_allocation): Choose the type of fix, scan for
      erratum.
      (gld${EMULATION_NAME}_finish): Fix veneer locations.
      (arm_elf_create_output_section_statements): Propagate
      stm32l4xx_fix value.
      (PARSE_AND_LIST_PROLOGUE): Define OPTION_STM32L4XX_FIX.
      (PARSE_AND_LIST_LONGOPTS): Add entry for handling
      --fix-stm32l4xx-629360.
      (PARSE_AND_LIST_OPTION): Add entry for helping on
      --fix-stm32l4xx-629360.
      (PARSE_AND_LIST_ARGS_CASES): Treat OPTION_STM32L4XX_FIX.

ld/testsuite/ChangeLog:

2015-09-17  Laurent Alfonsi <laurent.alfonsi@st.com>
             Christophe Monat <christophe.monat@st.com>

      * ld-arm/arm-elf.exp (armelftests_common): Add STM32L4XX
        tests.
      * ld-arm/stm32l4xx-cannot-fix-far-ldm.d: New.
      * ld-arm/stm32l4xx-cannot-fix-far-ldm.s: Likewise.
      * ld-arm/stm32l4xx-cannot-fix-it-block.d: Likewise.
      * ld-arm/stm32l4xx-cannot-fix-it-block.s: Likewise.
      * ld-arm/stm32l4xx-fix-all.d: Likewise.
      * ld-arm/stm32l4xx-fix-all.s: Likewise.
      * ld-arm/stm32l4xx-fix-it-block.d: Likewise.
      * ld-arm/stm32l4xx-fix-it-block.s: Likewise.
      * ld-arm/stm32l4xx-fix-ldm.d: Likewise.
      * ld-arm/stm32l4xx-fix-ldm.s: Likewise.
      * ld-arm/stm32l4xx-fix-vldm.d: Likewise.
      * ld-arm/stm32l4xx-fix-vldm.s: Likewise.

Patch attached, waiting for your review.
--C
>From 32867caf5cb11bd444fada043df4556959195ebc Mon Sep 17 00:00:00 2001
From: Christophe Monat <christophe.monat@st.com>
Date: Thu, 15 Oct 2015 15:52:06 +0200
Subject: [PATCH] Fix for STMicroelectronics hardware erratum STM32L4XX 629360.

        * Description of the problem:
        The STM32L4x6xx family of microcontroller suffers from a memory
        controller ('FMC') limitation, described in the publicly available
        document:
        <www.st.com/st-web-ui/static/active/en/resource/technical/document/errata_sheet/DM00111498.pdf>
        See : "Read burst access of 9 words or more is not supported by
        FMC" for the full the description.
        In a few words, the multiple loads that are issued from the code
        (ldm, vldm) through the FMC may receive corrupt data when more
        than 9 words are involved. Access should be limited to chunks of 8
        words.

        * Impact of the problem:
        The STM32L476 product is available to anybody worldwide either
        directly through ST or through the distributors.  As there is no
        plan for a revision of that silicon and it will live for at least
        10 years, in the end tens of millions of parts or even more are
        potentially affected.

        * Possible solutions to the problem:
        The compilation tools can be changed to limit their code
        generation to appropriate multiple load sizes, but this solution
        does not account for existing libraries and third-party software
        libraries that cannot be recompiled.

        An alternative solution is to create a specific treatment at
        the linker level to identify and divert the faulty instruction
        and divert it to a correct sequence. Thus it can deal with any
        pre-existing .o or .a contribution that may expose the issue
        and produce a fixed, final executable.

        * Purpose of the patch:
        The patch is disabled by default and has no observable
        impact. When it is enabled, it checks if the expected architecture
        is used and warns otherwise. It scans the code sections for faulty
        ldm and vldm instructions and when one is detected it creates a
        veneer that contains a fixed sequence and a branch back to the
        following instruction, and replaces the original instruction to a
        branch to that veneer.

        The veneer contents are emitted based on the original
        instruction details, depending on the instruction family (9
        cases for ldm, 3 for vldm) and various other details (PC being
        or not being part of the set of restored registers, write-back
        of the base register, etc..)

        * Implementation:
        The implementation is strongly inspired by the so-called VFPV11
        work-around, and works in a similar way with regard to the veneer
        management. The only specificities are the specific conditions
        detection, and the emitted code supported by quite a few
        instruction encoding helpers.

        A deviation with regard to VFPV11 is that the veneers are not
        created in a dedicated section (as it was for .vfp11_veneer),
        but in a .text.stm32l4xx_veneer that is naturally treated by
        the linker scripts.

        * Testing:
        - Unit tests have been written to cover the veneer emission
          code for all ldm and vldm sub-cases, the IT block logic
          detection, the detection of specific IT block issues, the
          detection of branch overflows when creating veneers; and are
          part of the proposed change.
        - No regression on ld test suite under arm-none-eabi and
          arm-linux-gnueabi configuration
        - Our standard QA has been run with a 4_9-2015q2-20150609
          Linaro embedded compiler patched with this linker changes,
          in two variants: default linker settings and fix enabled,
          forcing the linker warnings to be turned into hard
          errors. Our QA contains various commercial tests suites,
          industrial benchmarks and proprietary ST applications. We
          have not observed behavioral changes with and without the
          fixes, nor detected the occurrence of unhandled sequences
          (IT blocks limitations), nor observed branch limitations.
        - Hardware testing has been done on as STM32L476 "Discovery
          kit".

        * Impact:
        - The code size increase has been measured to 1.5% geometric
          mean increase on our whole set of QA applications, with a
          peak at 3.2%.

        * Limitations:
        - The veneer sizes are dimensioned to the maximum sequence
          that might be generated from a given instruction class (4
          words for ldm, 6 for vldm), but the actual sequence may be
          smaller.
        - Faulty ldm or vldm that are part of IT blocks are
          transformed if and only if they appear as the last
          instruction of the IT block, where the creation of a jump is
          the only legitimate place, otherwise the linker emits a
          warning and does nothing.
        - In ldmdb cases when the pc is involved in the restored
          register list, the memory access order is reversed compared
          to what it was originally.

Signed-off-by: Laurent ALFONSI <laurent.alfonsi@st.com>
---
 bfd/bfd-in.h                                       |   19 +-
 bfd/bfd-in2.h                                      |   19 +-
 bfd/elf32-arm.c                                    | 1433 +++++++++++++++++++-
 ld/emultempl/armelf.em                             |   32 +-
 ld/ld.texinfo                                      |   42 +
 ld/testsuite/ld-arm/arm-elf.exp                    |   18 +
 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-far-ldm.d |   25 +
 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-far-ldm.s |   27 +
 .../ld-arm/stm32l4xx-cannot-fix-it-block.d         |   16 +
 .../ld-arm/stm32l4xx-cannot-fix-it-block.s         |   16 +
 ld/testsuite/ld-arm/stm32l4xx-fix-all.d            |   83 ++
 ld/testsuite/ld-arm/stm32l4xx-fix-all.s            |   22 +
 ld/testsuite/ld-arm/stm32l4xx-fix-it-block.d       |  189 +++
 ld/testsuite/ld-arm/stm32l4xx-fix-it-block.s       |   92 ++
 ld/testsuite/ld-arm/stm32l4xx-fix-ldm.d            |  174 +++
 ld/testsuite/ld-arm/stm32l4xx-fix-ldm.s            |  147 ++
 ld/testsuite/ld-arm/stm32l4xx-fix-vldm.d           |   49 +
 ld/testsuite/ld-arm/stm32l4xx-fix-vldm.s           |   26 +
 18 files changed, 2423 insertions(+), 6 deletions(-)
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-far-ldm.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-far-ldm.s
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-it-block.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-cannot-fix-it-block.s
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-all.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-all.s
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-it-block.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-it-block.s
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-ldm.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-ldm.s
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-vldm.d
 create mode 100644 ld/testsuite/ld-arm/stm32l4xx-fix-vldm.s

LD Fix for STMicroelectronics hardware erratum STM32L4XX 629360.

Commit Message

Comments

Patch