Message ID | 20230925200127.504256-4-Benjamin.Cheatham@amd.com |
---|---|
State | New |
Headers | show |
Series | CXL, ACPI, APEI, EINJ: Update EINJ for CXL 1.1 error types | expand |
On Mon, 25 Sep 2023 15:01:27 -0500 Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > Update EINJ documentation to include CXL errors in available_error_types > table and usage of the types. > > Also fix a formatting error in the param4 file description that caused > the description to be on the same line as the bullet point. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> A trivial comment inline. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > .../firmware-guide/acpi/apei/einj.rst | 25 ++++++++++++++++--- > 1 file changed, 21 insertions(+), 4 deletions(-) > > diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst > index d6b61d22f525..c6f28118c48b 100644 > --- a/Documentation/firmware-guide/acpi/apei/einj.rst > +++ b/Documentation/firmware-guide/acpi/apei/einj.rst > @@ -32,6 +32,9 @@ configuration:: > CONFIG_ACPI_APEI > CONFIG_ACPI_APEI_EINJ > > +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same > +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m"). > + > The EINJ user interface is in <debugfs mount point>/apei/einj. > > The following files belong to it: > @@ -40,9 +43,9 @@ The following files belong to it: > > This file shows which error types are supported: > > - ================ =================================== > + ================ ========================================= > Error Type Value Error Description > - ================ =================================== > + ================ ========================================= > 0x00000001 Processor Correctable > 0x00000002 Processor Uncorrectable non-fatal > 0x00000004 Processor Uncorrectable fatal > @@ -55,7 +58,13 @@ The following files belong to it: > 0x00000200 Platform Correctable > 0x00000400 Platform Uncorrectable non-fatal > 0x00000800 Platform Uncorrectable fatal > - ================ =================================== > + 0x00001000 CXL.cache Protocol Correctable > + 0x00002000 CXL.cache Protocol Uncorrectable non-fatal > + 0x00004000 CXL.cache Protocol Uncorrectable fatal > + 0x00008000 CXL.mem Protocol Correctable > + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal > + 0x00020000 CXL.mem Protocol Uncorrectable fatal > + ================ ========================================= > > The format of the file contents are as above, except present are only > the available error types. > @@ -106,6 +115,7 @@ The following files belong to it: > Used when the 0x1 bit is set in "flags" to specify the APIC id > > - param4 > + #Unrelated change. Probably reasonable but should be separate patch really. > Used when the 0x4 bit is set in "flags" to specify target PCIe device > > - notrigger > @@ -159,6 +169,13 @@ and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor > documentation for details (and expect changes to this API if vendors > creativity in using this feature expands beyond our expectations). > > +CXL error types are supported from ACPI 6.5 onwards. To use these error > +types you need the MMIO address of a CXL 1.1 downstream port. You can > +find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr > +(it's possible that the dport is under the CXL root, in that case the > +path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr). > +From there, write the address to param1 and continue as you would for a > +memory error type. > > An error injection example:: > > @@ -201,4 +218,4 @@ The following sequence can be used: > 7) Read from the virtual address. This will trigger the error > > For more information about EINJ, please refer to ACPI specification > -version 4.0, section 17.5 and ACPI 5.0, section 18.6. > +version 4.0, section 17.5 and ACPI 6.5, section 18.6.
On 9/26/23 6:05 AM, Jonathan Cameron wrote: > On Mon, 25 Sep 2023 15:01:27 -0500 > Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > >> Update EINJ documentation to include CXL errors in available_error_types >> table and usage of the types. >> >> Also fix a formatting error in the param4 file description that caused >> the description to be on the same line as the bullet point. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > A trivial comment inline. > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > >> --- >> .../firmware-guide/acpi/apei/einj.rst | 25 ++++++++++++++++--- >> 1 file changed, 21 insertions(+), 4 deletions(-) >> >> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >> index d6b61d22f525..c6f28118c48b 100644 >> --- a/Documentation/firmware-guide/acpi/apei/einj.rst >> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst >> @@ -32,6 +32,9 @@ configuration:: >> CONFIG_ACPI_APEI >> CONFIG_ACPI_APEI_EINJ >> >> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same >> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m"). >> + >> The EINJ user interface is in <debugfs mount point>/apei/einj. >> >> The following files belong to it: >> @@ -40,9 +43,9 @@ The following files belong to it: >> >> This file shows which error types are supported: >> >> - ================ =================================== >> + ================ ========================================= >> Error Type Value Error Description >> - ================ =================================== >> + ================ ========================================= >> 0x00000001 Processor Correctable >> 0x00000002 Processor Uncorrectable non-fatal >> 0x00000004 Processor Uncorrectable fatal >> @@ -55,7 +58,13 @@ The following files belong to it: >> 0x00000200 Platform Correctable >> 0x00000400 Platform Uncorrectable non-fatal >> 0x00000800 Platform Uncorrectable fatal >> - ================ =================================== >> + 0x00001000 CXL.cache Protocol Correctable >> + 0x00002000 CXL.cache Protocol Uncorrectable non-fatal >> + 0x00004000 CXL.cache Protocol Uncorrectable fatal >> + 0x00008000 CXL.mem Protocol Correctable >> + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal >> + 0x00020000 CXL.mem Protocol Uncorrectable fatal >> + ================ ========================================= >> >> The format of the file contents are as above, except present are only >> the available error types. >> @@ -106,6 +115,7 @@ The following files belong to it: >> Used when the 0x1 bit is set in "flags" to specify the APIC id >> >> - param4 >> + > > #Unrelated change. Probably reasonable but should be separate patch really. > I'll take that out. Thanks, Ben >> Used when the 0x4 bit is set in "flags" to specify target PCIe device >> >> - notrigger >> @@ -159,6 +169,13 @@ and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor >> documentation for details (and expect changes to this API if vendors >> creativity in using this feature expands beyond our expectations). >> >> +CXL error types are supported from ACPI 6.5 onwards. To use these error >> +types you need the MMIO address of a CXL 1.1 downstream port. You can >> +find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr >> +(it's possible that the dport is under the CXL root, in that case the >> +path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr). >> +From there, write the address to param1 and continue as you would for a >> +memory error type. >> >> An error injection example:: >> >> @@ -201,4 +218,4 @@ The following sequence can be used: >> 7) Read from the virtual address. This will trigger the error >> >> For more information about EINJ, please refer to ACPI specification >> -version 4.0, section 17.5 and ACPI 5.0, section 18.6. >> +version 4.0, section 17.5 and ACPI 6.5, section 18.6. >
On Mon, Sep 25, 2023 at 03:01:27PM -0500, Ben Cheatham wrote: > Update EINJ documentation to include CXL errors in available_error_types > table and usage of the types. > > Also fix a formatting error in the param4 file description that caused > the description to be on the same line as the bullet point. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > --- > .../firmware-guide/acpi/apei/einj.rst | 25 ++++++++++++++++--- > 1 file changed, 21 insertions(+), 4 deletions(-) I always feel like the documentation update should be in the same patch as the new functionality so it's easy to match up with the code and keep things together when backporting. But I know that sentiment is not universal and maybe there's good reason to keep them separate. > diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst > index d6b61d22f525..c6f28118c48b 100644 > --- a/Documentation/firmware-guide/acpi/apei/einj.rst > +++ b/Documentation/firmware-guide/acpi/apei/einj.rst > @@ -32,6 +32,9 @@ configuration:: > CONFIG_ACPI_APEI > CONFIG_ACPI_APEI_EINJ > > +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same > +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m"). > ...
On 9/26/23 3:24 PM, Bjorn Helgaas wrote: > On Mon, Sep 25, 2023 at 03:01:27PM -0500, Ben Cheatham wrote: >> Update EINJ documentation to include CXL errors in available_error_types >> table and usage of the types. >> >> Also fix a formatting error in the param4 file description that caused >> the description to be on the same line as the bullet point. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> >> --- >> .../firmware-guide/acpi/apei/einj.rst | 25 ++++++++++++++++--- >> 1 file changed, 21 insertions(+), 4 deletions(-) > > I always feel like the documentation update should be in the same > patch as the new functionality so it's easy to match up with the code > and keep things together when backporting. But I know that sentiment > is not universal and maybe there's good reason to keep them separate. > I put it into a separate patch since the documentation change was substantial, but if it gets shorter in v6 I'll put it into the previous patch. Thanks, Ben >> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >> index d6b61d22f525..c6f28118c48b 100644 >> --- a/Documentation/firmware-guide/acpi/apei/einj.rst >> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst >> @@ -32,6 +32,9 @@ configuration:: >> CONFIG_ACPI_APEI >> CONFIG_ACPI_APEI_EINJ >> >> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same >> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m"). >> ...
diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index d6b61d22f525..c6f28118c48b 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -32,6 +32,9 @@ configuration:: CONFIG_ACPI_APEI CONFIG_ACPI_APEI_EINJ +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m"). + The EINJ user interface is in <debugfs mount point>/apei/einj. The following files belong to it: @@ -40,9 +43,9 @@ The following files belong to it: This file shows which error types are supported: - ================ =================================== + ================ ========================================= Error Type Value Error Description - ================ =================================== + ================ ========================================= 0x00000001 Processor Correctable 0x00000002 Processor Uncorrectable non-fatal 0x00000004 Processor Uncorrectable fatal @@ -55,7 +58,13 @@ The following files belong to it: 0x00000200 Platform Correctable 0x00000400 Platform Uncorrectable non-fatal 0x00000800 Platform Uncorrectable fatal - ================ =================================== + 0x00001000 CXL.cache Protocol Correctable + 0x00002000 CXL.cache Protocol Uncorrectable non-fatal + 0x00004000 CXL.cache Protocol Uncorrectable fatal + 0x00008000 CXL.mem Protocol Correctable + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal + 0x00020000 CXL.mem Protocol Uncorrectable fatal + ================ ========================================= The format of the file contents are as above, except present are only the available error types. @@ -106,6 +115,7 @@ The following files belong to it: Used when the 0x1 bit is set in "flags" to specify the APIC id - param4 + Used when the 0x4 bit is set in "flags" to specify target PCIe device - notrigger @@ -159,6 +169,13 @@ and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor documentation for details (and expect changes to this API if vendors creativity in using this feature expands beyond our expectations). +CXL error types are supported from ACPI 6.5 onwards. To use these error +types you need the MMIO address of a CXL 1.1 downstream port. You can +find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr +(it's possible that the dport is under the CXL root, in that case the +path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr). +From there, write the address to param1 and continue as you would for a +memory error type. An error injection example:: @@ -201,4 +218,4 @@ The following sequence can be used: 7) Read from the virtual address. This will trigger the error For more information about EINJ, please refer to ACPI specification -version 4.0, section 17.5 and ACPI 5.0, section 18.6. +version 4.0, section 17.5 and ACPI 6.5, section 18.6.
Update EINJ documentation to include CXL errors in available_error_types table and usage of the types. Also fix a formatting error in the param4 file description that caused the description to be on the same line as the bullet point. Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- .../firmware-guide/acpi/apei/einj.rst | 25 ++++++++++++++++--- 1 file changed, 21 insertions(+), 4 deletions(-)