mbox series

[V4,0/9] CXL: Process event logs

Message ID 20221212070627.1372402-1-ira.weiny@intel.com
Headers show
Series CXL: Process event logs | expand

Message

Ira Weiny Dec. 12, 2022, 7:06 a.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

This code has been tested with a newer qemu which allows for more events to be
returned at a time as well an additional QMP event and interrupt injection.
Those patches will follow once they have been cleaned up.

The series is now in 3 parts:

	1) Base functionality including interrupts
	2) Tracing specific events (Dynamic Capacity Event Record is defered)
	3) cxl-test infrastructure for basic tests

Changes from V3
	Feedback from Dan
	Spit out ACPI changes for Bjorn

- Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/


Davidlohr Bueso (1):
  cxl/mem: Wire up event interrupts

Ira Weiny (8):
  PCI/CXL: Export native CXL error reporting control
  cxl/mem: Read, trace, and clear events on driver load
  cxl/mem: Trace General Media Event Record
  cxl/mem: Trace DRAM Event Record
  cxl/mem: Trace Memory Module Event Record
  cxl/test: Add generic mock events
  cxl/test: Add specific events
  cxl/test: Simulate event log overflow

 drivers/acpi/pci_root.c       |   3 +
 drivers/cxl/core/mbox.c       | 186 +++++++++++++
 drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h             |  16 ++
 drivers/cxl/cxlmem.h          | 171 ++++++++++++
 drivers/cxl/cxlpci.h          |   6 +
 drivers/cxl/pci.c             | 236 +++++++++++++++++
 drivers/pci/probe.c           |   1 +
 include/linux/pci.h           |   1 +
 tools/testing/cxl/test/Kbuild |   2 +-
 tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
 11 files changed, 1452 insertions(+), 1 deletion(-)


base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e

Comments

Dan Williams Dec. 13, 2022, 8:15 p.m. UTC | #1
ira.weiny@ wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> ---
> Changes from V3:
> 	Adjust based on changes in patch 1
> 	Consolidate event setup into cxl_event_config()
> 	Consistently use cxl_event_* for function names
> 	Remove cxl_event_int_is_msi()
> 	Ensure DCD log is ignored in status
> 	Simplify event status loop logic
> 	Dan
> 		Fail driver load if the irq's are not allocated
> 		move cxl_event_config_msgnums() to pci.c
> 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> 		s/devm_kmalloc/devm_kzalloc
> 		Fix up pci_alloc_irq_vectors() comment
> 		Pass pdev to cxl_alloc_irq_vectors()
> 		Check FW irq policy prior to configuration
> 	Jonathan
> 		Use FIELD_GET instead of manual masking
> ---
>  drivers/cxl/cxl.h    |   4 +
>  drivers/cxl/cxlmem.h |  19 ++++
>  drivers/cxl/cxlpci.h |   6 ++
>  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 231 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  				 CXLDEV_EVENT_STATUS_FAIL |	\
>  				 CXLDEV_EVENT_STATUS_FATAL)
>  
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
>  	return 0;
>  }
>  
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> +	int nvecs;
> +
> +	/*
> +	 * CXL requires MSI/MSIX support.

Spec reference please.

"Per CXL 3.0 3.1.1 CXL.io Endpoint a function on a CXL device must not
generate INTx messages if that function participates in CXL.cache or
CXL.mem."

> +	 *
> +	 * Additionally pci_alloc_irq_vectors() handles calling
> +	 * pci_free_irq_vectors() automatically despite not being called
> +	 * pcim_*.  See pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 1) {
> +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> +		return -ENXIO;
> +	}
> +	return 0;
> +}
> +
> +struct cxl_dev_id {
> +	struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> +	struct cxl_dev_id *dev_id = id;
> +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	u32 status;
> +
> +	do {
> +		/*
> +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> +		 * ignore the reserved upper 32 bits
> +		 */
> +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +		/* Ignore logs unknown to the driver */
> +		status &= CXLDEV_EVENT_STATUS_ALL;
> +		if (!status)
> +			break;
> +		cxl_mem_get_event_records(cxlds, status);
> +		cond_resched();
> +	} while (status);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_id *dev_id;
> +	int irq;
> +
> +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> +		return -ENXIO;
> +
> +	/* dev_id must be globally unique and must contain the cxlds */
> +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> +	if (!dev_id)
> +		return -ENOMEM;
> +	dev_id->cxlds = cxlds;
> +
> +	irq =  pci_irq_vector(pdev,
> +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> +	if (irq < 0)
> +		return irq;
> +
> +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> +					 IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> +		.payload_out = policy,
> +		.size_out = sizeof(*policy),
> +	};
> +	int rc;
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0)
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +
> +	return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd;
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;

I prefer this syntax for command payload initialization if only because
it 0-fills all fields:

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index d42d87faddb8..e8b4bd5e517f 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -541,10 +541,12 @@ static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
        struct cxl_mbox_cmd mbox_cmd;
        int rc;
 
-       policy->info_settings = CXL_INT_MSI_MSIX;
-       policy->warn_settings = CXL_INT_MSI_MSIX;
-       policy->failure_settings = CXL_INT_MSI_MSIX;
-       policy->fatal_settings = CXL_INT_MSI_MSIX;
+       *policy = (struct cxl_event_interrupt_policy) {
+               .info_settings = CXL_INT_MSI_MSIX,
+               .warn_settings = CXL_INT_MSI_MSIX,
+               .failure_settings = CXL_INT_MSI_MSIX,
+               .fatal_settings = CXL_INT_MSI_MSIX,
+       };
 
        mbox_cmd = (struct cxl_mbox_cmd) {
                .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,

...but that can be a follow-on fixup because I am not seeing much else
to need respinning this patch.

> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +		.payload_in = policy,
> +		.size_in = sizeof(*policy),
> +	};
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	/* Retrieve final interrupt settings */
> +	return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	rc = cxl_event_config_msgnums(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> +	return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> +			    struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	/*
> +	 * When BIOS maintains CXL error reporting control, it will process
> +	 * event records.  Only one agent can do so.
> +	 */
> +	if (!host_bridge->native_cxl_error)
> +		return 0;
> +
> +	rc = cxl_event_get_int_policy(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	if (cxl_event_int_is_fw(policy.info_settings) ||
> +	    cxl_event_int_is_fw(policy.warn_settings) ||
> +	    cxl_event_int_is_fw(policy.failure_settings) ||
> +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> +		return -EBUSY;
> +	}
> +
> +	rc = cxl_event_irqsetup(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> +	return 0;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_alloc_irq_vectors(pdev);
> +	if (rc)
> +		return rc;
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	/*
> -	 * When BIOS maintains CXL error reporting control, it will process
> -	 * event records.  Only one agent can do so.
> -	 */
> -	if (host_bridge->native_cxl_error)
> -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	rc = cxl_event_config(host_bridge, cxlds);
> +	if (rc)
> +		return rc;

In the future, it would help me if these kinds of policy changes mid
patch set, if they cannot be avoided, are at least declared in the
changelog. It saps some reviewer resources to invalidate previous
understanding within the same review session.
Jonathan Cameron Dec. 16, 2022, 12:25 p.m. UTC | #2
On Sun, 11 Dec 2022 23:06:18 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> This code has been tested with a newer qemu which allows for more events to be
> returned at a time as well an additional QMP event and interrupt injection.
> Those patches will follow once they have been cleaned up.
> 
> The series is now in 3 parts:
> 
> 	1) Base functionality including interrupts
> 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> 	3) cxl-test infrastructure for basic tests
> 
> Changes from V3
> 	Feedback from Dan
> 	Spit out ACPI changes for Bjorn
> 
> - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/

Because I'm in a grumpy mood (as my colleagues will attest!)...
This is dependent on the patch that moves the trace definitions and
that's not upstream yet except in cxl/preview which is optimistic
place to use for a base commit.  The id isn't the one below either which
isn't in either mailine or the current CXL trees.

Not that I actually checked the cover letter until it failed to apply
(and hence already knew what was missing) but still, please call out
dependencies unless they are in the branches Dan has queued up to push.

I just want to play with Dave's fix for the RAS errors so having to jump
through these other sets.

Thanks,

Jonathan

> 
> 
> Davidlohr Bueso (1):
>   cxl/mem: Wire up event interrupts
> 
> Ira Weiny (8):
>   PCI/CXL: Export native CXL error reporting control
>   cxl/mem: Read, trace, and clear events on driver load
>   cxl/mem: Trace General Media Event Record
>   cxl/mem: Trace DRAM Event Record
>   cxl/mem: Trace Memory Module Event Record
>   cxl/test: Add generic mock events
>   cxl/test: Add specific events
>   cxl/test: Simulate event log overflow
> 
>  drivers/acpi/pci_root.c       |   3 +
>  drivers/cxl/core/mbox.c       | 186 +++++++++++++
>  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h             |  16 ++
>  drivers/cxl/cxlmem.h          | 171 ++++++++++++
>  drivers/cxl/cxlpci.h          |   6 +
>  drivers/cxl/pci.c             | 236 +++++++++++++++++
>  drivers/pci/probe.c           |   1 +
>  include/linux/pci.h           |   1 +
>  tools/testing/cxl/test/Kbuild |   2 +-
>  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
>  11 files changed, 1452 insertions(+), 1 deletion(-)
> 
> 
> base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
Jonathan Cameron Dec. 16, 2022, 2:09 p.m. UTC | #3
On Sun, 11 Dec 2022 23:06:19 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
Jonathan Cameron Dec. 16, 2022, 2:24 p.m. UTC | #4
On Sun, 11 Dec 2022 23:06:21 -0800
ira.weiny@intel.com wrote:

> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 

> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;

This is an issue for your QEMU code which has this set at 5 bytes.
Guess our handling of record lengths needs updating now we have two different
spec versions to support and hence these can have multiple lengths.

Btw, do you have an updated version of the QEMU patches you can share?
I was planning on just doing the AER type RAS stuff for the first pull this cycle
but given this set means we never reach that code I probably need to do QEMU
support for this and the stuff to support those all in one go - otherwise
no one will be able to test it :)  We rather optimistically have the OSC set
to say the OS can have control of these, but upstream code doesn't emulate
anything yet. Oops. Should have pretended the hardware was handling them
until we had this support in place in QEMU.

Jonathan

> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
Dan Williams Dec. 16, 2022, 5:01 p.m. UTC | #5
Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:18 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > This code has been tested with a newer qemu which allows for more events to be
> > returned at a time as well an additional QMP event and interrupt injection.
> > Those patches will follow once they have been cleaned up.
> > 
> > The series is now in 3 parts:
> > 
> > 	1) Base functionality including interrupts
> > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > 	3) cxl-test infrastructure for basic tests
> > 
> > Changes from V3
> > 	Feedback from Dan
> > 	Spit out ACPI changes for Bjorn
> > 
> > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/
> 
> Because I'm in a grumpy mood (as my colleagues will attest!)...
> This is dependent on the patch that moves the trace definitions and
> that's not upstream yet except in cxl/preview which is optimistic
> place to use for a base commit.  The id isn't the one below either which
> isn't in either mailine or the current CXL trees.

I do not want to commit to a new baseline until after -rc1, so yes this
is in a messy period.

> Not that I actually checked the cover letter until it failed to apply
> (and hence already knew what was missing) but still, please call out
> dependencies unless they are in the branches Dan has queued up to push.
> 
> I just want to play with Dave's fix for the RAS errors so having to jump
> through these other sets.

Yes, that is annoying, apologies.

> 
> Thanks,
> 
> Jonathan
> 
> > 
> > 
> > Davidlohr Bueso (1):
> >   cxl/mem: Wire up event interrupts
> > 
> > Ira Weiny (8):
> >   PCI/CXL: Export native CXL error reporting control
> >   cxl/mem: Read, trace, and clear events on driver load
> >   cxl/mem: Trace General Media Event Record
> >   cxl/mem: Trace DRAM Event Record
> >   cxl/mem: Trace Memory Module Event Record
> >   cxl/test: Add generic mock events
> >   cxl/test: Add specific events
> >   cxl/test: Simulate event log overflow
> > 
> >  drivers/acpi/pci_root.c       |   3 +
> >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h             |  16 ++
> >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> >  drivers/cxl/cxlpci.h          |   6 +
> >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> >  drivers/pci/probe.c           |   1 +
> >  include/linux/pci.h           |   1 +
> >  tools/testing/cxl/test/Kbuild |   2 +-
> >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > 
> > 
> > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> 

I think going forward these base-commits need to be something that are
reachable on cxl.git. For now I have pushed out a baseline for both Dave
and Ira's patches to cxl/preview which will rebase after -rc1 comes out.

Just the small matter of needing some acks/reviews on those lead in
patches so I can move them to through cxl/pending to cxl/next:

http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com
Ira Weiny Dec. 16, 2022, 6:15 p.m. UTC | #6
On Fri, Dec 16, 2022 at 09:01:13AM -0800, Dan Williams wrote:
> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > ira.weiny@intel.com wrote:
> > 
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > > 
> > > The series is now in 3 parts:
> > > 
> > > 	1) Base functionality including interrupts
> > > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 	3) cxl-test infrastructure for basic tests
> > > 
> > > Changes from V3
> > > 	Feedback from Dan
> > > 	Spit out ACPI changes for Bjorn
> > > 
> > > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/
> > 
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit.  The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.
> 
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.
> 
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> > 
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.
> 
> Yes, that is annoying, apologies.
> 
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > > 
> > > 
> > > Davidlohr Bueso (1):
> > >   cxl/mem: Wire up event interrupts
> > > 
> > > Ira Weiny (8):
> > >   PCI/CXL: Export native CXL error reporting control
> > >   cxl/mem: Read, trace, and clear events on driver load
> > >   cxl/mem: Trace General Media Event Record
> > >   cxl/mem: Trace DRAM Event Record
> > >   cxl/mem: Trace Memory Module Event Record
> > >   cxl/test: Add generic mock events
> > >   cxl/test: Add specific events
> > >   cxl/test: Simulate event log overflow
> > > 
> > >  drivers/acpi/pci_root.c       |   3 +
> > >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> > >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h             |  16 ++
> > >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> > >  drivers/cxl/cxlpci.h          |   6 +
> > >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> > >  drivers/pci/probe.c           |   1 +
> > >  include/linux/pci.h           |   1 +
> > >  tools/testing/cxl/test/Kbuild |   2 +-
> > >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> > >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> > 
> 
> I think going forward these base-commits need to be something that are
> reachable on cxl.git.

Agreed.  I thought this was in preview.  But even preview is not stable and I
should have waited and asked to see this land in next first.

Ira

> For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
> 
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:
> 
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com
Jonathan Cameron Dec. 16, 2022, 6:21 p.m. UTC | #7
On Sun, 11 Dec 2022 23:06:21 -0800
ira.weiny@intel.com wrote:

> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

Sometimes it feels like we go around in circles as I'm sure we've
fixed this at least 3 times now in different sets that got dropped
(was definitely in the DOE interrupt support set and was controversial ;)

Nothing in this patch set calls pci_set_master() so no interrupts
will even be delivered.  I don't think there are any sets in flight
that would fix that.

Jonathan


> 
> ---
> Changes from V3:
> 	Adjust based on changes in patch 1
> 	Consolidate event setup into cxl_event_config()
> 	Consistently use cxl_event_* for function names
> 	Remove cxl_event_int_is_msi()
> 	Ensure DCD log is ignored in status
> 	Simplify event status loop logic
> 	Dan
> 		Fail driver load if the irq's are not allocated
> 		move cxl_event_config_msgnums() to pci.c
> 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> 		s/devm_kmalloc/devm_kzalloc
> 		Fix up pci_alloc_irq_vectors() comment
> 		Pass pdev to cxl_alloc_irq_vectors()
> 		Check FW irq policy prior to configuration
> 	Jonathan
> 		Use FIELD_GET instead of manual masking
> ---
>  drivers/cxl/cxl.h    |   4 +
>  drivers/cxl/cxlmem.h |  19 ++++
>  drivers/cxl/cxlpci.h |   6 ++
>  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 231 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  				 CXLDEV_EVENT_STATUS_FAIL |	\
>  				 CXLDEV_EVENT_STATUS_FATAL)
>  
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
>  	return 0;
>  }
>  
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> +	int nvecs;
> +
> +	/*
> +	 * CXL requires MSI/MSIX support.
> +	 *
> +	 * Additionally pci_alloc_irq_vectors() handles calling
> +	 * pci_free_irq_vectors() automatically despite not being called
> +	 * pcim_*.  See pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 1) {
> +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> +		return -ENXIO;
> +	}
> +	return 0;
> +}
> +
> +struct cxl_dev_id {
> +	struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> +	struct cxl_dev_id *dev_id = id;
> +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	u32 status;
> +
> +	do {
> +		/*
> +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> +		 * ignore the reserved upper 32 bits
> +		 */
> +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +		/* Ignore logs unknown to the driver */
> +		status &= CXLDEV_EVENT_STATUS_ALL;
> +		if (!status)
> +			break;
> +		cxl_mem_get_event_records(cxlds, status);
> +		cond_resched();
> +	} while (status);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_id *dev_id;
> +	int irq;
> +
> +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> +		return -ENXIO;
> +
> +	/* dev_id must be globally unique and must contain the cxlds */
> +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> +	if (!dev_id)
> +		return -ENOMEM;
> +	dev_id->cxlds = cxlds;
> +
> +	irq =  pci_irq_vector(pdev,
> +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> +	if (irq < 0)
> +		return irq;
> +
> +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> +					 IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> +		.payload_out = policy,
> +		.size_out = sizeof(*policy),
> +	};
> +	int rc;
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0)
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +
> +	return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd;
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +		.payload_in = policy,
> +		.size_in = sizeof(*policy),
> +	};
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	/* Retrieve final interrupt settings */
> +	return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	rc = cxl_event_config_msgnums(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> +	return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> +			    struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	/*
> +	 * When BIOS maintains CXL error reporting control, it will process
> +	 * event records.  Only one agent can do so.
> +	 */
> +	if (!host_bridge->native_cxl_error)
> +		return 0;
> +
> +	rc = cxl_event_get_int_policy(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	if (cxl_event_int_is_fw(policy.info_settings) ||
> +	    cxl_event_int_is_fw(policy.warn_settings) ||
> +	    cxl_event_int_is_fw(policy.failure_settings) ||
> +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> +		return -EBUSY;
> +	}
> +
> +	rc = cxl_event_irqsetup(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> +	return 0;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_alloc_irq_vectors(pdev);
> +	if (rc)
> +		return rc;
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	/*
> -	 * When BIOS maintains CXL error reporting control, it will process
> -	 * event records.  Only one agent can do so.
> -	 */
> -	if (host_bridge->native_cxl_error)
> -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	rc = cxl_event_config(host_bridge, cxlds);
> +	if (rc)
> +		return rc;
>  
>  	if (cxlds->regs.ras) {
>  		pci_enable_pcie_error_reporting(pdev);
Jonathan Cameron Dec. 16, 2022, 6:39 p.m. UTC | #8
On Fri, 16 Dec 2022 09:01:13 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > ira.weiny@intel.com wrote:
> >   
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > > 
> > > The series is now in 3 parts:
> > > 
> > > 	1) Base functionality including interrupts
> > > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 	3) cxl-test infrastructure for basic tests
> > > 
> > > Changes from V3
> > > 	Feedback from Dan
> > > 	Spit out ACPI changes for Bjorn
> > > 
> > > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/  
> > 
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit.  The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.  
> 
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.

Fully understood. I only push trees out as 'testing' for 0-day to hit
until I can rebase on rc1.

> 
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> > 
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.  
> 
> Yes, that is annoying, apologies.
Not really a problem I just felt like grumbling :)

Have a good weekend.

Jonathan

> 
> > 
> > Thanks,
> > 
> > Jonathan
> >   
> > > 
> > > 
> > > Davidlohr Bueso (1):
> > >   cxl/mem: Wire up event interrupts
> > > 
> > > Ira Weiny (8):
> > >   PCI/CXL: Export native CXL error reporting control
> > >   cxl/mem: Read, trace, and clear events on driver load
> > >   cxl/mem: Trace General Media Event Record
> > >   cxl/mem: Trace DRAM Event Record
> > >   cxl/mem: Trace Memory Module Event Record
> > >   cxl/test: Add generic mock events
> > >   cxl/test: Add specific events
> > >   cxl/test: Simulate event log overflow
> > > 
> > >  drivers/acpi/pci_root.c       |   3 +
> > >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> > >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h             |  16 ++
> > >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> > >  drivers/cxl/cxlpci.h          |   6 +
> > >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> > >  drivers/pci/probe.c           |   1 +
> > >  include/linux/pci.h           |   1 +
> > >  tools/testing/cxl/test/Kbuild |   2 +-
> > >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> > >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e  
> >   
> 
> I think going forward these base-commits need to be something that are
> reachable on cxl.git. For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
> 
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:


Don't move too fast with Ira's.  Some issues coming up in testing..
(admittedly half of them were things where QEMU hadn't kept up with
what the kernel code now uses).


> 
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com
Jonathan Cameron Dec. 16, 2022, 6:42 p.m. UTC | #9
On Fri, 16 Dec 2022 14:24:38 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Sun, 11 Dec 2022 23:06:21 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.  Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> >   
> 
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;  
> 
> This is an issue for your QEMU code which has this set at 5 bytes.
> Guess our handling of record lengths needs updating now we have two different
> spec versions to support and hence these can have multiple lengths.
> 
> Btw, do you have an updated version of the QEMU patches you can share?

Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
stuff you have already done!

Jonathan

> I was planning on just doing the AER type RAS stuff for the first pull this cycle
> but given this set means we never reach that code I probably need to do QEMU
> support for this and the stuff to support those all in one go - otherwise
> no one will be able to test it :)  We rather optimistically have the OSC set
> to say the OS can have control of these, but upstream code doesn't emulate
> anything yet. Oops. Should have pretended the hardware was handling them
> until we had this support in place in QEMU.
> 
> Jonathan
> 
> > +} __packed;
> > +
> >  /**
> >   * struct cxl_event_state - Event log driver state
> >   *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
Ira Weiny Dec. 16, 2022, 9:28 p.m. UTC | #10
On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> On Fri, 16 Dec 2022 14:24:38 +0000
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > ira.weiny@intel.com wrote:
> > 
> > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries.  The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > > 
> > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > then the PCI infrastructure will allocate that number.  Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > > 
> > > CXL device events are signaled via interrupts.  Each event log may have
> > > a different interrupt message number.  These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > > 
> > > Add interrupt support for event logs.  Interrupts are allocated as
> > > shared interrupts.  Therefore, all or some event logs can share the same
> > > message number.
> > > 
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > > 
> > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > >   
> > 
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > +	CXL_INT_NONE		= 0x00,
> > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > +	CXL_INT_FW		= 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > +	u8 info_settings;
> > > +	u8 warn_settings;
> > > +	u8 failure_settings;
> > > +	u8 fatal_settings;  
> > 
> > This is an issue for your QEMU code which has this set at 5 bytes.
> > Guess our handling of record lengths needs updating now we have two different
> > spec versions to support and hence these can have multiple lengths.
> > 
> > Btw, do you have an updated version of the QEMU patches you can share?
> 
> Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> stuff you have already done!

I do have updates to handle more than 3 records at a time which I was polishing
until I saw your review.

I was getting all geared up to respin the patches but I'm not sure there is an
issue now that I look at it.

I'm still investigating.

Ira

> 
> Jonathan
> 
> > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > but given this set means we never reach that code I probably need to do QEMU
> > support for this and the stuff to support those all in one go - otherwise
> > no one will be able to test it :)  We rather optimistically have the OSC set
> > to say the OS can have control of these, but upstream code doesn't emulate
> > anything yet. Oops. Should have pretended the hardware was handling them
> > until we had this support in place in QEMU.
> > 
> > Jonathan
> > 
> > > +} __packed;
> > > +
> > >  /**
> > >   * struct cxl_event_state - Event log driver state
> > >   *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,  
>
Ira Weiny Dec. 16, 2022, 9:33 p.m. UTC | #11
On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:21 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.  Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> Sometimes it feels like we go around in circles as I'm sure we've
> fixed this at least 3 times now in different sets that got dropped
> (was definitely in the DOE interrupt support set and was controversial ;)

Well I certainly seem to be spinning ATM.  I probably need more eggnog.  ;-)

> 
> Nothing in this patch set calls pci_set_master() so no interrupts
> will even be delivered.

Is this a bug in Qemu then?  Because this _is_ tested and irqs _were_
delivered.  I thought we determined that pci_set_master() was called in
pcim_enable_device() but I don't see that now.  :-(

What I do know is this works with Qemu.  :-/

Ira

> I don't think there are any sets in flight
> that would fix that.
> 
> Jonathan
> 
> 
> > 
> > ---
> > Changes from V3:
> > 	Adjust based on changes in patch 1
> > 	Consolidate event setup into cxl_event_config()
> > 	Consistently use cxl_event_* for function names
> > 	Remove cxl_event_int_is_msi()
> > 	Ensure DCD log is ignored in status
> > 	Simplify event status loop logic
> > 	Dan
> > 		Fail driver load if the irq's are not allocated
> > 		move cxl_event_config_msgnums() to pci.c
> > 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > 		s/devm_kmalloc/devm_kzalloc
> > 		Fix up pci_alloc_irq_vectors() comment
> > 		Pass pdev to cxl_alloc_irq_vectors()
> > 		Check FW irq policy prior to configuration
> > 	Jonathan
> > 		Use FIELD_GET instead of manual masking
> > ---
> >  drivers/cxl/cxl.h    |   4 +
> >  drivers/cxl/cxlmem.h |  19 ++++
> >  drivers/cxl/cxlpci.h |   6 ++
> >  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
> >  4 files changed, 231 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 5974d1082210..b3964149c77b 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> >  				 CXLDEV_EVENT_STATUS_FAIL |	\
> >  				 CXLDEV_EVENT_STATUS_FATAL)
> >  
> > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> > +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> > +
> >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> >  	struct range dvsec_range[2];
> >  };
> >  
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;
> > +} __packed;
> > +
> >  /**
> >   * struct cxl_event_state - Event log driver state
> >   *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index 77dbdb980b12..a8ea04f536ab 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -53,6 +53,12 @@
> >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> >  
> > +/*
> > + * NOTE: Currently all the functions which are enabled for CXL require their
> > + * vectors to be in the first 16.  Use this as the default max.
> > + */
> > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > +
> >  /* Register Block Identifier (RBI) */
> >  enum cxl_regloc_type {
> >  	CXL_REGLOC_RBI_EMPTY = 0,
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index a2d8382bc593..d42d87faddb8 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> >  	return 0;
> >  }
> >  
> > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > +{
> > +	int nvecs;
> > +
> > +	/*
> > +	 * CXL requires MSI/MSIX support.
> > +	 *
> > +	 * Additionally pci_alloc_irq_vectors() handles calling
> > +	 * pci_free_irq_vectors() automatically despite not being called
> > +	 * pcim_*.  See pci_setup_msi_context().
> > +	 */
> > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > +	if (nvecs < 1) {
> > +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > +		return -ENXIO;
> > +	}
> > +	return 0;
> > +}
> > +
> > +struct cxl_dev_id {
> > +	struct cxl_dev_state *cxlds;
> > +};
> > +
> > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_id *dev_id = id;
> > +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> > +	u32 status;
> > +
> > +	do {
> > +		/*
> > +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > +		 * ignore the reserved upper 32 bits
> > +		 */
> > +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +		/* Ignore logs unknown to the driver */
> > +		status &= CXLDEV_EVENT_STATUS_ALL;
> > +		if (!status)
> > +			break;
> > +		cxl_mem_get_event_records(cxlds, status);
> > +		cond_resched();
> > +	} while (status);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct cxl_dev_id *dev_id;
> > +	int irq;
> > +
> > +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > +		return -ENXIO;
> > +
> > +	/* dev_id must be globally unique and must contain the cxlds */
> > +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > +	if (!dev_id)
> > +		return -ENOMEM;
> > +	dev_id->cxlds = cxlds;
> > +
> > +	irq =  pci_irq_vector(pdev,
> > +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > +	if (irq < 0)
> > +		return irq;
> > +
> > +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > +					 IRQF_SHARED, NULL, dev_id);
> > +}
> > +
> > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > +				    struct cxl_event_interrupt_policy *policy)
> > +{
> > +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > +		.payload_out = policy,
> > +		.size_out = sizeof(*policy),
> > +	};
> > +	int rc;
> > +
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0)
> > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > +			rc);
> > +
> > +	return rc;
> > +}
> > +
> > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > +				    struct cxl_event_interrupt_policy *policy)
> > +{
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > +		.payload_in = policy,
> > +		.size_in = sizeof(*policy),
> > +	};
> > +
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	/* Retrieve final interrupt settings */
> > +	return cxl_event_get_int_policy(cxlds, policy);
> > +}
> > +
> > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy policy;
> > +	int rc;
> > +
> > +	rc = cxl_event_config_msgnums(cxlds, &policy);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > +		return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static bool cxl_event_int_is_fw(u8 setting)
> > +{
> > +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > +
> > +	return mode == CXL_INT_FW;
> > +}
> > +
> > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > +			    struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy policy;
> > +	int rc;
> > +
> > +	/*
> > +	 * When BIOS maintains CXL error reporting control, it will process
> > +	 * event records.  Only one agent can do so.
> > +	 */
> > +	if (!host_bridge->native_cxl_error)
> > +		return 0;
> > +
> > +	rc = cxl_event_get_int_policy(cxlds, &policy);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (cxl_event_int_is_fw(policy.info_settings) ||
> > +	    cxl_event_int_is_fw(policy.warn_settings) ||
> > +	    cxl_event_int_is_fw(policy.failure_settings) ||
> > +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> > +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	rc = cxl_event_irqsetup(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +
> > +	return 0;
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	rc = cxl_alloc_irq_vectors(pdev);
> > +	if (rc)
> > +		return rc;
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > -	/*
> > -	 * When BIOS maintains CXL error reporting control, it will process
> > -	 * event records.  Only one agent can do so.
> > -	 */
> > -	if (host_bridge->native_cxl_error)
> > -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +	rc = cxl_event_config(host_bridge, cxlds);
> > +	if (rc)
> > +		return rc;
> >  
> >  	if (cxlds->regs.ras) {
> >  		pci_enable_pcie_error_reporting(pdev);
>
Jonathan Cameron Dec. 17, 2022, 4:40 p.m. UTC | #12
On Fri, 16 Dec 2022 13:28:36 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> > On Fri, 16 Dec 2022 14:24:38 +0000
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >   
> > > On Sun, 11 Dec 2022 23:06:21 -0800
> > > ira.weiny@intel.com wrote:
> > >   
> > > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > > 
> > > > Currently the only CXL features targeted for irq support require their
> > > > message numbers to be within the first 16 entries.  The device may
> > > > however support less than 16 entries depending on the support it
> > > > provides.
> > > > 
> > > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > > then the PCI infrastructure will allocate that number.  Upon successful
> > > > allocation, users can plug in their respective isr at any point
> > > > thereafter.
> > > > 
> > > > CXL device events are signaled via interrupts.  Each event log may have
> > > > a different interrupt message number.  These message numbers are
> > > > reported in the Get Event Interrupt Policy mailbox command.
> > > > 
> > > > Add interrupt support for event logs.  Interrupts are allocated as
> > > > shared interrupts.  Therefore, all or some event logs can share the same
> > > > message number.
> > > > 
> > > > In addition all logs are queried on any interrupt in order of the most
> > > > to least severe based on the status register.
> > > > 
> > > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > > >     
> > >   
> > > > +/**
> > > > + * Event Interrupt Policy
> > > > + *
> > > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > > + */
> > > > +enum cxl_event_int_mode {
> > > > +	CXL_INT_NONE		= 0x00,
> > > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > > +	CXL_INT_FW		= 0x02
> > > > +};
> > > > +struct cxl_event_interrupt_policy {
> > > > +	u8 info_settings;
> > > > +	u8 warn_settings;
> > > > +	u8 failure_settings;
> > > > +	u8 fatal_settings;    
> > > 
> > > This is an issue for your QEMU code which has this set at 5 bytes.
> > > Guess our handling of record lengths needs updating now we have two different
> > > spec versions to support and hence these can have multiple lengths.
> > > 
> > > Btw, do you have an updated version of the QEMU patches you can share?  
> > 
> > Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> > stuff you have already done!  
> 
> I do have updates to handle more than 3 records at a time which I was polishing
> until I saw your review.

Great. I'll work on other stuff in the meantime and hopefully we can get all the
error emulation upstream this QEMU cycle.

> 
> I was getting all geared up to respin the patches but I'm not sure there is an
> issue now that I look at it.
> 
> I'm still investigating.
> 
> Ira
> 
> > 
> > Jonathan
> >   
> > > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > > but given this set means we never reach that code I probably need to do QEMU
> > > support for this and the stuff to support those all in one go - otherwise
> > > no one will be able to test it :)  We rather optimistically have the OSC set
> > > to say the OS can have control of these, but upstream code doesn't emulate
> > > anything yet. Oops. Should have pretended the hardware was handling them
> > > until we had this support in place in QEMU.
> > > 
> > > Jonathan
> > >   
> > > > +} __packed;
> > > > +
> > > >  /**
> > > >   * struct cxl_event_state - Event log driver state
> > > >   *
> > > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,    
> >
Jonathan Cameron Dec. 17, 2022, 4:43 p.m. UTC | #13
On Fri, 16 Dec 2022 13:33:43 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > ira.weiny@intel.com wrote:
> >   
> > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries.  The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > > 
> > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > then the PCI infrastructure will allocate that number.  Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > > 
> > > CXL device events are signaled via interrupts.  Each event log may have
> > > a different interrupt message number.  These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > > 
> > > Add interrupt support for event logs.  Interrupts are allocated as
> > > shared interrupts.  Therefore, all or some event logs can share the same
> > > message number.
> > > 
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > > 
> > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>  
> > 
> > Sometimes it feels like we go around in circles as I'm sure we've
> > fixed this at least 3 times now in different sets that got dropped
> > (was definitely in the DOE interrupt support set and was controversial ;)  
> 
> Well I certainly seem to be spinning ATM.  I probably need more eggnog.  ;-)
> 
> > 
> > Nothing in this patch set calls pci_set_master() so no interrupts
> > will even be delivered.  
> 
> Is this a bug in Qemu then?  Because this _is_ tested and irqs _were_
> delivered.  I thought we determined that pci_set_master() was called in
> pcim_enable_device() but I don't see that now.  :-(
> 
> What I do know is this works with Qemu.  :-/

Without the addition of a pci_set_master() the kernel code, 
I don't get any interrupts on my arm64 test setup. Only reason I realized the
call was missing.  Possible there is something different on x86 I guess.

Jonathan

> 
> Ira
> 
> > I don't think there are any sets in flight
> > that would fix that.
> > 
> > Jonathan
> > 
> >   
> > > 
> > > ---
> > > Changes from V3:
> > > 	Adjust based on changes in patch 1
> > > 	Consolidate event setup into cxl_event_config()
> > > 	Consistently use cxl_event_* for function names
> > > 	Remove cxl_event_int_is_msi()
> > > 	Ensure DCD log is ignored in status
> > > 	Simplify event status loop logic
> > > 	Dan
> > > 		Fail driver load if the irq's are not allocated
> > > 		move cxl_event_config_msgnums() to pci.c
> > > 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > > 		s/devm_kmalloc/devm_kzalloc
> > > 		Fix up pci_alloc_irq_vectors() comment
> > > 		Pass pdev to cxl_alloc_irq_vectors()
> > > 		Check FW irq policy prior to configuration
> > > 	Jonathan
> > > 		Use FIELD_GET instead of manual masking
> > > ---
> > >  drivers/cxl/cxl.h    |   4 +
> > >  drivers/cxl/cxlmem.h |  19 ++++
> > >  drivers/cxl/cxlpci.h |   6 ++
> > >  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
> > >  4 files changed, 231 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 5974d1082210..b3964149c77b 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> > >  				 CXLDEV_EVENT_STATUS_FAIL |	\
> > >  				 CXLDEV_EVENT_STATUS_FATAL)
> > >  
> > > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > > +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> > > +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> > > +
> > >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> > >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> > >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> > >  	struct range dvsec_range[2];
> > >  };
> > >  
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > +	CXL_INT_NONE		= 0x00,
> > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > +	CXL_INT_FW		= 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > +	u8 info_settings;
> > > +	u8 warn_settings;
> > > +	u8 failure_settings;
> > > +	u8 fatal_settings;
> > > +} __packed;
> > > +
> > >  /**
> > >   * struct cxl_event_state - Event log driver state
> > >   *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > > index 77dbdb980b12..a8ea04f536ab 100644
> > > --- a/drivers/cxl/cxlpci.h
> > > +++ b/drivers/cxl/cxlpci.h
> > > @@ -53,6 +53,12 @@
> > >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> > >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> > >  
> > > +/*
> > > + * NOTE: Currently all the functions which are enabled for CXL require their
> > > + * vectors to be in the first 16.  Use this as the default max.
> > > + */
> > > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > > +
> > >  /* Register Block Identifier (RBI) */
> > >  enum cxl_regloc_type {
> > >  	CXL_REGLOC_RBI_EMPTY = 0,
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index a2d8382bc593..d42d87faddb8 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > >  	return 0;
> > >  }
> > >  
> > > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > > +{
> > > +	int nvecs;
> > > +
> > > +	/*
> > > +	 * CXL requires MSI/MSIX support.
> > > +	 *
> > > +	 * Additionally pci_alloc_irq_vectors() handles calling
> > > +	 * pci_free_irq_vectors() automatically despite not being called
> > > +	 * pcim_*.  See pci_setup_msi_context().
> > > +	 */
> > > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > > +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > > +	if (nvecs < 1) {
> > > +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > > +		return -ENXIO;
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +struct cxl_dev_id {
> > > +	struct cxl_dev_state *cxlds;
> > > +};
> > > +
> > > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > > +{
> > > +	struct cxl_dev_id *dev_id = id;
> > > +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> > > +	u32 status;
> > > +
> > > +	do {
> > > +		/*
> > > +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > > +		 * ignore the reserved upper 32 bits
> > > +		 */
> > > +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > +		/* Ignore logs unknown to the driver */
> > > +		status &= CXLDEV_EVENT_STATUS_ALL;
> > > +		if (!status)
> > > +			break;
> > > +		cxl_mem_get_event_records(cxlds, status);
> > > +		cond_resched();
> > > +	} while (status);
> > > +
> > > +	return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > > +{
> > > +	struct device *dev = cxlds->dev;
> > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > +	struct cxl_dev_id *dev_id;
> > > +	int irq;
> > > +
> > > +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > > +		return -ENXIO;
> > > +
> > > +	/* dev_id must be globally unique and must contain the cxlds */
> > > +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > > +	if (!dev_id)
> > > +		return -ENOMEM;
> > > +	dev_id->cxlds = cxlds;
> > > +
> > > +	irq =  pci_irq_vector(pdev,
> > > +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > > +	if (irq < 0)
> > > +		return irq;
> > > +
> > > +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > > +					 IRQF_SHARED, NULL, dev_id);
> > > +}
> > > +
> > > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > > +				    struct cxl_event_interrupt_policy *policy)
> > > +{
> > > +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > > +		.payload_out = policy,
> > > +		.size_out = sizeof(*policy),
> > > +	};
> > > +	int rc;
> > > +
> > > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +	if (rc < 0)
> > > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > > +			rc);
> > > +
> > > +	return rc;
> > > +}
> > > +
> > > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > +				    struct cxl_event_interrupt_policy *policy)
> > > +{
> > > +	struct cxl_mbox_cmd mbox_cmd;
> > > +	int rc;
> > > +
> > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > > +
> > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > > +		.payload_in = policy,
> > > +		.size_in = sizeof(*policy),
> > > +	};
> > > +
> > > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +	if (rc < 0) {
> > > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > > +			rc);
> > > +		return rc;
> > > +	}
> > > +
> > > +	/* Retrieve final interrupt settings */
> > > +	return cxl_event_get_int_policy(cxlds, policy);
> > > +}
> > > +
> > > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct cxl_event_interrupt_policy policy;
> > > +	int rc;
> > > +
> > > +	rc = cxl_event_config_msgnums(cxlds, &policy);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static bool cxl_event_int_is_fw(u8 setting)
> > > +{
> > > +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > > +
> > > +	return mode == CXL_INT_FW;
> > > +}
> > > +
> > > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > > +			    struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct cxl_event_interrupt_policy policy;
> > > +	int rc;
> > > +
> > > +	/*
> > > +	 * When BIOS maintains CXL error reporting control, it will process
> > > +	 * event records.  Only one agent can do so.
> > > +	 */
> > > +	if (!host_bridge->native_cxl_error)
> > > +		return 0;
> > > +
> > > +	rc = cxl_event_get_int_policy(cxlds, &policy);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	if (cxl_event_int_is_fw(policy.info_settings) ||
> > > +	    cxl_event_int_is_fw(policy.warn_settings) ||
> > > +	    cxl_event_int_is_fw(policy.failure_settings) ||
> > > +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> > > +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > > +		return -EBUSY;
> > > +	}
> > > +
> > > +	rc = cxl_event_irqsetup(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  {
> > >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > +	rc = cxl_alloc_irq_vectors(pdev);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > >  	cxlmd = devm_cxl_add_memdev(cxlds);
> > >  	if (IS_ERR(cxlmd))
> > >  		return PTR_ERR(cxlmd);
> > > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > -	/*
> > > -	 * When BIOS maintains CXL error reporting control, it will process
> > > -	 * event records.  Only one agent can do so.
> > > -	 */
> > > -	if (host_bridge->native_cxl_error)
> > > -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +	rc = cxl_event_config(host_bridge, cxlds);
> > > +	if (rc)
> > > +		return rc;
> > >  
> > >  	if (cxlds->regs.ras) {
> > >  		pci_enable_pcie_error_reporting(pdev);  
> >   
>
Ira Weiny Jan. 5, 2023, 3:16 a.m. UTC | #14
On Sun, Dec 11, 2022 at 11:06:19PM -0800, Ira wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 

Rafael,

I should have CC'ed you on this patch.  Could I get an ack on it so Dan can
take it through the CXL tree?

Thanks,
Ira

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> -- 
> 2.37.2
>