From patchwork Thu Feb 29 07:13:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 777211 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C7891446B6; Thu, 29 Feb 2024 07:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190805; cv=none; b=uyzU6SFgBPuStMc9ulH5wkfKpaoq/UzrA/YrNip/2dNsZZvkrpDSpCJ5w8MkaOZGYyxlxDd+3lpinj5Oc68asBaxCNpp1PDNBF7n6ED9cSlfg2POW/QurOvYe82g7WhnXM6SzII1TLL8d2B8P7YO8Lo/5iS/rWC+HqqGk0JJfeY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190805; c=relaxed/simple; bh=tyzJnjQ8r78M9dJ56ewIMb+ho6PqTtHsXYNzlsBHLqQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=U5cNAp2DHnp8nhmN+peibbgtbN6GO07ZRwx5yNBTvXSc6h0vkA6+xKEdicX2GSUxen2xpcJeSO6DNzdns0ZKMmUTaGDm1K/3Ek6EzghoXgnc1VWgPaK+2SVy423KRxcANRtUGhWlppfd5hc8CBAmbZorenSIs8hWcuAb58hccTQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Xko1Zge6; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Xko1Zge6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709190804; x=1740726804; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=tyzJnjQ8r78M9dJ56ewIMb+ho6PqTtHsXYNzlsBHLqQ=; b=Xko1Zge6QMn1eDwWUsahfr82pB+yxChclDdeodAdEGyLoGjmItBKzXuD ztQJEWqSvCz627hyioQAQ3KYhqZDeS1L5NQ+y8c23FOTplwamne6sm4Hc Om0SvZH1JSVUjHjyrvOFNI8eSTH10UTgNIDKrfSI+ZK11i3Bp0VloBX12 PEmSxcQi5P6r9quriHfTqs/qIiNL3jeJT49pz99MnUeGWuWtz1hYmDp8Q bC7kS+IGhNz23/8oNe1ClekEyuatQg2GqqNUI3uiGCQoiCPcKc3VsvQOx 0I7JGtscs3KlW561IBEe++Z2lM+I/Gz4ifYPTt6UxFkyz1eGM6SL4Uzkj w==; X-IronPort-AV: E=McAfee;i="6600,9927,10998"; a="3519853" X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="3519853" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="8283945" Received: from iweiny-desk3.amr.corp.intel.com (HELO localhost) ([10.213.166.213]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:22 -0800 From: Ira Weiny Date: Wed, 28 Feb 2024 23:13:16 -0800 Subject: [PATCH 1/4] cxl/event: Add missing include files Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240228-cxl-cper3-v1-1-6aa3f1343c6c@intel.com> References: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> In-Reply-To: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> To: Dan Williams , Jonathan Cameron , Smita Koralahalli , Shiju Jose Cc: Dan Carpenter , Yazen Ghannam , Davidlohr Bueso , Dave Jiang , Alison Schofield , Vishal Verma , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, Ira Weiny X-Mailer: b4 0.13-dev-2d940 X-Developer-Signature: v=1; a=ed25519-sha256; t=1709190800; l=932; i=ira.weiny@intel.com; s=20221222; h=from:subject:message-id; bh=tyzJnjQ8r78M9dJ56ewIMb+ho6PqTtHsXYNzlsBHLqQ=; b=aVJn1c1YctwcOn+6/FjTVDtOQD+bGI+t0/QiVekrpKTZX4YnK16qBReM6jWtAab8dwuINloJs KuCLIe4VS7BCimoINRmT58TzbdnJ23ABGgWCYGNbqACI1H/+fAsCMXv X-Developer-Key: i=ira.weiny@intel.com; a=ed25519; pk=brwqReAJklzu/xZ9FpSsMPSQ/qkSalbg6scP3w809Ec= Additional event testing using the cxl-event.h header revealed that it did not include the correct headers for the types used. Compile errors such as: include/linux/cxl-event.h|11 col 9| error: unknown type name ‘u8’ ... were seen. Add the correct pre-requisite headers. Omit the fixes tag because this does not cause any issues until the header is used again in other code. Signed-off-by: Ira Weiny --- include/linux/cxl-event.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h index 03fa6d50d46f..812ed16ffc2f 100644 --- a/include/linux/cxl-event.h +++ b/include/linux/cxl-event.h @@ -3,6 +3,9 @@ #ifndef _LINUX_CXL_EVENT_H #define _LINUX_CXL_EVENT_H +#include +#include + /* * Common Event Record Format * CXL rev 3.0 section 8.2.9.2.1; Table 8-42 From patchwork Thu Feb 29 07:13:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 776931 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 501F544C67; Thu, 29 Feb 2024 07:13:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190807; cv=none; b=fGMAgZQ2XovNuQc5dKzFXT4386jyA89ILsuD0ClKJ1RyhCsJE71N9DT5Ti0peL/68n7yRRbAQ6Y6+SfaYQlprib6vel7buOtBzedpQs6oaMiT95Cl94cQfLW72ySXf2yfQgyirINF58CaFocF1820FAo02NeKTHJwAD/rWAzfxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190807; c=relaxed/simple; bh=5eM5w7VZ6NPyCjmyhwlq6ycAHOyOHREv7WbgfWiRXaU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Al5gWNWTgvit/fVHc/PHg+c5nDNdi4/Jt2y0Gt1aR25wV9fzLj4s8jyKQB3FzaulqkDuNtQp9kC7jBjt7nH0bsHNfrCSq3N9wD7/jnyeWxQHtg9i7HDf+mxL8l809fn+zJG1j9qXniRbT46ZhkBtcNjX5f4hfR5Fqm6vlhQYm0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Y0D5PMke; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Y0D5PMke" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709190805; x=1740726805; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=5eM5w7VZ6NPyCjmyhwlq6ycAHOyOHREv7WbgfWiRXaU=; b=Y0D5PMkeXhhvdjDFjvT2hTvXxdBi/tXv4qKJ64wsFQ4rtS5NVjPeWLjx fkvXZMiVDp9ia0+R0xIhtDuNdFhmtvWEVLXs4kASO9Kjdnqrt3l4Q+Pms gZl1Ynnd528jXXPhsEYnjDilEjX+4KexfMPEGy+/HPQi9XBdh+7xqKauz HewGKB52+9s7kfL0xHwp0ndpEmMVzSP9rWkh86L7CPIOZzcx45HXgVCVp w3MHAU4d3Ya614l/9iasCYoqvIK1mtXibLFFulYjCO+MQIWheJcN2H7P+ xopw8B90VBMSR8O2zL8MDBbtt9JrYqRK1gqpzQPGXaZ61pjRMsBEgWQzm A==; X-IronPort-AV: E=McAfee;i="6600,9927,10998"; a="3519861" X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="3519861" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="8283950" Received: from iweiny-desk3.amr.corp.intel.com (HELO localhost) ([10.213.166.213]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:24 -0800 From: Ira Weiny Date: Wed, 28 Feb 2024 23:13:17 -0800 Subject: [PATCH 2/4] acpi/ghes: Process CXL Component Events Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240228-cxl-cper3-v1-2-6aa3f1343c6c@intel.com> References: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> In-Reply-To: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> To: Dan Williams , Jonathan Cameron , Smita Koralahalli , Shiju Jose Cc: Dan Carpenter , Yazen Ghannam , Davidlohr Bueso , Dave Jiang , Alison Schofield , Vishal Verma , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, Ira Weiny , "Rafael J. Wysocki" , Tony Luck , Borislav Petkov X-Mailer: b4 0.13-dev-2d940 X-Developer-Signature: v=1; a=ed25519-sha256; t=1709190800; l=8205; i=ira.weiny@intel.com; s=20221222; h=from:subject:message-id; bh=5eM5w7VZ6NPyCjmyhwlq6ycAHOyOHREv7WbgfWiRXaU=; b=V6e5g2S1NJtnryWVa+ppMCFhjuHNy2s9mMoKu4D4/wbJLBGxxya6zJermX0/mP4lB7F6qNjL0 5naxmG8GfIEDl0CNuRES5QIAjXsHBmUCO0eNwM8q1ZlH/MhuSxKVoKA X-Developer-Key: i=ira.weiny@intel.com; a=ed25519; pk=brwqReAJklzu/xZ9FpSsMPSQ/qkSalbg6scP3w809Ec= BIOS can configure memory devices as firmware first. This will send CXL events to the firmware instead of the OS. The firmware can then send these events to the OS via UEFI. Currently a configuration such as this will trace a non standard event in the log. Using the specific CXL trace points with the additional information CXL can provide is much more useful to users. Specifically, future support can be added to CXL provide the DPA to HPA mapping configured at the time of the event. UEFI v2.10 section N.2.14 defines a Common Platform Error Record (CPER) format for CXL Component Events. The format is mostly the same as the CXL Common Event Record Format. The difference is the use of a GUID in the Section Type rather than a UUID as part of the event itself. Add GHES support to detect CXL CPER records and call into the CXL code to process the event. Multiple methods were considered for the call into the CXL code. A notifier chain was considered for the callback but the complexity did not justify the use case. Furthermore, the CXL code is required to be called from process context as it needs to take a device lock so a simple callback register proved difficult. Dan Williams suggested using 2 work items as an atomic way of switching between a callback being registered and not. This allows the callback to run without any locking.[1] Note that a local work item is required to dump any messages seen during a race between any check done in cxl_cper_post_event() and the scheduling of work. That said, no attempt is made to stop the addition of messages into the kfifo because this local work item provides a hook to add a local CXL CPER trace point in a future patch. This new combined patch addresses the report by Dan Carpenter[2]. Thus the reported by tag. [1] https://lore.kernel.org/all/65d111eb87115_6c745294ac@dwillia2-xfh.jf.intel.com.notmuch/ [2] https://lore.kernel.org/all/b963c490-2c13-4b79-bbe7-34c6568423c7@moroto.mountain/ Cc: Ard Biesheuvel Cc: "Rafael J. Wysocki" Cc: Tony Luck Cc: Borislav Petkov Reported-by: Dan Carpenter Suggested-by: Dan Williams Signed-off-by: Ira Weiny --- [djbw: use kfifo for record data] [djbw: Use work struct for sync between cxl reg and ghes code] --- drivers/acpi/apei/ghes.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++ include/linux/cxl-event.h | 18 +++++++ 2 files changed, 145 insertions(+) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index ab2a82cb1b0b..f433f4eae888 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -33,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -673,6 +675,116 @@ static void ghes_defer_non_standard_event(struct acpi_hest_generic_data *gdata, schedule_work(&entry->work); } +/* CXL Event record UUIDs are formated as GUIDs and reported in section type */ + +/* + * General Media Event Record + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 + */ +#define CPER_SEC_CXL_GEN_MEDIA_GUID \ + GUID_INIT(0xfbcd0a77, 0xc260, 0x417f, \ + 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6) + +/* + * DRAM Event Record + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 + */ +#define CPER_SEC_CXL_DRAM_GUID \ + GUID_INIT(0x601dcbb3, 0x9c06, 0x4eab, \ + 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24) + +/* + * Memory Module Event Record + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45 + */ +#define CPER_SEC_CXL_MEM_MODULE_GUID \ + GUID_INIT(0xfe927475, 0xdd59, 0x4339, \ + 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74) + +struct cxl_cper_work_data { + enum cxl_event_type event_type; + struct cxl_cper_event_rec rec; +}; + +DEFINE_KFIFO(cxl_cper_fifo, struct cxl_cper_work_data, 32); +static DEFINE_SPINLOCK(cxl_cper_read_lock); +static DEFINE_SPINLOCK(cxl_cper_write_lock); + +static cxl_cper_callback cper_callback; +/* cb function dumps the records */ +static void cxl_cper_cb_fn(struct work_struct *work) +{ + struct cxl_cper_work_data wd; + + while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1, + &cxl_cper_read_lock)) { + cper_callback(wd.event_type, &wd.rec); + } +} +static DECLARE_WORK(cxl_cb_work, cxl_cper_cb_fn); + +static void cxl_cper_local_fn(struct work_struct *work) +{ + struct cxl_cper_work_data wd; + + while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1, + &cxl_cper_read_lock)) { + /* drop msg */ + } +} +static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn); + +/* Pointer for atomic switch of record processing */ +struct work_struct *cxl_cper_work = &cxl_local_work; + +static void cxl_cper_post_event(enum cxl_event_type event_type, + struct cxl_cper_event_rec *rec) +{ + struct cxl_cper_work_data wd; + + if (rec->hdr.length <= sizeof(rec->hdr) || + rec->hdr.length > sizeof(*rec)) { + pr_err(FW_WARN "CXL CPER Invalid section length (%u)\n", + rec->hdr.length); + return; + } + + if (!(rec->hdr.validation_bits & CPER_CXL_COMP_EVENT_LOG_VALID)) { + pr_err(FW_WARN "CXL CPER invalid event\n"); + return; + } + + wd.event_type = event_type; + memcpy(&wd.rec, rec, sizeof(wd.rec)); + + kfifo_in_spinlocked(&cxl_cper_fifo, &wd, 1, &cxl_cper_write_lock); + schedule_work(cxl_cper_work); +} + +int cxl_cper_register_callback(cxl_cper_callback callback) +{ + if (cper_callback) + return -EINVAL; + cper_callback = callback; + /* Atomic switch back to callback processing */ + cxl_cper_work = &cxl_cb_work; + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_cper_register_callback, CXL); + +int cxl_cper_unregister_callback(cxl_cper_callback callback) +{ + if (callback != cper_callback) + return -EINVAL; + + /* Atomic switch back to ghes processing */ + cxl_cper_work = &cxl_local_work; + cancel_work_sync(&cxl_cb_work); + cper_callback = NULL; + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_cper_unregister_callback, CXL); + static bool ghes_do_proc(struct ghes *ghes, const struct acpi_hest_generic_status *estatus) { @@ -707,6 +819,21 @@ static bool ghes_do_proc(struct ghes *ghes, } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { queued = ghes_handle_arm_hw_error(gdata, sev, sync); + } + else if (guid_equal(sec_type, &CPER_SEC_CXL_GEN_MEDIA_GUID)) { + struct cxl_cper_event_rec *rec = acpi_hest_get_payload(gdata); + + cxl_cper_post_event(CXL_CPER_EVENT_GEN_MEDIA, rec); + } + else if (guid_equal(sec_type, &CPER_SEC_CXL_DRAM_GUID)) { + struct cxl_cper_event_rec *rec = acpi_hest_get_payload(gdata); + + cxl_cper_post_event(CXL_CPER_EVENT_DRAM, rec); + } + else if (guid_equal(sec_type, &CPER_SEC_CXL_MEM_MODULE_GUID)) { + struct cxl_cper_event_rec *rec = acpi_hest_get_payload(gdata); + + cxl_cper_post_event(CXL_CPER_EVENT_MEM_MODULE, rec); } else { void *err = acpi_hest_get_payload(gdata); diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h index 812ed16ffc2f..4834cf23656e 100644 --- a/include/linux/cxl-event.h +++ b/include/linux/cxl-event.h @@ -143,4 +143,22 @@ struct cxl_cper_event_rec { union cxl_event event; } __packed; +typedef void (*cxl_cper_callback)(enum cxl_event_type type, + struct cxl_cper_event_rec *rec); + +#ifdef CONFIG_ACPI_APEI_GHES +int cxl_cper_register_callback(cxl_cper_callback callback); +int cxl_cper_unregister_callback(cxl_cper_callback callback); +#else +static inline int cxl_cper_register_callback(cxl_cper_callback callback) +{ + return 0; +} + +static inline int cxl_cper_unregister_callback(cxl_cper_callback callback) +{ + return 0; +} +#endif + #endif /* _LINUX_CXL_EVENT_H */ From patchwork Thu Feb 29 07:13:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 777210 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 924C245950; Thu, 29 Feb 2024 07:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190808; cv=none; b=ds5Yq/q6/5sZLflpZH/HdGaCitnu/V1W6Qf9YAw8SvkH9MPQWZUQDxOjIsjq96pKR8ElBVO3Do3JiXN8QSarUPxxw1wpizjuVpUNWICWFkhFkR96iKbrJcyUz6M9P1ZPKO+UUpgdx/33liibOgTE3XkiV2mM/LFwzVjVac/aZPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190808; c=relaxed/simple; bh=x1aGCgFq02k93Mdw2reAcXjZLCdUEfh0lfFGNDEloKo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=UkAgTDWvv+Oh4eEMrG5LX0FCiPuZuvdCXQERo5ajqHNmq85OAP8uRhdmMu+gsa4dk6PoH5f0dqwfcoqIdy4SRmlSSEw7V+uHqiq8Z6oPQrpALND3IpLkXYaN9+JEDpUY7HRkPvAWXfO9IC35WJVtoILJKKYd1OAClDcCXZmuzuY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=n7fcNErE; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="n7fcNErE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709190807; x=1740726807; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=x1aGCgFq02k93Mdw2reAcXjZLCdUEfh0lfFGNDEloKo=; b=n7fcNErEOOOd6Uw2aH5GmMrw9OlxLLNEIhGMNmzVchfHpIEFpci/2UVu Dp505rwQKydmq8tAjf9NA8Ev2W+t5H/J3Fqt/QXfgegT4RI75I4r0GuPa KhjMBQjfO3ogaVpwJ2aNXuur3WOUrzCO8QbSn33jH+Nojn2sx6FH/5NE9 +0ap21/3w3rcNATKC+u+joixwLI5GNonoVQJ+c6sMg0Rw1uNPv+hL18Ou Oytug1DW8xp/zdRbdlzwa2Xa4DSmQxnD/KkfHp443nS97L4Zr4LHk9W2S 4iV5yal6MtfILJ9AbCM9jcXNSGK8P4yCvkMzpUgOa3iOVtTE1MR9xtuYi A==; X-IronPort-AV: E=McAfee;i="6600,9927,10998"; a="3519871" X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="3519871" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="8283954" Received: from iweiny-desk3.amr.corp.intel.com (HELO localhost) ([10.213.166.213]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:25 -0800 From: Ira Weiny Date: Wed, 28 Feb 2024 23:13:18 -0800 Subject: [PATCH 3/4] cxl/pci: Register for and process CPER events Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240228-cxl-cper3-v1-3-6aa3f1343c6c@intel.com> References: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> In-Reply-To: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> To: Dan Williams , Jonathan Cameron , Smita Koralahalli , Shiju Jose Cc: Dan Carpenter , Yazen Ghannam , Davidlohr Bueso , Dave Jiang , Alison Schofield , Vishal Verma , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, Ira Weiny X-Mailer: b4 0.13-dev-2d940 X-Developer-Signature: v=1; a=ed25519-sha256; t=1709190800; l=3078; i=ira.weiny@intel.com; s=20221222; h=from:subject:message-id; bh=x1aGCgFq02k93Mdw2reAcXjZLCdUEfh0lfFGNDEloKo=; b=Hc+j7X+CQb6x5t3VaGt2xuCh8yHcqTF5bbPhMcdj23T1v3wWyZH2yhK64ynk1UrM2LS/hapc/ ZiE+EB7FFEsBNojB1u0I3kLZ9WtRlw9rLtru3OuSc/cNHjn6GL1791X X-Developer-Key: i=ira.weiny@intel.com; a=ed25519; pk=brwqReAJklzu/xZ9FpSsMPSQ/qkSalbg6scP3w809Ec= If the firmware has configured CXL event support to be firmware first the OS can process those events through CPER records. The CXL layer has unique DPA to HPA knowledge and standard event trace parsing in place. CPER records contain Bus, Device, Function information which can be used to identify the PCI device which is sending the event. Add a CXL CPER callback to process events through the CXL trace subsystem. Future patches will provide additional region information such as HPA. Signed-off-by: Ira Weiny --- Changes: [iweiny: Add back in after the revert in 6.8] --- drivers/cxl/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 2ff361e756d6..6cf8336d1b33 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -974,6 +974,73 @@ static struct pci_driver cxl_pci_driver = { }, }; -module_pci_driver(cxl_pci_driver); +#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0) +static void cxl_cper_event_call(enum cxl_event_type ev_type, + struct cxl_cper_event_rec *rec) +{ + struct cper_cxl_event_devid *device_id = &rec->hdr.device_id; + struct pci_dev *pdev __free(pci_dev_put) = NULL; + enum cxl_event_log_type log_type; + struct cxl_dev_state *cxlds; + unsigned int devfn; + u32 hdr_flags; + + pr_debug("CPER event for device %u:%u:%u.%u\n", + device_id->segment_num, device_id->bus_num, + device_id->device_num, device_id->func_num); + + devfn = PCI_DEVFN(device_id->device_num, device_id->func_num); + pdev = pci_get_domain_bus_and_slot(device_id->segment_num, + device_id->bus_num, devfn); + if (!pdev) { + pr_err("CPER event device %u:%u:%u.%u not found\n", + device_id->segment_num, device_id->bus_num, + device_id->device_num, device_id->func_num); + return; + } + + dev_dbg(&pdev->dev, "Found device %u:%u.%u\n", device_id->bus_num, + device_id->device_num, device_id->func_num); + + guard(device)(&pdev->dev); + if (pdev->driver != &cxl_pci_driver) + return; + + cxlds = pci_get_drvdata(pdev); + if (!cxlds) + return; + + /* Fabricate a log type */ + hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags); + log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags); + + dev_dbg(&pdev->dev, "Tracing %d\n", ev_type); + cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type, + &uuid_null, &rec->event); +} + +static int __init cxl_pci_driver_init(void) +{ + int rc; + + rc = pci_register_driver(&cxl_pci_driver); + if (rc) + return rc; + + rc = cxl_cper_register_callback(cxl_cper_event_call); + if (rc) + pci_unregister_driver(&cxl_pci_driver); + + return rc; +} + +static void __exit cxl_pci_driver_exit(void) +{ + cxl_cper_unregister_callback(cxl_cper_event_call); + pci_unregister_driver(&cxl_pci_driver); +} + +module_init(cxl_pci_driver_init); +module_exit(cxl_pci_driver_exit); MODULE_LICENSE("GPL v2"); MODULE_IMPORT_NS(CXL); From patchwork Thu Feb 29 07:13:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 776930 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07DA147F5F; Thu, 29 Feb 2024 07:13:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190809; cv=none; b=T4PKZ/NnqUYGHHYhfx3spcMOeCFNbMAZ4HhFu9sf9TyOu7yZE2t0N/rAeciq5FuTXYLRiQLNbCuaAybV8coBHDEst/8kgd0XTbIxUVCypNwJOIaqFiLoG5e91wYj0ej0sQ/gdFVtTT0714NSQQ1VReLV6YCujiLTGftu36rDBaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709190809; c=relaxed/simple; bh=d6szCcRr1xJz3ioCRqW/Jzj0XJlHmJymM212OssLqV8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NtTfOAkE5frWyItjJ69cHiMV8Ldiw+0e+P45kRsfF0feKgjXup5N018Ge2d15Q3FT7YIodKK5Dy/26j59ymGHflRmPStASC95ZwxiyeQLkfW/kIlVI4wzhX7FBuqv5jFPIUpNPkKfISZhh4w/BkmiFfKV3xI8Apyd9TPZqkwK/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Pxhe90Kx; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Pxhe90Kx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709190808; x=1740726808; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=d6szCcRr1xJz3ioCRqW/Jzj0XJlHmJymM212OssLqV8=; b=Pxhe90Kx5tV+wVG0ENzmQjFvO+8OC+/Fy+JB4Jt2Dnp8vpvgeE2wroNI GZwDcV4Jki1UsxfyNIbTJKYZa+nJCdYdJ3MxCcTy2Xne2RNZshY4lvDGN 9rZYdL2gHAQ0lz1m9bWZGQl1X0KOY23p/+Qf5yZrTU9VkE4EihnwlVHR1 /dQlFDh401b5ypsTjhO6X/5yurnFEZVgSmCZYgMrYUl0JkN9/xy7QuDl9 J9hWiF83Fyblpx9sXdkLUaCMgLLHVTMjmh/i4J/3yX9ALT2zoXcks42AZ /TGCzurK/5vrKP9aTN5HWyN6Wt7xs7FRm7J9vMpEjrnJU/tyUe4V3pGq6 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10998"; a="3519880" X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="3519880" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,192,1705392000"; d="scan'208";a="8283957" Received: from iweiny-desk3.amr.corp.intel.com (HELO localhost) ([10.213.166.213]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2024 23:13:26 -0800 From: Ira Weiny Date: Wed, 28 Feb 2024 23:13:19 -0800 Subject: [PATCH 4/4] ras/events: Trace CXL CPER events even without the CXL stack loaded Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240228-cxl-cper3-v1-4-6aa3f1343c6c@intel.com> References: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> In-Reply-To: <20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com> To: Dan Williams , Jonathan Cameron , Smita Koralahalli , Shiju Jose Cc: Dan Carpenter , Yazen Ghannam , Davidlohr Bueso , Dave Jiang , Alison Schofield , Vishal Verma , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, Ira Weiny , "Rafael J. Wysocki" , Tony Luck , Borislav Petkov X-Mailer: b4 0.13-dev-2d940 X-Developer-Signature: v=1; a=ed25519-sha256; t=1709190800; l=5654; i=ira.weiny@intel.com; s=20221222; h=from:subject:message-id; bh=d6szCcRr1xJz3ioCRqW/Jzj0XJlHmJymM212OssLqV8=; b=QdCvHwK33KkZUETbo+6ZurAroMJdTuJYMUu7rMVJkz+oC972HYA7r1dhQZm4t1iEdiv3KLVOP 0cKA5vRSNM9ADzIJG8qQCFbEUSfAWvHU0VweYpvcwxA8cAvDH6vaDOQ X-Developer-Key: i=ira.weiny@intel.com; a=ed25519; pk=brwqReAJklzu/xZ9FpSsMPSQ/qkSalbg6scP3w809Ec= If CXL is solely managed by firmware (including HDM configuration and event processing via firmware first) it is possible to run the system without the CXL software loaded. In this case no CXL callback will be loaded and CXL CPER errors will not be processed at all. In this case memory device and region (HPA) information is missing but omitting the error completely is not friendly for such a user. Some device information is available in the generic event which could prove useful to a user. Utilize the local work item to trace a generic CXL CPER event. Duplicate the pattern of decoding the CXL event header to aid in adding future trace points if needed. This was an easy lift from the CXL trace points. But stop at header decoding only because this is an unlikely configuration for the system. Further decoding can be obtained with user space tools or added later if needed. Cc: Ard Biesheuvel Cc: "Rafael J. Wysocki" Cc: Tony Luck Cc: Borislav Petkov Suggested-by: Dan Williams Signed-off-by: Ira Weiny --- drivers/acpi/apei/ghes.c | 5 ++- include/ras/ras_event.h | 90 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 94 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index f433f4eae888..9ac323cbf195 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -729,7 +729,10 @@ static void cxl_cper_local_fn(struct work_struct *work) while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1, &cxl_cper_read_lock)) { - /* drop msg */ + struct cxl_cper_event_rec *rec = &wd.rec; + union cxl_event *evt = &rec->event; + + trace_cper_cxl_gen_event(rec, &evt->generic); } } static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn); diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index cbd3ddd7c33d..319faf552b65 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -422,6 +422,96 @@ TRACE_EVENT(memory_failure_event, ) ); #endif /* CONFIG_MEMORY_FAILURE */ + +#include +#include + +/* + * Common Event Record Format + * CXL 3.0 section 8.2.9.2.1; Table 8-42 + */ +#define CXL_EVENT_RECORD_FLAG_PERMANENT BIT(2) +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED BIT(3) +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED BIT(4) +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE BIT(5) +#define show_hdr_flags(flags) __print_flags(flags, " | ", \ + { CXL_EVENT_RECORD_FLAG_PERMANENT, "PERMANENT_CONDITION" }, \ + { CXL_EVENT_RECORD_FLAG_MAINT_NEEDED, "MAINTENANCE_NEEDED" }, \ + { CXL_EVENT_RECORD_FLAG_PERF_DEGRADED, "PERFORMANCE_DEGRADED" }, \ + { CXL_EVENT_RECORD_FLAG_HW_REPLACE, "HARDWARE_REPLACEMENT_NEEDED" } \ +) + +/* + * Define macros for the common header of each CPER CXL event. + * + * Tracepoints using these macros must do 3 things: + * + * 1) Add CPER_CXL_EVT_TP_entry to TP_STRUCT__entry + * 2) Use CPER_CXL_EVT_TP_fast_assign within TP_fast_assign; + * pass the serial number and CXL event header + * 3) Use CPER_CXL_EVT_TP_printk() instead of TP_printk() + * + * See the generic_event tracepoint as an example. + */ +#define CPER_CXL_EVT_TP_entry \ + __field(u16, segment) \ + __field(u8, bus) \ + __field(u8, device) \ + __field(u8, func) \ + __field(u64, serial) \ + __field(u32, hdr_flags) \ + __field(u16, hdr_handle) \ + __field(u16, hdr_related_handle) \ + __field(u64, hdr_timestamp) \ + __field(u8, hdr_length) \ + __field(u8, hdr_maint_op_class) + +#define CPER_CXL_EVT_TP_fast_assign(cper_rec, evt_hdr) \ + __entry->segment = cper_rec->hdr.device_id.segment_num; \ + __entry->bus = cper_rec->hdr.device_id.bus_num; \ + __entry->device = cper_rec->hdr.device_id.device_num; \ + __entry->func = cper_rec->hdr.device_id.func_num; \ + __entry->serial = (((u64)cper_rec->hdr.dev_serial_num.upper_dw) << 32) |\ + cper_rec->hdr.dev_serial_num.lower_dw; \ + __entry->hdr_length = (evt_hdr).length; \ + __entry->hdr_flags = get_unaligned_le24((evt_hdr).flags); \ + __entry->hdr_handle = le16_to_cpu((evt_hdr).handle); \ + __entry->hdr_related_handle = le16_to_cpu((evt_hdr).related_handle); \ + __entry->hdr_timestamp = le64_to_cpu((evt_hdr).timestamp); \ + __entry->hdr_maint_op_class = (evt_hdr).maint_op_class + +#define CPER_CXL_EVT_TP_printk(fmt, ...) \ + TP_printk("device=%04x:%02x:%02x.%02x serial=%lld : time=%llu " \ + "len=%d flags='%s' handle=%x related_handle=%x " \ + "maint_op_class=%u : " fmt, \ + __entry->segment, __entry->bus, __entry->device, __entry->func, \ + __entry->serial, \ + __entry->hdr_timestamp, __entry->hdr_length, \ + show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle, \ + __entry->hdr_related_handle, __entry->hdr_maint_op_class, \ + ##__VA_ARGS__) + +TRACE_EVENT(cper_cxl_gen_event, + + TP_PROTO(struct cxl_cper_event_rec *cper_rec, + struct cxl_event_generic *gen_rec), + + TP_ARGS(cper_rec, gen_rec), + + TP_STRUCT__entry( + CPER_CXL_EVT_TP_entry + __array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH) + ), + + TP_fast_assign( + CPER_CXL_EVT_TP_fast_assign(cper_rec, gen_rec->hdr); + memcpy(__entry->data, gen_rec->data, CXL_EVENT_RECORD_DATA_LENGTH); + ), + + CPER_CXL_EVT_TP_printk("%s", + __print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH)) +); + #endif /* _TRACE_HW_EVENT_MC_H */ /* This part must be outside protection */