diff mbox series

[RFC,2/2] arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate

Message ID 20240312135958.727765-3-dwmw2@infradead.org
State Superseded
Headers show
Series Add PSCI v1.3 SYSTEM_OFF2 support for hibernation | expand

Commit Message

David Woodhouse March 12, 2024, 1:51 p.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk>

The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
function which is analogous to ACPI S4 state. This will allow hosting
environments to determine that a guest is hibernated rather than just
powered off, and handle that state appropriately on subsequent launches.

Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
poweroff") the EFI shutdown method is deliberately preferred over PSCI
or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
*only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
a last resort via the legacy pm_power_off function pointer.

The hibernation code already exports a system_entering_hibernation()
function which is be used by the higher-priority handler to check for
hibernation. That existing function just returns the value of a static
boolean variable from hibernate.c, which was previously only set in the
hibernation_platform_enter() code path. Set the same flag in the simpler
code path around the call to kernel_power_off() too.

An alternative way to hook SYSTEM_OFF2 into the hibernation code would
be to register a platform_hibernation_ops structure with an ->enter()
method which makes the new SYSTEM_OFF2 call. But that would have the
unwanted side-effect of making hibernation take a completely different
code path in hibernation_platform_enter(), invoking a lot of special dpm
callbacks.

Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
indicate whether the power off is for hibernation.

But this version works and is relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
 kernel/power/hibernate.c     |  5 ++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

Comments

David Woodhouse March 12, 2024, 4:36 p.m. UTC | #1
On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
> Looked briefly at register_sys_off_handler and it should be OK to call
> it from psci_init_system_off2() below. Any particular reason for having
> separate initcall to do this ? We can even eliminate the need for
> psci_init_system_off2 if it can be called from there. What am I missing ?

My first attempt did that. I don't think we can kmalloc that early:

[    0.000000] psci: SMC Calling Convention v1.1
[    0.000000] Unable to handle kernel read from unreadable memory at virtual address 0000000000000018
[    0.000000] Mem abort info:
[    0.000000]   ESR = 0x0000000096000004
[    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000]   FSC = 0x04: level 0 translation fault
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    0.000000]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.000000]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.000000] [0000000000000018] user address but active_mm is swapper
[    0.000000] Internal error: Oops: 0000000096000004 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc3+ #30
[    0.000000] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.000000] pc : kmalloc_trace+0x138/0x340
[    0.000000] lr : register_sys_off_handler+0x60/0x258
[    0.000000] sp : ffff8000827d3d10
[    0.000000] x29: ffff8000827d3d20 x28: 000000005cd7e0ac x27: 0000000000001f3f
[    0.000000] x26: 0000000000000000 x25: ffff8000802bd890 x24: ffff8000802bd890
[    0.000000] x23: 0000000000000040 x22: 0000000000000dc0 x21: 0000000000000001
[    0.000000] x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000006
[    0.000000] x17: 000000000036fd40 x16: 000000005ec902c0 x15: ffff8000827d37c0
[    0.000000] x14: 0000000000000000 x13: 312e3176206e6f69 x12: 746e65766e6f4320
[    0.000000] x11: 00000000ffffdfff x10: ffff8000828cebe0 x9 : ffff80008281ea10
[    0.000000] x8 : ffff8000827d3d78 x7 : 0000000000000000 x6 : 0000000000000000
[    0.000000] x5 : 0000000000000000 x4 : ffff8000827e0000 x3 : ffff8000827f41c0
[    0.000000] x2 : 0000000000000040 x1 : 0000000000000dc0 x0 : 0000000000000000
[    0.000000] Call trace:
[    0.000000]  kmalloc_trace+0x138/0x340
[    0.000000]  register_sys_off_handler+0x60/0x258
[    0.000000]  psci_probe+0x2cc/0x350
[    0.000000]  psci_acpi_init+0x50/0x88
[    0.000000]  setup_arch+0x194/0x278
[    0.000000]  start_kernel+0x7c/0x410
[    0.000000]  __primary_switched+0xb8/0xc8
[    0.000000] Code: b5000f7a f94003f4 aa1803fe d50320ff (b9401a64) 
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
Sudeep Holla March 13, 2024, 3:34 p.m. UTC | #2
On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote:
> On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
> > Looked briefly at register_sys_off_handler and it should be OK to call
> > it from psci_init_system_off2() below. Any particular reason for having
> > separate initcall to do this ? We can even eliminate the need for
> > psci_init_system_off2 if it can be called from there. What am I missing ?
>
> My first attempt did that. I don't think we can kmalloc that early:
>

That was was initial guess. But a quick hack on my setup and running it on
the FVP model didn't complain. I think either I messed up or something else
wrong, I must check on some h/w. Anyways sorry for the noise and thanks for
the response.
diff mbox series

Patch

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index d9629ff87861..69d2f6969438 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -78,6 +78,7 @@  struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
 
 static u32 psci_cpu_suspend_feature;
 static bool psci_system_reset2_supported;
+static bool psci_system_off2_supported;
 
 static inline bool psci_has_ext_power_state(void)
 {
@@ -333,6 +334,28 @@  static void psci_sys_poweroff(void)
 	invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
 }
 
+#ifdef CONFIG_HIBERNATION
+static int psci_sys_hibernate(struct sys_off_data *data)
+{
+	if (system_entering_hibernation())
+		invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
+			       PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
+	return NOTIFY_DONE;
+}
+
+static int __init psci_hibernate_init(void)
+{
+	if (psci_system_off2_supported) {
+		/* Higher priority than EFI shutdown, but only for hibernate */
+		register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
+					 SYS_OFF_PRIO_FIRMWARE + 2,
+					 psci_sys_hibernate, NULL);
+	}
+	return 0;
+}
+subsys_initcall(psci_hibernate_init);
+#endif
+
 static int psci_features(u32 psci_func_id)
 {
 	return invoke_psci_fn(PSCI_1_0_FN_PSCI_FEATURES,
@@ -364,6 +387,7 @@  static const struct {
 	PSCI_ID_NATIVE(1_1, SYSTEM_RESET2),
 	PSCI_ID(1_1, MEM_PROTECT),
 	PSCI_ID_NATIVE(1_1, MEM_PROTECT_CHECK_RANGE),
+	PSCI_ID_NATIVE(1_3, SYSTEM_OFF2),
 };
 
 static int psci_debugfs_read(struct seq_file *s, void *data)
@@ -523,6 +547,16 @@  static void __init psci_init_system_reset2(void)
 		psci_system_reset2_supported = true;
 }
 
+static void __init psci_init_system_off2(void)
+{
+	int ret;
+
+	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
+
+	if (ret != PSCI_RET_NOT_SUPPORTED)
+		psci_system_off2_supported = true;
+}
+
 static void __init psci_init_system_suspend(void)
 {
 	int ret;
@@ -653,6 +687,7 @@  static int __init psci_probe(void)
 		psci_init_cpu_suspend();
 		psci_init_system_suspend();
 		psci_init_system_reset2();
+		psci_init_system_off2();
 		kvm_init_hyp_services();
 	}
 
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 4b0b7cf2e019..ac87b3cb670c 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -676,8 +676,11 @@  static void power_down(void)
 		}
 		fallthrough;
 	case HIBERNATION_SHUTDOWN:
-		if (kernel_can_power_off())
+		if (kernel_can_power_off()) {
+			entering_platform_hibernation = true;
 			kernel_power_off();
+			entering_platform_hibernation = false;
+		}
 		break;
 	}
 	kernel_halt();