mbox series

[RFC,0/6] Bootloader based hibernation

Message ID 1652860121-24092-1-git-send-email-quic_vivekuma@quicinc.com
Headers show
Series Bootloader based hibernation | expand

Message

Vivek Kumar May 18, 2022, 7:48 a.m. UTC
Kernel Hibernation

Linux Kernel has been already supporting hibernation, a process which
involves freezing of all userspace tasks, followed by quiescing of all
kernel device drivers and then a DDR snapshot is taken which is saved
to disc-swap partition, after the save, the system can either shutdown
or continue further. Generally during the next power cycle when kernel
boots and after probing almost all of the drivers, in the late_init()
part, it checks if a hibernation image is present in the specified swap
slot, if a valid hibernation image is found, it superimposes the currently
executing Kernel with an older kernel from the snapshot, moving further,
it calls the restore of the drivers and unfreezes the userspace tasks.
CONFIG_HIBERNATION and a designated swap partition needs to be present
for to enable Hibernation.

Bootloader Based Hibernation:

Automotive usecases require better boot KPIs, Hence we are proposing a
bootloader based hibernation restore. Purpose of bootloader based
hibernation is to improve the overall boot time till the first display
frame is seen on the screen or a camera application can be launched from
userspace after the power on reset key is pressed. This RFC patchset
implements a slightly tweaked version of hibernation in which the
restoration of an older snapshot into DDR is being carried out from the
bootloader (ABL) itself, by doing this we are saving some time
(1 second measured on msm-4.14 Kernel) by not running a
temporary kernel and figuring out the hibernation image at late_init().
In order to achieve the same bootloader checks for the hibernation
image at a very early stage from swap partition, it parses the image and
loads it in the DDR instead of loading boot image form boot partition.
Since we are not running the temporary kernel,which would have done some
basic ARM related setup like, MMU enablement, EL2 setup, CPU setup etc,
entry point into hibernation snapshot image directly from bootloader is
different, on similar lines, all device drivers are now re-programming
the IO-mapped registers as part of the restore callback (which is
triggered from the hibernation framework) to bring back the HW/SW sync.

Other factors like, read-speed of the secondary storage device and
organization of the hibernation image in the swap partition effects the
total image restore time and the overall boot time. In our current
implementation we have serialized the allocation of swap-partition's slots
in kernel, so when hibernation image is being saved to disc, each page is
not scattered across various swap-slot offsets, rather it in a serial
manner. For example, if a DDR page at Page frame number 0x8005 is
located at a swap-slot offset 50, the next valid DDR page at PFN 0x8005
will be preset at the swap-slot offset 51. With this optimization in
place, bootloader can utilize the max capacity of issuing a disc-read
for reading a bigger chunk (~50 MBs at once) from the swap slot,
and also parsing of the image becomes simpler as it is available
contiguously.



Vivek Kumar (6):
  arm64: hibernate: Introduce new entry point to kernel
  PM: Hibernate: Add option to disable disk offset randomization
  block: gendisk: Add a new genhd capability flag
  mm: swap: Add randomization check for swapon/off calls
  Hibernate: Add check for pte_valid in saveable page
  irqchip/gic-v3: Re-init GIC hardware upon hibernation restore

 Documentation/admin-guide/kernel-parameters.txt |  11 ++
 arch/arm64/kernel/hibernate.c                   |   9 ++
 drivers/irqchip/irq-gic-v3.c                    | 138 ++++++++++++++++-
 include/linux/blkdev.h                          |   1 +
 kernel/power/snapshot.c                         |  43 ++++++++
 kernel/power/swap.c                             |  12 +++
 mm/swapfile.c                                   |   6 +-
 7 files changed, 216 insertions(+), 4 deletions(-)

Comments

Marc Zyngier May 19, 2022, 3:27 p.m. UTC | #1
On 2022-05-18 08:48, Vivek Kumar wrote:
> Introduce a new entry point to hibernated kernel image.
> This is generally needed when bootloader restores the
> hibernated image from disc to ddr and passes control
> to it by turning off the mmu, also initialize this new
> entry point with cpu_resume which turns on the mmu and
> then proceeds with restore routines.
> 
> Signed-off-by: Vivek Kumar <quic_vivekuma@quicinc.com>
> Signed-off-by: Prasanna Kumar <quic_kprasan@quicinc.com>
> ---
>  arch/arm64/kernel/hibernate.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kernel/hibernate.c 
> b/arch/arm64/kernel/hibernate.c
> index 6328308..4e294b3 100644
> --- a/arch/arm64/kernel/hibernate.c
> +++ b/arch/arm64/kernel/hibernate.c
> @@ -74,6 +74,14 @@ static struct arch_hibernate_hdr {
>  	void		(*reenter_kernel)(void);
> 
>  	/*
> +	 * Another entry point if jump to kernel happens with mmu disabled,
> +	 * generally done when restoring hibernation image from bootloader
> +	 * context
> +	 */
> +
> +	phys_addr_t	phys_reenter_kernel;
> +
> +	/*
>  	 * We need to know where the __hyp_stub_vectors are after restore to
>  	 * re-configure el2.
>  	 */
> @@ -116,6 +124,7 @@ int arch_hibernation_header_save(void *addr,
> unsigned int max_size)
>  	arch_hdr_invariants(&hdr->invariants);
>  	hdr->ttbr1_el1		= __pa_symbol(swapper_pg_dir);
>  	hdr->reenter_kernel	= _cpu_resume;
> +	hdr->phys_reenter_kernel  = __pa(cpu_resume);
> 
>  	/* We can't use __hyp_get_vectors() because kvm may still be loaded 
> */
>  	if (el2_reset_needed())

So here, you are creating a new ABI with the bootloader, based on
a data structure that isn't mean't to be ABI. It means that we
wouldn't be allowed to ever change this data structure, as this
would mean having to update the bootloader in sync.

Clearly, this isn't acceptable.

         M.
Mark Rutland May 20, 2022, 4:43 p.m. UTC | #2
Hi,

On Wed, May 18, 2022 at 01:18:35PM +0530, Vivek Kumar wrote:
> Kernel Hibernation
> 
> Linux Kernel has been already supporting hibernation, a process which
> involves freezing of all userspace tasks, followed by quiescing of all
> kernel device drivers and then a DDR snapshot is taken which is saved
> to disc-swap partition, after the save, the system can either shutdown
> or continue further. Generally during the next power cycle when kernel
> boots and after probing almost all of the drivers, in the late_init()
> part, it checks if a hibernation image is present in the specified swap
> slot, if a valid hibernation image is found, it superimposes the currently
> executing Kernel with an older kernel from the snapshot, moving further,
> it calls the restore of the drivers and unfreezes the userspace tasks.
> CONFIG_HIBERNATION and a designated swap partition needs to be present
> for to enable Hibernation.
> 
> Bootloader Based Hibernation:
> 
> Automotive usecases require better boot KPIs, Hence we are proposing a
> bootloader based hibernation restore.

At a high-level, I'm not a fan of adding new ways to enter the kernel, and for
the same reasons that the existing hibernate handover is deliberately *not* a
stable ABI, I don't think we should add an ABI for this. This is not going to
remain maintainable or compatible over time as the kernel evolves.

> Purpose of bootloader based hibernation is to improve the overall boot time
> till the first display frame is seen on the screen or a camera application
> can be launched from userspace after the power on reset key is pressed.

Can you break down the time taken for that today?

What does a cold boot look like?

What *exactly* are you trying to skip by using hibernation?

Thanks,
Mark.

> This RFC patchset
> implements a slightly tweaked version of hibernation in which the
> restoration of an older snapshot into DDR is being carried out from the
> bootloader (ABL) itself, by doing this we are saving some time
> (1 second measured on msm-4.14 Kernel) by not running a
> temporary kernel and figuring out the hibernation image at late_init().
> In order to achieve the same bootloader checks for the hibernation
> image at a very early stage from swap partition, it parses the image and
> loads it in the DDR instead of loading boot image form boot partition.
> Since we are not running the temporary kernel,which would have done some
> basic ARM related setup like, MMU enablement, EL2 setup, CPU setup etc,
> entry point into hibernation snapshot image directly from bootloader is
> different, on similar lines, all device drivers are now re-programming
> the IO-mapped registers as part of the restore callback (which is
> triggered from the hibernation framework) to bring back the HW/SW sync.
> 
> Other factors like, read-speed of the secondary storage device and
> organization of the hibernation image in the swap partition effects the
> total image restore time and the overall boot time. In our current
> implementation we have serialized the allocation of swap-partition's slots
> in kernel, so when hibernation image is being saved to disc, each page is
> not scattered across various swap-slot offsets, rather it in a serial
> manner. For example, if a DDR page at Page frame number 0x8005 is
> located at a swap-slot offset 50, the next valid DDR page at PFN 0x8005
> will be preset at the swap-slot offset 51. With this optimization in
> place, bootloader can utilize the max capacity of issuing a disc-read
> for reading a bigger chunk (~50 MBs at once) from the swap slot,
> and also parsing of the image becomes simpler as it is available
> contiguously.
> 
> 
> 
> Vivek Kumar (6):
>   arm64: hibernate: Introduce new entry point to kernel
>   PM: Hibernate: Add option to disable disk offset randomization
>   block: gendisk: Add a new genhd capability flag
>   mm: swap: Add randomization check for swapon/off calls
>   Hibernate: Add check for pte_valid in saveable page
>   irqchip/gic-v3: Re-init GIC hardware upon hibernation restore
> 
>  Documentation/admin-guide/kernel-parameters.txt |  11 ++
>  arch/arm64/kernel/hibernate.c                   |   9 ++
>  drivers/irqchip/irq-gic-v3.c                    | 138 ++++++++++++++++-
>  include/linux/blkdev.h                          |   1 +
>  kernel/power/snapshot.c                         |  43 ++++++++
>  kernel/power/swap.c                             |  12 +++
>  mm/swapfile.c                                   |   6 +-
>  7 files changed, 216 insertions(+), 4 deletions(-)
> 
> -- 
> 2.7.4
>