diff mbox series

kernel/power : add pr_err() for debugging "Error -14 resuming" error

Message ID 20221111045242.530607-1-luoxueqin@kylinos.cn
State Superseded
Headers show
Series kernel/power : add pr_err() for debugging "Error -14 resuming" error | expand

Commit Message

Xueqin Luo Nov. 11, 2022, 4:52 a.m. UTC
The system memory map can change over a hibernation-restore cycle due
to a defect in the platform firmware, and some of the page frames used
by the kernel before hibernation may not be available any more during
the subsequent restore which leads to the error below.

[  T357] PM: Image loading progress:   0%
[  T357] PM: Read 2681596 kbytes in 0.03 seconds (89386.53 MB/s)
[  T357] PM: Error -14 resuming
[  T357] PM: Failed to load hibernation image, recovering.
[  T357] PM: Basic memory bitmaps freed
[  T357] OOM killer enabled.
[  T357] Restarting tasks ... done.
[  T357] PM: resume from hibernation failed (-14)
[  T357] PM: Hibernation image not present or could not be loaded.

So, by adding an Error message to the unpack () function, you can quickly
navigate to the Error page number and analyze the cause when an "Error -14
resuming" error occurs in S4. This can save developers the cost of
debugging time.

Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn>
---
v3: Modify the pr_err() function output again

v2: Modify the commit message and pr_err() function output

 kernel/power/snapshot.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Rafael J. Wysocki Nov. 30, 2022, 6:45 p.m. UTC | #1
On Fri, Nov 11, 2022 at 5:53 AM Xueqin Luo <luoxueqin@kylinos.cn> wrote:
>
> The system memory map can change over a hibernation-restore cycle due
> to a defect in the platform firmware, and some of the page frames used
> by the kernel before hibernation may not be available any more during
> the subsequent restore which leads to the error below.
>
> [  T357] PM: Image loading progress:   0%
> [  T357] PM: Read 2681596 kbytes in 0.03 seconds (89386.53 MB/s)
> [  T357] PM: Error -14 resuming
> [  T357] PM: Failed to load hibernation image, recovering.
> [  T357] PM: Basic memory bitmaps freed
> [  T357] OOM killer enabled.
> [  T357] Restarting tasks ... done.
> [  T357] PM: resume from hibernation failed (-14)
> [  T357] PM: Hibernation image not present or could not be loaded.
>
> So, by adding an Error message to the unpack () function, you can quickly
> navigate to the Error page number and analyze the cause when an "Error -14
> resuming" error occurs in S4. This can save developers the cost of
> debugging time.
>
> Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn>
> ---
> v3: Modify the pr_err() function output again
>
> v2: Modify the commit message and pr_err() function output
>
>  kernel/power/snapshot.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index c20ca5fb9adc..e7bd4531faf2 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -2259,10 +2259,14 @@ static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm)
>                 if (unlikely(buf[j] == BM_END_OF_MAP))
>                         break;
>
> -               if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j]))
> +               if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) {
>                         memory_bm_set_bit(bm, buf[j]);
> -               else
> +               } else {
> +                       if (!pfn_valid(buf[j]))
> +                               pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n",
> +                                               PFN_PHYS(buf[j]));
>                         return -EFAULT;
> +               }
>         }
>
>         return 0;
> --

Applied as 6.2 material under a new subject and with some edits in the
changelog.

Thanks!
diff mbox series

Patch

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index c20ca5fb9adc..e7bd4531faf2 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -2259,10 +2259,14 @@  static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm)
 		if (unlikely(buf[j] == BM_END_OF_MAP))
 			break;
 
-		if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j]))
+		if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) {
 			memory_bm_set_bit(bm, buf[j]);
-		else
+		} else {
+			if (!pfn_valid(buf[j]))
+				pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n",
+						PFN_PHYS(buf[j]));
 			return -EFAULT;
+		}
 	}
 
 	return 0;