diff mbox series

[REGRESSION] Kernel reboots unexpectdely on i.MX8X when Cortex-M4 is running and it was started by U-Boot bootaux

Message ID 20250404141713.ac2ntcsjsf7epdfa@hiago-nb
State New
Headers show
Series [REGRESSION] Kernel reboots unexpectdely on i.MX8X when Cortex-M4 is running and it was started by U-Boot bootaux | expand

Commit Message

Hiago De Franco April 4, 2025, 2:17 p.m. UTC
#regzbot introduced: 4f6c983261

Hi Peng and all,

Commit 4f6c9832613b ("genpd: imx: scu-pd: initialize is_off according to
HW state") introduced a regression where the Kernel reboots unexpectedly
(without any warnings, crashes or errors) when the Cortex-M4 was loaded
and running by U-Boot, using the bootaux command:

# load mmc 0:2 ${loadaddr} /home/root/hello_world.bin
# bootaux ${loadaddr} 0
# boot

This is a simple hello world binary that prints a message into the
M40.UART0 pin (demo from NXP MCUXpresso).

Before this commit, everything worked as expected, Linux boots fine and
the HMP core keeps running and printing messages to the UART. After the
commit, the kernel reboots at the beggining of the boot process. The
only relevant message is printed by U-Boot after reset:

"Reset cause: SCFW fault reset"

This commit was bisectabled, the same device tree, u-boot version, and
SCFW versions were used. Reverting this commit fixes the issues.

For testing purposes, I created the following patch which also fixes the
issue:



Test Environment:
- Hardware: Colibri iMX8DX 1GB with Colbiri Evaluation Board.
- U-Boot Version: 2024.04
- U-Boot Build info:
	SCFW 83624b99, SECO-FW c9de51c0, IMX-MKIMAGE 4622115c, ATF 7c64d4e

The issue is not present on: v6.5

The real root cause is still unclear to me. Anybody has any ideas? I am
happy to share more details if needed.

Cheers,
Hiago.

Comments

Hiago De Franco April 11, 2025, 12:50 p.m. UTC | #1
On Fri, Apr 04, 2025 at 11:17:13AM -0300, Hiago De Franco wrote:
> #regzbot introduced: 4f6c983261
> 
> Hi Peng and all,
> 
> Commit 4f6c9832613b ("genpd: imx: scu-pd: initialize is_off according to
> HW state") introduced a regression where the Kernel reboots unexpectedly
> (without any warnings, crashes or errors) when the Cortex-M4 was loaded
> and running by U-Boot, using the bootaux command:
> 
> # load mmc 0:2 ${loadaddr} /home/root/hello_world.bin
> # bootaux ${loadaddr} 0
> # boot
> 
> This is a simple hello world binary that prints a message into the
> M40.UART0 pin (demo from NXP MCUXpresso).
> 
> Before this commit, everything worked as expected, Linux boots fine and
> the HMP core keeps running and printing messages to the UART. After the
> commit, the kernel reboots at the beggining of the boot process. The
> only relevant message is printed by U-Boot after reset:
> 
> "Reset cause: SCFW fault reset"
> 
> This commit was bisectabled, the same device tree, u-boot version, and
> SCFW versions were used. Reverting this commit fixes the issues.
> 
> For testing purposes, I created the following patch which also fixes the
> issue:
> 
> diff --git a/drivers/pmdomain/imx/scu-pd.c b/drivers/pmdomain/imx/scu-pd.c
> index 38f3cdd21042..0477b3fb4991 100644
> --- a/drivers/pmdomain/imx/scu-pd.c
> +++ b/drivers/pmdomain/imx/scu-pd.c
> @@ -539,6 +539,9 @@ imx_scu_add_pm_domain(struct device *dev, int idx,
>                 return NULL;
>         }
> 
> +       if (strstr("cm40", sc_pd->name) != NULL)
> +               is_off = true;
> +
>         ret = pm_genpd_init(&sc_pd->pd, NULL, is_off);
>         if (ret) {
>                 dev_warn(dev, "failed to init pd %s rsrc id %d",
> 
> 
> Test Environment:
> - Hardware: Colibri iMX8DX 1GB with Colbiri Evaluation Board.
> - U-Boot Version: 2024.04
> - U-Boot Build info:
> 	SCFW 83624b99, SECO-FW c9de51c0, IMX-MKIMAGE 4622115c, ATF 7c64d4e
> 
> The issue is not present on: v6.5
> 
> The real root cause is still unclear to me. Anybody has any ideas? I am
> happy to share more details if needed.

Hello everyone, as this introduced a regression, should I send a revert
for 4f6c983261? Or any ideas that might help fix this issue?

> 
> Cheers,
> Hiago.

Cheers,
Hiago.
Peng Fan April 11, 2025, 1:23 p.m. UTC | #2
Hi,

Sorry for late.
> Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when Cortex-M4 is running and it was started by U-Boot bootaux
> 
> On Fri, Apr 04, 2025 at 11:17:13AM -0300, Hiago De Franco wrote:
> > #regzbot introduced: 4f6c983261
> >
> > Hi Peng and all,
> >
> > Commit 4f6c9832613b ("genpd: imx: scu-pd: initialize is_off
> according
> > to HW state") introduced a regression where the Kernel reboots
> > unexpectedly (without any warnings, crashes or errors) when the
> > Cortex-M4 was loaded and running by U-Boot, using the bootaux
> command:
> >
> > # load mmc 0:2 ${loadaddr} /home/root/hello_world.bin # bootaux
> > ${loadaddr} 0 # boot
> >
> > This is a simple hello world binary that prints a message into the
> > M40.UART0 pin (demo from NXP MCUXpresso).

Which release is this image from?

> >
> > Before this commit, everything worked as expected, Linux boots fine
> > and the HMP core keeps running and printing messages to the UART.
> > After the commit, the kernel reboots at the beggining of the boot
> > process. The only relevant message is printed by U-Boot after reset:
> >
> > "Reset cause: SCFW fault reset"
> >
> > This commit was bisectabled, the same device tree, u-boot version,
> and
> > SCFW versions were used. Reverting this commit fixes the issues.
> >
> > For testing purposes, I created the following patch which also fixes
> > the
> > issue:
> >
> > diff --git a/drivers/pmdomain/imx/scu-pd.c
> > b/drivers/pmdomain/imx/scu-pd.c index
> 38f3cdd21042..0477b3fb4991
> > 100644
> > --- a/drivers/pmdomain/imx/scu-pd.c
> > +++ b/drivers/pmdomain/imx/scu-pd.c
> > @@ -539,6 +539,9 @@ imx_scu_add_pm_domain(struct device
> *dev, int idx,
> >                 return NULL;
> >         }
> >
> > +       if (strstr("cm40", sc_pd->name) != NULL)
> > +               is_off = true;
> > +
> >         ret = pm_genpd_init(&sc_pd->pd, NULL, is_off);
> >         if (ret) {
> >                 dev_warn(dev, "failed to init pd %s rsrc id %d",
> >
> >
> > Test Environment:
> > - Hardware: Colibri iMX8DX 1GB with Colbiri Evaluation Board.
> > - U-Boot Version: 2024.04
> > - U-Boot Build info:
> > 	SCFW 83624b99, SECO-FW c9de51c0, IMX-MKIMAGE
> 4622115c, ATF 7c64d4e
> >
> > The issue is not present on: v6.5
> >
> > The real root cause is still unclear to me. Anybody has any ideas? I
> > am happy to share more details if needed.

Have you tried pd_ignore_unused? 

I think it is linux power down M4 which M4 is running, then SCFW
reports error. So please give a try pd_ignore_unused.

If this is the case, may I know do you have m4 nodes in dts and
with power domain included?

Anyway, I will give a try on i.MX8QM EVK.

> 
> Hello everyone, as this introduced a regression, should I send a revert
> for 4f6c983261? 

Please wait a while, I think we need find root cause.

Thanks,
Peng.

Or any ideas that might help fix this issue?
> 
> >
> > Cheers,
> > Hiago.
> 
> Cheers,
> Hiago.
Hiago De Franco April 11, 2025, 4:23 p.m. UTC | #3
Hi Peng,

On Fri, Apr 11, 2025 at 01:23:32PM +0000, Peng Fan wrote:
> Hi,
> 
> Sorry for late.
> > Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> > when Cortex-M4 is running and it was started by U-Boot bootaux
> > 
> > On Fri, Apr 04, 2025 at 11:17:13AM -0300, Hiago De Franco wrote:
> > > #regzbot introduced: 4f6c983261
> > >
> > > Hi Peng and all,
> > >
> > > Commit 4f6c9832613b ("genpd: imx: scu-pd: initialize is_off
> > according
> > > to HW state") introduced a regression where the Kernel reboots
> > > unexpectedly (without any warnings, crashes or errors) when the
> > > Cortex-M4 was loaded and running by U-Boot, using the bootaux
> > command:
> > >
> > > # load mmc 0:2 ${loadaddr} /home/root/hello_world.bin # bootaux
> > > ${loadaddr} 0 # boot
> > >
> > > This is a simple hello world binary that prints a message into the
> > > M40.UART0 pin (demo from NXP MCUXpresso).
> 
> Which release is this image from?

This is MCUXpresso SDK 2.9.0.

> 
> > >
> > > Before this commit, everything worked as expected, Linux boots fine
> > > and the HMP core keeps running and printing messages to the UART.
> > > After the commit, the kernel reboots at the beggining of the boot
> > > process. The only relevant message is printed by U-Boot after reset:
> > >
> > > "Reset cause: SCFW fault reset"
> > >
> > > This commit was bisectabled, the same device tree, u-boot version,
> > and
> > > SCFW versions were used. Reverting this commit fixes the issues.
> > >
> > > For testing purposes, I created the following patch which also fixes
> > > the
> > > issue:
> > >
> > > diff --git a/drivers/pmdomain/imx/scu-pd.c
> > > b/drivers/pmdomain/imx/scu-pd.c index
> > 38f3cdd21042..0477b3fb4991
> > > 100644
> > > --- a/drivers/pmdomain/imx/scu-pd.c
> > > +++ b/drivers/pmdomain/imx/scu-pd.c
> > > @@ -539,6 +539,9 @@ imx_scu_add_pm_domain(struct device
> > *dev, int idx,
> > >                 return NULL;
> > >         }
> > >
> > > +       if (strstr("cm40", sc_pd->name) != NULL)
> > > +               is_off = true;
> > > +
> > >         ret = pm_genpd_init(&sc_pd->pd, NULL, is_off);
> > >         if (ret) {
> > >                 dev_warn(dev, "failed to init pd %s rsrc id %d",
> > >
> > >
> > > Test Environment:
> > > - Hardware: Colibri iMX8DX 1GB with Colbiri Evaluation Board.
> > > - U-Boot Version: 2024.04
> > > - U-Boot Build info:
> > > 	SCFW 83624b99, SECO-FW c9de51c0, IMX-MKIMAGE
> > 4622115c, ATF 7c64d4e
> > >
> > > The issue is not present on: v6.5
> > >
> > > The real root cause is still unclear to me. Anybody has any ideas? I
> > > am happy to share more details if needed.
> 
> Have you tried pd_ignore_unused? 
> 
> I think it is linux power down M4 which M4 is running, then SCFW
> reports error. So please give a try pd_ignore_unused.

For debugging purposes, I tried it and it works, kernel boots fine with
M4 running and pd_ignore_unused parameter.

> 
> If this is the case, may I know do you have m4 nodes in dts and
> with power domain included?

This is the device tree overlay I am testing:

/dts-v1/;
/plugin/;

#include <dt-bindings/clock/imx8mm-clock.h>
#include <dt-bindings/firmware/imx/rsrc.h>

/ {
	compatible = "toradex,colibri-imx8x";
};

&{/} {
	imx8x-cm4 {
		compatible = "fsl,imx8qxp-cm4";
		mbox-names = "tx", "rx", "rxdb";
		mboxes = <&lsio_mu5 0 1
			  &lsio_mu5 1 1
			  &lsio_mu5 3 1>;
		memory-region = <&vdevbuffer>, <&vdev0vring0>, <&vdev0vring1>,
				<&vdev1vring0>, <&vdev1vring1>, <&rsc_table>;
		power-domains = <&pd IMX_SC_R_M4_0_PID0>,
				<&pd IMX_SC_R_M4_0_MU_1A>;
		fsl,entry-address = <0x34fe0000>;
		fsl,resource-id = <IMX_SC_R_M4_0_PID0>;
	};

	reserved-memory {
		#address-cells = <2>;
		#size-cells = <2>;
		ranges;

		vdev0vring0: memory@90000000 {
			reg = <0 0x90000000 0 0x8000>;
			no-map;
		};

		vdev0vring1: memory@90008000 {
			reg = <0 0x90008000 0 0x8000>;
			no-map;
		};

		vdev1vring0: memory@90010000 {
			reg = <0 0x90010000 0 0x8000>;
			no-map;
		};

		vdev1vring1: memory@90018000 {
			reg = <0 0x90018000 0 0x8000>;
			no-map;
		};

		rsc_table: memory@900ff000 {
			reg = <0 0x900ff000 0 0x1000>;
			no-map;
		};

		vdevbuffer: memory@90400000 {
			compatible = "shared-dma-pool";
			reg = <0 0x90400000 0 0x100000>;
			no-map;
		};
	};
};

&lsio_mu5 {
	status = "okay";
};

This was basically copied from
arch/arm64/boot/dts/freescale/imx8qxp-mek.dts. Do you see anything
wrong? Should I also add the "clocks" property to imx8x-cm4 node?

> 
> Anyway, I will give a try on i.MX8QM EVK.

Great, thanks.

> 
> > 
> > Hello everyone, as this introduced a regression, should I send a revert
> > for 4f6c983261? 
> 
> Please wait a while, I think we need find root cause.
> 
> Thanks,
> Peng.
> 
> Or any ideas that might help fix this issue?
> > 
> > >
> > > Cheers,
> > > Hiago.
> > 
> > Cheers,
> > Hiago.

Cheers,
Hiago.
Peng Fan April 14, 2025, 6:09 a.m. UTC | #4
> Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when Cortex-M4 is running and it was started by U-Boot bootaux
> 
> Hi Peng,
> 
> On Fri, Apr 11, 2025 at 01:23:32PM +0000, Peng Fan wrote:
> > Hi,
> >
> > Sorry for late.
> > > Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when
> > > Cortex-M4 is running and it was started by U-Boot bootaux
> > >
> > > On Fri, Apr 04, 2025 at 11:17:13AM -0300, Hiago De Franco wrote:
> > > > #regzbot introduced: 4f6c983261
> > > >
> > > > Hi Peng and all,
> > > >
> > > > Commit 4f6c9832613b ("genpd: imx: scu-pd: initialize is_off
> > > according
> > > > to HW state") introduced a regression where the Kernel reboots
> > > > unexpectedly (without any warnings, crashes or errors) when the
> > > > Cortex-M4 was loaded and running by U-Boot, using the bootaux
> > > command:
> > > >
> > > > # load mmc 0:2 ${loadaddr} /home/root/hello_world.bin #
> bootaux
> > > > ${loadaddr} 0 # boot
> > > >
> > > > This is a simple hello world binary that prints a message into the
> > > > M40.UART0 pin (demo from NXP MCUXpresso).
> >
> > Which release is this image from?
> 
> This is MCUXpresso SDK 2.9.0.
> 
> >
> > > >
> > > > Before this commit, everything worked as expected, Linux boots
> > > > fine and the HMP core keeps running and printing messages to
> the UART.
> > > > After the commit, the kernel reboots at the beggining of the boot
> > > > process. The only relevant message is printed by U-Boot after
> reset:
> > > >
> > > > "Reset cause: SCFW fault reset"
> > > >
> > > > This commit was bisectabled, the same device tree, u-boot
> version,
> > > and
> > > > SCFW versions were used. Reverting this commit fixes the issues.
> > > >
> > > > For testing purposes, I created the following patch which also
> > > > fixes the
> > > > issue:
> > > >
> > > > diff --git a/drivers/pmdomain/imx/scu-pd.c
> > > > b/drivers/pmdomain/imx/scu-pd.c index
> > > 38f3cdd21042..0477b3fb4991
> > > > 100644
> > > > --- a/drivers/pmdomain/imx/scu-pd.c
> > > > +++ b/drivers/pmdomain/imx/scu-pd.c
> > > > @@ -539,6 +539,9 @@ imx_scu_add_pm_domain(struct device
> > > *dev, int idx,
> > > >                 return NULL;
> > > >         }
> > > >
> > > > +       if (strstr("cm40", sc_pd->name) != NULL)
> > > > +               is_off = true;
> > > > +
> > > >         ret = pm_genpd_init(&sc_pd->pd, NULL, is_off);
> > > >         if (ret) {
> > > >                 dev_warn(dev, "failed to init pd %s rsrc id %d",
> > > >
> > > >
> > > > Test Environment:
> > > > - Hardware: Colibri iMX8DX 1GB with Colbiri Evaluation Board.
> > > > - U-Boot Version: 2024.04
> > > > - U-Boot Build info:
> > > > 	SCFW 83624b99, SECO-FW c9de51c0, IMX-MKIMAGE
> > > 4622115c, ATF 7c64d4e
> > > >
> > > > The issue is not present on: v6.5
> > > >
> > > > The real root cause is still unclear to me. Anybody has any ideas?
> > > > I am happy to share more details if needed.
> >
> > Have you tried pd_ignore_unused?
> >
> > I think it is linux power down M4 which M4 is running, then SCFW
> > reports error. So please give a try pd_ignore_unused.
> 
> For debugging purposes, I tried it and it works, kernel boots fine with
> M4 running and pd_ignore_unused parameter.
> 
> >
> > If this is the case, may I know do you have m4 nodes in dts and with
> > power domain included?
> 
> This is the device tree overlay I am testing:
> 
> /dts-v1/;
> /plugin/;
> 
> #include <dt-bindings/clock/imx8mm-clock.h>
> #include <dt-bindings/firmware/imx/rsrc.h>
> 
> / {
> 	compatible = "toradex,colibri-imx8x";
> };
> 
> &{/} {
> 	imx8x-cm4 {
> 		compatible = "fsl,imx8qxp-cm4";
> 		mbox-names = "tx", "rx", "rxdb";
> 		mboxes = <&lsio_mu5 0 1
> 			  &lsio_mu5 1 1
> 			  &lsio_mu5 3 1>;
> 		memory-region = <&vdevbuffer>, <&vdev0vring0>,
> <&vdev0vring1>,
> 				<&vdev1vring0>, <&vdev1vring1>,
> <&rsc_table>;
> 		power-domains = <&pd IMX_SC_R_M4_0_PID0>,
> 				<&pd IMX_SC_R_M4_0_MU_1A>;
> 		fsl,entry-address = <0x34fe0000>;
> 		fsl,resource-id = <IMX_SC_R_M4_0_PID0>;
> 	};
> 
> 	reserved-memory {
> 		#address-cells = <2>;
> 		#size-cells = <2>;
> 		ranges;
> 
> 		vdev0vring0: memory@90000000 {
> 			reg = <0 0x90000000 0 0x8000>;
> 			no-map;
> 		};
> 
> 		vdev0vring1: memory@90008000 {
> 			reg = <0 0x90008000 0 0x8000>;
> 			no-map;
> 		};
> 
> 		vdev1vring0: memory@90010000 {
> 			reg = <0 0x90010000 0 0x8000>;
> 			no-map;
> 		};
> 
> 		vdev1vring1: memory@90018000 {
> 			reg = <0 0x90018000 0 0x8000>;
> 			no-map;
> 		};
> 
> 		rsc_table: memory@900ff000 {
> 			reg = <0 0x900ff000 0 0x1000>;
> 			no-map;
> 		};
> 
> 		vdevbuffer: memory@90400000 {
> 			compatible = "shared-dma-pool";
> 			reg = <0 0x90400000 0 0x100000>;
> 			no-map;
> 		};
> 	};
> };
> 
> &lsio_mu5 {
> 	status = "okay";
> };
> 
> This was basically copied from
> arch/arm64/boot/dts/freescale/imx8qxp-mek.dts. Do you see anything
> wrong? Should I also add the "clocks" property to imx8x-cm4 node?

In your case, m4 is in same scu partition as a53, so m4
power domain is manageable(owned) by Linux.

However to m4 earlyboot(kicked by bootloader),
if you not wanna linux to handle m4, use scu_rm
to create a separate partition in u-boot.
If you wanna linux to handle m4, but not wanna linux
to shutdown the pd in kernel boot, imx_rproc.c
needs to be built in, and need to add a clock entry
or use clock optional api in imx_rproc.c .

Current imx_rproc.c needs a clock entry to probe pass.

I think in your case, this driver not probe pass, so the
M4 pd still get powered off.


Regards,
Peng.

> 
> >
> > Anyway, I will give a try on i.MX8QM EVK.
> 
> Great, thanks.
> 
> >
> > >
> > > Hello everyone, as this introduced a regression, should I send a
> > > revert for 4f6c983261?
> >
> > Please wait a while, I think we need find root cause.
> >
> > Thanks,
> > Peng.
> >
> > Or any ideas that might help fix this issue?
> > >
> > > >
> > > > Cheers,
> > > > Hiago.
> > >
> > > Cheers,
> > > Hiago.
> 
> Cheers,
> Hiago.
Hiago De Franco April 14, 2025, 10:44 p.m. UTC | #5
Hi Peng,

On Mon, Apr 14, 2025 at 06:09:49AM +0000, Peng Fan wrote:
> 
> In your case, m4 is in same scu partition as a53, so m4
> power domain is manageable(owned) by Linux.
> 
> However to m4 earlyboot(kicked by bootloader),
> if you not wanna linux to handle m4, use scu_rm
> to create a separate partition in u-boot.
> If you wanna linux to handle m4, but not wanna linux
> to shutdown the pd in kernel boot, imx_rproc.c
> needs to be built in, and need to add a clock entry
> or use clock optional api in imx_rproc.c .
> 
> Current imx_rproc.c needs a clock entry to probe pass.
> 
> I think in your case, this driver not probe pass, so the
> M4 pd still get powered off.

This was correct, indeed. I was not able to find exactly where the
cortex-m4 clock is defined, so I added a clk_dummy to the imx8x-cm4
remoteproc node and now it works, the code continues to run and I can
control the m4 with Linux. Thanks!

One thing that I noticed is I cannot make the RPMsg work with this
devicetree node, even tought I assigned the correct memory-regions
(vdev0buffer, vdev0ring0...). Also tested with the rpmsg-lite from the
linux-imx. Is this supposed to work with RPMsg as well?

> 
> 
> Regards,
> Peng.
>

Hiago
Peng Fan April 15, 2025, 12:11 a.m. UTC | #6
> Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when Cortex-M4 is running and it was started by U-Boot bootaux
> 
> Hi Peng,
> 
> On Mon, Apr 14, 2025 at 06:09:49AM +0000, Peng Fan wrote:
> >
> > In your case, m4 is in same scu partition as a53, so m4 power domain
> > is manageable(owned) by Linux.
> >
> > However to m4 earlyboot(kicked by bootloader), if you not wanna
> linux
> > to handle m4, use scu_rm to create a separate partition in u-boot.
> > If you wanna linux to handle m4, but not wanna linux to shutdown
> the
> > pd in kernel boot, imx_rproc.c needs to be built in, and need to add a
> > clock entry or use clock optional api in imx_rproc.c .
> >
> > Current imx_rproc.c needs a clock entry to probe pass.
> >
> > I think in your case, this driver not probe pass, so the
> > M4 pd still get powered off.
> 
> This was correct, indeed. I was not able to find exactly where the
> cortex-m4 clock is defined, so I added a clk_dummy to the imx8x-cm4
> remoteproc node and now it works, the code continues to run and I
> can control the m4 with Linux. Thanks!
> 
> One thing that I noticed is I cannot make the RPMsg work with this
> devicetree node, even tought I assigned the correct memory-regions
> (vdev0buffer, vdev0ring0...). Also tested with the rpmsg-lite from the
> linux-imx. Is this supposed to work with RPMsg as well?

To make rpmsg work, you need a m4 demo that could publish
resource table, such as i.MX tty echo or pingpong demo.

There is downstream rpmsg driver under drivers/rpmsg/imx_*.c
that could be used to talk with m4.

helloworld demo does not have any virtio devices, so no rpmsg.

Regards,
Peng.

> 
> >
> >
> > Regards,
> > Peng.
> >
> 
> Hiago
Hiago De Franco April 15, 2025, 6:40 p.m. UTC | #7
On Tue, Apr 15, 2025 at 12:11:33AM +0000, Peng Fan wrote:
> > Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> > when Cortex-M4 is running and it was started by U-Boot bootaux
> > 
> > Hi Peng,
> > 
> > On Mon, Apr 14, 2025 at 06:09:49AM +0000, Peng Fan wrote:
> > >
> > > In your case, m4 is in same scu partition as a53, so m4 power domain
> > > is manageable(owned) by Linux.
> > >
> > > However to m4 earlyboot(kicked by bootloader), if you not wanna
> > linux
> > > to handle m4, use scu_rm to create a separate partition in u-boot.
> > > If you wanna linux to handle m4, but not wanna linux to shutdown
> > the
> > > pd in kernel boot, imx_rproc.c needs to be built in, and need to add a
> > > clock entry or use clock optional api in imx_rproc.c .
> > >
> > > Current imx_rproc.c needs a clock entry to probe pass.
> > >
> > > I think in your case, this driver not probe pass, so the
> > > M4 pd still get powered off.
> > 
> > This was correct, indeed. I was not able to find exactly where the
> > cortex-m4 clock is defined, so I added a clk_dummy to the imx8x-cm4
> > remoteproc node and now it works, the code continues to run and I
> > can control the m4 with Linux. Thanks!
> > 
> > One thing that I noticed is I cannot make the RPMsg work with this
> > devicetree node, even tought I assigned the correct memory-regions
> > (vdev0buffer, vdev0ring0...). Also tested with the rpmsg-lite from the
> > linux-imx. Is this supposed to work with RPMsg as well?
> 
> To make rpmsg work, you need a m4 demo that could publish
> resource table, such as i.MX tty echo or pingpong demo.
> 
> There is downstream rpmsg driver under drivers/rpmsg/imx_*.c
> that could be used to talk with m4.
> 
> helloworld demo does not have any virtio devices, so no rpmsg.

Got it, I was able to make it work with the downstream pingpong driver
and the MCUXpresso demo. I can launch the firmware using the remoteproc
and exchange data between the two cores.

There is something I noticed, when I start the pingpong demo with
U-Boot, it does not work. I run the pingpong modprobe on Linux but the
name service is never annouced. It only works if I start with the
remoteproc on Linux, not U-Boot. Is this because of Linux not being able
to control the M4?

If I start the binary using U-Boot, the "state" always report as
"offline" by the remoteproc driver.

> 
> Regards,
> Peng.
> 
> > 
> > >
> > >
> > > Regards,
> > > Peng.
> > >
> > 
> > Hiago

Thanks,
Hiago.
Peng Fan April 16, 2025, 8:19 a.m. UTC | #8
> Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when Cortex-M4 is running and it was started by U-Boot bootaux
> 
> On Tue, Apr 15, 2025 at 12:11:33AM +0000, Peng Fan wrote:
> > > Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when
> > > Cortex-M4 is running and it was started by U-Boot bootaux
> > >
> > > Hi Peng,
> > >
> > > On Mon, Apr 14, 2025 at 06:09:49AM +0000, Peng Fan wrote:
> > > >
> > > > In your case, m4 is in same scu partition as a53, so m4 power
> > > > domain is manageable(owned) by Linux.
> > > >
> > > > However to m4 earlyboot(kicked by bootloader), if you not
> wanna
> > > linux
> > > > to handle m4, use scu_rm to create a separate partition in u-boot.
> > > > If you wanna linux to handle m4, but not wanna linux to
> shutdown
> > > the
> > > > pd in kernel boot, imx_rproc.c needs to be built in, and need to
> > > > add a clock entry or use clock optional api in imx_rproc.c .
> > > >
> > > > Current imx_rproc.c needs a clock entry to probe pass.
> > > >
> > > > I think in your case, this driver not probe pass, so the
> > > > M4 pd still get powered off.
> > >
> > > This was correct, indeed. I was not able to find exactly where the
> > > cortex-m4 clock is defined, so I added a clk_dummy to the imx8x-
> cm4
> > > remoteproc node and now it works, the code continues to run and I
> > > can control the m4 with Linux. Thanks!
> > >
> > > One thing that I noticed is I cannot make the RPMsg work with this
> > > devicetree node, even tought I assigned the correct memory-
> regions
> > > (vdev0buffer, vdev0ring0...). Also tested with the rpmsg-lite from
> > > the linux-imx. Is this supposed to work with RPMsg as well?
> >
> > To make rpmsg work, you need a m4 demo that could publish
> resource
> > table, such as i.MX tty echo or pingpong demo.
> >
> > There is downstream rpmsg driver under drivers/rpmsg/imx_*.c that
> > could be used to talk with m4.
> >
> > helloworld demo does not have any virtio devices, so no rpmsg.
> 
> Got it, I was able to make it work with the downstream pingpong
> driver and the MCUXpresso demo. I can launch the firmware using the
> remoteproc and exchange data between the two cores.
> 
> There is something I noticed, when I start the pingpong demo with U-
> Boot, it does not work. I run the pingpong modprobe on Linux but the
> name service is never annouced. It only works if I start with the
> remoteproc on Linux, not U-Boot. Is this because of Linux not being
> able to control the M4?

No. In you case, you could start using remoteproc, that means
Linux could control M4.

> 
> If I start the binary using U-Boot, the "state" always report as "offline"
> by the remoteproc driver.

In drivers/remoteproc/imx_rproc.c,  imx_rproc_detect_mode
case IMX_RPROC_SCU_API is used for detect mode of M4 for i.MX8Q/X
platform. Please give a look where it returns.

For U-Boot start m4, linux should remote state: "attached"

Regards,
Peng.

> 
> >
> > Regards,
> > Peng.
> >
> > >
> > > >
> > > >
> > > > Regards,
> > > > Peng.
> > > >
> > >
> > > Hiago
> 
> Thanks,
> Hiago.
Hiago De Franco April 16, 2025, 9:57 p.m. UTC | #9
Hi Peng,

On Wed, Apr 16, 2025 at 08:19:27AM +0000, Peng Fan wrote:
> > Got it, I was able to make it work with the downstream pingpong
> > driver and the MCUXpresso demo. I can launch the firmware using the
> > remoteproc and exchange data between the two cores.
> >
> > There is something I noticed, when I start the pingpong demo with U-
> > Boot, it does not work. I run the pingpong modprobe on Linux but the
> > name service is never annouced. It only works if I start with the
> > remoteproc on Linux, not U-Boot. Is this because of Linux not being
> > able to control the M4?
>
> No. In you case, you could start using remoteproc, that means
> Linux could control M4.
>
> >
> > If I start the binary using U-Boot, the "state" always report as "offline"
> > by the remoteproc driver.
>
> In drivers/remoteproc/imx_rproc.c,  imx_rproc_detect_mode
> case IMX_RPROC_SCU_API is used for detect mode of M4 for i.MX8Q/X
> platform. Please give a look where it returns.
>
> For U-Boot start m4, linux should remote state: "attached"

Ok, in this case looks its does not work. I start the firmware with
U-Boot and state is always "offline". Inside the IMX_RPROC_SCU_API case,
the function returns at this point:

		...
		/*
		 * If Mcore resource is not owned by Acore partition, It is kicked by ROM,
		 * and Linux could only do IPC with Mcore and nothing else.
		 */
		if (imx_sc_rm_is_resource_owned(priv->ipc_handle, priv->rsrc_id)) {
			if (of_property_read_u32(dev->of_node, "fsl,entry-address", &priv->entry))
				return -EINVAL;

			return imx_rproc_attach_pd(priv); // <-- Returns here
		...

And this function, imx_rproc_attach_pd, returns 0 at the end:

	...
	return 0; // <-- Returns here at the end

detach_pd:
	while (--i >= 0) {
	...

So looks like in this case the partition is *not* owned by the A core,
it is still being owned by the Mcore and I can not attach.

For debugging purposes, I did the following patch, inverting the logic:

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 592a34cfa75e..2fcc9086e916 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -1072,7 +1072,7 @@ static int imx_rproc_detect_mode(struct imx_rproc *priv)
                 * If Mcore resource is not owned by Acore partition, It is kicked by ROM,
                 * and Linux could only do IPC with Mcore and nothing else.
                 */
-               if (imx_sc_rm_is_resource_owned(priv->ipc_handle, priv->rsrc_id)) {
+               if (!imx_sc_rm_is_resource_owned(priv->ipc_handle, priv->rsrc_id)) {
                        if (of_property_read_u32(dev->of_node, "fsl,entry-address", &priv->entry))
                                return -EINVAL;

And now the remoteproc driver attaches to the MCore succesfully, which
is exactly what I want. However less than one second later the kernel
reboot with the "SCFW fault reset" again.

Do you know what could be the issue in this case? Maybe the partitions are
not yet correct?

>
> Regards,
> Peng.

Cheers,
Hiago.
Peng Fan April 17, 2025, 2:46 a.m. UTC | #10
> Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> when Cortex-M4 is running and it was started by U-Boot bootaux
> 
> Hi Peng,
> 
> On Wed, Apr 16, 2025 at 08:19:27AM +0000, Peng Fan wrote:
> > > Got it, I was able to make it work with the downstream pingpong
> > > driver and the MCUXpresso demo. I can launch the firmware using
> the
> > > remoteproc and exchange data between the two cores.
> > >
> > > There is something I noticed, when I start the pingpong demo with
> U-
> > > Boot, it does not work. I run the pingpong modprobe on Linux but
> the
> > > name service is never annouced. It only works if I start with the
> > > remoteproc on Linux, not U-Boot. Is this because of Linux not being
> > > able to control the M4?
> >
> > No. In you case, you could start using remoteproc, that means Linux
> > could control M4.
> >
> > >
> > > If I start the binary using U-Boot, the "state" always report as
> "offline"
> > > by the remoteproc driver.
> >
> > In drivers/remoteproc/imx_rproc.c,  imx_rproc_detect_mode case
> > IMX_RPROC_SCU_API is used for detect mode of M4 for i.MX8Q/X
> platform.
> > Please give a look where it returns.
> >
> > For U-Boot start m4, linux should remote state: "attached"
> 
> Ok, in this case looks its does not work. I start the firmware with U-
> Boot and state is always "offline". Inside the IMX_RPROC_SCU_API case,
> the function returns at this point:
> 
> 		...
> 		/*
> 		 * If Mcore resource is not owned by Acore partition, It
> is kicked by ROM,
> 		 * and Linux could only do IPC with Mcore and
> nothing else.
> 		 */
> 		if (imx_sc_rm_is_resource_owned(priv->ipc_handle,
> priv->rsrc_id)) {
> 			if (of_property_read_u32(dev->of_node,
> "fsl,entry-address", &priv->entry))
> 				return -EINVAL;
> 
> 			return imx_rproc_attach_pd(priv); // <--
> Returns here
> 		...
> 
> And this function, imx_rproc_attach_pd, returns 0 at the end:
> 
> 	...
> 	return 0; // <-- Returns here at the end

        ret = dev_pm_domain_attach_list(dev, &pd_data, &priv->pd_list);                             
        return ret < 0 ? ret : 0;

It should return with ret as 0.
Because you mentioned you added two entries.
		power-domains = <&pd IMX_SC_R_M4_0_PID0>,
				<&pd IMX_SC_R_M4_0_MU_1A>;

https://lore.kernel.org/all/20250411162328.y2kchvdb4v4xi2lj@hiago-nb/


> 
> detach_pd:
> 	while (--i >= 0) {
> 	...

It should not runs into detach_pd, where it fails?

> 
> So looks like in this case the partition is *not* owned by the A core, it
> is still being owned by the Mcore and I can not attach.

No. It is owned, because imx_sc_rm_is_resource_owned returns
true from what you said above.

> 
> For debugging purposes, I did the following patch, inverting the logic:
> 
> diff --git a/drivers/remoteproc/imx_rproc.c
> b/drivers/remoteproc/imx_rproc.c index 592a34cfa75e..2fcc9086e916
> 100644
> --- a/drivers/remoteproc/imx_rproc.c
> +++ b/drivers/remoteproc/imx_rproc.c
> @@ -1072,7 +1072,7 @@ static int imx_rproc_detect_mode(struct
> imx_rproc *priv)
>                  * If Mcore resource is not owned by Acore partition, It is
> kicked by ROM,
>                  * and Linux could only do IPC with Mcore and nothing else.
>                  */
> -               if (imx_sc_rm_is_resource_owned(priv->ipc_handle, priv-
> >rsrc_id)) {
> +               if (!imx_sc_rm_is_resource_owned(priv->ipc_handle,
> + priv->rsrc_id)) {
>                         if (of_property_read_u32(dev->of_node, "fsl,entry-
> address", &priv->entry))
>                                 return -EINVAL;

Please no.

> 
> And now the remoteproc driver attaches to the MCore succesfully,
> which is exactly what I want. However less than one second later the
> kernel reboot with the "SCFW fault reset" again.

The resources used by M4 are unexpectedly shutdown by Linux,
as I understand.

Please check imx_rproc_attach_pd, it should return success
with dev_pm_domain_attach_list return 0.

Regards,
Peng

> 
> Do you know what could be the issue in this case? Maybe the
> partitions are not yet correct?
> 
> >
> > Regards,
> > Peng.
> 
> Cheers,
> Hiago.
Hiago De Franco April 17, 2025, 9:37 p.m. UTC | #11
Hi Peng,

On Thu, Apr 17, 2025 at 02:46:46AM +0000, Peng Fan wrote:
> > Subject: Re: [REGRESSION] Kernel reboots unexpectdely on i.MX8X
> > when Cortex-M4 is running and it was started by U-Boot bootaux
> >
> > Hi Peng,
> >
> > On Wed, Apr 16, 2025 at 08:19:27AM +0000, Peng Fan wrote:
> > > > Got it, I was able to make it work with the downstream pingpong
> > > > driver and the MCUXpresso demo. I can launch the firmware using
> > the
> > > > remoteproc and exchange data between the two cores.
> > > >
> > > > There is something I noticed, when I start the pingpong demo with
> > U-
> > > > Boot, it does not work. I run the pingpong modprobe on Linux but
> > the
> > > > name service is never annouced. It only works if I start with the
> > > > remoteproc on Linux, not U-Boot. Is this because of Linux not being
> > > > able to control the M4?
> > >
> > > No. In you case, you could start using remoteproc, that means Linux
> > > could control M4.
> > >
> > > >
> > > > If I start the binary using U-Boot, the "state" always report as
> > "offline"
> > > > by the remoteproc driver.
> > >
> > > In drivers/remoteproc/imx_rproc.c,  imx_rproc_detect_mode case
> > > IMX_RPROC_SCU_API is used for detect mode of M4 for i.MX8Q/X
> > platform.
> > > Please give a look where it returns.
> > >
> > > For U-Boot start m4, linux should remote state: "attached"
> >
> > Ok, in this case looks its does not work. I start the firmware with U-
> > Boot and state is always "offline". Inside the IMX_RPROC_SCU_API case,
> > the function returns at this point:
> >
> > 		...
> > 		/*
> > 		 * If Mcore resource is not owned by Acore partition, It
> > is kicked by ROM,
> > 		 * and Linux could only do IPC with Mcore and
> > nothing else.
> > 		 */
> > 		if (imx_sc_rm_is_resource_owned(priv->ipc_handle,
> > priv->rsrc_id)) {
> > 			if (of_property_read_u32(dev->of_node,
> > "fsl,entry-address", &priv->entry))
> > 				return -EINVAL;
> >
> > 			return imx_rproc_attach_pd(priv); // <--
> > Returns here
> > 		...
> >
> > And this function, imx_rproc_attach_pd, returns 0 at the end:
> >
> > 	...
> > 	return 0; // <-- Returns here at the end
>
>         ret = dev_pm_domain_attach_list(dev, &pd_data, &priv->pd_list);
>         return ret < 0 ? ret : 0;
>
> It should return with ret as 0.
> Because you mentioned you added two entries.
> 		power-domains = <&pd IMX_SC_R_M4_0_PID0>,
> 				<&pd IMX_SC_R_M4_0_MU_1A>;
>
> https://lore.kernel.org/all/20250411162328.y2kchvdb4v4xi2lj@hiago-nb/
>
>
> >
> > detach_pd:
> > 	while (--i >= 0) {
> > 	...
>
> It should not runs into detach_pd, where it fails?
>
> >
> > So looks like in this case the partition is *not* owned by the A core, it
> > is still being owned by the Mcore and I can not attach.
>
> No. It is owned, because imx_sc_rm_is_resource_owned returns
> true from what you said above.
>
> >
> > For debugging purposes, I did the following patch, inverting the logic:
> >
> > diff --git a/drivers/remoteproc/imx_rproc.c
> > b/drivers/remoteproc/imx_rproc.c index 592a34cfa75e..2fcc9086e916
> > 100644
> > --- a/drivers/remoteproc/imx_rproc.c
> > +++ b/drivers/remoteproc/imx_rproc.c
> > @@ -1072,7 +1072,7 @@ static int imx_rproc_detect_mode(struct
> > imx_rproc *priv)
> >                  * If Mcore resource is not owned by Acore partition, It is
> > kicked by ROM,
> >                  * and Linux could only do IPC with Mcore and nothing else.
> >                  */
> > -               if (imx_sc_rm_is_resource_owned(priv->ipc_handle, priv-
> > >rsrc_id)) {
> > +               if (!imx_sc_rm_is_resource_owned(priv->ipc_handle,
> > + priv->rsrc_id)) {
> >                         if (of_property_read_u32(dev->of_node, "fsl,entry-
> > address", &priv->entry))
> >                                 return -EINVAL;
>
> Please no.
>
> >
> > And now the remoteproc driver attaches to the MCore succesfully,
> > which is exactly what I want. However less than one second later the
> > kernel reboot with the "SCFW fault reset" again.
>
> The resources used by M4 are unexpectedly shutdown by Linux,
> as I understand.
>
> Please check imx_rproc_attach_pd, it should return success
> with dev_pm_domain_attach_list return 0.

I was able to debug this issue further. The issue is not with the power
domains and the clock, these were solved by using the correct device
tree and by using the optional clk api:

@@ -1033,7 +1034,7 @@ static int imx_rproc_clk_enable(struct imx_rproc *priv)
        if (dcfg->method == IMX_RPROC_NONE)
                return 0;

-       priv->clk = devm_clk_get(dev, NULL);
+       priv->clk = devm_clk_get_optional(dev, NULL);
        if (IS_ERR(priv->clk)) {
                dev_err(dev, "Failed to get clock\n");
                return PTR_ERR(priv->clk);

And the device tree node:

	imx8x-cm4 {
		compatible = "fsl,imx8qxp-cm4";
		mbox-names = "tx", "rx", "rxdb";
		mboxes = <&lsio_mu5 0 1
			  &lsio_mu5 1 1
			  &lsio_mu5 3 1>;
		memory-region = <&vdev0buffer>, <&vdev0vring0>, <&vdev0vring1>,
				<&vdev1vring0>, <&vdev1vring1>, <&rsc_table>;
		power-domains = <&pd IMX_SC_R_M4_0_PID0>,
				<&pd IMX_SC_R_M4_0_MU_1A>;
		fsl,entry-address = <0x34fe0000>;
		fsl,resource-id = <IMX_SC_R_M4_0_PID0>;
	};

The issue is: the .attach callback is never called when then code was
started by U-Boot bootaux. It should be called inside rproc_validate()
function from remoteproc_core.c. However, the rproc->state when we reach
this function is RPROC_OFFLINE, which causes the state to be always
(obviously) offline.

In imx_rproc.c driver, imx_rproc_detect_mode() returns at

		/*
		 * If Mcore resource is not owned by Acore partition, It is kicked by ROM,
		 * and Linux could only do IPC with Mcore and nothing else.
		 */
		if (imx_sc_rm_is_resource_owned(priv->ipc_handle, priv->rsrc_id)) {
			if (of_property_read_u32(dev->of_node, "fsl,entry-address", &priv->entry))
				return -EINVAL;

			return imx_rproc_attach_pd(priv);
		}

imx_sc_rm_is_resource_owned() returns 1, as M4 is owned by Linux, and
imx_rproc_attach_pd() returns 0, as both power domains are set. This
happens _before_ "priv->rproc->state = RPROC_DETACHED" is set one line
below, which causes the state to be always offline and the .attach
callback not called. For debugging, this patch solves this issue:

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 74299af1d7f1..b5af44a5d542 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -949,6 +949,7 @@ static int imx_rproc_detect_mode(struct imx_rproc *priv)
                        if (of_property_read_u32(dev->of_node, "fsl,entry-address", &priv->entry))
                                return -EINVAL;

+                       priv->rproc->state = RPROC_DETACHED;
                        return imx_rproc_attach_pd(priv);
                }

With that, the running M4 is attached succesfully:

root@colibri-imx8x-07308754:~# dmesg | grep remote
[    0.481561] remoteproc remoteproc0: imx-rproc is available
[    0.481656] remoteproc remoteproc0: attaching to imx-rproc
[    0.481743] remoteproc remoteproc0: remote processor imx-rproc is now attached

But as you know, this is not the correct solution, because this will
also attach when the M4 is _not_ running, since it will be owned by
Linux without any code running, which I think is wrong... I think this
"if" statement is not fully correct, there must be a way to attach to
the running M4. Any ideas or suggestions to fix this part?

>
> Regards,
> Peng
>
> >
> > Do you know what could be the issue in this case? Maybe the
> > partitions are not yet correct?
> >
> > >
> > > Regards,
> > > Peng.
> >
> > Cheers,
> > Hiago.

Cheers,
Hiago.
diff mbox series

Patch

diff --git a/drivers/pmdomain/imx/scu-pd.c b/drivers/pmdomain/imx/scu-pd.c
index 38f3cdd21042..0477b3fb4991 100644
--- a/drivers/pmdomain/imx/scu-pd.c
+++ b/drivers/pmdomain/imx/scu-pd.c
@@ -539,6 +539,9 @@  imx_scu_add_pm_domain(struct device *dev, int idx,
                return NULL;
        }

+       if (strstr("cm40", sc_pd->name) != NULL)
+               is_off = true;
+
        ret = pm_genpd_init(&sc_pd->pd, NULL, is_off);
        if (ret) {
                dev_warn(dev, "failed to init pd %s rsrc id %d",