mbox series

[v5,0/3] Add Qualcomm MPM irqchip driver support

Message ID 20220216132830.32490-1-shawn.guo@linaro.org
Headers show
Series Add Qualcomm MPM irqchip driver support | expand

Message

Shawn Guo Feb. 16, 2022, 1:28 p.m. UTC
It starts from updating cpuidle-psci driver to send CPU_CLUSTER_PM_ENTER
notification, and then adds DT binding and driver support for Qualcomm
MPM (MSM Power Manager) interrupt controller.

Changes for v5:
- Drop inline attributes and let compiler to decide
- Use _irqsave/_irqrestore flavour for spin lock
- Assignment on a single for irq_resolve_mapping() call
- Add documentation to explain vMPM ownership transition
- Move MPM pin map data into device tree and so use a generic compatible
- Drop the code that counts CPUs in PM and use CPU_CLUSTER_PM_ENTER
  notification instead

Changes for v4:
- Add the missing include of <linux/interrupt.h> to fix build errors
  on arm architecture.
- Leave IRQCHIP_PLATFORM_DRIVER infrastructural unchanged, and use
  of_find_device_by_node() to get platform_device pointer.

Changes for v3:
- Support module build
- Use relaxed accessors
- Add barrier call to ensure MMIO write completes
- Use d->chip_data to pass driver private data
- Use raw spinlock
- USe BIT() for bit shift
- Create a single irq domain to cover both types of MPM pins
- Call irq_resolve_mapping() to find out Linux irq number
- Save the use of ternary conditional operator and use switch/case for
  .irq_set_type call
- Drop unnecessary .irq_disable hook
- Align qcom_mpm_chip and qcom_mpm_ops members vertically
- Use helper irq_domain_translate_twocell()
- Move mailbox requesting forward in probe function
- Improve the documentation on qcm2290_gic_pins[]
- Use IRQCHIP_PLATFORM_DRIVER infrastructural
- Use cpu_pm notifier instead of .suspend_late hook to write MPM for
  sleep, so that MPM can be set up for both suspend and idle context.
  The TIMER0/1 setup is currently omitted for idle use case though,
  as I haven't been able to successfully test the idle context.

Shawn Guo (3):
  cpuidle: psci: Call cpu_cluster_pm_enter() on the last CPU
  dt-bindings: interrupt-controller: Add Qualcomm MPM support
  irqchip: Add Qualcomm MPM controller driver

 .../interrupt-controller/qcom,mpm.yaml        |  94 ++++
 drivers/cpuidle/cpuidle-psci.c                |  13 +
 drivers/irqchip/Kconfig                       |   8 +
 drivers/irqchip/Makefile                      |   1 +
 drivers/irqchip/qcom-mpm.c                    | 440 ++++++++++++++++++
 5 files changed, 556 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/qcom,mpm.yaml
 create mode 100644 drivers/irqchip/qcom-mpm.c

Comments

Marc Zyngier Feb. 16, 2022, 2:39 p.m. UTC | #1
On 2022-02-16 13:28, Shawn Guo wrote:
> Make a call to cpu_cluster_pm_enter() on the last CPU going to low 
> power
> state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> platforms can be notified to set up hardware for getting into the 
> cluster
> low power state.
> 
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  drivers/cpuidle/cpuidle-psci.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/cpuidle/cpuidle-psci.c 
> b/drivers/cpuidle/cpuidle-psci.c
> index b51b5df08450..c748c1a7d7b1 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -37,6 +37,7 @@ struct psci_cpuidle_data {
>  static DEFINE_PER_CPU_READ_MOSTLY(struct psci_cpuidle_data, 
> psci_cpuidle_data);
>  static DEFINE_PER_CPU(u32, domain_state);
>  static bool psci_cpuidle_use_cpuhp;
> +static atomic_t cpus_in_idle;
> 
>  void psci_set_domain_state(u32 state)
>  {
> @@ -67,6 +68,14 @@ static int __psci_enter_domain_idle_state(struct
> cpuidle_device *dev,
>  	if (ret)
>  		return -1;
> 
> +	if (atomic_inc_return(&cpus_in_idle) == num_online_cpus()) {
> +		ret = cpu_cluster_pm_enter();
> +		if (ret) {
> +			ret = -1;
> +			goto dec_atomic;
> +		}
> +	}
> +
>  	/* Do runtime PM to manage a hierarchical CPU toplogy. */
>  	rcu_irq_enter_irqson();
>  	if (s2idle)
> @@ -88,6 +97,10 @@ static int __psci_enter_domain_idle_state(struct
> cpuidle_device *dev,
>  		pm_runtime_get_sync(pd_dev);
>  	rcu_irq_exit_irqson();
> 
> +	if (atomic_read(&cpus_in_idle) == num_online_cpus())
> +		cpu_cluster_pm_exit();
> +dec_atomic:
> +	atomic_dec(&cpus_in_idle);
>  	cpu_pm_exit();
> 
>  	/* Clear the domain state to start fresh when back from idle. */

Is it just me, or does anyone else find it a bit odd that a cpuidle 
driver
calls back into the core cpuidle code to generate new events?

Also, why is this PSCI specific? I would assume that the core cpuidle 
code
should be responsible for these transitions, not a random cpuidle 
driver.

Thanks,

         M.
Sudeep Holla Feb. 16, 2022, 2:49 p.m. UTC | #2
+Ulf (as you he is the author of cpuidle-psci-domains.c and can help you
with that if you require)

On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
> Make a call to cpu_cluster_pm_enter() on the last CPU going to low power
> state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> platforms can be notified to set up hardware for getting into the cluster
> low power state.
>

NACK. We are not getting the notion of CPU cluster back to cpuidle again.
That must die. Remember the cluster doesn't map to idle states especially
in the DSU systems where HMP CPUs are in the same cluster but can be in 
different power domains.

You need to decide which PSCI CPU_SUSPEND mode you want to use first. If it is
Platform Co-ordinated(PC), then you need not notify anything to the platform.
Just request the desired idle state on each CPU and platform will take care
from there.

If for whatever reason you have chosen OS initiated mode(OSI), then specify
the PSCI power domains correctly in the DT which will make use of the
cpuidle-psci-domains and handle the so called "cluster" state correctly.
Marc Zyngier Feb. 16, 2022, 3:58 p.m. UTC | #3
On 2022-02-16 14:49, Sudeep Holla wrote:
> +Ulf (as you he is the author of cpuidle-psci-domains.c and can help 
> you
> with that if you require)
> 
> On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
>> Make a call to cpu_cluster_pm_enter() on the last CPU going to low 
>> power
>> state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
>> platforms can be notified to set up hardware for getting into the 
>> cluster
>> low power state.
>> 
> 
> NACK. We are not getting the notion of CPU cluster back to cpuidle 
> again.
> That must die. Remember the cluster doesn't map to idle states 
> especially
> in the DSU systems where HMP CPUs are in the same cluster but can be in
> different power domains.
> 
> You need to decide which PSCI CPU_SUSPEND mode you want to use first. 
> If it is
> Platform Co-ordinated(PC), then you need not notify anything to the 
> platform.
> Just request the desired idle state on each CPU and platform will take 
> care
> from there.
> 
> If for whatever reason you have chosen OS initiated mode(OSI), then 
> specify
> the PSCI power domains correctly in the DT which will make use of the
> cpuidle-psci-domains and handle the so called "cluster" state 
> correctly.

My understanding is that what Shawn is after is a way to detect the 
"last
man standing" on the system to kick off some funky wake-up controller 
that
really should be handled by the power controller (because only that guy
knows for sure who is the last CPU on the block).

There was previously some really funky stuff (copy pasted from the 
existing
rpmh_rsc_cpu_pm_callback()), which I totally objected to having hidden 
in
an irqchip driver.

My ask was that if we needed such information, and assuming that it is
possible to obtain it in a reliable way, this should come from the core
code, and not be invented by random drivers.

Thanks,

         M.
Shawn Guo Feb. 17, 2022, 2:28 a.m. UTC | #4
On Wed, Feb 16, 2022 at 02:39:26PM +0000, Marc Zyngier wrote:
> On 2022-02-16 13:28, Shawn Guo wrote:
> > Make a call to cpu_cluster_pm_enter() on the last CPU going to low power
> > state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> > platforms can be notified to set up hardware for getting into the
> > cluster
> > low power state.
> > 
> > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > ---
> >  drivers/cpuidle/cpuidle-psci.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/cpuidle/cpuidle-psci.c
> > b/drivers/cpuidle/cpuidle-psci.c
> > index b51b5df08450..c748c1a7d7b1 100644
> > --- a/drivers/cpuidle/cpuidle-psci.c
> > +++ b/drivers/cpuidle/cpuidle-psci.c
> > @@ -37,6 +37,7 @@ struct psci_cpuidle_data {
> >  static DEFINE_PER_CPU_READ_MOSTLY(struct psci_cpuidle_data,
> > psci_cpuidle_data);
> >  static DEFINE_PER_CPU(u32, domain_state);
> >  static bool psci_cpuidle_use_cpuhp;
> > +static atomic_t cpus_in_idle;
> > 
> >  void psci_set_domain_state(u32 state)
> >  {
> > @@ -67,6 +68,14 @@ static int __psci_enter_domain_idle_state(struct
> > cpuidle_device *dev,
> >  	if (ret)
> >  		return -1;
> > 
> > +	if (atomic_inc_return(&cpus_in_idle) == num_online_cpus()) {
> > +		ret = cpu_cluster_pm_enter();
> > +		if (ret) {
> > +			ret = -1;
> > +			goto dec_atomic;
> > +		}
> > +	}
> > +
> >  	/* Do runtime PM to manage a hierarchical CPU toplogy. */
> >  	rcu_irq_enter_irqson();
> >  	if (s2idle)
> > @@ -88,6 +97,10 @@ static int __psci_enter_domain_idle_state(struct
> > cpuidle_device *dev,
> >  		pm_runtime_get_sync(pd_dev);
> >  	rcu_irq_exit_irqson();
> > 
> > +	if (atomic_read(&cpus_in_idle) == num_online_cpus())
> > +		cpu_cluster_pm_exit();
> > +dec_atomic:
> > +	atomic_dec(&cpus_in_idle);
> >  	cpu_pm_exit();
> > 
> >  	/* Clear the domain state to start fresh when back from idle. */
> 
> Is it just me, or does anyone else find it a bit odd that a cpuidle driver
> calls back into the core cpuidle code to generate new events?

It's not uncommon that a platform driver calls some helper functions
provided by core.

> Also, why is this PSCI specific? I would assume that the core cpuidle code
> should be responsible for these transitions, not a random cpuidle driver.

The CPU PM helpers cpu_pm_enter() and cpu_cluster_pm_enter() are provided
by kernel/cpu_pm.c rather than cpuidle core.  This PSCI cpuidle driver
already uses cpu_pm_enter(), and my patch is making a call to
cpu_cluster_pm_enter().

Shawn
Shawn Guo Feb. 17, 2022, 7:31 a.m. UTC | #5
On Wed, Feb 16, 2022 at 03:58:41PM +0000, Marc Zyngier wrote:
> On 2022-02-16 14:49, Sudeep Holla wrote:
> > +Ulf (as you he is the author of cpuidle-psci-domains.c and can help you
> > with that if you require)

Thanks, Sudeep!

> > 
> > On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
> > > Make a call to cpu_cluster_pm_enter() on the last CPU going to low
> > > power
> > > state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> > > platforms can be notified to set up hardware for getting into the
> > > cluster
> > > low power state.
> > > 
> > 
> > NACK. We are not getting the notion of CPU cluster back to cpuidle
> > again.
> > That must die. Remember the cluster doesn't map to idle states
> > especially
> > in the DSU systems where HMP CPUs are in the same cluster but can be in
> > different power domains.

The 'cluster' in cpu_cluster_pm_enter() doesn't necessarily means
a physical CPU cluster.  I think the documentation of the function has a
better description.

 * Notifies listeners that all cpus in a power domain are entering a low power
 * state that may cause some blocks in the same power domain to reset.

So cpu_domain_pm_enter() might be a better name?  Anyways ...

> > 
> > You need to decide which PSCI CPU_SUSPEND mode you want to use first. If
> > it is
> > Platform Co-ordinated(PC), then you need not notify anything to the
> > platform.
> > Just request the desired idle state on each CPU and platform will take
> > care
> > from there.
> > 
> > If for whatever reason you have chosen OS initiated mode(OSI), then
> > specify
> > the PSCI power domains correctly in the DT which will make use of the
> > cpuidle-psci-domains and handle the so called "cluster" state correctly.

Yes, I'm running a Qualcomm platform that has OSI supported in PSCI.

> 
> My understanding is that what Shawn is after is a way to detect the "last
> man standing" on the system to kick off some funky wake-up controller that
> really should be handled by the power controller (because only that guy
> knows for sure who is the last CPU on the block).
> 
> There was previously some really funky stuff (copy pasted from the existing
> rpmh_rsc_cpu_pm_callback()), which I totally objected to having hidden in
> an irqchip driver.
> 
> My ask was that if we needed such information, and assuming that it is
> possible to obtain it in a reliable way, this should come from the core
> code, and not be invented by random drivers.

Thanks Marc for explain my problem!

Right, all I need is a notification in MPM irqchip driver when the CPU
domain/cluster is about to enter low power state.  As cpu_pm -
kernel/cpu_pm.c, already has helper cpu_cluster_pm_enter() sending
CPU_CLUSTER_PM_ENTER event, I just need to find a caller to this cpu_pm
helper.  

Is .power_off hook of generic_pm_domain a better place for calling the
helper?

Shawn

----8<------------
diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index ff2c3f8e4668..58aad15851f9 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -10,6 +10,7 @@
 #define pr_fmt(fmt) "CPUidle PSCI: " fmt
 
 #include <linux/cpu.h>
+#include <linux/cpu_pm.h>
 #include <linux/device.h>
 #include <linux/kernel.h>
 #include <linux/platform_device.h>
@@ -33,6 +34,7 @@ static int psci_pd_power_off(struct generic_pm_domain *pd)
 {
        struct genpd_power_state *state = &pd->states[pd->state_idx];
        u32 *pd_state;
+       int ret;
 
        if (!state->data)
                return 0;
@@ -44,6 +46,16 @@ static int psci_pd_power_off(struct generic_pm_domain *pd)
        pd_state = state->data;
        psci_set_domain_state(*pd_state);
 
+       if (list_empty(&pd->child_links)) {
+               /*
+                * The top domain (not being a child of anyone) should be the
+                * best one triggering PM notification.
+                */
+               ret = cpu_cluster_pm_enter();
+               if (ret)
+                       return ret;
+       }
+
        return 0;
 }
Marc Zyngier Feb. 17, 2022, 8:52 a.m. UTC | #6
On Thu, 17 Feb 2022 07:31:32 +0000,
Shawn Guo <shawn.guo@linaro.org> wrote:
> 
> On Wed, Feb 16, 2022 at 03:58:41PM +0000, Marc Zyngier wrote:
> > On 2022-02-16 14:49, Sudeep Holla wrote:
> > > +Ulf (as you he is the author of cpuidle-psci-domains.c and can help you
> > > with that if you require)
> 
> Thanks, Sudeep!
> 
> > > 
> > > On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
> > > > Make a call to cpu_cluster_pm_enter() on the last CPU going to low
> > > > power
> > > > state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> > > > platforms can be notified to set up hardware for getting into the
> > > > cluster
> > > > low power state.
> > > > 
> > > 
> > > NACK. We are not getting the notion of CPU cluster back to cpuidle
> > > again.
> > > That must die. Remember the cluster doesn't map to idle states
> > > especially
> > > in the DSU systems where HMP CPUs are in the same cluster but can be in
> > > different power domains.
> 
> The 'cluster' in cpu_cluster_pm_enter() doesn't necessarily means
> a physical CPU cluster.  I think the documentation of the function has a
> better description.
> 
>  * Notifies listeners that all cpus in a power domain are entering a low power
>  * state that may cause some blocks in the same power domain to reset.
> 
> So cpu_domain_pm_enter() might be a better name?  Anyways ...
> 
> > > 
> > > You need to decide which PSCI CPU_SUSPEND mode you want to use first. If
> > > it is
> > > Platform Co-ordinated(PC), then you need not notify anything to the
> > > platform.
> > > Just request the desired idle state on each CPU and platform will take
> > > care
> > > from there.
> > > 
> > > If for whatever reason you have chosen OS initiated mode(OSI), then
> > > specify
> > > the PSCI power domains correctly in the DT which will make use of the
> > > cpuidle-psci-domains and handle the so called "cluster" state correctly.
> 
> Yes, I'm running a Qualcomm platform that has OSI supported in PSCI.
> 
> > 
> > My understanding is that what Shawn is after is a way to detect the "last
> > man standing" on the system to kick off some funky wake-up controller that
> > really should be handled by the power controller (because only that guy
> > knows for sure who is the last CPU on the block).
> > 
> > There was previously some really funky stuff (copy pasted from the existing
> > rpmh_rsc_cpu_pm_callback()), which I totally objected to having hidden in
> > an irqchip driver.
> > 
> > My ask was that if we needed such information, and assuming that it is
> > possible to obtain it in a reliable way, this should come from the core
> > code, and not be invented by random drivers.
> 
> Thanks Marc for explain my problem!
> 
> Right, all I need is a notification in MPM irqchip driver when the CPU
> domain/cluster is about to enter low power state.  As cpu_pm -
> kernel/cpu_pm.c, already has helper cpu_cluster_pm_enter() sending
> CPU_CLUSTER_PM_ENTER event, I just need to find a caller to this cpu_pm
> helper.  
> 
> Is .power_off hook of generic_pm_domain a better place for calling the
> helper?

I really don't understand why you want a cluster PM event generated by
the idle driver. Specially considering that you are not after a
*cluster* PM event, but after some sort of system-wide event (last man
standing).

It looks to me that having a predicate that can be called from a PM
notifier event to find out whether you're the last in line would be
better suited, and could further be used to remove the crap from the
rpmh-rsc driver.

	M.
Shawn Guo Feb. 17, 2022, 1:14 p.m. UTC | #7
On Wed, Feb 16, 2022 at 03:48:42PM +0000, Marc Zyngier wrote:
...
> > +static irq_hw_number_t get_parent_hwirq(struct qcom_mpm_priv *priv, int
> > pin)
> > +{
> > +	const struct mpm_gic_map *maps = priv->maps;
> > +	int i;
> > +
> > +	for (i = 0; i < priv->map_cnt; i++) {
> > +		if (maps[i].pin == pin)
> > +			return maps[i].hwirq;
> > +	}
> > +
> > +	return MPM_NO_PARENT_IRQ;
> 
> I'd rather you change this helper to return a pointer to the mapping data,
> and NULL on failure. This will avoid inventing magic values which may
> or may not clash with an interrupt number in the future.
> 
> > +}
> > +
> > +static int qcom_mpm_alloc(struct irq_domain *domain, unsigned int virq,
> > +			  unsigned int nr_irqs, void *data)
> > +{
> > +	struct qcom_mpm_priv *priv = domain->host_data;
> > +	struct irq_fwspec *fwspec = data;
> > +	struct irq_fwspec parent_fwspec;
> > +	irq_hw_number_t parent_hwirq;
> 
> Isn't this the pin number? If so, I'd rather you call it that.

We use term 'pin' in the driver as hwirq of MPM, while the parent_hwirq
here means hwirq of GIC.  But it will be dropped anyway as I'm following
your suggestion to check mapping data instead of parent_hwirq.

I will address all other review comments.  Thanks, Marc!

Shawn

> 
> > +	irq_hw_number_t hwirq;
> > +	unsigned int type;
> > +	int  ret;
> > +
> > +	ret = irq_domain_translate_twocell(domain, fwspec, &hwirq, &type);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = irq_domain_set_hwirq_and_chip(domain, virq, hwirq,
> > +					    &qcom_mpm_chip, priv);
> > +	if (ret)
> > +		return ret;
> > +
> > +	parent_hwirq = get_parent_hwirq(priv, hwirq);
> > +	if (parent_hwirq == MPM_NO_PARENT_IRQ)
> > +		return irq_domain_disconnect_hierarchy(domain->parent, virq);
> > +
> > +	if (type & IRQ_TYPE_EDGE_BOTH)
> > +		type = IRQ_TYPE_EDGE_RISING;
> > +
> > +	if (type & IRQ_TYPE_LEVEL_MASK)
> > +		type = IRQ_TYPE_LEVEL_HIGH;
> > +
> > +	parent_fwspec.fwnode = domain->parent->fwnode;
> > +	parent_fwspec.param_count = 3;
> > +	parent_fwspec.param[0] = 0;
> > +	parent_fwspec.param[1] = parent_hwirq;
> > +	parent_fwspec.param[2] = type;
> > +
> > +	return irq_domain_alloc_irqs_parent(domain, virq, nr_irqs,
> > +					    &parent_fwspec);
> > +}
Shawn Guo Feb. 17, 2022, 1:37 p.m. UTC | #8
On Thu, Feb 17, 2022 at 08:52:35AM +0000, Marc Zyngier wrote:
> On Thu, 17 Feb 2022 07:31:32 +0000,
> Shawn Guo <shawn.guo@linaro.org> wrote:
> > 
> > On Wed, Feb 16, 2022 at 03:58:41PM +0000, Marc Zyngier wrote:
> > > On 2022-02-16 14:49, Sudeep Holla wrote:
> > > > +Ulf (as you he is the author of cpuidle-psci-domains.c and can help you
> > > > with that if you require)
> > 
> > Thanks, Sudeep!
> > 
> > > > 
> > > > On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
> > > > > Make a call to cpu_cluster_pm_enter() on the last CPU going to low
> > > > > power
> > > > > state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> > > > > platforms can be notified to set up hardware for getting into the
> > > > > cluster
> > > > > low power state.
> > > > > 
> > > > 
> > > > NACK. We are not getting the notion of CPU cluster back to cpuidle
> > > > again.
> > > > That must die. Remember the cluster doesn't map to idle states
> > > > especially
> > > > in the DSU systems where HMP CPUs are in the same cluster but can be in
> > > > different power domains.
> > 
> > The 'cluster' in cpu_cluster_pm_enter() doesn't necessarily means
> > a physical CPU cluster.  I think the documentation of the function has a
> > better description.
> > 
> >  * Notifies listeners that all cpus in a power domain are entering a low power
> >  * state that may cause some blocks in the same power domain to reset.
> > 
> > So cpu_domain_pm_enter() might be a better name?  Anyways ...
> > 
> > > > 
> > > > You need to decide which PSCI CPU_SUSPEND mode you want to use first. If
> > > > it is
> > > > Platform Co-ordinated(PC), then you need not notify anything to the
> > > > platform.
> > > > Just request the desired idle state on each CPU and platform will take
> > > > care
> > > > from there.
> > > > 
> > > > If for whatever reason you have chosen OS initiated mode(OSI), then
> > > > specify
> > > > the PSCI power domains correctly in the DT which will make use of the
> > > > cpuidle-psci-domains and handle the so called "cluster" state correctly.
> > 
> > Yes, I'm running a Qualcomm platform that has OSI supported in PSCI.
> > 
> > > 
> > > My understanding is that what Shawn is after is a way to detect the "last
> > > man standing" on the system to kick off some funky wake-up controller that
> > > really should be handled by the power controller (because only that guy
> > > knows for sure who is the last CPU on the block).
> > > 
> > > There was previously some really funky stuff (copy pasted from the existing
> > > rpmh_rsc_cpu_pm_callback()), which I totally objected to having hidden in
> > > an irqchip driver.
> > > 
> > > My ask was that if we needed such information, and assuming that it is
> > > possible to obtain it in a reliable way, this should come from the core
> > > code, and not be invented by random drivers.
> > 
> > Thanks Marc for explain my problem!
> > 
> > Right, all I need is a notification in MPM irqchip driver when the CPU
> > domain/cluster is about to enter low power state.  As cpu_pm -
> > kernel/cpu_pm.c, already has helper cpu_cluster_pm_enter() sending
> > CPU_CLUSTER_PM_ENTER event, I just need to find a caller to this cpu_pm
> > helper.  
> > 
> > Is .power_off hook of generic_pm_domain a better place for calling the
> > helper?
> 
> I really don't understand why you want a cluster PM event generated by
> the idle driver.  Specially considering that you are not after a
> *cluster* PM event, but after some sort of system-wide event (last man
> standing).

That's primarily because MPM driver is used in the idle context, either
s2idle or cpuidle, and idle driver already has some infrastructure that
could help trigger the event.

> It looks to me that having a predicate that can be called from a PM
> notifier event to find out whether you're the last in line would be
> better suited, and could further be used to remove the crap from the
> rpmh-rsc driver.

I will see if I can come up with a patch for discussion.

Shawn
Shawn Guo Feb. 19, 2022, 11:45 a.m. UTC | #9
On Thu, Feb 17, 2022 at 09:37:03PM +0800, Shawn Guo wrote:
> On Thu, Feb 17, 2022 at 08:52:35AM +0000, Marc Zyngier wrote:
> > On Thu, 17 Feb 2022 07:31:32 +0000,
> > Shawn Guo <shawn.guo@linaro.org> wrote:
> > > 
> > > On Wed, Feb 16, 2022 at 03:58:41PM +0000, Marc Zyngier wrote:
> > > > On 2022-02-16 14:49, Sudeep Holla wrote:
> > > > > +Ulf (as you he is the author of cpuidle-psci-domains.c and can help you
> > > > > with that if you require)
> > > 
> > > Thanks, Sudeep!
> > > 
> > > > > 
> > > > > On Wed, Feb 16, 2022 at 09:28:28PM +0800, Shawn Guo wrote:
> > > > > > Make a call to cpu_cluster_pm_enter() on the last CPU going to low
> > > > > > power
> > > > > > state (and cpu_cluster_pm_exit() on the firt CPU coming back), so that
> > > > > > platforms can be notified to set up hardware for getting into the
> > > > > > cluster
> > > > > > low power state.
> > > > > > 
> > > > > 
> > > > > NACK. We are not getting the notion of CPU cluster back to cpuidle
> > > > > again.
> > > > > That must die. Remember the cluster doesn't map to idle states
> > > > > especially
> > > > > in the DSU systems where HMP CPUs are in the same cluster but can be in
> > > > > different power domains.
> > > 
> > > The 'cluster' in cpu_cluster_pm_enter() doesn't necessarily means
> > > a physical CPU cluster.  I think the documentation of the function has a
> > > better description.
> > > 
> > >  * Notifies listeners that all cpus in a power domain are entering a low power
> > >  * state that may cause some blocks in the same power domain to reset.
> > > 
> > > So cpu_domain_pm_enter() might be a better name?  Anyways ...
> > > 
> > > > > 
> > > > > You need to decide which PSCI CPU_SUSPEND mode you want to use first. If
> > > > > it is
> > > > > Platform Co-ordinated(PC), then you need not notify anything to the
> > > > > platform.
> > > > > Just request the desired idle state on each CPU and platform will take
> > > > > care
> > > > > from there.
> > > > > 
> > > > > If for whatever reason you have chosen OS initiated mode(OSI), then
> > > > > specify
> > > > > the PSCI power domains correctly in the DT which will make use of the
> > > > > cpuidle-psci-domains and handle the so called "cluster" state correctly.
> > > 
> > > Yes, I'm running a Qualcomm platform that has OSI supported in PSCI.
> > > 
> > > > 
> > > > My understanding is that what Shawn is after is a way to detect the "last
> > > > man standing" on the system to kick off some funky wake-up controller that
> > > > really should be handled by the power controller (because only that guy
> > > > knows for sure who is the last CPU on the block).
> > > > 
> > > > There was previously some really funky stuff (copy pasted from the existing
> > > > rpmh_rsc_cpu_pm_callback()), which I totally objected to having hidden in
> > > > an irqchip driver.
> > > > 
> > > > My ask was that if we needed such information, and assuming that it is
> > > > possible to obtain it in a reliable way, this should come from the core
> > > > code, and not be invented by random drivers.
> > > 
> > > Thanks Marc for explain my problem!
> > > 
> > > Right, all I need is a notification in MPM irqchip driver when the CPU
> > > domain/cluster is about to enter low power state.  As cpu_pm -
> > > kernel/cpu_pm.c, already has helper cpu_cluster_pm_enter() sending
> > > CPU_CLUSTER_PM_ENTER event, I just need to find a caller to this cpu_pm
> > > helper.  
> > > 
> > > Is .power_off hook of generic_pm_domain a better place for calling the
> > > helper?
> > 
> > I really don't understand why you want a cluster PM event generated by
> > the idle driver.  Specially considering that you are not after a
> > *cluster* PM event, but after some sort of system-wide event (last man
> > standing).
> 
> That's primarily because MPM driver is used in the idle context, either
> s2idle or cpuidle, and idle driver already has some infrastructure that
> could help trigger the event.
> 
> > It looks to me that having a predicate that can be called from a PM
> > notifier event to find out whether you're the last in line would be
> > better suited, and could further be used to remove the crap from the
> > rpmh-rsc driver.
> 
> I will see if I can come up with a patch for discussion.

How about this?

---8<-----------