mbox series

[v14,0/6] Invoke rpmh_flush for non OSI targets

Message ID 1585244270-637-1-git-send-email-mkshah@codeaurora.org
Headers show
Series Invoke rpmh_flush for non OSI targets | expand

Message

Maulik Shah March 26, 2020, 5:37 p.m. UTC
Changes in v14:
- Address Doug's comments on change 3 from v13
- Drop new APIs for start and end transaction from change 4 in v13
- Update change 4 to use cpu pm notifications instead
- Add [5] as change 5 to enable use of WAKE TCS when ACTIVE TCS count is 0
- Add change 6 to Allow multiple WAKE TCS to be used as ACTIVE TCSes
- First 4 changes can be merged even without change 5 and 6.

Changes in v13:
- Address Stephen's comment to maintain COMPILE_TEST
- Address Doug's comments and add new APIs for start and end transaction

Changes in v12:
- Kconfig change to remove COMPILE_TEST was dropped in v11, reinclude it.

Changes in v11:
- Address Doug's comments on change 2 and 3
- Include change to invalidate TCSes before flush from [4]

Changes in v10:
- Address Evan's comments to update commit message on change 2
- Add Evan's Reviewed by on change 2
- Remove comment from rpmh_flush() related to last CPU invoking it
- Rebase all changes on top of next-20200302

Changes in v9:
- Keep rpmh_flush() to invoke from within cache_lock
- Remove comments related to only last cpu invoking rpmh_flush()

Changes in v8:
- Address Stephen's comments on changes 2 and 3
- Add Reviewed by from Stephen on change 1

Changes in v7:
- Address Srinivas's comments to update commit text
- Add Reviewed by from Srinivas

Changes in v6:
- Drop 1 & 2 changes from v5 as they already landed in maintainer tree
- Drop 3 & 4 changes from v5 as no user at present for power domain in rsc
- Rename subject to appropriate since power domain changes are dropped
- Rebase other changes on top of next-20200221

Changes in v5:
- Add Rob's Acked by on dt-bindings change
- Drop firmware psci change
- Update cpuidle stats in dtsi to follow PC mode
- Include change to update dirty flag when data is updated from [4]
- Add change to invoke rpmh_flush when caches are dirty

Changes in v4:
- Add change to allow hierarchical topology in PC mode
- Drop hierarchical domain idle states converter from v3
- Address Merge sc7180 dtsi change to add low power modes

Changes in v3:
- Address Rob's comment on dt property value
- Address Stephen's comments on rpmh-rsc driver change
- Include sc7180 cpuidle low power mode changes from [1]
- Include hierarchical domain idle states converter change from [2]

Changes in v2:
- Add Stephen's Reviewed-By to the first three patches
- Addressed Stephen's comments on fourth patch
- Include changes to connect rpmh domain to cpuidle and genpds

Resource State Coordinator (RSC) is responsible for powering off/lowering
the requirements from CPU subsystem for the associated hardware like buses,
clocks, and regulators when all CPUs and cluster is powered down.

RSC power domain uses last-man activities provided by genpd framework based
on Ulf Hansoon's patch series[3], when the cluster of CPUs enter deepest
idle states. As a part of domain poweroff, RSC can lower resource state
requirements by flushing the cached sleep and wake state votes for various
resources.

[1] https://patchwork.kernel.org/patch/11218965
[2] https://patchwork.kernel.org/patch/10941671
[3] https://patchwork.kernel.org/project/linux-arm-msm/list/?series=222355
[4] https://patchwork.kernel.org/project/linux-arm-msm/list/?series=236503
[5] https://patchwork.kernel.org/patch/10818129

Maulik Shah (5):
  arm64: dts: qcom: sc7180: Add cpuidle low power states
  soc: qcom: rpmh: Update dirty flag only when data changes
  soc: qcom: rpmh: Invalidate SLEEP and WAKE TCSes before flushing new
    data
  soc: qcom: rpmh: Invoke rpmh_flush() for dirty caches
  soc: qcom: rpmh-rsc: Allow using free WAKE TCS for active request

Raju P.L.S.S.S.N (1):
  soc: qcom: rpmh-rsc: Clear active mode configuration for wake TCS

 arch/arm64/boot/dts/qcom/sc7180.dtsi |  78 ++++++++++++++
 drivers/soc/qcom/rpmh-internal.h     |   8 ++
 drivers/soc/qcom/rpmh-rsc.c          | 203 ++++++++++++++++++++++++++++-------
 drivers/soc/qcom/rpmh.c              |  71 ++++++------
 4 files changed, 280 insertions(+), 80 deletions(-)

Comments

Doug Anderson March 26, 2020, 6:20 p.m. UTC | #1
Hi,

On Thu, Mar 26, 2020 at 10:38 AM Maulik Shah <mkshah@codeaurora.org> wrote:
>
> From: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org>
>
> For RSCs that have sleep & wake TCS but no dedicated active TCS, wake
> TCS can be re-purposed to send active requests. Once the active requests
> are sent and response is received, the active mode configuration needs
> to be cleared so that controller can use wake TCS for sending wake
> requests.
>
> Introduce enable_tcs_irq() to enable completion IRQ for repurposed TCSes.
>
> Fixes: 2de4b8d33eab (drivers: qcom: rpmh-rsc: allow active requests from wake TCS)
> Signed-off-by: Raju P.L.S.S.S.N <rplsssn@codeaurora.org>
> [mkshah: call enable_tcs_irq() within drv->lock, update commit message]
> Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
> ---
>  drivers/soc/qcom/rpmh-rsc.c | 77 +++++++++++++++++++++++++++++++--------------
>  1 file changed, 54 insertions(+), 23 deletions(-)

I was writing my review of v1 at the same time you sent this.  Looks
like your patch does address the most important piece of feedback I
had (adjusting the interrupt enable under spinlock), but some of the
other feedback might be nice to incorporate:

https://lore.kernel.org/r/CAD=FV=XmBQb8yfx14T-tMQ68F-h=3UHog744b3X3JZViu15+4g@mail.gmail.com

-Doug
Doug Anderson March 26, 2020, 9:18 p.m. UTC | #2
Hi,

On Thu, Mar 26, 2020 at 10:38 AM Maulik Shah <mkshah@codeaurora.org> wrote:
>
> Add changes to invoke rpmh flush() from cpu pm notification.
> This is done when last cpu is entering power collapse and

s/when last/when the last/

> controller is not busy.

A few overall comments:

* CPU_PM certainly seems interesting and I wasn't aware of it.

* Your series is predicated on nobody calling any of the RPMH
functions after the last CPU enters low power mode with CPU_PM_ENTER
and before some CPU calls CPU_PM_EXIT.  Is it worth adding a check for
that to the code?  This is super important for the 'zero-active-TCS'
case but even for the 'some-active-TCS' it's still important that
nobody tries to update the sleep/wake values after you flush.

* I think your series is focused on idle, right?  What about
suspend/resume?  You need a flush before suspend, right?  Do you need
to register for something to handle suspend, or do the pm callbacks
all get called for full system suspend?  If so, how late do they get
called?


> Controller that do have 'HW solver' mode do not need to register

s/Controller/Controllers


> @@ -85,22 +85,30 @@ struct rpmh_ctrlr {
>   * Resource State Coordinator controller (RSC)
>   *
>   * @name:       controller identifier
> + * @base:       start address of the RSC's DRV registers
>   * @tcs_base:   start address of the TCS registers in this controller
>   * @id:         instance id in the controller (Direct Resource Voter)
>   * @num_tcs:    number of TCSes in this DRV
> + * @rsc_pm:     CPU PM notifier for controller
> + * @cpus_in_pc: CPU mask for cpus in idle power collapse

Name "cpus_entered_pm" to match Linux naming?  Also the comment should
clearly say that this is only used in non-solver mode.


>   * @tcs:        TCS groups
>   * @tcs_in_use: s/w state of the TCS
>   * @lock:       synchronize state of the controller
> + * @pm_lock:    synchronize during PM notifications

Comment should say that this is only for non-solver mode.


> @@ -521,6 +527,71 @@ int rpmh_rsc_write_ctrl_data(struct rsc_drv *drv, const struct tcs_request *msg)
>         return tcs_ctrl_write(drv, msg);
>  }
>
> +/**
> + * rpmh_rsc_ctrlr_is_busy: Check if any of the AMCs are busy.
> + *
> + * @drv: The controller
> + *
> + * Returns True if the TCSes are engaged in handling requests, False otherwise.

This is almost, but not quite in kernel-doc syntax.  Why not make it
match the example in "Documentation/doc-guide/kernel-doc.rst"?
Specifically trying to follow that style:

 * rpmh_rsc_ctrlr_is_busy() - Check if any of the AMCs are busy.
 * @drv: The controller
 *
 * Return: true if TCSes are engaged in handling requests; else false


> +static bool rpmh_rsc_ctrlr_is_busy(struct rsc_drv *drv)
> +{
> +       int m;
> +       struct tcs_group *tcs = get_tcs_of_type(drv, ACTIVE_TCS);
> +
> +       /**

nit: why double star in "/**"?  That's for kerneldoc comments.  I
don't think it makes sense here...


> +        * If we made an active request on a RSC that does not have a
> +        * dedicated TCS for active state use, then re-purposed wake TCSes
> +        * should be checked for not busy
> +        */
> +       if (!tcs->num_tcs)
> +               tcs = get_tcs_of_type(drv, WAKE_TCS);
> +
> +       for (m = tcs->offset; m < tcs->offset + tcs->num_tcs; m++) {
> +               if (!tcs_is_free(drv, m))

As per my documentation patch, tcs_is_free():

 * Must be called with the drv->lock held or the tcs_lock for the TCS being
 * tested.

...I think you're OK calling without the lock since you're running on
the last CPU (and you're blocking any other CPUs from coming online)
and interrupts are disabled, but please document.


> +static int rpmh_rsc_cpu_pm_callback(struct notifier_block *nfb,
> +                                   unsigned long action, void *v)
> +{
> +       struct rsc_drv *drv = container_of(nfb, struct rsc_drv, rsc_pm);
> +       unsigned long flags;
> +       int ret = NOTIFY_OK;
> +
> +       spin_lock_irqsave(&drv->pm_lock, flags);

nit: you can get away with just spin_lock().  cpu_pm_enter() is
documented to be called with interrupts already disabled so the
"irqsave" and "irqrestore" is useless.


> +       switch (action) {
> +       case CPU_PM_ENTER:
> +               cpumask_set_cpu(raw_smp_processor_id(), &drv->cpus_in_pc);
> +
> +               if (!cpumask_equal(&drv->cpus_in_pc, cpu_online_mask))

Is it ever possible to offline a CPU after doing "CPU_PM_ENTER" on it?
 If so you'll never be "equal" to the online mask again.  This seems
safer:

!cpumask_subset(cpu_online_mask, &drv->cpus_in_pc)


> +                       goto exit;
> +               break;
> +       case CPU_PM_ENTER_FAILED:
> +       case CPU_PM_EXIT:
> +               cpumask_clear_cpu(raw_smp_processor_id(), &drv->cpus_in_pc);
> +               goto exit;
> +       }
> +
> +       ret = rpmh_rsc_ctrlr_is_busy(drv);
> +       if (ret) {
> +               ret = NOTIFY_BAD;
> +               goto exit;
> +       }

Are you sure you can't just skip the call to rpmh_rsc_ctrlr_is_busy()?
 Specifically, I'm not sure why you need to block until the last
active mode transaction is done.

a) If the active mode transaction was on its own TCS, I'd hope that
the RPMH hardware would know to let the active transitions happen
before triggering the sleep (please confirm)

b) If the active mode transaction was borrowing the wake TCS the flush
should notice and return NOTIFY_BAD anyway.

Oh!  ...but looking closer at case b) points out a problem anyway.
You need to go into rpmh_flush() and stop having it loop on 'while
(ret == -EAGAIN)".  As per my documentation series (please review)
when -EAGAIN is returned it's important that the caller enable
interrupts for a little while before trying again.  You're not doing
that here.  Maybe if you fix it then you can get rid of your special
case rpmh_rsc_ctrlr_is_busy()?


> +       ret = rpmh_flush(&drv->client);

Back when RPMH first landed we had a whole argument about whether it
should be one file or two.  Given that it's currently two files, I
think the abstraction is that rpmh-rsc generally doesn't call into
rpmh except for the very special rpmh_tx_done().

...so seems like the whole pm_callback should move into rpmh.c and not
in rpmh-rsc.c?


> @@ -533,21 +604,20 @@ static int rpmh_probe_tcs_config(struct platform_device *pdev,
>         int i, ret, n, st = 0;
>         struct tcs_group *tcs;
>         struct resource *res;
> -       void __iomem *base;
>         char drv_id[10] = {0};
>
>         snprintf(drv_id, ARRAY_SIZE(drv_id), "drv-%d", drv->id);
>         res = platform_get_resource_byname(pdev, IORESOURCE_MEM, drv_id);
> -       base = devm_ioremap_resource(&pdev->dev, res);
> -       if (IS_ERR(base))
> -               return PTR_ERR(base);
> +       drv->base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(drv->base))
> +               return PTR_ERR(drv->base);

I don't understand this part of the change.  Why move "base" into the
driver data.  It's not needed after probe.  Please undo this.  If you
really feel like base needs to be separate please move to its own
patch.


> @@ -663,6 +734,20 @@ static int rpmh_rsc_probe(struct platform_device *pdev)
>         if (ret)
>                 return ret;
>
> +       /*
> +        * CPU PM notification are not required for controllers that support
> +        * 'HW solver' mode where they can be in autonomous mode executing low
> +        * power mode to power down.
> +        */
> +       solver_config = readl_relaxed(drv->base + DRV_SOLVER_CONFIG);
> +       solver_config &= DRV_HW_SOLVER_MASK << DRV_HW_SOLVER_SHIFT;
> +       solver_config = solver_config >> DRV_HW_SOLVER_SHIFT;
> +       if (!solver_config) {
> +               drv->rsc_pm.notifier_call = rpmh_rsc_cpu_pm_callback;
> +               cpu_pm_register_notifier(&drv->rsc_pm);
> +               spin_lock_init(&drv->pm_lock);

nit: init the spin lock before registering instead of after.


>   * @ctrlr: controller making request to flush cached data
>   *
> - * Return: -EBUSY if the controller is busy, probably waiting on a response
> - * to a RPMH request sent earlier.
> + * Return: 0 on success, error number otherwise.
>   *
> - * This function is always called from the sleep code from the last CPU
> - * that is powering down the entire system. Since no other RPMH API would be
> - * executing at this time, it is safe to run lockless.
> + * This function can either be called from sleep code on the last CPU
> + * (thus no spinlock needed) or with the ctrlr->cache_lock already held.

Now you can remove the "or with the ctrlr->cache_lock already held"
since it's no longer true.


-Doug
Doug Anderson March 27, 2020, 6:22 p.m. UTC | #3
Hi,

On Fri, Mar 27, 2020 at 4:00 AM Maulik Shah <mkshah@codeaurora.org> wrote:
>
>   * @ctrlr: controller making request to flush cached data
>   *
> - * Return: -EBUSY if the controller is busy, probably waiting on a response
> - * to a RPMH request sent earlier.
> + * Return: 0 on success, error number otherwise.
>   *
> - * This function is always called from the sleep code from the last CPU
> - * that is powering down the entire system. Since no other RPMH API would be
> - * executing at this time, it is safe to run lockless.
> + * This function can either be called from sleep code on the last CPU
> + * (thus no spinlock needed) or with the ctrlr->cache_lock already held.
>
> Now you can remove the "or with the ctrlr->cache_lock already held"
> since it's no longer true.
>
> It can be true for other RSCs, so i kept as it is.

I don't really understand this.  The cache_lock is only a concept in
"rpmh.c".  How could another RSC grab the cache lock?  If nothing
else, can you remove this comment until support for those other RSCs
are added and we can evaluate then?

-Doug