From patchwork Wed Mar 18 18:04:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Pitre X-Patchwork-Id: 45988 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f71.google.com (mail-la0-f71.google.com [209.85.215.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id EF3EF2153C for ; Wed, 18 Mar 2015 18:08:43 +0000 (UTC) Received: by lams18 with SMTP id s18sf8803896lam.2 for ; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id :in-reply-to:references:cc:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:mime-version :content-type:content-transfer-encoding:sender:errors-to :x-original-sender:x-original-authentication-results:mailing-list; bh=UdHQBPc2apC+Pcp1TArdLb/xbtQ7Hr73SSVQRLMVRY8=; b=ZsvWTT0XUbUgG63PxOFKDSZkm6H7GFhz8XMyrWKZWv6VcvcErv0PASvqcse7CJU0ne 26aOvW9d3/++PaOm+3L6qXnhiB5nbyP+RlDjBH0NomEIQpLFGTSuPll4YbECWtH1z0xx zcR+P9p2VvHWfBqwwX+4dG+2s207WIWCPRo2qfom8JUY7zNCUVmFF/gNt3z/PKyhz3ZM aaPsDzDlhjgiCADRlUsmkuCCM3PdC6mnu0lpbf+7oaD4fPIx2kx9uUMwJb9jCrJg+YZb MkhqyBbAVL53YZJQz5eNrh5tRZ7CHpj0YdfZzmFDMO5mVo5OFBpODEf1u4GBcE/opBs0 ao1Q== X-Gm-Message-State: ALoCoQnZ4IMbeTdO0IWbehjoLOyrRUr6M1rZqGzhg470WTbNRlGL44aVlXUV0stSVjbW/kHOn8ia X-Received: by 10.194.95.67 with SMTP id di3mr11358625wjb.0.1426702122899; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.245.4 with SMTP id xk4ls76031lac.77.gmail; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) X-Received: by 10.152.163.67 with SMTP id yg3mr55791793lab.42.1426702122584; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) Received: from mail-lb0-f181.google.com (mail-lb0-f181.google.com. [209.85.217.181]) by mx.google.com with ESMTPS id nb6si13450223lbb.161.2015.03.18.11.08.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Mar 2015 11:08:42 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.181 as permitted sender) client-ip=209.85.217.181; Received: by lbnq5 with SMTP id q5so7663077lbn.0 for ; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) X-Received: by 10.152.116.11 with SMTP id js11mr43626887lab.106.1426702122436; Wed, 18 Mar 2015 11:08:42 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.35.133 with SMTP id h5csp1319404lbj; Wed, 18 Mar 2015 11:08:41 -0700 (PDT) X-Received: by 10.70.26.129 with SMTP id l1mr162520156pdg.163.1426702120304; Wed, 18 Mar 2015 11:08:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id g3si37492217pdb.251.2015.03.18.11.08.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Mar 2015 11:08:40 -0700 (PDT) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YYIMV-00049k-Tl; Wed, 18 Mar 2015 18:06:35 +0000 Received: from relais.videotron.ca ([24.201.245.36]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YYILa-0003lU-D3 for linux-arm-kernel@lists.infradead.org; Wed, 18 Mar 2015 18:05:40 +0000 Received: from yoda.home ([66.131.180.142]) by VL-VM-MR002.ip.videotron.ca (Oracle Communications Messaging Exchange Server 7u4-22.01 64bit (built Apr 21 2011)) with ESMTP id <0NLF00IZG68IZI00@VL-VM-MR002.ip.videotron.ca> for linux-arm-kernel@lists.infradead.org; Wed, 18 Mar 2015 14:05:07 -0400 (EDT) Received: from xanadu.home (xanadu.home [192.168.2.2]) by yoda.home (Postfix) with ESMTP id 81F622DA0553; Wed, 18 Mar 2015 14:05:06 -0400 (EDT) From: Nicolas Pitre To: linux-arm-kernel@lists.infradead.org Subject: [PATCH RFC/RFT 1/6] MCPM: move the algorithmic complexity to the core code Date: Wed, 18 Mar 2015 14:04:48 -0400 Message-id: <1426701893-25589-2-git-send-email-nicolas.pitre@linaro.org> X-Mailer: git-send-email 2.1.0 In-reply-to: <1426701893-25589-1-git-send-email-nicolas.pitre@linaro.org> References: <1426701893-25589-1-git-send-email-nicolas.pitre@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150318_110538_564210_0D8E5EED X-CRM114-Status: GOOD ( 31.60 ) X-Spam-Score: 1.0 (+) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (1.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [24.201.245.36 listed in wl.mailspike.net] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [24.201.245.36 listed in list.dnswl.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Kevin Hilman , arm@kernel.org, Haojian Zhuang , Abhilash Kesavan , Dave Martin , linaro-kernel@lists.linaro.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: nicolas.pitre@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.181 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 All backends are reimplementing a variation of the same CPU reference count handling. They are also responsible for driving the MCPM special low-level locking. This is needless duplication, involving algorithmic requirements that are not necessarily obvious to the uninitiated. And from past code review experience, those were all initially implemented badly. After 3 years, it is time to refactor as much common code to the core MCPM facility to make the backends as simple as possible. To avoid a flag day, the new scheme is introduced in parallel to the existing backend interface. When all backends are converted over, the compatibility interface could be removed. The new MCPM backend interface implements simpler methods addressing very platform specific tasks performed under lock protection while keeping the algorithmic complexity and race avoidance local to the core code. Signed-off-by: Nicolas Pitre --- arch/arm/common/mcpm_entry.c | 202 ++++++++++++++++++++++++++++++++++++------- arch/arm/include/asm/mcpm.h | 65 +++++++++++++- 2 files changed, 233 insertions(+), 34 deletions(-) diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c index 3c165fc2dc..5f8a52ac7e 100644 --- a/arch/arm/common/mcpm_entry.c +++ b/arch/arm/common/mcpm_entry.c @@ -55,22 +55,81 @@ bool mcpm_is_available(void) return (platform_ops) ? true : false; } +/* + * We can't use regular spinlocks. In the switcher case, it is possible + * for an outbound CPU to call power_down() after its inbound counterpart + * is already live using the same logical CPU number which trips lockdep + * debugging. + */ +static arch_spinlock_t mcpm_lock = __ARCH_SPIN_LOCK_UNLOCKED; + +static int mcpm_cpu_use_count[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; + +static inline bool mcpm_cluster_unused(unsigned int cluster) +{ + int i, cnt; + for (i = 0, cnt = 0; i < MAX_CPUS_PER_CLUSTER; i++) + cnt |= mcpm_cpu_use_count[cluster][i]; + return !cnt; +} + int mcpm_cpu_power_up(unsigned int cpu, unsigned int cluster) { + bool cpu_is_down, cluster_is_down; + int ret = 0; + if (!platform_ops) return -EUNATCH; /* try not to shadow power_up errors */ might_sleep(); - return platform_ops->power_up(cpu, cluster); + + /* backward compatibility callback */ + if (platform_ops->power_up) + return platform_ops->power_up(cpu, cluster); + + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + + /* + * Since this is called with IRQs enabled, and no arch_spin_lock_irq + * variant exists, we need to disable IRQs manually here. + */ + local_irq_disable(); + arch_spin_lock(&mcpm_lock); + + cpu_is_down = !mcpm_cpu_use_count[cluster][cpu]; + cluster_is_down = mcpm_cluster_unused(cluster); + + mcpm_cpu_use_count[cluster][cpu]++; + /* + * The only possible values are: + * 0 = CPU down + * 1 = CPU (still) up + * 2 = CPU requested to be up before it had a chance + * to actually make itself down. + * Any other value is a bug. + */ + BUG_ON(mcpm_cpu_use_count[cluster][cpu] != 1 && + mcpm_cpu_use_count[cluster][cpu] != 2); + + if (cluster_is_down) + ret = platform_ops->cluster_powerup(cluster); + if (cpu_is_down && !ret) + ret = platform_ops->cpu_powerup(cpu, cluster); + + arch_spin_unlock(&mcpm_lock); + local_irq_enable(); + return ret; } typedef void (*phys_reset_t)(unsigned long); void mcpm_cpu_power_down(void) { + unsigned int mpidr, cpu, cluster; + bool cpu_going_down, last_man; phys_reset_t phys_reset; - if (WARN_ON_ONCE(!platform_ops || !platform_ops->power_down)) - return; + if (WARN_ON_ONCE(!platform_ops)) + return; BUG_ON(!irqs_disabled()); /* @@ -79,28 +138,65 @@ void mcpm_cpu_power_down(void) */ setup_mm_for_reboot(); - platform_ops->power_down(); + /* backward compatibility callback */ + if (platform_ops->power_down) { + platform_ops->power_down(); + goto not_dead; + } + + mpidr = read_cpuid_mpidr(); + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + + __mcpm_cpu_going_down(cpu, cluster); + arch_spin_lock(&mcpm_lock); + BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP); + + mcpm_cpu_use_count[cluster][cpu]--; + BUG_ON(mcpm_cpu_use_count[cluster][cpu] != 0 && + mcpm_cpu_use_count[cluster][cpu] != 1); + cpu_going_down = !mcpm_cpu_use_count[cluster][cpu]; + last_man = mcpm_cluster_unused(cluster); + + if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) { + platform_ops->cpu_powerdown_prepare(cpu, cluster); + platform_ops->cluster_powerdown_prepare(cluster); + arch_spin_unlock(&mcpm_lock); + platform_ops->cluster_cache_disable(); + __mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN); + } else { + if (cpu_going_down) + platform_ops->cpu_powerdown_prepare(cpu, cluster); + arch_spin_unlock(&mcpm_lock); + /* + * If cpu_going_down is false here, that means a power_up + * request raced ahead of us. Even if we do not want to + * shut this CPU down, the caller still expects execution + * to return through the system resume entry path, like + * when the WFI is aborted due to a new IRQ or the like.. + * So let's continue with cache cleaning in all cases. + */ + platform_ops->cpu_cache_disable(); + } + + __mcpm_cpu_down(cpu, cluster); + + /* Now we are prepared for power-down, do it: */ + if (cpu_going_down) + wfi(); + +not_dead: /* * It is possible for a power_up request to happen concurrently * with a power_down request for the same CPU. In this case the - * power_down method might not be able to actually enter a - * powered down state with the WFI instruction if the power_up - * method has removed the required reset condition. The - * power_down method is then allowed to return. We must perform - * a re-entry in the kernel as if the power_up method just had - * deasserted reset on the CPU. - * - * To simplify race issues, the platform specific implementation - * must accommodate for the possibility of unordered calls to - * power_down and power_up with a usage count. Therefore, if a - * call to power_up is issued for a CPU that is not down, then - * the next call to power_down must not attempt a full shutdown - * but only do the minimum (normally disabling L1 cache and CPU - * coherency) and return just as if a concurrent power_up request - * had happened as described above. + * CPU might not be able to actually enter a powered down state + * with the WFI instruction if the power_up request has removed + * the required reset condition. We must perform a re-entry in + * the kernel as if the power_up method just had deasserted reset + * on the CPU. */ - phys_reset = (phys_reset_t)(unsigned long)virt_to_phys(cpu_reset); phys_reset(virt_to_phys(mcpm_entry_point)); @@ -125,26 +221,66 @@ int mcpm_wait_for_cpu_powerdown(unsigned int cpu, unsigned int cluster) void mcpm_cpu_suspend(u64 expected_residency) { - phys_reset_t phys_reset; - - if (WARN_ON_ONCE(!platform_ops || !platform_ops->suspend)) + if (WARN_ON_ONCE(!platform_ops)) return; - BUG_ON(!irqs_disabled()); - /* Very similar to mcpm_cpu_power_down() */ - setup_mm_for_reboot(); - platform_ops->suspend(expected_residency); - phys_reset = (phys_reset_t)(unsigned long)virt_to_phys(cpu_reset); - phys_reset(virt_to_phys(mcpm_entry_point)); - BUG(); + /* backward compatibility callback */ + if (platform_ops->suspend) { + phys_reset_t phys_reset; + BUG_ON(!irqs_disabled()); + setup_mm_for_reboot(); + platform_ops->suspend(expected_residency); + phys_reset = (phys_reset_t)(unsigned long)virt_to_phys(cpu_reset); + phys_reset(virt_to_phys(mcpm_entry_point)); + BUG(); + } + + /* Some platforms might have to enable special resume modes, etc. */ + if (platform_ops->cpu_suspend_prepare) { + unsigned int mpidr = read_cpuid_mpidr(); + unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + arch_spin_lock(&mcpm_lock); + platform_ops->cpu_suspend_prepare(cpu, cluster); + arch_spin_unlock(&mcpm_lock); + } + mcpm_cpu_power_down(); } int mcpm_cpu_powered_up(void) { + unsigned int mpidr, cpu, cluster; + bool cpu_was_down, first_man; + unsigned long flags; + if (!platform_ops) return -EUNATCH; - if (platform_ops->powered_up) + + /* backward compatibility callback */ + if (platform_ops->powered_up) { platform_ops->powered_up(); + return 0; + } + + mpidr = read_cpuid_mpidr(); + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + local_irq_save(flags); + arch_spin_lock(&mcpm_lock); + + cpu_was_down = !mcpm_cpu_use_count[cluster][cpu]; + first_man = mcpm_cluster_unused(cluster); + + if (first_man && platform_ops->cluster_is_up) + platform_ops->cluster_is_up(cluster); + if (cpu_was_down) + mcpm_cpu_use_count[cluster][cpu] = 1; + if (platform_ops->cpu_is_up) + platform_ops->cpu_is_up(cpu, cluster); + + arch_spin_unlock(&mcpm_lock); + local_irq_restore(flags); + return 0; } @@ -334,8 +470,10 @@ int __init mcpm_sync_init( } mpidr = read_cpuid_mpidr(); this_cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); - for_each_online_cpu(i) + for_each_online_cpu(i) { + mcpm_cpu_use_count[this_cluster][i] = 1; mcpm_sync.clusters[this_cluster].cpus[i].cpu = CPU_UP; + } mcpm_sync.clusters[this_cluster].cluster = CLUSTER_UP; sync_cache_w(&mcpm_sync); diff --git a/arch/arm/include/asm/mcpm.h b/arch/arm/include/asm/mcpm.h index 3446f6a1d9..50b378f59e 100644 --- a/arch/arm/include/asm/mcpm.h +++ b/arch/arm/include/asm/mcpm.h @@ -171,12 +171,73 @@ void mcpm_cpu_suspend(u64 expected_residency); int mcpm_cpu_powered_up(void); /* - * Platform specific methods used in the implementation of the above API. + * Platform specific callbacks used in the implementation of the above API. + * + * cpu_powerup: + * Make given CPU runable. Called with MCPM lock held and IRQs disabled. + * The given cluster is assumed to be set up (cluster_powerup would have + * been called beforehand). Must return 0 for success or negative error code. + * + * cluster_powerup: + * Set up power for given cluster. Called with MCPM lock held and IRQs + * disabled. Called before first cpu_powerup when cluster is down. Must + * return 0 for success or negative error code. + * + * cpu_suspend_prepare: + * Special suspend configuration. Called on target CPU with MCPM lock held + * and IRQs disabled. This callback is optional. If provided, it is called + * before cpu_powerdown_prepare. + * + * cpu_powerdown_prepare: + * Configure given CPU for power down. Called on target CPU with MCPM lock + * held and IRQs disabled. Power down must be effective only at the next WFI instruction. + * + * cluster_powerdown_prepare: + * Configure given cluster for power down. Called on one CPU from target + * cluster with MCPM lock held and IRQs disabled. A cpu_powerdown_prepare + * for each CPU in the cluster has happened when this occurs. + * + * cpu_cache_disable: + * Clean and disable CPU level cache for the calling CPU. Called on with IRQs + * disabled only. The CPU is no longer cache coherent with the rest of the + * system when this returns. + * + * cluster_cache_disable: + * Clean and disable the cluster wide cache as well as the CPU level cache + * for the calling CPU. No call to cpu_cache_disable will happen for this + * CPU. Called with IRQs disabled and only when all the other CPUs are done + * with their own cpu_cache_disable. The cluster is no longer cache coherent + * with the rest of the system when this returns. + * + * cpu_is_up: + * Called on given CPU after it has been powered up or resumed. The MCPM lock + * is held and IRQs disabled. This callback is optional. + * + * cluster_is_up: + * Called by the first CPU to be powered up or resumed in given cluster. + * The MCPM lock is held and IRQs disabled. This callback is optional. If + * provided, it is called before cpu_is_up for that CPU. + * + * wait_for_powerdown: + * Wait until given CPU is powered down. This is called in sleeping context. + * Some reasonable timeout must be considered. Must return 0 for success or + * negative error code. */ struct mcpm_platform_ops { + int (*cpu_powerup)(unsigned int cpu, unsigned int cluster); + int (*cluster_powerup)(unsigned int cluster); + void (*cpu_suspend_prepare)(unsigned int cpu, unsigned int cluster); + void (*cpu_powerdown_prepare)(unsigned int cpu, unsigned int cluster); + void (*cluster_powerdown_prepare)(unsigned int cluster); + void (*cpu_cache_disable)(void); + void (*cluster_cache_disable)(void); + void (*cpu_is_up)(unsigned int cpu, unsigned int cluster); + void (*cluster_is_up)(unsigned int cluster); + int (*wait_for_powerdown)(unsigned int cpu, unsigned int cluster); + + /* deprecated callbacks */ int (*power_up)(unsigned int cpu, unsigned int cluster); void (*power_down)(void); - int (*wait_for_powerdown)(unsigned int cpu, unsigned int cluster); void (*suspend)(u64); void (*powered_up)(void); };