From patchwork Tue Oct 13 08:09:04 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Viresh Kumar <viresh.kumar@linaro.org>
X-Patchwork-Id: 54827
Return-Path: <patchwork-forward+bncBCRIZROPVQPBBVHY6KYAKGQEMREPKTA@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-lb0-f199.google.com (mail-lb0-f199.google.com
 [209.85.217.199])
 by patches.linaro.org (Postfix) with ESMTPS id D8E9623001
 for <linaro@patches.linaro.org>; Tue, 13 Oct 2015 08:09:57 +0000 (UTC)
Received: by lbbti1 with SMTP id ti1sf4994323lbb.3
 for <linaro@patches.linaro.org>; Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject
 :date:message-id:in-reply-to:references:in-reply-to:references
 :sender:precedence:list-id:x-original-sender
 :x-original-authentication-results:mailing-list:list-post:list-help
 :list-archive:list-unsubscribe;
 bh=kWYtCdlsn2V68TOU2mjyr6B5qMCl0wd0sFDAoPl5ozU=;
 b=X6KsEggmfk1yzTv6j9s2Fiui1IpX0cF+M2yAdksEl0QPv6OOi2Z9PMLEY0EX9/Dz/q
 FXvRhyY9aOw2SQrFTZlGLrXPRjwOFIu8nvNMPhiUihuYzN/kq55E3JUjG0Ah5vxAcbLq
 fwOEaPsNBKR5MZKdrD+VQ9lf9xKwIXcEpgoUlJLOC2HstHVfjovIOEKNRZXOpmq2SA18
 lvXuZpU7tI0eZ8KKXUCZ1tguIkcJzwhiwQKOs49H320xOhiXvHsNdRH+rseqpPd/DoiE
 orJGAx0oRfmrQ4bIqcpRFoUfWKl9MXD7EATexK9kkevKjtLFr3Fy6OyQp4i2GoNTzk+Z
 JMAA==
X-Gm-Message-State: ALoCoQnvMT8s6tp4nsOaO7DHUkAd3xC1zKW1BBOZcoZ2Aw8ec3WAWOnGlrLPjGIr1/qbrzZD1yYj
X-Received: by 10.112.209.73 with SMTP id mk9mr6596126lbc.14.1444723796842; 
 Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.25.38.206 with SMTP id m197ls562896lfm.55.gmail; Tue, 13 Oct
 2015 01:09:56 -0700 (PDT)
X-Received: by 10.25.90.83 with SMTP id o80mr9164698lfb.47.1444723796670;
 Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com.
 [209.85.217.176]) by mx.google.com with ESMTPS id
 d5si1358221lbi.126.2015.10.13.01.09.56
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.217.176 as permitted sender) client-ip=209.85.217.176; 
Received: by lbwr8 with SMTP id r8so10303111lbw.2
 for <patchwork-forward@linaro.org>;
 Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
X-Received: by 10.112.199.170 with SMTP id jl10mr8858103lbc.88.1444723796532; 
 Tue, 13 Oct 2015 01:09:56 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.59.35 with SMTP id w3csp1974505lbq;
 Tue, 13 Oct 2015 01:09:55 -0700 (PDT)
X-Received: by 10.107.166.201 with SMTP id p192mr38282810ioe.0.1444723795515; 
 Tue, 13 Oct 2015 01:09:55 -0700 (PDT)
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 jf2si11615713igb.62.2015.10.13.01.09.55; 
 Tue, 13 Oct 2015 01:09:55 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1752896AbbJMIJs (ORCPT <rfc822;pingbo.wen@linaro.org>
 + 30 others); Tue, 13 Oct 2015 04:09:48 -0400
Received: from mail-pa0-f41.google.com ([209.85.220.41]:33973 "EHLO
 mail-pa0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1752804AbbJMIJn (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Tue, 13 Oct 2015 04:09:43 -0400
Received: by padhy16 with SMTP id hy16so14225222pad.1
 for <linux-kernel@vger.kernel.org>;
 Tue, 13 Oct 2015 01:09:42 -0700 (PDT)
X-Received: by 10.68.92.164 with SMTP id cn4mr38587853pbb.156.1444723782844; 
 Tue, 13 Oct 2015 01:09:42 -0700 (PDT)
Received: from localhost ([223.227.239.124]) by smtp.gmail.com with ESMTPSA id
 tb9sm2264958pab.13.2015.10.13.01.09.40
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Tue, 13 Oct 2015 01:09:41 -0700 (PDT)
From: Viresh Kumar <viresh.kumar@linaro.org>
To: Rafael Wysocki <rjw@rjwysocki.net>
Cc: linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org,
 Viresh Kumar <viresh.kumar@linaro.org>,
 linux-kernel@vger.kernel.org (open list)
Subject: [PATCH V3 4/5] cpufreq: governor: Quit work-handlers early if
 governor is stopped
Date: Tue, 13 Oct 2015 13:39:04 +0530
Message-Id: <1e579d2bf8dbee09295725cda37bd92222fe61fb.1444723240.git.viresh.kumar@linaro.org>
X-Mailer: git-send-email 2.4.0
In-Reply-To: <cover.1444723240.git.viresh.kumar@linaro.org>
References: <cover.1444723240.git.viresh.kumar@linaro.org>
In-Reply-To: <cover.1444723240.git.viresh.kumar@linaro.org>
References: <cover.1444723240.git.viresh.kumar@linaro.org>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: list
List-ID: <patchwork-forward.linaro.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: viresh.kumar@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.217.176 as permitted sender)
 smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>

cpufreq_governor_lock is abused by using it outside of cpufreq core,
i.e. in cpufreq-governors. But we didn't had a better solution to the
problem (described later) at that point of time, and following was the
only acceptable solution:

6f1e4efd882e ("cpufreq: Fix timer/workqueue corruption by protecting
reading governor_enabled")

The cpufreq governor core is fixed against possible races now and things
are in much better shape.

The original problem:

When a CPU is hot unplugged, we cancel delayed works for all
policy->cpus via gov_cancel_work(). If the work is already running on
any CPU, the workqueue code will wait for the work to finish, to prevent
the work items from re-queuing themselves.

This works most of the time, except for the case where the work handler
determines that it should adjust the delay for all other CPUs, that the
policy is managing. When this happens, the canceling CPU will cancel its
own work but can queue up the works on other policy->cpus.

For example, consider CPU 0-4 in a policy and we called
gov_cancel_work() for them. Workqueue core canceled the works for 0-3
and is waiting for the handler to finish on CPU4. At that time, handler
on CPU4 can restart works on CPU 0-3 again. Which makes 0-3 run works,
which the governor core thinks are canceled.

To fix that in a different (non-hacky) way, set set shared->policy to
false before trying to cancel the work. It should be updated within
timer_mutex, which will prevent the work-handlers to start. Once the
work-handlers finds that we are already trying to stop the governor, it
will exit early. And that will prevent queuing of works again as well.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 750626d8fb03..931424ca96d9 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -171,10 +171,6 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 {
 	int i;
 
-	mutex_lock(&cpufreq_governor_lock);
-	if (!policy->governor_enabled)
-		goto out_unlock;
-
 	if (!all_cpus) {
 		/*
 		 * Use raw_smp_processor_id() to avoid preemptible warnings.
@@ -188,9 +184,6 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 		for_each_cpu(i, policy->cpus)
 			__gov_queue_work(i, dbs_data, delay);
 	}
-
-out_unlock:
-	mutex_unlock(&cpufreq_governor_lock);
 }
 EXPORT_SYMBOL_GPL(gov_queue_work);
 
@@ -229,13 +222,24 @@ static void dbs_timer(struct work_struct *work)
 	struct cpu_dbs_info *cdbs = container_of(work, struct cpu_dbs_info,
 						 dwork.work);
 	struct cpu_common_dbs_info *shared = cdbs->shared;
-	struct cpufreq_policy *policy = shared->policy;
-	struct dbs_data *dbs_data = policy->governor_data;
+	struct cpufreq_policy *policy;
+	struct dbs_data *dbs_data;
 	unsigned int sampling_rate, delay;
 	bool modify_all = true;
 
 	mutex_lock(&shared->timer_mutex);
 
+	policy = shared->policy;
+
+	/*
+	 * Governor might already be disabled and there is no point continuing
+	 * with the work-handler.
+	 */
+	if (!policy)
+		goto unlock;
+
+	dbs_data = policy->governor_data;
+
 	if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -252,6 +256,7 @@ static void dbs_timer(struct work_struct *work)
 	delay = dbs_data->cdata->gov_dbs_timer(cdbs, dbs_data, modify_all);
 	gov_queue_work(dbs_data, policy, delay, modify_all);
 
+unlock:
 	mutex_unlock(&shared->timer_mutex);
 }
 
@@ -488,9 +493,17 @@ static int cpufreq_governor_stop(struct cpufreq_policy *policy,
 	if (!shared || !shared->policy)
 		return -EBUSY;
 
+	/*
+	 * Work-handler must see this updated, as it should not proceed any
+	 * further after governor is disabled. And so timer_mutex is taken while
+	 * updating this value.
+	 */
+	mutex_lock(&shared->timer_mutex);
+	shared->policy = NULL;
+	mutex_unlock(&shared->timer_mutex);
+
 	gov_cancel_work(dbs_data, policy);
 
-	shared->policy = NULL;
 	mutex_destroy(&shared->timer_mutex);
 	return 0;
 }