From patchwork Fri Dec 20 18:59:46 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 22697 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ig0-f199.google.com (mail-ig0-f199.google.com [209.85.213.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 0B09720555 for ; Fri, 20 Dec 2013 19:00:01 +0000 (UTC) Received: by mail-ig0-f199.google.com with SMTP id hk11sf12126899igb.2 for ; Fri, 20 Dec 2013 11:00:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=nLKrEyNbfSZhQ0Bb2M9oABdrsfpzqxjXh0Nc93sH7+o=; b=Vj0gIg/nPATkMVpcoXJUve3qXYMciuHZmHF9oWiG5Zr6HG6OGOZBUh72VRivOs00rR lXnzm9Wtv3oF4hcWk3j+wRn2SyvovcpIkupmQBvoDZnbEOVomVB4/8iR9DJ+NAf1GF6V za0JHoF+I17UoSny0zmZDlVThN0UbMBwKaaCAeYTfZQosUjCx9+H4/0ru5jPoMY1qUFz rIeZR5azAQY+BoSrBIVQB4rKw/JxieAl9Mvo9MzKgUDfFU2NkTVNs8LSuv8AkWVhMZcu PMUYAZcR0GOsS+hkd7tYKKDYs1gp/NlD6cDjFDf+hFk7+G+mvhKs3J+spxbjtb3jYiTU PNdw== X-Gm-Message-State: ALoCoQk+SFH9vjswg20ekWgxx15mbxRAzZ7KG4tHW/ddYGCAH/Stf9p4YK1PO/3gyEiJaEFvhj8X X-Received: by 10.43.163.202 with SMTP id mp10mr3798734icc.19.1387566001295; Fri, 20 Dec 2013 11:00:01 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.49.86.232 with SMTP id s8ls868326qez.67.gmail; Fri, 20 Dec 2013 11:00:01 -0800 (PST) X-Received: by 10.58.100.197 with SMTP id fa5mr10905veb.24.1387566001096; Fri, 20 Dec 2013 11:00:01 -0800 (PST) Received: from mail-vb0-f41.google.com (mail-vb0-f41.google.com [209.85.212.41]) by mx.google.com with ESMTPS id tl2si1694074vdc.155.2013.12.20.11.00.01 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 20 Dec 2013 11:00:01 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.41 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.212.41; Received: by mail-vb0-f41.google.com with SMTP id p14so1671652vbm.14 for ; Fri, 20 Dec 2013 11:00:01 -0800 (PST) X-Received: by 10.52.164.16 with SMTP id ym16mr4517vdb.39.1387566001010; Fri, 20 Dec 2013 11:00:01 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.59.13.131 with SMTP id ey3csp83375ved; Fri, 20 Dec 2013 11:00:00 -0800 (PST) X-Received: by 10.66.119.136 with SMTP id ku8mr10331902pab.121.1387565999862; Fri, 20 Dec 2013 10:59:59 -0800 (PST) Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by mx.google.com with ESMTPS id xh9si5935944pab.35.2013.12.20.10.59.59 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 20 Dec 2013 10:59:59 -0800 (PST) Received-SPF: neutral (google.com: 209.85.220.53 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=209.85.220.53; Received: by mail-pa0-f53.google.com with SMTP id hz1so2986315pad.40 for ; Fri, 20 Dec 2013 10:59:59 -0800 (PST) X-Received: by 10.68.66.1 with SMTP id b1mr10388836pbt.43.1387565999436; Fri, 20 Dec 2013 10:59:59 -0800 (PST) Received: from localhost.localdomain (c-67-170-153-23.hsd1.or.comcast.net. [67.170.153.23]) by mx.google.com with ESMTPSA id z10sm20919924pas.6.2013.12.20.10.59.57 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 20 Dec 2013 10:59:58 -0800 (PST) From: John Stultz To: LKML Cc: John Stultz , Thomas Gleixner , Prarit Bhargava , Richard Cochran , Ingo Molnar , Sasha Levin , stable Subject: [PATCH 3/3] timekeeping: Avoid possible deadlock from clock_was_set_delayed Date: Fri, 20 Dec 2013 10:59:46 -0800 Message-Id: <1387565986-11738-4-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1387565986-11738-1-git-send-email-john.stultz@linaro.org> References: <1387565986-11738-1-git-send-email-john.stultz@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.41 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , As part of normal operaions, the hrtimer subsystem frequently calls into the timekeeping code, creating a locking order of hrtimer locks -> timekeeping locks clock_was_set_delayed() was suppoed to allow us to avoid deadlocks between the timekeeping the hrtimer subsystem, so that we could notify the hrtimer subsytem the time had changed while holding the timekeeping locks. This was done by scheduling delayed work that would run later once we were out of the timekeeing code. But unfortunately the lock chains are complex enoguh that in scheduling delayed work, we end up eventually trying to grab an hrtimer lock. Sasha Levin noticed this in testing when the new seqlock lockdep enablement triggered the following (somewhat abrieviated) message: [ 251.100221] ====================================================== [ 251.100221] [ INFO: possible circular locking dependency detected ] [ 251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted [ 251.101967] ------------------------------------------------------- [ 251.101967] kworker/10:1/4506 is trying to acquire lock: [ 251.101967] (timekeeper_seq){----..}, at: [] retrigger_next_event+0x56/0x70 [ 251.101967] [ 251.101967] but task is already holding lock: [ 251.101967] (hrtimer_bases.lock#11){-.-...}, at: [] retrigger_next_event+0x3c/0x70 [ 251.101967] [ 251.101967] which lock already depends on the new lock. [ 251.101967] [ 251.101967] [ 251.101967] the existing dependency chain (in reverse order) is: [ 251.101967] -> #5 (hrtimer_bases.lock#11){-.-...}: [snipped] -> #4 (&rt_b->rt_runtime_lock){-.-...}: [snipped] -> #3 (&rq->lock){-.-.-.}: [snipped] -> #2 (&p->pi_lock){-.-.-.}: [snipped] -> #1 (&(&pool->lock)->rlock){-.-...}: [ 251.101967] [] validate_chain+0x6c3/0x7b0 [ 251.101967] [] __lock_acquire+0x4ad/0x580 [ 251.101967] [] lock_acquire+0x182/0x1d0 [ 251.101967] [] _raw_spin_lock+0x40/0x80 [ 251.101967] [] __queue_work+0x1a9/0x3f0 [ 251.101967] [] queue_work_on+0x98/0x120 [ 251.101967] [] clock_was_set_delayed+0x21/0x30 [ 251.101967] [] do_adjtimex+0x111/0x160 [ 251.101967] [] compat_sys_adjtimex+0x41/0x70 [ 251.101967] [] ia32_sysret+0x0/0x5 [ 251.101967] -> #0 (timekeeper_seq){----..}: [snipped] [ 251.101967] other info that might help us debug this: [ 251.101967] [ 251.101967] Chain exists of: timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11 [ 251.101967] Possible unsafe locking scenario: [ 251.101967] [ 251.101967] CPU0 CPU1 [ 251.101967] ---- ---- [ 251.101967] lock(hrtimer_bases.lock#11); [ 251.101967] lock(&rt_b->rt_runtime_lock); [ 251.101967] lock(hrtimer_bases.lock#11); [ 251.101967] lock(timekeeper_seq); [ 251.101967] [ 251.101967] *** DEADLOCK *** [ 251.101967] [ 251.101967] 3 locks held by kworker/10:1/4506: [ 251.101967] #0: (events){.+.+.+}, at: [] process_one_work+0x200/0x530 [ 251.101967] #1: (hrtimer_work){+.+...}, at: [] process_one_work+0x200/0x530 [ 251.101967] #2: (hrtimer_bases.lock#11){-.-...}, at: [] retrigger_next_event+0x3c/0x70 [ 251.101967] [ 251.101967] stack backtrace: [ 251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 [ 251.101967] Workqueue: events clock_was_set_work So the best solution is to avoid calling clock_was_set_delayed() while holding the timekeeping lock, and instead using a flag variable to decide if we should call clock_was_set() once we've released the locks. This works for the case here, where the do_adjtimex() was the deadlock trigger point. Unfortuantely, in update_wall_time() we still hold the jiffies lock, which would deadlock with the ipi triggered by clock_was_set(), preventing us from calling it even after we drop the timekeeping lock. So instead call clock_was_set_delayed() at that point. Cc: Thomas Gleixner Cc: Prarit Bhargava Cc: Richard Cochran Cc: Ingo Molnar Cc: Sasha Levin Cc: stable #3.10+ Reported-by: Sasha Levin Tested-by: Sasha Levin Signed-off-by: John Stultz --- kernel/time/timekeeping.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index d4c78f0..41564cc 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1278,7 +1278,6 @@ static inline unsigned int accumulate_nsecs_to_secs(struct timekeeper *tk) __timekeeping_set_tai_offset(tk, tk->tai_offset - leap); - clock_was_set_delayed(); clock_was_set = TK_CLOCK_WAS_SET; } } @@ -1442,6 +1441,19 @@ static void update_wall_time(void) write_seqcount_end(&timekeeper_seq); out: raw_spin_unlock_irqrestore(&timekeeper_lock, flags); + if (clock_was_set) { + /* + * XXX - I'd rather we just call clock_was_set(), but + * since we're currently holding the jiffies lock, calling + * clock_was_set would trigger an ipi which would then grab + * the jiffies lock and we'd deadlock. :( + * The right solution should probably be droping + * the jiffies lock before calling update_wall_time + * but that requires some rework of the tick sched + * code. + */ + clock_was_set_delayed(); + } } /** @@ -1702,11 +1714,13 @@ int do_adjtimex(struct timex *txc) if (tai != orig_tai) { __timekeeping_set_tai_offset(tk, tai); timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET); - clock_was_set_delayed(); } write_seqcount_end(&timekeeper_seq); raw_spin_unlock_irqrestore(&timekeeper_lock, flags); + if (tai != orig_tai) + clock_was_set(); + ntp_notify_cmos_timer(); return ret;