From patchwork Sat Jan 9 02:05:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 360167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A9AAC4332D for ; Sat, 9 Jan 2021 02:06:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E6C1523B09 for ; Sat, 9 Jan 2021 02:06:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726642AbhAICGc (ORCPT ); Fri, 8 Jan 2021 21:06:32 -0500 Received: from mail.kernel.org ([198.145.29.99]:41500 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726571AbhAICGb (ORCPT ); Fri, 8 Jan 2021 21:06:31 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 37FCA23AC2; Sat, 9 Jan 2021 02:05:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610157950; bh=QHfGc37y+RJrvHoN5PghJVWpKd0a96DMtNk0V+pDWzU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WMc+KZAtrsmNDDcqxe+XSFxq5ypEOt1JbOY2t9CfEO4fZHof9U1yifzQJe74DvKjv gApJGnRVxYcwYV5Cn7UFLyoD3KYySEjvrs1VYl3ptFcJH+YWDkqlu/pRd0eWSBKeM7 dTiCLVhsvBJl9LU7haVmjIITosHYDHYt05kC9VXtSIJ7piZ0EK6at4cMeIMHLJGOBJ r0ERgkx07SXSjI9SEO96S99VNIPUhxYmQPZIfHQh7W9/JmoNHX3OheTa20UCOUTIP5 9ShxNJMWn65me9H7vWDJrseKcbNuyQvHup+NcSz/D+vmk3eCDaTSNmt/WPiKFmsvxo tE3rI+wA4Aayg== From: Frederic Weisbecker To: Peter Zijlstra , "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , "Rafael J . Wysocki" , Ingo Molnar , Thomas Gleixner , stable@vger.kernel.org Subject: [RFC PATCH 4/8] rcu/nocb: Trigger self-IPI on late deferred wake up before user resume Date: Sat, 9 Jan 2021 03:05:32 +0100 Message-Id: <20210109020536.127953-5-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210109020536.127953-1-frederic@kernel.org> References: <20210109020536.127953-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Unfortunately the call to rcu_user_enter() is already past the last rescheduling opportunity before we resume to userspace or to guest mode. We may escape there with the woken task ignored. The ultimate resort to fix every callsites is to trigger a self-IPI (nohz_full depends on IRQ_WORK) that will trigger a reschedule on IRQ tail or guest exit. Eventually every site that want a saner treatment will need to carefully place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit need_resched() check upon resume. Reported-by: Paul E. McKenney Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 22 +++++++++++++++++++++- kernel/rcu/tree.h | 2 +- kernel/rcu/tree_plugin.h | 25 ++++++++++++++++--------- 3 files changed, 38 insertions(+), 11 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index b6e1377774e3..2920dfc9f58c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -676,6 +676,18 @@ void rcu_idle_enter(void) EXPORT_SYMBOL_GPL(rcu_idle_enter); #ifdef CONFIG_NO_HZ_FULL + +/* + * An empty function that will trigger a reschedule on + * IRQ tail once IRQs get re-enabled on userspace resume. + */ +static void late_wakeup_func(struct irq_work *work) +{ +} + +static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = + IRQ_WORK_INIT(late_wakeup_func); + /** * rcu_user_enter - inform RCU that we are resuming userspace. * @@ -692,9 +704,17 @@ noinstr void rcu_user_enter(void) struct rcu_data *rdp = this_cpu_ptr(&rcu_data); lockdep_assert_irqs_disabled(); - do_nocb_deferred_wakeup(rdp); + /* + * We may be past the last rescheduling opportunity in the entry code. + * Trigger a self IPI that will fire and reschedule once we resume to + * user/guest mode. + */ + if (do_nocb_deferred_wakeup(rdp) && need_resched()) + irq_work_queue(this_cpu_ptr(&late_wakeup_work)); + rcu_eqs_enter(true); } + #endif /* CONFIG_NO_HZ_FULL */ /** diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 7708ed161f4a..9226f4021a36 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -433,7 +433,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty, unsigned long flags); static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); -static void do_nocb_deferred_wakeup(struct rcu_data *rdp); +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp); static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); static void rcu_spawn_cpu_nocb_kthread(int cpu); static void __init rcu_spawn_nocb_kthreads(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index d5b38c28abd1..384856e4d13e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1631,8 +1631,8 @@ bool rcu_is_nocb_cpu(int cpu) * Kick the GP kthread for this NOCB group. Caller holds ->nocb_lock * and this function releases it. */ -static void wake_nocb_gp(struct rcu_data *rdp, bool force, - unsigned long flags) +static bool wake_nocb_gp(struct rcu_data *rdp, bool force, + unsigned long flags) __releases(rdp->nocb_lock) { bool needwake = false; @@ -1643,7 +1643,7 @@ static void wake_nocb_gp(struct rcu_data *rdp, bool force, trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("AlreadyAwake")); rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } del_timer(&rdp->nocb_timer); rcu_nocb_unlock_irqrestore(rdp, flags); @@ -1656,6 +1656,8 @@ static void wake_nocb_gp(struct rcu_data *rdp, bool force, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) wake_up_process(rdp_gp->nocb_gp_kthread); + + return needwake; } /* @@ -2152,20 +2154,23 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp) } /* Do a deferred wakeup of rcu_nocb_kthread(). */ -static void do_nocb_deferred_wakeup_common(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp) { unsigned long flags; int ndw; + int ret; rcu_nocb_lock_irqsave(rdp, flags); if (!rcu_nocb_need_deferred_wakeup(rdp)) { rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } ndw = READ_ONCE(rdp->nocb_defer_wakeup); WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); - wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); + ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake")); + + return ret; } /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */ @@ -2181,10 +2186,11 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t) * This means we do an inexact common-case check. Note that if * we miss, ->nocb_timer will eventually clean things up. */ -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { if (rcu_nocb_need_deferred_wakeup(rdp)) - do_nocb_deferred_wakeup_common(rdp); + return do_nocb_deferred_wakeup_common(rdp); + return false; } void rcu_nocb_flush_deferred_wakeup(void) @@ -2523,8 +2529,9 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp) return false; } -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { + return false; } static void rcu_spawn_cpu_nocb_kthread(int cpu)