From patchwork Mon Aug 8 12:55:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12580C25B08 for ; Mon, 8 Aug 2022 12:56:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242860AbiHHM4t (ORCPT ); Mon, 8 Aug 2022 08:56:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237695AbiHHM4k (ORCPT ); Mon, 8 Aug 2022 08:56:40 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B97F2DEF for ; Mon, 8 Aug 2022 05:56:38 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 067E64169A; Mon, 8 Aug 2022 14:56:35 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 067E64169A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963395; bh=QVylu6gNfEsMODExWXwK9K+Eje0wUkfJcoIUdDHpRKU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oF/cL7mCPCA7dSSnjcT/OxKiQO6UQdQO9GCL0C2RGhCCCljR4tnbWytk0/RnFLONd gMxYhSJYwdnjuAoEdSo29KZHskHMUtaetPMaVCte6/QpS/hPcBLUfL2uosxNi6fF9L QuWbcO2aqvTZIJqUS1BYVsoNCEP4XEm8h7B02XEs= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, Michal Hocko , Roman Gushchin , Johannes Weiner , Shakeel Butt , David Oberhollenzer Subject: [PATCH v2 01/10] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") Date: Mon, 8 Aug 2022 14:55:53 +0200 Message-Id: <20220808125602.97747-2-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Michal Hocko The optimisation is based on a micro benchmark where local_irq_save() is more expensive than a preempt_disable(). There is no evidence that it is visible in a real-world workload and there are CPUs where the opposite is true (local_irq_save() is cheaper than preempt_disable()). Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE where preempt_disable() is optimized away. There is no improvement with PREEMPT_DYNAMIC since the preemption counter is always available. The optimization makes also the PREEMPT_RT integration more complicated since most of the assumption are not true on PREEMPT_RT. Revert the optimisation since it complicates the PREEMPT_RT integration and the improvement is hardly visible. [ bigeasy: Patch body around Michal's diff ] Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de Signed-off-by: Michal Hocko Signed-off-by: Sebastian Andrzej Siewior Acked-by: Roman Gushchin Acked-by: Johannes Weiner Reviewed-by: Shakeel Butt Acked-by: Michal Hocko [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 94 ++++++++++++++----------------------------------- 1 file changed, 27 insertions(+), 67 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 971546bb99e0..a07e068989b4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2102,23 +2102,17 @@ void unlock_page_memcg(struct page *page) } EXPORT_SYMBOL(unlock_page_memcg); -struct obj_stock { +struct memcg_stock_pcp { + struct mem_cgroup *cached; /* this never be root cgroup */ + unsigned int nr_pages; + #ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *cached_objcg; struct pglist_data *cached_pgdat; unsigned int nr_bytes; int nr_slab_reclaimable_b; int nr_slab_unreclaimable_b; -#else - int dummy[0]; #endif -}; - -struct memcg_stock_pcp { - struct mem_cgroup *cached; /* this never be root cgroup */ - unsigned int nr_pages; - struct obj_stock task_obj; - struct obj_stock irq_obj; struct work_struct work; unsigned long flags; @@ -2128,12 +2122,12 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct obj_stock *stock); +static void drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); #else -static inline void drain_obj_stock(struct obj_stock *stock) +static inline void drain_obj_stock(struct memcg_stock_pcp *stock) { } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2144,41 +2138,6 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, #endif /* - * Most kmem_cache_alloc() calls are from user context. The irq disable/enable - * sequence used in this case to access content from object stock is slow. - * To optimize for user context access, there are now two object stocks for - * task context and interrupt context access respectively. - * - * The task context object stock can be accessed by disabling preemption only - * which is cheap in non-preempt kernel. The interrupt context object stock - * can only be accessed after disabling interrupt. User context code can - * access interrupt object stock, but not vice versa. - */ -static inline struct obj_stock *get_obj_stock(unsigned long *pflags) -{ - struct memcg_stock_pcp *stock; - - if (likely(in_task())) { - *pflags = 0UL; - preempt_disable(); - stock = this_cpu_ptr(&memcg_stock); - return &stock->task_obj; - } - - local_irq_save(*pflags); - stock = this_cpu_ptr(&memcg_stock); - return &stock->irq_obj; -} - -static inline void put_obj_stock(unsigned long flags) -{ - if (likely(in_task())) - preempt_enable(); - else - local_irq_restore(flags); -} - -/** * consume_stock: Try to consume stocked charge on this cpu. * @memcg: memcg to consume from. * @nr_pages: how many pages to charge. @@ -2245,9 +2204,7 @@ static void drain_local_stock(struct work_struct *dummy) local_irq_save(flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(&stock->irq_obj); - if (in_task()) - drain_obj_stock(&stock->task_obj); + drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); @@ -3084,10 +3041,13 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); int *bytes; + local_irq_save(flags); + stock = this_cpu_ptr(&memcg_stock); + /* * Save vmstat data in stock and skip vmstat array update unless * accumulating over a page of vmstat data or when pgdat or idx @@ -3138,26 +3098,29 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); - put_obj_stock(flags); + local_irq_restore(flags); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); bool ret = false; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { stock->nr_bytes -= nr_bytes; ret = true; } - put_obj_stock(flags); + local_irq_restore(flags); return ret; } -static void drain_obj_stock(struct obj_stock *stock) +static void drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old = stock->cached_objcg; @@ -3213,13 +3176,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, { struct mem_cgroup *memcg; - if (in_task() && stock->task_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg); - if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) - return true; - } - if (stock->irq_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg); + if (stock->cached_objcg) { + memcg = obj_cgroup_memcg(stock->cached_objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } @@ -3230,10 +3188,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, bool allow_uncharge) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); unsigned int nr_pages = 0; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ drain_obj_stock(stock); obj_cgroup_get(objcg); @@ -3249,7 +3210,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - put_obj_stock(flags); + local_irq_restore(flags); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); @@ -6874,7 +6835,6 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) unsigned long nr_pages; struct mem_cgroup *memcg; struct obj_cgroup *objcg; - bool use_objcg = PageMemcgKmem(page); VM_BUG_ON_PAGE(PageLRU(page), page); @@ -6883,7 +6843,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) * page memcg or objcg at this point, we have fully * exclusive access to the page. */ - if (use_objcg) { + if (PageMemcgKmem(page)) { objcg = __page_objcg(page); /* * This get matches the put at the end of the function and @@ -6911,7 +6871,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) nr_pages = compound_nr(page); - if (use_objcg) { + if (PageMemcgKmem(page)) { ug->nr_memory += nr_pages; ug->nr_kmem += nr_pages; From patchwork Mon Aug 8 12:55:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8757EC25B08 for ; Mon, 8 Aug 2022 12:56:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237479AbiHHM4q (ORCPT ); Mon, 8 Aug 2022 08:56:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237443AbiHHM4k (ORCPT ); Mon, 8 Aug 2022 08:56:40 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BD9C273D for ; Mon, 8 Aug 2022 05:56:38 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 80CA34169D; Mon, 8 Aug 2022 14:56:35 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 80CA34169D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963395; bh=DhPN7weyQjmj15xvQ18fmASiGvEpFtfU/y3t7RmmEe8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UI+2InPLF0mUN5gksNv+kARLY2g28SpqE47sgMl1wbw9k2fu1jx7zqdi7seb7Ghbe L1Vmi5E9d/xH71f81r5WRux0mn/YX10hIuKGC0uEpnt0ykXOJntCwTUnMNtTNstKDI 9/+c1l2PDBqbvOlJ+00/aQ44vq0XWgVa1XyGg+as= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, Michal Hocko , =?utf-8?q?Michal_Koutn=C3=BD?= , Roman Gushchin , Johannes Weiner , Shakeel Butt , Michal Hocko , David Oberhollenzer Subject: [PATCH v2 02/10] mm/memcg: Disable threshold event handlers on PREEMPT_RT Date: Mon, 8 Aug 2022 14:55:54 +0200 Message-Id: <20220808125602.97747-3-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior During the integration of PREEMPT_RT support, the code flow around memcg_check_events() resulted in `twisted code'. Moving the code around and avoiding then would then lead to an additional local-irq-save section within memcg_check_events(). While looking better, it adds a local-irq-save section to code flow which is usually within an local-irq-off block on non-PREEMPT_RT configurations. The threshold event handler is a deprecated memcg v1 feature. Instead of trying to get it to work under PREEMPT_RT just disable it. There should be no users on PREEMPT_RT. From that perspective it makes even less sense to get it to work under PREEMPT_RT while having zero users. Make memory.soft_limit_in_bytes and cgroup.event_control return -EOPNOTSUPP on PREEMPT_RT. Make an empty memcg_check_events() and memcg_write_event_control() which return only -EOPNOTSUPP on PREEMPT_RT. Document that the two knobs are disabled on PREEMPT_RT. Suggested-by: Michal Hocko Suggested-by: Michal Koutný Signed-off-by: Sebastian Andrzej Siewior Acked-by: Roman Gushchin Acked-by: Johannes Weiner Reviewed-by: Shakeel Butt Acked-by: Michal Hocko [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- Documentation/admin-guide/cgroup-v1/memory.rst | 2 ++ mm/memcontrol.c | 14 ++++++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 41191b5fb69d..c45291ac9ffb 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -64,6 +64,7 @@ Brief summary of control files. threads cgroup.procs show list of processes cgroup.event_control an interface for event_fd() + This knob is not available on CONFIG_PREEMPT_RT systems. memory.usage_in_bytes show current usage for memory (See 5.5 for details) memory.memsw.usage_in_bytes show current usage for memory+Swap @@ -75,6 +76,7 @@ Brief summary of control files. memory.max_usage_in_bytes show max memory usage recorded memory.memsw.max_usage_in_bytes show max memory+Swap usage recorded memory.soft_limit_in_bytes set/show soft limit of memory usage + This knob is not available on CONFIG_PREEMPT_RT systems. memory.stat show various statistics memory.use_hierarchy set/show hierarchical account enabled This knob is deprecated and shouldn't be diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a07e068989b4..6078d57aee0f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -909,6 +909,9 @@ static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, */ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page) { + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + return; + /* threshold event is triggered in finer grain than soft limit */ if (unlikely(mem_cgroup_event_ratelimit(memcg, MEM_CGROUP_TARGET_THRESH))) { @@ -3777,8 +3780,12 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of, } break; case RES_SOFT_LIMIT: - memcg->soft_limit = nr_pages; - ret = 0; + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { + ret = -EOPNOTSUPP; + } else { + memcg->soft_limit = nr_pages; + ret = 0; + } break; } return ret ?: nbytes; @@ -4754,6 +4761,9 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of, char *endp; int ret; + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + return -EOPNOTSUPP; + buf = strstrip(buf); efd = simple_strtoul(buf, &endp, 10); From patchwork Mon Aug 8 12:55:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51958C00140 for ; Mon, 8 Aug 2022 12:56:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242737AbiHHM4o (ORCPT ); Mon, 8 Aug 2022 08:56:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237479AbiHHM4k (ORCPT ); Mon, 8 Aug 2022 08:56:40 -0400 Received: from mail.infraroot.at (mail.infraroot.at [IPv6:2001:41d0:701:1100::afc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33C2838A0 for ; Mon, 8 Aug 2022 05:56:39 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 051964169C; Mon, 8 Aug 2022 14:56:35 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 051964169C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963396; bh=Nhwi56mtmN1YvbuXrfZ4GWgqCaJD62FwSaPWVWCpkCU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Qa83pUjCCwS4GMTwQavMNk40kyo2bxoz+rexnQSWElo9r0UJ4h0IIrCw4VDHOR6CB /koLVRVmpZjKCiCE76y/tOF+iaZ9IoNrq6G57zyb0QDRhB7Hw6XCCZjezXsU30dlcA Eq5aa0hdGDQN0/A1JDBhbAK8g0nMD7v+c8E6VSXA= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, Roman Gushchin , David Oberhollenzer Subject: [PATCH v2 03/10] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed. Date: Mon, 8 Aug 2022 14:55:55 +0200 Message-Id: <20220808125602.97747-4-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior The per-CPU counter are modified with the non-atomic modifier. The consistency is ensured by disabling interrupts for the update. On non PREEMPT_RT configuration this works because acquiring a spinlock_t typed lock with the _irq() suffix disables interrupts. On PREEMPT_RT configurations the RMW operation can be interrupted. Another problem is that mem_cgroup_swapout() expects to be invoked with disabled interrupts because the caller has to acquire a spinlock_t which is acquired with disabled interrupts. Since spinlock_t never disables interrupts on PREEMPT_RT the interrupts are never disabled at this point. The code is never called from in_irq() context on PREEMPT_RT therefore disabling preemption during the update is sufficient on PREEMPT_RT. The sections which explicitly disable interrupts can remain on PREEMPT_RT because the sections remain short and they don't involve sleeping locks (memcg_check_events() is doing nothing on PREEMPT_RT). Disable preemption during update of the per-CPU variables which do not explicitly disable interrupts. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Roman Gushchin [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 6078d57aee0f..e566c34466a1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -654,6 +654,35 @@ static u64 flush_next_time; #define FLUSH_TIME (2UL*HZ) +/* + * Accessors to ensure that preemption is disabled on PREEMPT_RT because it can + * not rely on this as part of an acquired spinlock_t lock. These functions are + * never used in hardirq context on PREEMPT_RT and therefore disabling preemtion + * is sufficient. + */ +static void memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#else + VM_BUG_ON(!irqs_disabled()); +#endif +} + +static void __memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#endif +} + +static void memcg_stats_unlock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_enable(); +#endif +} + static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { unsigned int x; @@ -737,6 +766,20 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg = pn->memcg; + /* + * The caller from rmap relay on disabled preemption becase they never + * update their counter from in-interrupt context. For these two + * counters we check that the update is never performed from an + * interrupt context while other caller need to have disabled interrupt. + */ + __memcg_stats_lock(); + if (IS_ENABLED(CONFIG_DEBUG_VM)) { + if (idx == NR_ANON_MAPPED || idx == NR_FILE_MAPPED) + WARN_ON_ONCE(!in_task()); + else + WARN_ON_ONCE(!irqs_disabled()); + } + /* Update memcg */ __this_cpu_add(memcg->vmstats_percpu->state[idx], val); @@ -744,6 +787,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); memcg_rstat_updated(memcg, val); + memcg_stats_unlock(); } /** @@ -844,8 +888,10 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, if (mem_cgroup_disabled()) return; + memcg_stats_lock(); __this_cpu_add(memcg->vmstats_percpu->events[idx], count); memcg_rstat_updated(memcg, count); + memcg_stats_unlock(); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) @@ -7211,8 +7257,9 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) * important here to have the interrupts disabled because it is the * only synchronisation we have for updating the per-CPU variables. */ - VM_BUG_ON(!irqs_disabled()); + memcg_stats_lock(); mem_cgroup_charge_statistics(memcg, page, -nr_entries); + memcg_stats_unlock(); memcg_check_events(memcg, page); css_put(&memcg->css); From patchwork Mon Aug 8 12:55:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 038B6C25B0D for ; Mon, 8 Aug 2022 12:56:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242800AbiHHM4s (ORCPT ); Mon, 8 Aug 2022 08:56:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241832AbiHHM4k (ORCPT ); Mon, 8 Aug 2022 08:56:40 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE55A38AF for ; Mon, 8 Aug 2022 05:56:39 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id A10B9416A1; Mon, 8 Aug 2022 14:56:36 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at A10B9416A1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963397; bh=M+bJ4tqx4dLgqcZrZ76MbmGEKAf1tcpLk6KbvZTtkZo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H5eVEYauyIdX1lFm2MP1Vl7fuxYVjTGmn3befQfBOIIM2f/nLeRCw5UFd7I2BMWS9 2cN4C+ShmtHIj1wVwkhCz7eIy8uXr0hAbvv6XBetN4lTLupNdrNqhfQTR256Ei5EEE kIQG7TivRnQB39qLx4ougH5zaQ9kOkTQNuXtx3f4= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, Johannes Weiner , Shakeel Butt , Roman Gushchin , Michal Hocko , David Oberhollenzer Subject: [PATCH v2 04/10] mm/memcg: Opencode the inner part of obj_cgroup_uncharge_pages() in drain_obj_stock() Date: Mon, 8 Aug 2022 14:55:56 +0200 Message-Id: <20220808125602.97747-5-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Johannes Weiner Provide the inner part of refill_stock() as __refill_stock() without disabling interrupts. This eases the integration of local_lock_t where recursive locking must be avoided. Open code obj_cgroup_uncharge_pages() in drain_obj_stock() and use __refill_stock(). The caller of drain_obj_stock() already disables interrupts. [bigeasy: Patch body around Johannes' diff ] Signed-off-by: Johannes Weiner Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e566c34466a1..2cf55b3ab4b0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2264,12 +2264,9 @@ static void drain_local_stock(struct work_struct *dummy) * Cache charges(val) to local per_cpu area. * This will be consumed by consume_stock() function, later. */ -static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +static void __refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { struct memcg_stock_pcp *stock; - unsigned long flags; - - local_irq_save(flags); stock = this_cpu_ptr(&memcg_stock); if (stock->cached != memcg) { /* reset if necessary */ @@ -2281,7 +2278,14 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (stock->nr_pages > MEMCG_CHARGE_BATCH) drain_stock(stock); +} +static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +{ + unsigned long flags; + + local_irq_save(flags); + __refill_stock(memcg, nr_pages); local_irq_restore(flags); } @@ -3180,8 +3184,18 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT; unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1); - if (nr_pages) - obj_cgroup_uncharge_pages(old, nr_pages); + if (nr_pages) { + struct mem_cgroup *memcg; + + memcg = get_mem_cgroup_from_objcg(old); + + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) + page_counter_uncharge(&memcg->kmem, nr_pages); + + __refill_stock(memcg, nr_pages); + + css_put(&memcg->css); + } /* * The leftover is flushed to the centralized per-memcg value. From patchwork Mon Aug 8 12:55:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707E6C25B0C for ; Mon, 8 Aug 2022 12:56:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243018AbiHHM4x (ORCPT ); Mon, 8 Aug 2022 08:56:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242678AbiHHM4o (ORCPT ); Mon, 8 Aug 2022 08:56:44 -0400 Received: from mail.infraroot.at (mail.infraroot.at [IPv6:2001:41d0:701:1100::afc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AFF338BA for ; Mon, 8 Aug 2022 05:56:40 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 31A80416A3; Mon, 8 Aug 2022 14:56:37 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 31A80416A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963397; bh=BHiUPLrE1mzv83ZqhZcFw7zyE5c6AaSMp8lrSZr3YEs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t0S+My0tcVC6vcQCQkfDKwgNbfe7JHUuC3EpjNxLw0atAyU9LyoylH5GMPvFpRWX0 ZIpdryXCl7q1TOUvlfrgQ62rgEDhkpKWRAqnjKhTgY7LUR/vAr3uUK2HYAgBmtuZUR fwMQTvVY1wSxVv3RmP7g9kfSStB9drj1koyBEQlI= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, kernel test robot , David Oberhollenzer Subject: [PATCH v2 05/10] mm/memcg: Protect memcg_stock with a local_lock_t Date: Mon, 8 Aug 2022 14:55:57 +0200 Message-Id: <20220808125602.97747-6-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior The members of the per-CPU structure memcg_stock_pcp are protected by disabling interrupts. This is not working on PREEMPT_RT because it creates atomic context in which actions are performed which require preemptible context. One example is obj_cgroup_release(). The IRQ-disable sections can be replaced with local_lock_t which preserves the explicit disabling of interrupts while keeps the code preemptible on PREEMPT_RT. drain_obj_stock() drops a reference on obj_cgroup which leads to an invocation of obj_cgroup_release() if it is the last object. This in turn leads to recursive locking of the local_lock_t. To avoid this, obj_cgroup_release() is invoked outside of the locked section. obj_cgroup_uncharge_pages() can be invoked with the local_lock_t acquired and without it. This will lead later to a recursion in refill_stock(). To avoid the locking recursion provide obj_cgroup_uncharge_pages_locked() which uses the locked version of refill_stock(). - Replace disabling interrupts for memcg_stock with a local_lock_t. - Let drain_obj_stock() return the old struct obj_cgroup which is passed to obj_cgroup_put() outside of the locked section. - Provide obj_cgroup_uncharge_pages_locked() which uses the locked version of refill_stock() to avoid recursive locking in drain_obj_stock(). Link: https://lkml.kernel.org/r/20220209014709.GA26885@xsang-OptiPlex-9020 Reported-by: kernel test robot Signed-off-by: Sebastian Andrzej Siewior [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 55 ++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2cf55b3ab4b0..20bce9682e8a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2152,6 +2152,7 @@ void unlock_page_memcg(struct page *page) EXPORT_SYMBOL(unlock_page_memcg); struct memcg_stock_pcp { + local_lock_t stock_lock; struct mem_cgroup *cached; /* this never be root cgroup */ unsigned int nr_pages; @@ -2167,17 +2168,20 @@ struct memcg_stock_pcp { unsigned long flags; #define FLUSHING_CACHED_CHARGE 0 }; -static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); +static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = { + .stock_lock = INIT_LOCAL_LOCK(stock_lock), +}; static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct memcg_stock_pcp *stock); +static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); #else -static inline void drain_obj_stock(struct memcg_stock_pcp *stock) +static inline struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) { + return NULL; } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg) @@ -2206,7 +2210,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (memcg == stock->cached && stock->nr_pages >= nr_pages) { @@ -2214,7 +2218,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) ret = true; } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } @@ -2243,6 +2247,7 @@ static void drain_stock(struct memcg_stock_pcp *stock) static void drain_local_stock(struct work_struct *dummy) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; /* @@ -2250,14 +2255,16 @@ static void drain_local_stock(struct work_struct *dummy) * drain_stock races is that we always operate on local CPU stock * here with IRQ disabled */ - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(stock); + old = drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); } /* @@ -2284,9 +2291,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); __refill_stock(memcg, nr_pages); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } /* @@ -3095,10 +3102,11 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; int *bytes; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); /* @@ -3107,7 +3115,7 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, * changes. */ if (stock->cached_objcg != objcg) { - drain_obj_stock(stock); + old = drain_obj_stock(stock); obj_cgroup_get(objcg); stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes) ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; @@ -3151,7 +3159,9 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) @@ -3160,7 +3170,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) unsigned long flags; bool ret = false; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { @@ -3168,17 +3178,17 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) ret = true; } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } -static void drain_obj_stock(struct memcg_stock_pcp *stock) +static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old = stock->cached_objcg; if (!old) - return; + return NULL; if (stock->nr_bytes) { unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT; @@ -3230,8 +3240,8 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) stock->cached_pgdat = NULL; } - obj_cgroup_put(old); stock->cached_objcg = NULL; + return old; } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -3252,14 +3262,15 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, bool allow_uncharge) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; unsigned int nr_pages = 0; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ - drain_obj_stock(stock); + old = drain_obj_stock(stock); obj_cgroup_get(objcg); stock->cached_objcg = objcg; stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes) @@ -3273,7 +3284,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); From patchwork Mon Aug 8 12:55:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D978C00140 for ; Mon, 8 Aug 2022 12:56:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242889AbiHHM4u (ORCPT ); Mon, 8 Aug 2022 08:56:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241944AbiHHM4m (ORCPT ); Mon, 8 Aug 2022 08:56:42 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C87638B8 for ; Mon, 8 Aug 2022 05:56:40 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id D665C416A4; Mon, 8 Aug 2022 14:56:37 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at D665C416A4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963398; bh=WBJdTKD5Koj6Sq/5os1AOLuHKL0uDlfVZ7EpKgwzGV4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t4oa7znbVW9RyaY3yIYmRX8kJqlC45eFQBhdNAYthcBm8oAM+24K42FazqWr3/H/K lRXyEo93SZhlxsvleIz+UOoHq8lK0aosvULW/0n8iPgFmaTCl14RPrAAKTVWH9malH jqb7NZXYQoUCDM+hY4LOpKoctPepYWGGLEOxZ9Zk= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, Michal Hocko , David Oberhollenzer Subject: [PATCH v2 06/10] mm/memcg: Disable migration instead of preemption in drain_all_stock(). Date: Mon, 8 Aug 2022 14:55:58 +0200 Message-Id: <20220808125602.97747-7-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior Before the for-each-CPU loop, preemption is disabled so that so that drain_local_stock() can be invoked directly instead of scheduling a worker. Ensuring that drain_local_stock() completed on the local CPU is not correctness problem. It _could_ be that the charging path will be forced to reclaim memory because cached charges are still waiting for their draining. Disabling preemption before invoking drain_local_stock() is problematic on PREEMPT_RT due to the sleeping locks involved. To ensure that no CPU migrations happens across for_each_online_cpu() it is enouhg to use migrate_disable() which disables migration and keeps context preemptible to a sleeping lock can be acquired. A race with CPU hotplug is not a problem because pcp data is not going away. In the worst case we just schedule draining of an empty stock. Use migrate_disable() instead of get_cpu() around the for_each_online_cpu() loop. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Michal Hocko [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 20bce9682e8a..4b5888bf0a60 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2313,7 +2313,8 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) * as well as workers from this path always operate on the local * per-cpu data. CPU up doesn't touch memcg_stock at all. */ - curcpu = get_cpu(); + migrate_disable(); + curcpu = smp_processor_id(); for_each_online_cpu(cpu) { struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu); struct mem_cgroup *memcg; @@ -2336,7 +2337,7 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) schedule_work_on(cpu, &stock->work); } } - put_cpu(); + migrate_enable(); mutex_unlock(&percpu_charge_mutex); } From patchwork Mon Aug 8 12:55:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC57CC25B0C for ; Mon, 8 Aug 2022 12:56:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243008AbiHHM4v (ORCPT ); Mon, 8 Aug 2022 08:56:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242005AbiHHM4m (ORCPT ); Mon, 8 Aug 2022 08:56:42 -0400 Received: from mail.infraroot.at (mail.infraroot.at [IPv6:2001:41d0:701:1100::afc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80ACC5583 for ; Mon, 8 Aug 2022 05:56:41 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 5C44E416A6; Mon, 8 Aug 2022 14:56:38 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 5C44E416A6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963399; bh=CMgh3pr6GL2N/zejbSB2FIVOGOefRGfoWDQiyPTC/MI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YWOjZcg+y7x16ZTqG9IOfCrWFwd4m1LC3sjfrTye6ZSPceS/6d0baKRgaBGieRqYA l6q2gnrUruBKjkc9g17Ot54wVmsHuHogmXSUoUf+DJniASTivwCjvTrJ6+4XBscEkR bS+cIhpyp+GIRibQkYq2La3UZ7Zz77sNzKQCmLso= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, David Oberhollenzer Subject: [PATCH v2 07/10] mm/memcg: Add missing counter index which are not update in interrupt. Date: Mon, 8 Aug 2022 14:55:59 +0200 Message-Id: <20220808125602.97747-8-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior Shakeel Butt reported that I missed a few counters which are not updated in-interrupt context and therefore disabling preemption is fine. Please fold into: "Protect per-CPU counter by disabling preemption on PREEMPT_RT" Signed-off-by: Sebastian Andrzej Siewior [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4b5888bf0a60..50d2d5432018 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -774,10 +774,17 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, */ __memcg_stats_lock(); if (IS_ENABLED(CONFIG_DEBUG_VM)) { - if (idx == NR_ANON_MAPPED || idx == NR_FILE_MAPPED) + switch (idx) { + case NR_ANON_MAPPED: + case NR_FILE_MAPPED: + case NR_ANON_THPS: + case NR_SHMEM_PMDMAPPED: + case NR_FILE_PMDMAPPED: WARN_ON_ONCE(!in_task()); - else + break; + default: WARN_ON_ONCE(!irqs_disabled()); + } } /* Update memcg */ From patchwork Mon Aug 8 12:56:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596396 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52CE7C25B0E for ; Mon, 8 Aug 2022 12:56:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241944AbiHHM4v (ORCPT ); Mon, 8 Aug 2022 08:56:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242531AbiHHM4n (ORCPT ); Mon, 8 Aug 2022 08:56:43 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80D8A5588 for ; Mon, 8 Aug 2022 05:56:41 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 1BECC416A2; Mon, 8 Aug 2022 14:56:39 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 1BECC416A2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963399; bh=ZqO092KYvya3/ahPQoasLBIjBb8XtL++HVpLcH+QVNQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XXpI+u9mUpuYa2qPKYItHKdnCWtc19Xo6qv+BgRIy7+gmFlXm0SNFUFYntFSDF3ti 7dcai/iajU5Do2yI9yvjYFJF9Pv1D4DZCU3Z41J1fXWYyGQyn61OTYEmEl3UQTrHig Jnz9a+dynQMHoLODbfFwO/7uIZeePG3lXInOxVEM= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, David Oberhollenzer Subject: [PATCH v2 08/10] mm/memcg: Add a comment regarding the release `obj'. Date: Mon, 8 Aug 2022 14:56:00 +0200 Message-Id: <20220808125602.97747-9-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior Please fold into mm/memcg: Protect memcg_stock with a local_lock_t Signed-off-by: Sebastian Andrzej Siewior [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 50d2d5432018..70c374761bb8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3249,6 +3249,10 @@ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) } stock->cached_objcg = NULL; + /* + * The `old' objects needs to be released by the caller via + * obj_cgroup_put() outside of memcg_stock_pcp::stock_lock. + */ return old; } From patchwork Mon Aug 8 12:56:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68A47C00140 for ; Mon, 8 Aug 2022 12:56:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243010AbiHHM4w (ORCPT ); Mon, 8 Aug 2022 08:56:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242548AbiHHM4n (ORCPT ); Mon, 8 Aug 2022 08:56:43 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F4535F49 for ; Mon, 8 Aug 2022 05:56:41 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 8DC36416AA; Mon, 8 Aug 2022 14:56:39 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 8DC36416AA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963399; bh=yMVPV+D70GOWw86vjSTklQbfqsovADM87cju4pwkeBs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O9hpScP26o05BTGrVZBOMxNkGPIrab81ok3bMzcyTXeDg7u44zFOQ4cdrYLL4HpBd 0mrSp1Ss5Ft0cJtq9IGSym0ACOyFgwjOogRcFi4XPxPL+bQxK16wq4V4MgBtWRODZu UqMvN+eJ7RySAKPAkDJwbVdHo2DQKPKWarAFkG1g= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, David Oberhollenzer Subject: [PATCH v2 09/10] mm/memcg: Only perform the debug checks on !PREEMPT_RT Date: Mon, 8 Aug 2022 14:56:01 +0200 Message-Id: <20220808125602.97747-10-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org From: Sebastian Andrzej Siewior On PREEMPT_RT interrupts and preemption is always enabled. The locking function __memcg_stats_lock() always disabled preemptions. The recently added checks need to performed only on !PREEMPT_RT where preemption and disabled interrupts are used. Please fold into: "Protect per-CPU counter by disabling preemption on PREEMPT_RT" Signed-off-by: Sebastian Andrzej Siewior [do: backported to v5.15] Signed-off-by: David Oberhollenzer --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 70c374761bb8..74025c2e2bcc 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -773,7 +773,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, * interrupt context while other caller need to have disabled interrupt. */ __memcg_stats_lock(); - if (IS_ENABLED(CONFIG_DEBUG_VM)) { + if (IS_ENABLED(CONFIG_DEBUG_VM) && !IS_ENABLED(CONFIG_PREEMPT_RT)) { switch (idx) { case NR_ANON_MAPPED: case NR_FILE_MAPPED: From patchwork Mon Aug 8 12:56:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Oberhollenzer X-Patchwork-Id: 596395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E729FC25B08 for ; Mon, 8 Aug 2022 12:56:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242678AbiHHM4y (ORCPT ); Mon, 8 Aug 2022 08:56:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242693AbiHHM4o (ORCPT ); Mon, 8 Aug 2022 08:56:44 -0400 Received: from mail.infraroot.at (mail.infraroot.at [54.37.73.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2351F5F4B for ; Mon, 8 Aug 2022 05:56:41 -0700 (PDT) Received: from localtoast.corp.sigma-star.at (unknown [82.150.214.1]) by mail.infraroot.at (Postfix) with ESMTPSA id 23BC6416AC; Mon, 8 Aug 2022 14:56:40 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.infraroot.at 23BC6416AC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infraroot.at; s=default; t=1659963400; bh=1Azd6iH8LnHKlev57mvjVBS8lXK05mIFDM2SiXHsJz0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ehC2xHVqaI0fsnQtEBDira8265Gxb49JW8tCJUBZ20gwhOZ6BkKbP4d6rHGDCLwM6 J4ufifZuHxKjsSc+8h9iAiEKEo1c2P1fSiVqyMaI3sdIGSuk2K8GMNdnWXSqC/Xkax WtQODFrS+D+X0rDEqKFbb+yOmNqcuhpgStL/O+tY= From: David Oberhollenzer To: linux-rt-users@vger.kernel.org Cc: williams@redhat.com, bigeasy@linutronix.de, richard@nod.at, joseph.salisbury@canonical.com, David Oberhollenzer Subject: [PATCH v2 10/10] Allow MEMCG on PREEMPT_RT Date: Mon, 8 Aug 2022 14:56:02 +0200 Message-Id: <20220808125602.97747-11-goliath@infraroot.at> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220808125602.97747-1-goliath@infraroot.at> References: <20220808125602.97747-1-goliath@infraroot.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org With these backported, it is safe to use MEMCG again: 0001-mm-memcg-Revert-mm-memcg-optimize-user-context-objec.patch 0002-mm-memcg-Disable-threshold-event-handlers-on-PREEMPT.patch 0003-mm-memcg-Protect-per-CPU-counter-by-disabling-preemp.patch 0004-mm-memcg-Opencode-the-inner-part-of-obj_cgroup_uncha.patch 0005-mm-memcg-Protect-memcg_stock-with-a-local_lock_t.patch 0006-mm-memcg-Disable-migration-instead-of-preemption-in-.patch 0007-mm-memcg-Add-missing-counter-index-which-are-not-upd.patch 0008-mm-memcg-Add-a-comment-regarding-the-release-obj.patch 0009-mm-memcg-Only-perform-the-debug-checks-on-PREEMPT_RT.patch Signed-off-by: David Oberhollenzer --- init/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index 160f836f81c7..1646c25f6ec5 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -943,7 +943,6 @@ config PAGE_COUNTER config MEMCG bool "Memory controller" - depends on !PREEMPT_RT select PAGE_COUNTER select EVENTFD help