From patchwork Fri Feb 11 02:13:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 542145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62C7CC433F5 for ; Fri, 11 Feb 2022 02:14:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242102AbiBKCOB (ORCPT ); Thu, 10 Feb 2022 21:14:01 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237351AbiBKCOA (ORCPT ); Thu, 10 Feb 2022 21:14:00 -0500 Received: from out01.mta.xmission.com (out01.mta.xmission.com [166.70.13.231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDF4E5FB0; Thu, 10 Feb 2022 18:14:00 -0800 (PST) Received: from in01.mta.xmission.com ([166.70.13.51]:56772) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRe-000fPs-PC; Thu, 10 Feb 2022 19:13:59 -0700 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:52650 helo=localhost.localdomain) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRc-00FMXV-OH; Thu, 10 Feb 2022 19:13:58 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: Alexey Gladkov , Kees Cook , Shuah Khan , Christian Brauner , Solar Designer , Ran Xiaokai , containers@lists.linux-foundation.org, =?utf-8?q?Michal_Koutn=C3=BD?= , "Eric W. Biederman" , stable@vger.kernel.org Date: Thu, 10 Feb 2022 20:13:17 -0600 Message-Id: <20220211021324.4116773-1-ebiederm@xmission.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> References: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1nILRc-00FMXV-OH; ; ; mid=<20220211021324.4116773-1-ebiederm@xmission.com>; ; ; hst=in01.mta.xmission.com; ; ; ip=68.227.174.4; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-XM-AID: U2FsdGVkX1/PfBild3s35N2DnDkS1MpwyGk5lTzuMOo= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH 1/8] ucounts: Fix RLIMIT_NPROC regression X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Michal Koutný wrote: > It was reported that v5.14 behaves differently when enforcing > RLIMIT_NPROC limit, namely, it allows one more task than previously. > This is consequence of the commit 21d1c5e386bc ("Reimplement > RLIMIT_NPROC on top of ucounts") that missed the sharpness of > equality in the forking path. This can be fixed either by fixing the test or by moving the increment to be before the test. Fix it my moving copy_creds which contains the increment before is_ucounts_overlimit. This is necessary so that in the case of CLONE_NEWUSER the new credential with the new ucounts is available of is_ucounts_overlimit to test. Both the test in fork and the test in set_user were semantically changed when the code moved to ucounts. The change of the test in fork was bad because it was before the increment. The test in set_user was wrong and the change to ucounts fixed it. So this fix is only restore the old behavior in one lcatio not two. Link: https://lkml.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com Cc: stable@vger.kernel.org Reported-by: Michal Koutný Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" --- kernel/fork.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index d75a528f7b21..17d8a8c85e3b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2021,18 +2021,18 @@ static __latent_entropy struct task_struct *copy_process( #ifdef CONFIG_PROVE_LOCKING DEBUG_LOCKS_WARN_ON(!p->softirqs_enabled); #endif + retval = copy_creds(p, clone_flags); + if (retval < 0) + goto bad_fork_free; + retval = -EAGAIN; if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { if (p->real_cred->user != INIT_USER && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) - goto bad_fork_free; + goto bad_fork_cleanup_count; } current->flags &= ~PF_NPROC_EXCEEDED; - retval = copy_creds(p, clone_flags); - if (retval < 0) - goto bad_fork_free; - /* * If multiple threads are within copy_process(), then this check * triggers too late. This doesn't hurt, the check is only there From patchwork Fri Feb 11 02:13:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 542144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01FB9C433EF for ; Fri, 11 Feb 2022 02:14:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347230AbiBKCON (ORCPT ); Thu, 10 Feb 2022 21:14:13 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347204AbiBKCOI (ORCPT ); Thu, 10 Feb 2022 21:14:08 -0500 Received: from out01.mta.xmission.com (out01.mta.xmission.com [166.70.13.231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC3EB5FB4; Thu, 10 Feb 2022 18:14:07 -0800 (PST) Received: from in01.mta.xmission.com ([166.70.13.51]:56980) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRm-000fRD-NU; Thu, 10 Feb 2022 19:14:06 -0700 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:52650 helo=localhost.localdomain) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRj-00FMXV-UM; Thu, 10 Feb 2022 19:14:06 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: Alexey Gladkov , Kees Cook , Shuah Khan , Christian Brauner , Solar Designer , Ran Xiaokai , containers@lists.linux-foundation.org, =?utf-8?q?Michal_Koutn=C3=BD?= , "Eric W. Biederman" , stable@vger.kernel.org Date: Thu, 10 Feb 2022 20:13:19 -0600 Message-Id: <20220211021324.4116773-3-ebiederm@xmission.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> References: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1nILRj-00FMXV-UM; ; ; mid=<20220211021324.4116773-3-ebiederm@xmission.com>; ; ; hst=in01.mta.xmission.com; ; ; ip=68.227.174.4; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-XM-AID: U2FsdGVkX1+t/nauytizkF2Z+L1yNTkR9kdtUK0tpC4= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH 3/8] ucounts: Fix and simplify RLIMIT_NPROC handling during setuid()+execve X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Michal Koutný wrote: > The check is currently against the current->cred but since those are > going to change and we want to check RLIMIT_NPROC condition after the > switch, supply the capability check with the new cred. > But since we're checking new_user being INIT_USER any new cred's > capability-based allowance may be redundant when the check fails and the > alternative solution would be revert of the commit 2863643fb8b9 > ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") As of commit 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") setting the flag to see if execve should check RLIMIT_NPROC is buggy, as it tests the capabilites from before the credential change and not aftwards. As of commit 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") examining the rlimit is buggy as cred->ucounts has not yet been properly set in the new credential. Make the code correct and more robust moving the test to see if execve() needs to test RLIMIT_NPROC into commit_creds, and defer all of the rest of the logic into execve() itself. As the flag only indicateds that RLIMIT_NPROC should be checked in execve rename it from PF_NPROC_EXCEEDED to PF_NPROC_CHECK. Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220207121800.5079-2-mkoutny@suse.com Link: https://lkml.kernel.org/r/20220207121800.5079-3-mkoutny@suse.com Reported-by: Michal Koutný Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 15 ++++++++------- include/linux/sched.h | 2 +- kernel/cred.c | 13 +++++++++---- kernel/fork.c | 2 +- kernel/sys.c | 14 -------------- 5 files changed, 19 insertions(+), 27 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 79f2c9483302..1e7f757cbc2c 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1875,20 +1875,21 @@ static int do_execveat_common(int fd, struct filename *filename, return PTR_ERR(filename); /* - * We move the actual failure in case of RLIMIT_NPROC excess from - * set*uid() to execve() because too many poorly written programs - * don't check setuid() return code. Here we additionally recheck - * whether NPROC limit is still exceeded. + * After calling set*uid() is RLIMT_NPROC exceeded? + * This can not be checked in set*uid() because too many programs don't + * check the setuid() return code. */ - if ((current->flags & PF_NPROC_EXCEEDED) && - is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { + if ((current->flags & PF_NPROC_CHECK) && + is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && + (current_user() != INIT_USER) && + !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) { retval = -EAGAIN; goto out_ret; } /* We're below the limit (still or again), so we don't want to make * further execve() calls fail. */ - current->flags &= ~PF_NPROC_EXCEEDED; + current->flags &= ~PF_NPROC_CHECK; bprm = alloc_bprm(fd, filename); if (IS_ERR(bprm)) { diff --git a/include/linux/sched.h b/include/linux/sched.h index 75ba8aa60248..6605a262a6be 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1678,7 +1678,7 @@ extern struct pid *cad_pid; #define PF_DUMPCORE 0x00000200 /* Dumped core */ #define PF_SIGNALED 0x00000400 /* Killed by a signal */ #define PF_MEMALLOC 0x00000800 /* Allocating memory */ -#define PF_NPROC_EXCEEDED 0x00001000 /* set_user() noticed that RLIMIT_NPROC was exceeded */ +#define PF_NPROC_CHECK 0x00001000 /* Check in execve if RLIMIT_NPROC was exceeded */ #define PF_USED_MATH 0x00002000 /* If unset the fpu must be initialized before use */ #define PF_NOFREEZE 0x00008000 /* This thread should not be frozen */ #define PF_FROZEN 0x00010000 /* Frozen for system suspend */ diff --git a/kernel/cred.c b/kernel/cred.c index 933155c96922..229cff081167 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -490,13 +490,18 @@ int commit_creds(struct cred *new) if (!gid_eq(new->fsgid, old->fsgid)) key_fsgid_changed(new); - /* do it - * RLIMIT_NPROC limits on user->processes have already been checked - * in set_user(). + /* + * Remember if the NPROC limit may be exceeded. The set*uid() functions + * can not fail if the NPROC limit is exceeded as too many programs + * don't check the return code. Instead enforce the NPROC limit for + * programs doing set*uid()+execve by harmlessly defering the failure + * to the execve() stage. */ alter_cred_subscribers(new, 2); - if (new->user != old->user || new->user_ns != old->user_ns) + if (new->user != old->user || new->user_ns != old->user_ns) { inc_rlimit_ucounts(new->ucounts, UCOUNT_RLIMIT_NPROC, 1); + task->flags |= PF_NPROC_CHECK; + } rcu_assign_pointer(task->real_cred, new); rcu_assign_pointer(task->cred, new); if (new->user != old->user || new->user_ns != old->user_ns) diff --git a/kernel/fork.c b/kernel/fork.c index 17d8a8c85e3b..2b6a28a86325 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2031,7 +2031,7 @@ static __latent_entropy struct task_struct *copy_process( !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) goto bad_fork_cleanup_count; } - current->flags &= ~PF_NPROC_EXCEEDED; + current->flags &= ~PF_NPROC_CHECK; /* * If multiple threads are within copy_process(), then this check diff --git a/kernel/sys.c b/kernel/sys.c index ecc4cf019242..b1ed21d79f3b 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -472,20 +472,6 @@ static int set_user(struct cred *new) if (!new_user) return -EAGAIN; - /* - * We don't fail in case of NPROC limit excess here because too many - * poorly written programs don't check set*uid() return code, assuming - * it never fails if called by root. We may still enforce NPROC limit - * for programs doing set*uid()+execve() by harmlessly deferring the - * failure to the execve() stage. - */ - if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && - new_user != INIT_USER && - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) - current->flags |= PF_NPROC_EXCEEDED; - else - current->flags &= ~PF_NPROC_EXCEEDED; - free_uid(new->user); new->user = new_user; return 0; From patchwork Fri Feb 11 02:13:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 542143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E3CCC4332F for ; Fri, 11 Feb 2022 02:14:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347196AbiBKCOQ (ORCPT ); Thu, 10 Feb 2022 21:14:16 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347218AbiBKCOO (ORCPT ); Thu, 10 Feb 2022 21:14:14 -0500 Received: from out01.mta.xmission.com (out01.mta.xmission.com [166.70.13.231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 690C35FB9; Thu, 10 Feb 2022 18:14:14 -0800 (PST) Received: from in01.mta.xmission.com ([166.70.13.51]:57088) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRt-000fSD-Ju; Thu, 10 Feb 2022 19:14:13 -0700 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:52650 helo=localhost.localdomain) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nILRr-00FMXV-Lf; Thu, 10 Feb 2022 19:14:13 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: Alexey Gladkov , Kees Cook , Shuah Khan , Christian Brauner , Solar Designer , Ran Xiaokai , containers@lists.linux-foundation.org, =?utf-8?q?Michal_Koutn=C3=BD?= , "Eric W. Biederman" , stable@vger.kernel.org Date: Thu, 10 Feb 2022 20:13:21 -0600 Message-Id: <20220211021324.4116773-5-ebiederm@xmission.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> References: <87o83e2mbu.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1nILRr-00FMXV-Lf; ; ; mid=<20220211021324.4116773-5-ebiederm@xmission.com>; ; ; hst=in01.mta.xmission.com; ; ; ip=68.227.174.4; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-XM-AID: U2FsdGVkX19+LUNNXy+P6wbS8hab2BGjP4edjzZTdos= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH 5/8] ucounts: Handle wrapping in is_ucounts_overlimit X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org While examining is_ucounts_overlimit and reading the various messages I realized that is_ucounts_overlimit fails to deal with counts that may have wrapped. Being wrapped should be a transitory state for counts and they should never be wrapped for long, but it can happen so handle it. Cc: stable@vger.kernel.org Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" --- kernel/ucount.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/ucount.c b/kernel/ucount.c index 65b597431c86..06ea04d44685 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -350,7 +350,8 @@ bool is_ucounts_overlimit(struct ucounts *ucounts, enum ucount_type type, unsign if (rlimit > LONG_MAX) max = LONG_MAX; for (iter = ucounts; iter; iter = iter->ns->ucounts) { - if (get_ucounts_value(iter, type) > max) + long val = get_ucounts_value(iter, type); + if (val < 0 || val > max) return true; max = READ_ONCE(iter->ns->ucount_max[type]); }