diff mbox series

[v9,1/7] cgroup/cpuset: Don't let child cpusets restrict parent in default hierarchy

Message ID 20211205183220.818872-2-longman@redhat.com
State Accepted
Commit 1f1562fcd04a485734e94390660e741c3be47867
Headers show
Series cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus | expand

Commit Message

Waiman Long Dec. 5, 2021, 6:32 p.m. UTC
In validate_change(), there is a check since v2.6.12 to make sure that
each of the child cpusets must be a subset of a parent cpuset.  IOW, it
allows child cpusets to restrict what changes can be made to a parent's
"cpuset.cpus". This actually violates one of the core principles of the
default hierarchy where a cgroup higher up in the hierarchy should be
able to change configuration however it sees fit as deligation breaks
down otherwise.

To address this issue, the check is now removed for the default hierarchy
to free parent cpusets from being restricted by child cpusets. The
check will still apply for legacy hierarchy.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/cgroup/cpuset.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

Comments

Tejun Heo Dec. 13, 2021, 8:41 p.m. UTC | #1
On Sun, Dec 05, 2021 at 01:32:14PM -0500, Waiman Long wrote:
> In validate_change(), there is a check since v2.6.12 to make sure that
> each of the child cpusets must be a subset of a parent cpuset.  IOW, it
> allows child cpusets to restrict what changes can be made to a parent's
> "cpuset.cpus". This actually violates one of the core principles of the
> default hierarchy where a cgroup higher up in the hierarchy should be
> able to change configuration however it sees fit as deligation breaks
> down otherwise.
> 
> To address this issue, the check is now removed for the default hierarchy
> to free parent cpusets from being restricted by child cpusets. The
> check will still apply for legacy hierarchy.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Waiman Long <longman@redhat.com>

Applied to cgroup/for-5.17.

Thanks.
Michal Koutný Dec. 15, 2021, 12:23 p.m. UTC | #2
On Mon, Dec 13, 2021 at 10:41:23AM -1000, Tejun Heo <tj@kernel.org> wrote:
> > To address this issue, the check is now removed for the default hierarchy
> > to free parent cpusets from being restricted by child cpusets. The
> > check will still apply for legacy hierarchy.

I'm trying to find whether something in update_cpumasks_hier() ensures
the constraint is checkd on the legacy hierarchy but it seems to me this
baby was thrown out with the bathwater. How is the legacy check still
applied?

> Applied to cgroup/for-5.17.

It comes out a bit more complex if I want to achieve both variants in
the below followup:

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 0dd7d853ed17..8b6e06f504f6 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -590,6 +590,35 @@ static inline void free_cpuset(struct cpuset *cs)
 	kfree(cs);
 }
 
+/*
+ * validate_change_legacy() - Validate conditions specific to legacy (v1)
+ *                            behavior.
+ */
+static int validate_change_legacy(struct cpuset *cur, struct cpuset *trial)
+{
+	struct cgroup_subsys_state *css;
+	struct cpuset *c, *par;
+	int ret;
+
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	/* Each of our child cpusets must be a subset of us */
+	ret = -EBUSY;
+	cpuset_for_each_child(c, css, cur)
+		if (!is_cpuset_subset(c, trial))
+			goto out;
+
+	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
+	ret = -EACCES;
+	par = parent_cs(cur);
+	if (par && !is_cpuset_subset(trial, par))
+		goto out;
+
+	ret = 0;
+out:
+	return ret;
+}
+
 /*
  * validate_change() - Used to validate that any proposed cpuset change
  *		       follows the structural rules for cpusets.
@@ -614,20 +643,21 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 {
 	struct cgroup_subsys_state *css;
 	struct cpuset *c, *par;
-	int ret;
-
-	/* The checks don't apply to root cpuset */
-	if (cur == &top_cpuset)
-		return 0;
+	int ret = 0;
 
 	rcu_read_lock();
-	par = parent_cs(cur);
 
-	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
-	ret = -EACCES;
-	if (!is_in_v2_mode() && !is_cpuset_subset(trial, par))
+	ret = validate_change_legacy(cur, trial);
+	if (ret)
+		goto out;
+
+	/* Remaining checks don't apply to root cpuset */
+	ret = 0;
+	if (cur == &top_cpuset)
 		goto out;
 
+	par = parent_cs(cur);
+
 	/*
 	 * If either I or some sibling (!= me) is exclusive, we can't
 	 * overlap
@@ -1175,9 +1205,7 @@ enum subparts_cmd {
  *
  * Because of the implicit cpu exclusive nature of a partition root,
  * cpumask changes that violates the cpu exclusivity rule will not be
- * permitted when checked by validate_change(). The validate_change()
- * function will also prevent any changes to the cpu list if it is not
- * a superset of children's cpu lists.
+ * permitted when checked by validate_change().
  */
 static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd,
 					  struct cpumask *newmask,
Waiman Long Dec. 15, 2021, 5:59 p.m. UTC | #3
On 12/15/21 07:23, Michal Koutný wrote:
> On Mon, Dec 13, 2021 at 10:41:23AM -1000, Tejun Heo <tj@kernel.org> wrote:
>>> To address this issue, the check is now removed for the default hierarchy
>>> to free parent cpusets from being restricted by child cpusets. The
>>> check will still apply for legacy hierarchy.
> I'm trying to find whether something in update_cpumasks_hier() ensures
> the constraint is checkd on the legacy hierarchy but it seems to me this
> baby was thrown out with the bathwater. How is the legacy check still
> applied?
Yes, you are right. I did remove the check for legacy hierarchy too.
>> Applied to cgroup/for-5.17.
> It comes out a bit more complex if I want to achieve both variants in
> the below followup:
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 0dd7d853ed17..8b6e06f504f6 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -590,6 +590,35 @@ static inline void free_cpuset(struct cpuset *cs)
>   	kfree(cs);
>   }
>   
> +/*
> + * validate_change_legacy() - Validate conditions specific to legacy (v1)
> + *                            behavior.
> + */
> +static int validate_change_legacy(struct cpuset *cur, struct cpuset *trial)
> +{
> +	struct cgroup_subsys_state *css;
> +	struct cpuset *c, *par;
> +	int ret;
> +
> +	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	/* Each of our child cpusets must be a subset of us */
> +	ret = -EBUSY;
> +	cpuset_for_each_child(c, css, cur)
> +		if (!is_cpuset_subset(c, trial))
> +			goto out;
> +
> +	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
> +	ret = -EACCES;
> +	par = parent_cs(cur);
> +	if (par && !is_cpuset_subset(trial, par))
> +		goto out;
> +
> +	ret = 0;
> +out:
> +	return ret;
> +}
> +
>   /*
>    * validate_change() - Used to validate that any proposed cpuset change
>    *		       follows the structural rules for cpusets.
> @@ -614,20 +643,21 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>   {
>   	struct cgroup_subsys_state *css;
>   	struct cpuset *c, *par;
> -	int ret;
> -
> -	/* The checks don't apply to root cpuset */
> -	if (cur == &top_cpuset)
> -		return 0;
> +	int ret = 0;
>   
>   	rcu_read_lock();
> -	par = parent_cs(cur);
>   
> -	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
> -	ret = -EACCES;
> -	if (!is_in_v2_mode() && !is_cpuset_subset(trial, par))

I think you still need to guard it with "!is_in_v2_mode()".

         if (!is_in_v2_mode()) {
                 ret = validate_change_legacy(cur, trial);
                 if (ret)
                         goto out;
         }

> +	ret = validate_change_legacy(cur, trial);
> +	if (ret)
> +		goto out;
> +
> +	/* Remaining checks don't apply to root cpuset */
> +	ret = 0;
> +	if (cur == &top_cpuset)
>   		goto out;
>   
> +	par = parent_cs(cur);
> +
>   	/*
>   	 * If either I or some sibling (!= me) is exclusive, we can't
>   	 * overlap
Cheers,
Longman
Waiman Long Dec. 17, 2021, 4:34 p.m. UTC | #4
On 12/17/21 10:48, Michal Koutný wrote:
> The commit 1f1562fcd04a ("cgroup/cpuset: Don't let child cpusets
> restrict parent in default hierarchy") inteded to relax the check only
> on the default hierarchy (or v2 mode) but it dropped the check in v1
> too.
>
> This patch returns and separates the legacy-only validations so that
> they can be considered only in the v1 mode, which should enforce the old
> constraints for the sake of compatibility.
>
> Fixes: 1f1562fcd04a ("cgroup/cpuset: Don't let child cpusets restrict parent in default hierarchy")
> Suggested-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>   kernel/cgroup/cpuset.c | 52 ++++++++++++++++++++++++++++++++----------
>   1 file changed, 40 insertions(+), 12 deletions(-)
>
> This is formatted as a separate patch fixing the already queued change in
> for-5.17 but it can be eventually squashed into the referenced commit AFAIAC.
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 0dd7d853ed17..ce6929ddc0b0 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -590,6 +590,35 @@ static inline void free_cpuset(struct cpuset *cs)
>   	kfree(cs);
>   }
>   
> +/*
> + * validate_change_legacy() - Validate conditions specific to legacy (v1)
> + *                            behavior.
> + */
> +static int validate_change_legacy(struct cpuset *cur, struct cpuset *trial)
> +{
> +	struct cgroup_subsys_state *css;
> +	struct cpuset *c, *par;
> +	int ret;
> +
> +	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	/* Each of our child cpusets must be a subset of us */
> +	ret = -EBUSY;
> +	cpuset_for_each_child(c, css, cur)
> +		if (!is_cpuset_subset(c, trial))
> +			goto out;
> +
> +	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
> +	ret = -EACCES;
> +	par = parent_cs(cur);
> +	if (par && !is_cpuset_subset(trial, par))
> +		goto out;
> +
> +	ret = 0;
> +out:
> +	return ret;
> +}
> +
>   /*
>    * validate_change() - Used to validate that any proposed cpuset change
>    *		       follows the structural rules for cpusets.
> @@ -614,20 +643,21 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>   {
>   	struct cgroup_subsys_state *css;
>   	struct cpuset *c, *par;
> -	int ret;
> -
> -	/* The checks don't apply to root cpuset */
> -	if (cur == &top_cpuset)
> -		return 0;
> +	int ret = 0;
>   
>   	rcu_read_lock();
> -	par = parent_cs(cur);
>   
> -	/* On legacy hierarchy, we must be a subset of our parent cpuset. */
> -	ret = -EACCES;
> -	if (!is_in_v2_mode() && !is_cpuset_subset(trial, par))
> +	if (!is_in_v2_mode())
> +		ret = validate_change_legacy(cur, trial);
> +	if (ret)
> +		goto out;
> +
> +	/* Remaining checks don't apply to root cpuset */
> +	if (cur == &top_cpuset)
>   		goto out;
>   
> +	par = parent_cs(cur);
> +
>   	/*
>   	 * If either I or some sibling (!= me) is exclusive, we can't
>   	 * overlap
> @@ -1175,9 +1205,7 @@ enum subparts_cmd {
>    *
>    * Because of the implicit cpu exclusive nature of a partition root,
>    * cpumask changes that violates the cpu exclusivity rule will not be
> - * permitted when checked by validate_change(). The validate_change()
> - * function will also prevent any changes to the cpu list if it is not
> - * a superset of children's cpu lists.
> + * permitted when checked by validate_change().
>    */
>   static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd,
>   					  struct cpumask *newmask,

Thanks for addressing this issue.

Reviewed-by: Waiman Long <longman@redhat.com>
diff mbox series

Patch

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index d0e163a02099..0dd7d853ed17 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -616,19 +616,11 @@  static int validate_change(struct cpuset *cur, struct cpuset *trial)
 	struct cpuset *c, *par;
 	int ret;
 
-	rcu_read_lock();
-
-	/* Each of our child cpusets must be a subset of us */
-	ret = -EBUSY;
-	cpuset_for_each_child(c, css, cur)
-		if (!is_cpuset_subset(c, trial))
-			goto out;
-
-	/* Remaining checks don't apply to root cpuset */
-	ret = 0;
+	/* The checks don't apply to root cpuset */
 	if (cur == &top_cpuset)
-		goto out;
+		return 0;
 
+	rcu_read_lock();
 	par = parent_cs(cur);
 
 	/* On legacy hierarchy, we must be a subset of our parent cpuset. */