mbox series

[v3,0/3] cpuset: Allow setscheduler regardless of manipulated task

Message ID 20230703172741.25392-1-mkoutny@suse.com
Headers show
Series cpuset: Allow setscheduler regardless of manipulated task | expand

Message

Michal Koutný July 3, 2023, 5:27 p.m. UTC
Changes in v3:
- only skip permissions check when effective affinity doesn't change
- update commit message accordingly

Changes in v2 (https://lore.kernel.org/r/20230630183908.32148-1-mkoutny@suse.com):
- rebased on mainline
- drop is_in_v2_mode()

Changes in v1 (https://lore.kernel.org/r/20230629091146.28801-1-mkoutny@suse.com):
- added selftests
- comments rewording

RFC in https://lore.kernel.org/r/20220623124944.2753-1-mkoutny@suse.com

Michal Koutný (3):
  cpuset: Allow setscheduler regardless of manipulated task
  selftests: cgroup: Minor code reorganizations
  selftests: cgroup: Add cpuset migrations testcase

 MAINTAINERS                                   |   2 +
 kernel/cgroup/cpuset.c                        |  19 +-
 tools/testing/selftests/cgroup/.gitignore     |   1 +
 tools/testing/selftests/cgroup/Makefile       |   2 +
 tools/testing/selftests/cgroup/cgroup_util.c  |   2 +
 tools/testing/selftests/cgroup/cgroup_util.h  |   2 +
 tools/testing/selftests/cgroup/test_core.c    |   2 +-
 tools/testing/selftests/cgroup/test_cpuset.c  | 275 ++++++++++++++++++
 .../selftests/cgroup/test_cpuset_prs.sh       |   2 +-
 9 files changed, 302 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/cgroup/test_cpuset.c


base-commit: e55e5df193d247a38a5e1ac65a5316a0adcc22fa

Comments

Waiman Long July 3, 2023, 6 p.m. UTC | #1
On 7/3/23 13:27, Michal Koutný wrote:
> When we migrate a task between two cgroups, one of the checks is a
> verification whether we can modify task's scheduler settings
> (cap_task_setscheduler()).
>
> An implicit migration occurs also when enabling a controller on the
> unified hierarchy (think of parent to child migration). The
> aforementioned check may be problematic if the caller of the migration
> (enabling a controller) has no permissions over migrated tasks.
> For instance, a user's cgroup that ends up running a process of a
> different user. Although cgroup permissions are configured favorably,
> the enablement fails due to the foreign process [1].
>
> Change the behavior by relaxing the permissions check on the unified
> hierarchy when no effective change would happen.
> This is in accordance with unified hierarchy attachment behavior when
> permissions of the source to target cgroups are decisive whereas the
> migrated task is opaque (as opposed to more restrictive check in
> __cgroup1_procs_write()).
>
> Notice that foreign task's affinity may still be modified if the user
> can modify destination cgroup's cpuset attributes
> (update_tasks_cpumask() does no permissions check). The permissions
> check could thus be skipped on v2 even when affinity changes. Stay
> conservative in this patch though.
>
> [1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649
>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>   kernel/cgroup/cpuset.c | 19 ++++++++++++++++---
>   1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 58e6f18f01c1..0a9b860844ca 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2487,6 +2487,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   	struct cgroup_subsys_state *css;
>   	struct cpuset *cs, *oldcs;
>   	struct task_struct *task;
> +	bool cpus_updated, mems_updated;
>   	int ret;
>   
>   	/* used later by cpuset_attach() */
> @@ -2501,13 +2502,25 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   	if (ret)
>   		goto out_unlock;
>   
> +	cpus_updated = !cpumask_equal(cs->effective_cpus, oldcs->effective_cpus);
> +	mems_updated = !nodes_equal(cs->effective_mems, oldcs->effective_mems);
> +
>   	cgroup_taskset_for_each(task, css, tset) {
>   		ret = task_can_attach(task);
>   		if (ret)
>   			goto out_unlock;
> -		ret = security_task_setscheduler(task);
> -		if (ret)
> -			goto out_unlock;
> +
> +		/*
> +		 * Skip rights over task check in v2 when nothing changes,
> +		 * migration permission derives from hierarchy ownership in
> +		 * cgroup_procs_write_permission()).
> +		 */
> +		if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) ||
> +		    (cpus_updated || mems_updated)) {
> +			ret = security_task_setscheduler(task);
> +			if (ret)
> +				goto out_unlock;
> +		}
>   
>   		if (dl_task(task)) {
>   			cs->nr_migrate_dl_tasks++;
Reviewed-by: Waiman Long <longman@redhat.com>

Thanks,
Longman
Tejun Heo July 10, 2023, 8:30 p.m. UTC | #2
Applied to cgroup/for-6.6.

Thanks.