mbox series

[PATCH-cgroup,v4,0/6] cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus

Message ID 20210811030607.13824-1-longman@redhat.com
Headers show
Series cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus | expand

Message

Waiman Long Aug. 11, 2021, 3:06 a.m. UTC
v4:
 - Rebased to the for-5.15 branch of cgroup git tree and dropped the
   first 3 patches of v3 series which have been merged.
 - Beside prohibiting violation of cpu exclusivity rule, allow arbitrary
   changes to cpuset.cpus of a partition root and force the partition root
   to become invalid in case any of the partition root constraints
   are violated. The documentation file and self test are modified
   accordingly.

v3:
 - Add two new patches (patches 2 & 3) to fix bugs found during the
   testing process.
 - Add a new patch to enable inotify event notification when partition
   become invalid.
 - Add a test to test event notification when partition become invalid.

v2:
 - Drop v1 patch 1.
 - Break out some cosmetic changes into a separate patch (patch #1).
 - Add a new patch to clarify the transition to invalid partition root
   is mainly caused by hotplug events.
 - Enhance the partition root state test including CPU online/offline
   behavior and fix issues found by the test.

This patchset makes four enhancements to the cpuset v2 code.

 Patch 1: Enable event notification on "cpuset.cpus.partition" whenever
 the state of a partition changes.

 Patch 2: Properly handle partition root tree and make partition
 invalid in case changes to cpuset.cpus violate any of the partition
 root constraints.

 Patch 3: Add a new partition state "isolated" to create a partition
 root without load balancing. This is for handling intermitten workloads
 that have a strict low latency requirement.

 Patch 4: Allow partition roots that are not the top cpuset to distribute
 all its cpus to child partitions as long as there is no task associated
 with that partition root. This allows more flexibility for middleware
 to manage multiple partitions.

Patch 5 updates the cgroup-v2.rst file accordingly. Patch 6 adds a new
cpuset test to test the new cpuset partition code.

Waiman Long (6):
  cgroup/cpuset: Enable event notification when partition state changes
  cgroup/cpuset: Properly handle partition root tree
  cgroup/cpuset: Add a new isolated cpus.partition type
  cgroup/cpuset: Allow non-top parent partition root to distribute out
    all CPUs
  cgroup/cpuset: Update description of cpuset.cpus.partition in
    cgroup-v2.rst
  kselftest/cgroup: Add cpuset v2 partition root state test

 Documentation/admin-guide/cgroup-v2.rst       | 104 +--
 kernel/cgroup/cpuset.c                        | 282 +++++---
 tools/testing/selftests/cgroup/Makefile       |   5 +-
 .../selftests/cgroup/test_cpuset_prs.sh       | 632 ++++++++++++++++++
 tools/testing/selftests/cgroup/wait_inotify.c |  86 +++
 5 files changed, 980 insertions(+), 129 deletions(-)
 create mode 100755 tools/testing/selftests/cgroup/test_cpuset_prs.sh
 create mode 100644 tools/testing/selftests/cgroup/wait_inotify.c

Comments

Tejun Heo Aug. 11, 2021, 6:15 p.m. UTC | #1
Hello,

On Tue, Aug 10, 2021 at 11:06:06PM -0400, Waiman Long wrote:
> +	Poll and inotify events are triggered whenever the state
> +	of "cpuset.cpus.partition" changes.  That includes changes
> +	caused by write to "cpuset.cpus.partition" and cpu hotplug.
> +	This will allow a user space agent to monitor changes caused
> +	by hotplug events.

It might be useful to emphasize that this is the primary mechanism to
signify errors and thus should always be monitored.

Thanks.
Waiman Long Aug. 11, 2021, 6:19 p.m. UTC | #2
On 8/11/21 2:15 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, Aug 10, 2021 at 11:06:06PM -0400, Waiman Long wrote:
>> +	Poll and inotify events are triggered whenever the state
>> +	of "cpuset.cpus.partition" changes.  That includes changes
>> +	caused by write to "cpuset.cpus.partition" and cpu hotplug.
>> +	This will allow a user space agent to monitor changes caused
>> +	by hotplug events.
> It might be useful to emphasize that this is the primary mechanism to
> signify errors and thus should always be monitored.

Sure, will do that in the next version.

Cheers,
Longman