Message ID | 20220726180502.2932619-1-jstultz@google.com |
---|---|
State | New |
Headers | show |
Series | cyclictest: Fix threads being affined even when -a isn't set | expand |
On Tue, 26 Jul 2022, John Stultz wrote: > Using cyclictest without specifying affinity via -a, I was > noticing a strange issue where the rt threads where not > migrating when being blocked. > > After lots of debugging in the kernel, I found its actually an > issue with cyclictest. > > When using -t there is no behavioral difference between specifying > -a or not specifying -a. > > This can be confirmed by adding printf messages around the > pthread_setaffinity_np() call in the threadtest function. > > Currently: > > root@localhost:~/rt-tests# ./cyclictest -t -a -q -D1 > Affining thread 0 to cpu: 0 > Affining thread 1 to cpu: 1 > Affining thread 2 to cpu: 2 > Affining thread 3 to cpu: 3 > Affining thread 4 to cpu: 4 > Affining thread 5 to cpu: 5 > Affining thread 7 to cpu: 7 > Affining thread 6 to cpu: 6 > T: 0 (15034) P: 0 I:1000 C: 1000 Min: 82 Act: 184 Avg: 180 Max: 705 > ... > > root@localhost:~/rt-tests# ./cyclictest -t -q -D1 > Affining thread 0 to cpu: 0 > Affining thread 1 to cpu: 1 > Affining thread 2 to cpu: 2 > Affining thread 3 to cpu: 3 > Affining thread 4 to cpu: 4 > Affining thread 5 to cpu: 5 > Affining thread 6 to cpu: 6 > Affining thread 7 to cpu: 7 > T: 0 (15044) P: 0 I:1000 C: 1000 Min: 74 Act: 144 Avg: 162 Max: 860 > .. > > This issue seems to come from the logic in process_options(): > /* if smp wasn't requested, test for numa automatically */ > if (!smp) { > numa = numa_initialize(); > if (setaffinity == AFFINITY_UNSPECIFIED) > setaffinity = AFFINITY_USEALL; > } > > Here, by setting setaffinity = AFFINITY_USEALL, we effectively > pin each thread to its respective cpu, same as the "-a" option. > > This was most recently introduced in commit bdb8350f1b0b > ("Revert "cyclictest: Use affinity_mask for steering > thread placement""). > > This seems erronious to me, so I wanted to share this patch > which removes the overriding AFFINITY_UNSPECIFIED with > AFFINITY_USEALL by default. > > With this patch, we no longer call pthread_setaffinity_np() in the > "./cyclictest -t -q -D1" case. > > Cc: John Kacur <jkacur@redhat.com> > Cc: Connor O'Brien <connoro@google.com> > Cc: Qais Yousef <qais.yousef@arm.com> > Signed-off-by: John Stultz <jstultz@google.com> > --- > src/cyclictest/cyclictest.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c > index decea78..02f32f6 100644 > --- a/src/cyclictest/cyclictest.c > +++ b/src/cyclictest/cyclictest.c > @@ -1270,8 +1270,6 @@ static void process_options(int argc, char *argv[], int max_cpus) > /* if smp wasn't requested, test for numa automatically */ > if (!smp) { > numa = numa_initialize(); > - if (setaffinity == AFFINITY_UNSPECIFIED) > - setaffinity = AFFINITY_USEALL; > } > > if (option_affinity) { > -- > 2.37.1.359.gd136c6c3e2-goog > > Okay, I see what's happening. We removed the numa flag to automate if the user doesn't specify smi, then we use numa if we detect that it is available and in that case we assume AFFINITY_USEALL. Part of this was an attempt to reduce the alphabet soup that the cyclictest commandline was becoming. However, I see your point, if a person doesn't ask for affinity, then, it shouldn't be assumed. Your patch of course doesn't work as it is, because we'll end-up passing a cpu=-1 value to rt_numa_numa_node_of_cpu(int cpu) resulting in FATAL: invalid cpu passed to numa_node_of_cpu(-1) but it illustrated your point. I'm off for a few days, but I'll tackle this when I come back on. Thanks John
On Thu, Jul 28, 2022 at 7:48 AM John Kacur <jkacur@redhat.com> wrote: > On Tue, 26 Jul 2022, John Stultz wrote: > > diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c > > index decea78..02f32f6 100644 > > --- a/src/cyclictest/cyclictest.c > > +++ b/src/cyclictest/cyclictest.c > > @@ -1270,8 +1270,6 @@ static void process_options(int argc, char *argv[], int max_cpus) > > /* if smp wasn't requested, test for numa automatically */ > > if (!smp) { > > numa = numa_initialize(); > > - if (setaffinity == AFFINITY_UNSPECIFIED) > > - setaffinity = AFFINITY_USEALL; > > } > > > > if (option_affinity) { > > -- > > 2.37.1.359.gd136c6c3e2-goog > > > > > > Okay, I see what's happening. We removed the numa flag to automate if > the user doesn't specify smi, then we use numa if we detect that it is > available and in that case we assume AFFINITY_USEALL. Part of this was an > attempt to reduce the alphabet soup that the cyclictest commandline was > becoming. However, I see your point, if a person doesn't ask for affinity, > then, it shouldn't be assumed. > > Your patch of course doesn't work as it is, because we'll end-up passing a > cpu=-1 value to rt_numa_numa_node_of_cpu(int cpu) resulting in > FATAL: invalid cpu passed to numa_node_of_cpu(-1) > but it illustrated your point. Ah. Apologies! Interestingly I didn't see this on the device I was testing with, but I can reproduce it on another system. > I'm off for a few days, but I'll tackle this when I come back on. I've got an approach that in the numa case, has a separate node_cpu value which is normally cpu but if cpu is -1 it will use the result of cpu_for_thread_ua(). That should at least match previous behavior with regards to the numa allocation logic. I'll send that out in a second here and you can take or tweak it as you feel best when you're back. thanks -john
diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c index decea78..02f32f6 100644 --- a/src/cyclictest/cyclictest.c +++ b/src/cyclictest/cyclictest.c @@ -1270,8 +1270,6 @@ static void process_options(int argc, char *argv[], int max_cpus) /* if smp wasn't requested, test for numa automatically */ if (!smp) { numa = numa_initialize(); - if (setaffinity == AFFINITY_UNSPECIFIED) - setaffinity = AFFINITY_USEALL; } if (option_affinity) {
Using cyclictest without specifying affinity via -a, I was noticing a strange issue where the rt threads where not migrating when being blocked. After lots of debugging in the kernel, I found its actually an issue with cyclictest. When using -t there is no behavioral difference between specifying -a or not specifying -a. This can be confirmed by adding printf messages around the pthread_setaffinity_np() call in the threadtest function. Currently: root@localhost:~/rt-tests# ./cyclictest -t -a -q -D1 Affining thread 0 to cpu: 0 Affining thread 1 to cpu: 1 Affining thread 2 to cpu: 2 Affining thread 3 to cpu: 3 Affining thread 4 to cpu: 4 Affining thread 5 to cpu: 5 Affining thread 7 to cpu: 7 Affining thread 6 to cpu: 6 T: 0 (15034) P: 0 I:1000 C: 1000 Min: 82 Act: 184 Avg: 180 Max: 705 ... root@localhost:~/rt-tests# ./cyclictest -t -q -D1 Affining thread 0 to cpu: 0 Affining thread 1 to cpu: 1 Affining thread 2 to cpu: 2 Affining thread 3 to cpu: 3 Affining thread 4 to cpu: 4 Affining thread 5 to cpu: 5 Affining thread 6 to cpu: 6 Affining thread 7 to cpu: 7 T: 0 (15044) P: 0 I:1000 C: 1000 Min: 74 Act: 144 Avg: 162 Max: 860 .. This issue seems to come from the logic in process_options(): /* if smp wasn't requested, test for numa automatically */ if (!smp) { numa = numa_initialize(); if (setaffinity == AFFINITY_UNSPECIFIED) setaffinity = AFFINITY_USEALL; } Here, by setting setaffinity = AFFINITY_USEALL, we effectively pin each thread to its respective cpu, same as the "-a" option. This was most recently introduced in commit bdb8350f1b0b ("Revert "cyclictest: Use affinity_mask for steering thread placement""). This seems erronious to me, so I wanted to share this patch which removes the overriding AFFINITY_UNSPECIFIED with AFFINITY_USEALL by default. With this patch, we no longer call pthread_setaffinity_np() in the "./cyclictest -t -q -D1" case. Cc: John Kacur <jkacur@redhat.com> Cc: Connor O'Brien <connoro@google.com> Cc: Qais Yousef <qais.yousef@arm.com> Signed-off-by: John Stultz <jstultz@google.com> --- src/cyclictest/cyclictest.c | 2 -- 1 file changed, 2 deletions(-)