Message ID | CALAqxLXLf_hB0xQ_vnBYoHyDF2-Bkh9GznLQK=ikYFdcw5u3WQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Fri, Aug 12, 2016 at 6:44 PM, Om Dhyade <odhyade@codeaurora.org> wrote: > Update from my tests: > Use-case: Android application launches. > > I tested the patches on android N build, i see max latency ~7ms. > In my tests, the wait is due to: copy_process(fork.c) blocks all threads in > __cgroup_procs_write including threads which are not part of the forking > process's thread-group. > > Dimtry had provided a hack patch which reverts to per-process rw-sem which > had max latency of ~2ms. > > android user-space binder library does 2 cgroup write operations per > transaction, apart from the copy_process(fork.c) wait, i see pre-emption in > _cgroup_procs_write causing waits. Hey Peter, Tejun, Oleg, So while you're tweaks for the percpu-rwsem have greatly helped the regression folks were seeing (many thanks, by the way), as noted above, the performance regression with the global lock compared to earlier kernels is still ~3x slower (though again, much better then the 80x slower that was seen earlier). So I was wondering if patches to go back to the per signal_struct locking would still be considered? Or is the global lock approach the only way forward? At a higher level, I'm worried that Android's use of cgroups as a priority enforcement mechanism is at odds with developers focusing on it as a container enforcement mechanism, as in the latter its not common for tasks to change between cgroups, but with the former priority adjustments are quite common. thanks -john
On Wed, Aug 24, 2016 at 2:30 PM, Tejun Heo <tj@kernel.org> wrote: > On Wed, Aug 24, 2016 at 02:16:52PM -0700, John Stultz wrote: >> >> So I was wondering if patches to go back to the per signal_struct >> locking would still be considered? Or is the global lock approach the >> only way forward? > > We can't simply revert but we can make the lock per signal_struct > again. It's just that it'd be quite a bit more complex (but, again, > if we need it...) and for cases where migrations aren't as frequent > percpu-rwsem would be at least a bit lower overhead. Can you please > test with the following patch applied just in case? > > https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/commit/?h=for-4.8-fixes&id=568ac888215c7fb2fabe8ea739b00ec3c1f5d440 That does provide a reasonable improvement in my testing! But, I'll pass it along to folks who are doing more in-depth performance analysis to make sure. If that doesn't work, I'll talk w/ Dmitry about submitting his patch reworking things back to the per signal_struct locking. >> At a higher level, I'm worried that Android's use of cgroups as a >> priority enforcement mechanism is at odds with developers focusing on >> it as a container enforcement mechanism, as in the latter its not >> common for tasks to change between cgroups, but with the former >> priority adjustments are quite common. > > It has been at odds as long as android existed. cgroup used to have > synchronous synchronize_rcu() in the migration path which android > kernel simply deleted (didn't break android's use case), re-labeling > every page's memcg ownership on foreground and background switches > (does it still do that? it's simply unworkable) and so on. Yea. Android folks can correct me, but I think the memcg forground/background bits have been dropped. They still use a apps memcg group for low-memory-notification in their userland-low-memory-killer-daemon (however my understanding is that has yet to sufficiently replace the in-kernel low-memory-killer). > I find it difficult to believe that android's requirements can be > satisfied only through moving processes around. cgroup usage model is > affected by container use cases but isn't limited to it. If android > is willing to change, I'd be more than happy to work with you and > solve obstacles. If not, we'll surely try not to break it anymore > than it has always been broken. I think folks are open to ideas, but going from idea-from-the-list to something reliable enough to ship is always a bit fraught, so it often takes some time and effort to really validate if those ideas are workable. I have on my list to re-submit the can_attach logic Android uses to provide permissions checks on processes migrating other tasks between cgroups, so at least we can try to reduce the separate domains of focus and get folks sharing the same code bases. thanks -john
On Wed, Aug 24, 2016 at 2:30 PM, Tejun Heo <tj@kernel.org> wrote: > Hello, John. > > On Wed, Aug 24, 2016 at 02:16:52PM -0700, John Stultz wrote: >> Hey Peter, Tejun, Oleg, >> So while you're tweaks for the percpu-rwsem have greatly helped the >> regression folks were seeing (many thanks, by the way), as noted >> above, the performance regression with the global lock compared to >> earlier kernels is still ~3x slower (though again, much better then >> the 80x slower that was seen earlier). >> >> So I was wondering if patches to go back to the per signal_struct >> locking would still be considered? Or is the global lock approach the >> only way forward? > > We can't simply revert but we can make the lock per signal_struct > again. It's just that it'd be quite a bit more complex (but, again, > if we need it...) and for cases where migrations aren't as frequent > percpu-rwsem would be at least a bit lower overhead. Can you please > test with the following patch applied just in case? > > https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/commit/?h=for-4.8-fixes&id=568ac888215c7fb2fabe8ea739b00ec3c1f5d440 Hey! Good news. This patch along with Peter's locking changes pushes the latencies down to an apparently acceptable level! Many thanks for the pointer! thanks -john
diff --git a/kernel/cgroup.c b/kernel/cgroup.c index db27804..9e9200b 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -5394,6 +5394,8 @@ int __init cgroup_init(void) BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files)); + rcu_sync_enter(&cgroup_threadgroup_rwsem.rss); + mutex_lock(&cgroup_mutex); /* Add init_css_set to the hash table */