Message ID | 20180228121458.2230-1-mark.rutland@arm.com |
---|---|
State | New |
Headers | show |
Series | Detect early free of a live mm | expand |
Ugh, I messed up Peter's email when sending this out. please s/infraded/infradead/ if replying to the first mail. Sorry about that. Mark. On Wed, Feb 28, 2018 at 12:14:58PM +0000, Mark Rutland wrote: > KASAN splats indicate that in some cases we free a live mm, then > continue to access it, with potentially disastrous results. This is > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > the culprit remains elusive. > > Let's have __mmdrop() verify that the mm isn't live for the current > task, similar to the existing check for init_mm. This way, we can catch > this class of issue earlier, and without requiring KASAN. > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Peter Zijlstra <peterz@infraded.org> > Cc: Rik van Riel <riel@redhat.com> > Cc: Will Deacon <will.deacon@arm.com> > --- > kernel/fork.c | 2 ++ > 1 file changed, 2 insertions(+) > > Hi, > > For context, we're seeing an intermittent use-after-free of an mm on > arm64 [1], where it looks like an mm has been freed earlier than > expected. So far KASAN has only caught legitimate mmdrop() uses, where > mm_count is presumably already bogus. > > Mark. > > [1] https://lkml.kernel.org/r/20180214120254.qq4w4s42ecxio7lu@lakrids.cambridge.arm.com > > diff --git a/kernel/fork.c b/kernel/fork.c > index e5d9d405ae4e..6922d93551b8 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -595,6 +595,8 @@ static void check_mm(struct mm_struct *mm) > void __mmdrop(struct mm_struct *mm) > { > BUG_ON(mm == &init_mm); > + BUG_ON(mm == current->mm); > + BUG_ON(mm == current->active_mm); > mm_free_pgd(mm); > destroy_context(mm); > hmm_mm_destroy(mm); > -- > 2.11.0 >
On Wed 28-02-18 12:18:10, Mark Rutland wrote: > Ugh, I messed up Peter's email when sending this out. please > s/infraded/infradead/ if replying to the first mail. > > Sorry about that. > > Mark. > > On Wed, Feb 28, 2018 at 12:14:58PM +0000, Mark Rutland wrote: > > KASAN splats indicate that in some cases we free a live mm, then > > continue to access it, with potentially disastrous results. This is > > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > > the culprit remains elusive. > > > > Let's have __mmdrop() verify that the mm isn't live for the current > > task, similar to the existing check for init_mm. This way, we can catch > > this class of issue earlier, and without requiring KASAN. Wouldn't it be better to check the mm_users count instead? {VM_}BUG_ON(atomic_read(&mm->mm_users)); Relying on current->mm resetting works currently but this is quite a subtle dependency. -- Michal Hocko SUSE Labs
On Thu, Mar 01, 2018 at 10:35:22AM +0100, Michal Hocko wrote: > On Wed 28-02-18 12:18:10, Mark Rutland wrote: > > Ugh, I messed up Peter's email when sending this out. please > > s/infraded/infradead/ if replying to the first mail. > > > > Sorry about that. > > > > Mark. > > > > On Wed, Feb 28, 2018 at 12:14:58PM +0000, Mark Rutland wrote: > > > KASAN splats indicate that in some cases we free a live mm, then > > > continue to access it, with potentially disastrous results. This is > > > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > > > the culprit remains elusive. > > > > > > Let's have __mmdrop() verify that the mm isn't live for the current > > > task, similar to the existing check for init_mm. This way, we can catch > > > this class of issue earlier, and without requiring KASAN. > > Wouldn't it be better to check the mm_users count instead? > {VM_}BUG_ON(atomic_read(&mm->mm_users)); Perhaps, but that won't catch a mismatched mmput(), which could decrement mm_users (and mm_count) to zero early, before current->mm is cleared in exit_mm(). Locally, I'm testing with: BUG_ON(mm == current->mm); BUG_ON(mm == current->active_mm); BUG_ON(refcount_read(&mm->mm_users) != 0); BUG_ON(refcount_read(&mm->mm_count) != 0); ... so as to also catch an early free from another thread. > Relying on current->mm resetting works currently but this is quite a > subtle dependency. Do you mean because it only catches an early free in a thread using that mm, or is there another subtlety you had in mind? Thanks, Mark.
On Thu 01-03-18 11:22:37, Mark Rutland wrote: > On Thu, Mar 01, 2018 at 10:35:22AM +0100, Michal Hocko wrote: > > On Wed 28-02-18 12:18:10, Mark Rutland wrote: > > > Ugh, I messed up Peter's email when sending this out. please > > > s/infraded/infradead/ if replying to the first mail. > > > > > > Sorry about that. > > > > > > Mark. > > > > > > On Wed, Feb 28, 2018 at 12:14:58PM +0000, Mark Rutland wrote: > > > > KASAN splats indicate that in some cases we free a live mm, then > > > > continue to access it, with potentially disastrous results. This is > > > > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > > > > the culprit remains elusive. > > > > > > > > Let's have __mmdrop() verify that the mm isn't live for the current > > > > task, similar to the existing check for init_mm. This way, we can catch > > > > this class of issue earlier, and without requiring KASAN. > > > > Wouldn't it be better to check the mm_users count instead? > > {VM_}BUG_ON(atomic_read(&mm->mm_users)); > > Perhaps, but that won't catch a mismatched mmput(), which could > decrement mm_users (and mm_count) to zero early, before current->mm is > cleared in exit_mm(). true > Locally, I'm testing with: > > BUG_ON(mm == current->mm); > BUG_ON(mm == current->active_mm); > BUG_ON(refcount_read(&mm->mm_users) != 0); > BUG_ON(refcount_read(&mm->mm_count) != 0); > > ... so as to also catch an early free from another thread. I would be careful about active_mm. At least {un}use_mm does some tricks with it. > > Relying on current->mm resetting works currently but this is quite a > > subtle dependency. > > Do you mean because it only catches an early free in a thread using that > mm, or is there another subtlety you had in mind? No, I was more thinking about future changes when we stop clearing tsk->mm at exit. This is rather unlikely to be honest because that requires more changes. -- Michal Hocko SUSE Labs
On Thu, Mar 01, 2018 at 01:40:24PM +0100, Michal Hocko wrote: > On Thu 01-03-18 11:22:37, Mark Rutland wrote: > > On Thu, Mar 01, 2018 at 10:35:22AM +0100, Michal Hocko wrote: > > > On Wed 28-02-18 12:18:10, Mark Rutland wrote: > > > > Ugh, I messed up Peter's email when sending this out. please > > > > s/infraded/infradead/ if replying to the first mail. > > > > > > > > Sorry about that. > > > > > > > > Mark. > > > > > > > > On Wed, Feb 28, 2018 at 12:14:58PM +0000, Mark Rutland wrote: > > > > > KASAN splats indicate that in some cases we free a live mm, then > > > > > continue to access it, with potentially disastrous results. This is > > > > > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > > > > > the culprit remains elusive. > > > > > > > > > > Let's have __mmdrop() verify that the mm isn't live for the current > > > > > task, similar to the existing check for init_mm. This way, we can catch > > > > > this class of issue earlier, and without requiring KASAN. > > > > > > Wouldn't it be better to check the mm_users count instead? > > > {VM_}BUG_ON(atomic_read(&mm->mm_users)); > > > > Perhaps, but that won't catch a mismatched mmput(), which could > > decrement mm_users (and mm_count) to zero early, before current->mm is > > cleared in exit_mm(). > > true > > > Locally, I'm testing with: > > > > BUG_ON(mm == current->mm); > > BUG_ON(mm == current->active_mm); > > BUG_ON(refcount_read(&mm->mm_users) != 0); > > BUG_ON(refcount_read(&mm->mm_count) != 0); > > > > ... so as to also catch an early free from another thread. > > I would be careful about active_mm. At least {un}use_mm does some tricks > with it. If that can leave mm == current->active_mm during __mmdrop(), I think that would be a bug in its own right, since that would leave current->active_mm pointing at a freed mm. AFAICT, use_mm() does the usual mm_count balancing when flipping active_mm (i.e. we mmgrab() the new mm, make it active_mm, then mmdrop() the old active_mm), so even if the mmdrop() ends up freeing the old active_mm, that shouldn't trigger the BUG_ON(mm == current->active_mm). In unuse_mm() we don't poke active_mm at all, and leave that to the usual context_switch() + finish_task_switch() grab/drop. > > > Relying on current->mm resetting works currently but this is quite a > > > subtle dependency. > > > > Do you mean because it only catches an early free in a thread using that > > mm, or is there another subtlety you had in mind? > > No, I was more thinking about future changes when we stop clearing > tsk->mm at exit. This is rather unlikely to be honest because that > requires more changes. Ok. Thanks, Mark.
On Wed, 28 Feb 2018 12:14:58 +0000 Mark Rutland <mark.rutland@arm.com> wrote: > KASAN splats indicate that in some cases we free a live mm, then > continue to access it, with potentially disastrous results. This is > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > the culprit remains elusive. > > Let's have __mmdrop() verify that the mm isn't live for the current > task, similar to the existing check for init_mm. This way, we can catch > this class of issue earlier, and without requiring KASAN. Presumably the results usually aren't disastrous. But they will be if we go and add BUG_ON()s! Can we make this WARN_ON[_ONCE]()? We should still get the same info from testers.
On Thu, Mar 01, 2018 at 05:16:09PM -0800, Andrew Morton wrote: > On Wed, 28 Feb 2018 12:14:58 +0000 Mark Rutland <mark.rutland@arm.com> wrote: > > > KASAN splats indicate that in some cases we free a live mm, then > > continue to access it, with potentially disastrous results. This is > > likely due to a mismatched mmdrop() somewhere in the kernel, but so far > > the culprit remains elusive. > > > > Let's have __mmdrop() verify that the mm isn't live for the current > > task, similar to the existing check for init_mm. This way, we can catch > > this class of issue earlier, and without requiring KASAN. > > Presumably the results usually aren't disastrous. It seems so. > But they will be if we go and add BUG_ON()s! Can we make this > WARN_ON[_ONCE]()? We should still get the same info from testers. I fuzz the kernel with panic_on_warn=1, so that's practically the same for me. I'll respin to that effect. Thanks, Mark.
On Sat, Mar 10, 2018 at 05:11:39AM +0800, kernel test robot wrote: > FYI, we noticed the following commit (built with gcc-7): > > commit: 94d3a254089a7cd4f11b7071b4323afd98eea0a6 ("Detect early free of a live mm") > url: https://github.com/0day-ci/linux/commits/Mark-Rutland/Detect-early-free-of-a-live-mm/20180303-144149 > [ 47.208935] kernel BUG at kernel/fork.c:599! > [ 47.210365] invalid opcode: 0000 [#1] SMP PTI > [ 47.211336] Modules linked in: > [ 47.212145] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-rc3-00001-g94d3a25 #1 > [ 47.213966] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 > [ 47.215869] RIP: 0010:__mmdrop+0x136/0x170 > [ 47.216866] RSP: 0018:ffffffff82803dd8 EFLAGS: 00010293 > [ 47.218160] RAX: ffffffff82818500 RBX: ffff880115770000 RCX: ffffffff810ae876 > [ 47.219758] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff880115770000 > [ 47.221306] RBP: ffffffff82803e00 R08: 0000000000000001 R09: 0000000000000000 > [ 47.223268] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82818500 > [ 47.224961] R13: ffffffff82a8ce20 R14: ffff88013ff534c0 R15: 00000000000003e7 > [ 47.226716] FS: 0000000000000000(0000) GS:ffff88013b200000(0000) knlGS:0000000000000000 > [ 47.228550] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 47.229884] CR2: 00007fbfc2cc0190 CR3: 0000000002812000 CR4: 00000000000006f0 > [ 47.231580] Call Trace: > [ 47.232144] idle_task_exit+0x53/0x60 Luckily this is a spurious warning. In idle_task_exit(), we switch to the init_mm, but leave active_mm stale before calling mmdrop(). In addition to the WARN_ON[_ONCE] changes, I'll drop the following in: diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e7c535eee0a6..0ef844abc2da 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5506,6 +5506,7 @@ void idle_task_exit(void) if (mm != &init_mm) { switch_mm(mm, &init_mm, current); + current->active_mm = &init_mm; finish_arch_post_lock_switch(); } mmdrop(mm); Thanks, Mark.
diff --git a/kernel/fork.c b/kernel/fork.c index e5d9d405ae4e..6922d93551b8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -595,6 +595,8 @@ static void check_mm(struct mm_struct *mm) void __mmdrop(struct mm_struct *mm) { BUG_ON(mm == &init_mm); + BUG_ON(mm == current->mm); + BUG_ON(mm == current->active_mm); mm_free_pgd(mm); destroy_context(mm); hmm_mm_destroy(mm);
KASAN splats indicate that in some cases we free a live mm, then continue to access it, with potentially disastrous results. This is likely due to a mismatched mmdrop() somewhere in the kernel, but so far the culprit remains elusive. Let's have __mmdrop() verify that the mm isn't live for the current task, similar to the existing check for init_mm. This way, we can catch this class of issue earlier, and without requiring KASAN. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infraded.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> --- kernel/fork.c | 2 ++ 1 file changed, 2 insertions(+) Hi, For context, we're seeing an intermittent use-after-free of an mm on arm64 [1], where it looks like an mm has been freed earlier than expected. So far KASAN has only caught legitimate mmdrop() uses, where mm_count is presumably already bogus. Mark. [1] https://lkml.kernel.org/r/20180214120254.qq4w4s42ecxio7lu@lakrids.cambridge.arm.com -- 2.11.0