diff mbox series

[Stable-8.1.1,11/34] softmmu: Assert data in bounds in iotlb_to_section

Message ID 20230909102747.346522-11-mjt@tls.msk.ru
State New
Headers show
Series None | expand

Commit Message

Michael Tokarev Sept. 9, 2023, 10:27 a.m. UTC
From: Richard Henderson <richard.henderson@linaro.org>

Acked-by: Alex Bennée <alex.bennee@linaro.org>
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Comments

Michael Tokarev Sept. 18, 2023, 9:19 a.m. UTC | #1
09.09.2023 13:27, Michael Tokarev wrote:
> From: Richard Henderson <richard.henderson@linaro.org>
> 
> Acked-by: Alex Bennée <alex.bennee@linaro.org>
> Suggested-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> (cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc)
> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
> 
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 3df73542e1..7597dc1c39 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -2413,9 +2413,15 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu,
>       int asidx = cpu_asidx_from_attrs(cpu, attrs);
>       CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
>       AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch);
> -    MemoryRegionSection *sections = d->map.sections;
> +    int section_index = index & ~TARGET_PAGE_MASK;
> +    MemoryRegionSection *ret;
> +
> +    assert(section_index < d->map.sections_nb);

This assert now triggers on staging-8.1

https://ci.debian.net/data/autopkgtest/testing/amd64/d/dropbear/37993610/log.gz
https://ci.debian.net/data/autopkgtest/testing/amd64/c/cryptsetup/37993606/log.gz

> +    ret = d->map.sections + section_index;
> +    assert(ret->mr);
> +    assert(ret->mr->ops);
>   
> -    return &sections[index & ~TARGET_PAGE_MASK];
> +    return ret;
>   }
>   
>   static void io_mem_init(void)

In this upload I removed softmmu-Use-async_run_on_cpu-in-tcg_commit.patch (0d58c660689f6da1),
and the test run uses tcg and -smp 4, which is the configuration which 0d58c6606
was supposed to fix.

qemu-system-x86_64 -no-user-config -nodefaults -name autopkgtest-cryptsetup-cryptroot-sysvinit \
  -machine type=q35,graphics=off -cpu qemu64,-svm,-vmx -smp cpus=4 -m size=2G \
  -vga none -display none -object rng....

I wonder if I should keep 0d58c6606 for 8.1.1 (the deadline is tomorrow)..

Thanks,

/mjt
Michael Tokarev Sept. 19, 2023, 7:23 a.m. UTC | #2
18.09.2023 12:19, Michael Tokarev wrote:
> 09.09.2023 13:27, Michael Tokarev wrote:
>> From: Richard Henderson <richard.henderson@linaro.org>
>>
>> Acked-by: Alex Bennée <alex.bennee@linaro.org>
>> Suggested-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> (cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc)
>> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
>>
>> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>> index 3df73542e1..7597dc1c39 100644
>> --- a/softmmu/physmem.c
>> +++ b/softmmu/physmem.c
>> @@ -2413,9 +2413,15 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu,
>>       int asidx = cpu_asidx_from_attrs(cpu, attrs);
>>       CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
>>       AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch);
>> -    MemoryRegionSection *sections = d->map.sections;
>> +    int section_index = index & ~TARGET_PAGE_MASK;
>> +    MemoryRegionSection *ret;
>> +
>> +    assert(section_index < d->map.sections_nb);
> 
> This assert now triggers on staging-8.1
> 
> https://ci.debian.net/data/autopkgtest/testing/amd64/d/dropbear/37993610/log.gz
> https://ci.debian.net/data/autopkgtest/testing/amd64/c/cryptsetup/37993606/log.gz
> 
>> +    ret = d->map.sections + section_index;
>> +    assert(ret->mr);
>> +    assert(ret->mr->ops);
>> -    return &sections[index & ~TARGET_PAGE_MASK];
>> +    return ret;
>>   }
>>   static void io_mem_init(void)
> 
> In this upload I removed softmmu-Use-async_run_on_cpu-in-tcg_commit.patch (0d58c660689f6da1),
> and the test run uses tcg and -smp 4, which is the configuration which 0d58c6606
> was supposed to fix.

So, should this change not be in 8.1.1 too (together with 0d58c6606),
or is it just the "messenger"?

Or both should go?

Today is the deadline day for 8.1.1.

Thanks!

/mjt
Alex Bennée Sept. 20, 2023, 9:23 a.m. UTC | #3
Michael Tokarev <mjt@tls.msk.ru> writes:

> 09.09.2023 13:27, Michael Tokarev wrote:
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Acked-by: Alex Bennée <alex.bennee@linaro.org>
>> Suggested-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> (cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc)
>> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
>> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>> index 3df73542e1..7597dc1c39 100644
>> --- a/softmmu/physmem.c
>> +++ b/softmmu/physmem.c
>> @@ -2413,9 +2413,15 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu,
>>       int asidx = cpu_asidx_from_attrs(cpu, attrs);
>>       CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
>>       AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch);
>> -    MemoryRegionSection *sections = d->map.sections;
>> +    int section_index = index & ~TARGET_PAGE_MASK;
>> +    MemoryRegionSection *ret;
>> +
>> +    assert(section_index < d->map.sections_nb);
>
> This assert now triggers on staging-8.1
>
> https://ci.debian.net/data/autopkgtest/testing/amd64/d/dropbear/37993610/log.gz
> https://ci.debian.net/data/autopkgtest/testing/amd64/c/cryptsetup/37993606/log.gz

The asserts are basically there to detect when we attempt to do a MR
lookup and it is not fully committed. If they are firing its because
things have gone wrong (which we know because we still have patches in
flight for the full fix).

The main benefit of the assert is we fail early rather than later on in
various cryptic ways (evidenced by the fact we raised 3 different bugs
which exhibited slightly different symptoms that where all fundamentally
caused by getting bogus memory region data).

>
>> +    ret = d->map.sections + section_index;
>> +    assert(ret->mr);
>> +    assert(ret->mr->ops);
>>   -    return &sections[index & ~TARGET_PAGE_MASK];
>> +    return ret;
>>   }
>>     static void io_mem_init(void)
>
> In this upload I removed softmmu-Use-async_run_on_cpu-in-tcg_commit.patch (0d58c660689f6da1),
> and the test run uses tcg and -smp 4, which is the configuration which 0d58c6606
> was supposed to fix.
>
> qemu-system-x86_64 -no-user-config -nodefaults -name autopkgtest-cryptsetup-cryptroot-sysvinit \
>  -machine type=q35,graphics=off -cpu qemu64,-svm,-vmx -smp cpus=4 -m size=2G \
>  -vga none -display none -object rng....
>
> I wonder if I should keep 0d58c6606 for 8.1.1 (the deadline is
> tomorrow)..

Unfortunately 0d58c is not the full fix, it papered over one crack but
revealed others. It might be leading to a false sense of security. So I
would argue:

  - keep the assert - better to fail early than to fail later in a hard
    to understand way
  - toss a coin for the 0d58c66 fix, if we include it we may end up
    reverting later once we have the "complete" fix but at least its
    slightly better for x86 while definitely breaking MIPS
    
>
> Thanks,
>
> /mjt
Alex Bennée Sept. 20, 2023, 3:04 p.m. UTC | #4
Michael Tokarev <mjt@tls.msk.ru> writes:

> 18.09.2023 12:19, Michael Tokarev wrote:
>> 09.09.2023 13:27, Michael Tokarev wrote:
>>> From: Richard Henderson <richard.henderson@linaro.org>
>>>
>>> Acked-by: Alex Bennée <alex.bennee@linaro.org>
>>> Suggested-by: Alex Bennée <alex.bennee@linaro.org>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> (cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc)
>>> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
>>>
>>> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>>> index 3df73542e1..7597dc1c39 100644
>>> --- a/softmmu/physmem.c
>>> +++ b/softmmu/physmem.c
>>> @@ -2413,9 +2413,15 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu,
>>>       int asidx = cpu_asidx_from_attrs(cpu, attrs);
>>>       CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
>>>       AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch);
>>> -    MemoryRegionSection *sections = d->map.sections;
>>> +    int section_index = index & ~TARGET_PAGE_MASK;
>>> +    MemoryRegionSection *ret;
>>> +
>>> +    assert(section_index < d->map.sections_nb);
>> This assert now triggers on staging-8.1
>> https://ci.debian.net/data/autopkgtest/testing/amd64/d/dropbear/37993610/log.gz
>> https://ci.debian.net/data/autopkgtest/testing/amd64/c/cryptsetup/37993606/log.gz
>> 
>>> +    ret = d->map.sections + section_index;
>>> +    assert(ret->mr);
>>> +    assert(ret->mr->ops);
>>> -    return &sections[index & ~TARGET_PAGE_MASK];
>>> +    return ret;
>>>   }
>>>   static void io_mem_init(void)
>> In this upload I removed
>> softmmu-Use-async_run_on_cpu-in-tcg_commit.patch (0d58c660689f6da1),
>> and the test run uses tcg and -smp 4, which is the configuration which 0d58c6606
>> was supposed to fix.
>
> So, should this change not be in 8.1.1 too (together with 0d58c6606),
> or is it just the "messenger"?

Sorry my previous reply was eaten by my MUA.

The main purpose of the asserts is to catch corruption to the Memory
Regions early so we don't see weird failures later on (c.f. the 3
separate bugs for crashes in slightly different places).

IOW is we are crashing on the asserts in this patch but it's booting
without it we are just getting lucky.

>
> Or both should go?
>
> Today is the deadline day for 8.1.1.
>
> Thanks!
>
> /mjt
Michael Tokarev Sept. 22, 2023, 2:43 p.m. UTC | #5
20.09.2023 18:04, Alex Bennée wrote:
..
> Sorry my previous reply was eaten by my MUA.

That happens.. especially when MUA becomes hungry :)

> The main purpose of the asserts is to catch corruption to the Memory
> Regions early so we don't see weird failures later on (c.f. the 3
> separate bugs for crashes in slightly different places).
> 
> IOW is we are crashing on the asserts in this patch but it's booting
> without it we are just getting lucky.

Thank you very much for this summary. Yeah, this confirms my thought,
but I wanted to be sure.  I left it in.

/mjt
Michael Tokarev Sept. 22, 2023, 8:21 p.m. UTC | #6
20.09.2023 12:23, Alex Bennée:
..
>> I wonder if I should keep 0d58c6606 for 8.1.1 (the deadline is
>> tomorrow)..
> 
> Unfortunately 0d58c is not the full fix, it papered over one crack but
> revealed others. It might be leading to a false sense of security. So I
> would argue:
> 
>    - keep the assert - better to fail early than to fail later in a hard
>      to understand way
>    - toss a coin for the 0d58c66 fix, if we include it we may end up
>      reverting later once we have the "complete" fix but at least its
>      slightly better for x86 while definitely breaking MIPS

Heh. I've read this email just now, way after 8.1.1 has been tagged and
the announcement sent.

I haven't included 0d58c66 for now, without tossing coins - just to be
on-par with 8.1.0, or else it is confusing at best (which stable releases
brings with new issues).

This whole thing is definitely worth a 8.1.2 once the fix is in.

Meanwhile I pushed qemu with 0d58c66 and the "always require can_do_io"
patchset to debian, - this one fixed all regressions so far.
https://salsa.debian.org/qemu-team/qemu/-/tree/debian/1%258.1.0+ds-6/debian/patches/always-can-do-io-1866
https://gitlab.com/mjt0k/qemu/-/commits/staging-8.1-always-require-can_do_io/

Thank you for the thoughts, much apprecated!

/mjt
diff mbox series

Patch

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..7597dc1c39 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2413,9 +2413,15 @@  MemoryRegionSection *iotlb_to_section(CPUState *cpu,
     int asidx = cpu_asidx_from_attrs(cpu, attrs);
     CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
     AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch);
-    MemoryRegionSection *sections = d->map.sections;
+    int section_index = index & ~TARGET_PAGE_MASK;
+    MemoryRegionSection *ret;
+
+    assert(section_index < d->map.sections_nb);
+    ret = d->map.sections + section_index;
+    assert(ret->mr);
+    assert(ret->mr->ops);
 
-    return &sections[index & ~TARGET_PAGE_MASK];
+    return ret;
 }
 
 static void io_mem_init(void)