diff mbox series

accel/tcg/translate-all: expand cpu_restore_state retaddr check

Message ID 20171107165226.22546-1-alex.bennee@linaro.org
State New
Headers show
Series accel/tcg/translate-all: expand cpu_restore_state retaddr check | expand

Commit Message

Alex Bennée Nov. 7, 2017, 4:52 p.m. UTC
We are still seeing signals during translation time when we walk over
a page protection boundary. This expands the check to ensure the
retaddr is inside the code generation buffer. The original suggestion
was to check versus tcg_ctx.code_gen_ptr but as we now segment the
translation buffer we have to settle for just a general check for
being inside.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
---
 accel/tcg/translate-all.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

-- 
2.14.2

Comments

Peter Maydell Nov. 7, 2017, 5:04 p.m. UTC | #1
On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote:
> We are still seeing signals during translation time when we walk over

> a page protection boundary. This expands the check to ensure the

> retaddr is inside the code generation buffer. The original suggestion

> was to check versus tcg_ctx.code_gen_ptr but as we now segment the

> translation buffer we have to settle for just a general check for

> being inside.

>

> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

> Reported-by: Peter Maydell <peter.maydell@linaro.org>

> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

> Cc: Richard Henderson <rth@twiddle.net>

> ---

>  accel/tcg/translate-all.c | 20 ++++++++++++--------

>  1 file changed, 12 insertions(+), 8 deletions(-)

>

> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c

> index 34c5e28d07..eb255af402 100644

> --- a/accel/tcg/translate-all.c

> +++ b/accel/tcg/translate-all.c

> @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)

>      TranslationBlock *tb;

>      bool r = false;

>

> -    /* A retaddr of zero is invalid so we really shouldn't have ended

> -     * up here. The target code has likely forgotten to check retaddr

> -     * != 0 before attempting to restore state. We return early to

> -     * avoid blowing up on a recursive tb_lock(). The target must have

> -     * previously survived a failed cpu_restore_state because

> -     * tb_find_pc(0) would have failed anyway. It still should be

> -     * fixed though.

> +    /* The retaddr has to be in the region of current code buffer. If

> +     * it's not we will not be able to resolve it here. If it is zero

> +     * the calling code has likely forgotten to check retaddr before

> +     * calling here.


This part of the comment isn't correct -- it's entirely expected
that we will get here with a zero retaddr, because that is
how the rest of the code tells this function "no state restoration
required".

> If it is not in the translated code we could be

> +     * faulting during translation itself.

> +     *

> +     * Either way we need return early to avoid blowing up on a

> +     * recursive tb_lock() as we can't resolve it here.

>       */

>

> -    if (!retaddr) {

> +    if (!retaddr ||

> +        (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) ||

> +        (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer +

> +                                tcg_init_ctx.code_gen_buffer_size))) {

>          return r;

>      }


thanks
-- PMM
Alex Bennée Nov. 7, 2017, 6:45 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote:

>> We are still seeing signals during translation time when we walk over

>> a page protection boundary. This expands the check to ensure the

>> retaddr is inside the code generation buffer. The original suggestion

>> was to check versus tcg_ctx.code_gen_ptr but as we now segment the

>> translation buffer we have to settle for just a general check for

>> being inside.

>>

>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

>> Reported-by: Peter Maydell <peter.maydell@linaro.org>

>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

>> Cc: Richard Henderson <rth@twiddle.net>

>> ---

>>  accel/tcg/translate-all.c | 20 ++++++++++++--------

>>  1 file changed, 12 insertions(+), 8 deletions(-)

>>

>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c

>> index 34c5e28d07..eb255af402 100644

>> --- a/accel/tcg/translate-all.c

>> +++ b/accel/tcg/translate-all.c

>> @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)

>>      TranslationBlock *tb;

>>      bool r = false;

>>

>> -    /* A retaddr of zero is invalid so we really shouldn't have ended

>> -     * up here. The target code has likely forgotten to check retaddr

>> -     * != 0 before attempting to restore state. We return early to

>> -     * avoid blowing up on a recursive tb_lock(). The target must have

>> -     * previously survived a failed cpu_restore_state because

>> -     * tb_find_pc(0) would have failed anyway. It still should be

>> -     * fixed though.

>> +    /* The retaddr has to be in the region of current code buffer. If

>> +     * it's not we will not be able to resolve it here. If it is zero

>> +     * the calling code has likely forgotten to check retaddr before

>> +     * calling here.

>

> This part of the comment isn't correct -- it's entirely expected

> that we will get here with a zero retaddr, because that is

> how the rest of the code tells this function "no state restoration

> required".


Then why call cpu_restore_state at all? We should be consistent as there
are plenty of places that do things like:

    if (pc) {
        /* now we have a real cpu fault */
        cpu_restore_state(cs, pc);
    }

I'm happy to make a 0 retaddr officially valid and actually document it
in exec-all.h. It's not like most callers even bother checking the
return code.

>

>> If it is not in the translated code we could be

>> +     * faulting during translation itself.

>> +     *

>> +     * Either way we need return early to avoid blowing up on a

>> +     * recursive tb_lock() as we can't resolve it here.

>>       */

>>

>> -    if (!retaddr) {

>> +    if (!retaddr ||

>> +        (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) ||

>> +        (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer +

>> +                                tcg_init_ctx.code_gen_buffer_size))) {

>>          return r;

>>      }

>

> thanks

> -- PMM



--
Alex Bennée
Peter Maydell Nov. 7, 2017, 6:53 p.m. UTC | #3
On 7 November 2017 at 18:45, Alex Bennée <alex.bennee@linaro.org> wrote:
>

> Peter Maydell <peter.maydell@linaro.org> writes:

>

>> On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote:

>>> We are still seeing signals during translation time when we walk over

>>> a page protection boundary. This expands the check to ensure the

>>> retaddr is inside the code generation buffer. The original suggestion

>>> was to check versus tcg_ctx.code_gen_ptr but as we now segment the

>>> translation buffer we have to settle for just a general check for

>>> being inside.

>>>

>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

>>> Reported-by: Peter Maydell <peter.maydell@linaro.org>

>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

>>> Cc: Richard Henderson <rth@twiddle.net>

>>> ---

>>>  accel/tcg/translate-all.c | 20 ++++++++++++--------

>>>  1 file changed, 12 insertions(+), 8 deletions(-)

>>>

>>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c

>>> index 34c5e28d07..eb255af402 100644

>>> --- a/accel/tcg/translate-all.c

>>> +++ b/accel/tcg/translate-all.c

>>> @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)

>>>      TranslationBlock *tb;

>>>      bool r = false;

>>>

>>> -    /* A retaddr of zero is invalid so we really shouldn't have ended

>>> -     * up here. The target code has likely forgotten to check retaddr

>>> -     * != 0 before attempting to restore state. We return early to

>>> -     * avoid blowing up on a recursive tb_lock(). The target must have

>>> -     * previously survived a failed cpu_restore_state because

>>> -     * tb_find_pc(0) would have failed anyway. It still should be

>>> -     * fixed though.

>>> +    /* The retaddr has to be in the region of current code buffer. If

>>> +     * it's not we will not be able to resolve it here. If it is zero

>>> +     * the calling code has likely forgotten to check retaddr before

>>> +     * calling here.

>>

>> This part of the comment isn't correct -- it's entirely expected

>> that we will get here with a zero retaddr, because that is

>> how the rest of the code tells this function "no state restoration

>> required".

>

> Then why call cpu_restore_state at all? We should be consistent as there

> are plenty of places that do things like:

>

>     if (pc) {

>         /* now we have a real cpu fault */

>         cpu_restore_state(cs, pc);

>     }

>

> I'm happy to make a 0 retaddr officially valid and actually document it

> in exec-all.h. It's not like most callers even bother checking the

> return code.


Hmm, there's more places than I expected that do that "don't call
if 0" check than I thought. Overall it seems better to me to officially
allow the zero, rather than having lots of callsites that all have
to remember to check.

Incidentally if retaddr is zero then
(retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer)
is always true and you don't need to explicitly check for zero, though
it might be clearer to do so if we think we might change the rest
of the condition in future.

thanks
-- PMM
Richard Henderson Nov. 8, 2017, 8:52 a.m. UTC | #4
On 11/07/2017 07:53 PM, Peter Maydell wrote:
>> Then why call cpu_restore_state at all? We should be consistent as there

>> are plenty of places that do things like:

>>

>>     if (pc) {

>>         /* now we have a real cpu fault */

>>         cpu_restore_state(cs, pc);

>>     }

>>

>> I'm happy to make a 0 retaddr officially valid and actually document it

>> in exec-all.h. It's not like most callers even bother checking the

>> return code.


This is exactly the discussion that we had last time, and we did just that.

> Hmm, there's more places than I expected that do that "don't call

> if 0" check than I thought. Overall it seems better to me to officially

> allow the zero, rather than having lots of callsites that all have

> to remember to check.


... what we didn't do is go through and change all of the call sites to remove
the check for zero.

> Incidentally if retaddr is zero then

> (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer)

> is always true and you don't need to explicitly check for zero, though

> it might be clearer to do so if we think we might change the rest

> of the condition in future.


Indeed, I was thinking

  retaddr - code_gen_buffer < code_gen_buffer_size

which works well with unsigned arithmetic.  And a large comment re zero.


r~
diff mbox series

Patch

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 34c5e28d07..eb255af402 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -357,16 +357,20 @@  bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)
     TranslationBlock *tb;
     bool r = false;
 
-    /* A retaddr of zero is invalid so we really shouldn't have ended
-     * up here. The target code has likely forgotten to check retaddr
-     * != 0 before attempting to restore state. We return early to
-     * avoid blowing up on a recursive tb_lock(). The target must have
-     * previously survived a failed cpu_restore_state because
-     * tb_find_pc(0) would have failed anyway. It still should be
-     * fixed though.
+    /* The retaddr has to be in the region of current code buffer. If
+     * it's not we will not be able to resolve it here. If it is zero
+     * the calling code has likely forgotten to check retaddr before
+     * calling here. If it is not in the translated code we could be
+     * faulting during translation itself.
+     *
+     * Either way we need return early to avoid blowing up on a
+     * recursive tb_lock() as we can't resolve it here.
      */
 
-    if (!retaddr) {
+    if (!retaddr ||
+        (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) ||
+        (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer +
+                                tcg_init_ctx.code_gen_buffer_size))) {
         return r;
     }