diff mbox series

[2/3] i686: Do not raise exception traps on fesetexcept (BZ 30989)

Message ID 20231023213659.3236496-3-adhemerval.zanella@linaro.org
State New
Headers show
Series Fix fesetexcept/fesetexceptflag on powerpc and x86 | expand

Commit Message

Adhemerval Zanella Oct. 23, 2023, 9:36 p.m. UTC
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).

The flags can be set in the 387 unit or in the SSE unit.  To set
a flag, it is sufficient to do it in the SSE unit, because that is
guaranteed to not trap.  However, on i386 CPUs that have only a
387 unit, set the flags in the 387, as long as this cannot trap.

Checked on i686-linux-gnu.
---
 math/test-fesetexcept-traps.c            | 11 ++++++
 sysdeps/i386/fpu/fesetexcept.c           | 43 ++++++++++++++++++++++--
 sysdeps/i386/fpu/math-tests-trap-force.h | 29 ++++++++++++++++
 3 files changed, 81 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/i386/fpu/math-tests-trap-force.h

Comments

Joseph Myers Oct. 23, 2023, 11:32 p.m. UTC | #1
On Mon, 23 Oct 2023, Adhemerval Zanella wrote:

> +      if (temp.__control_word & temp.__status_word & excepts)
> +        /* Setting the exception flags may trigger a trap (at the next
> +           floating-point instruction, but that does not matter).
> +           ISO C 23 ?? 7.6.4.5 does not allow it.  */
> +        return -1;

Don't you need to check (~temp.__control_word), as in patch 3, since the 
bits in the control word are set for masked exceptions, clear for 
trapping?
Bruno Haible Oct. 24, 2023, 12:17 a.m. UTC | #2
Adhemerval Zanella wrote:
> diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c
> index 96f6c4752f..122c23eb7e 100644
> --- a/math/test-fesetexcept-traps.c
> +++ b/math/test-fesetexcept-traps.c
> @@ -43,6 +44,16 @@ do_test (void)
>       where setting the exception might result in traps the function should
>       return a nonzero value.  */
>    ret = fesetexcept (FE_ALL_EXCEPT);
> +
> +  /* Execute some floating-point operations, since on some CPUs exceptions
> +     triggers a trap only at the next floating-point instruction.  */
> +  double a = 1.0;
> +  double b = a + a;
> +  math_force_eval (b);
> +  long double al = 1.0L;
> +  long double bl = al + al;

I would mark the variables a, b, al, bl as 'volatile', otherwise it's too
easy for GCC to do constant-folding and thus eliminate the floating-point
operations.

> diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c
> index 18949e982a..a4c70cd1d1 100644
> --- a/sysdeps/i386/fpu/fesetexcept.c
> +++ b/sysdeps/i386/fpu/fesetexcept.c
> @@ -17,15 +17,54 @@
>     <https://www.gnu.org/licenses/>.  */
>  
>  #include <fenv.h>
> +#include <ldsodefs.h>
>  
>  int
>  fesetexcept (int excepts)
>  {
> +  /* The flags can be set in the 387 unit or in the SSE unit.  To set a flag,
> +     it is sufficient to do it in the SSE unit, because that is guaranteed to
> +     not trap.  However, on i386 CPUs that have only a 387 unit, set the flags
> +     in the 387, as long as this cannot trap.  */
> +
>    fenv_t temp;
>  
> +  excepts &= FE_ALL_EXCEPT;
> +
>    __asm__ ("fnstenv %0" : "=m" (*&temp));

The variable 'temp' and this __asm__ statement can be moved to the 'else'
branch, since in the case that an SSE unit is present, we don't need to
touch the 387 unit at all. (Since the job here is to _set_ some exception
flag bits.)

> +  if (CPU_FEATURE_USABLE (SSE))
> +    {
> +      /* Clear relevant flags.  */
> +      temp.__status_word &= ~excepts;
> +
> +      /* Store the new status word (along with the rest of the environment).  */
> +      __asm__ ("fldenv %0" : : "m" (*&temp));
> +

These last two statements can be removed, since in this case, we don't need to
touch the 387 unit at all.

> +      /* And now similarly for SSE.  */
> +      unsigned int mxcsr;
> +      __asm__ ("stmxcsr %0" : "=m" (*&mxcsr));
> +
> +      /* Set relevant flags.  */
> +      mxcsr |= excepts & FE_ALL_EXCEPT;

No need for the ' & FE_ALL_EXCEPT' here, since it was already done at function
entry.

> +      /* Put the new data in effect.  */
> +      __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr));
> +    }
> +  else
> +    {
> +      /* Clear or set relevant flags.  */
> +      temp.__status_word ^= temp.__status_word & excepts;

This last statement is not right: It clears bits from temp.__status_word,
but should *set* these bits instead. Change this to:

         /* Set relevant flags.  */
         temp.__status_word |= excepts;

> +      if (temp.__control_word & temp.__status_word & excepts)

The temp.__status_word does not need to be tested here, since we just
set all EXCEPTS bit in it just before. With Joseph's remark, this line
becomes

         if ((~temp.__control_word) & excepts)

> +        /* Setting the exception flags may trigger a trap (at the next
> +           floating-point instruction, but that does not matter).
> +           ISO C 23 § 7.6.4.5 does not allow it.  */

In this function, we need to reference § 7.6.4.4.

> diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h
> new file mode 100644
> index 0000000000..d88229c271
> --- /dev/null
> +++ b/sysdeps/i386/fpu/math-tests-trap-force.h
> @@ -0,0 +1,29 @@
> +/* Configuration for math tests: support for setting exception flags
> +   without causing enabled traps.  i686 version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H
> +#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1
> +
> +#include <cpu-features.h>
> +
> +/* Setting exception flags in FPSCR results in enabled traps for those
> +   exceptions being taken.  */

The i386 does not have an FPSCR register. The exception flags are stored
in the register that gdb calls 'fstat' instead. I would use the same name.
(The Intel reference does not have a short name for this register; it
calls it "FPU Status Register".)

Bruno
Adhemerval Zanella Oct. 24, 2023, 11:12 a.m. UTC | #3
On 23/10/23 21:17, Bruno Haible wrote:
> Adhemerval Zanella wrote:
>> diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c
>> index 96f6c4752f..122c23eb7e 100644
>> --- a/math/test-fesetexcept-traps.c
>> +++ b/math/test-fesetexcept-traps.c
>> @@ -43,6 +44,16 @@ do_test (void)
>>       where setting the exception might result in traps the function should
>>       return a nonzero value.  */
>>    ret = fesetexcept (FE_ALL_EXCEPT);
>> +
>> +  /* Execute some floating-point operations, since on some CPUs exceptions
>> +     triggers a trap only at the next floating-point instruction.  */
>> +  double a = 1.0;
>> +  double b = a + a;
>> +  math_force_eval (b);
>> +  long double al = 1.0L;
>> +  long double bl = al + al;
> 
> I would mark the variables a, b, al, bl as 'volatile', otherwise it's too
> easy for GCC to do constant-folding and thus eliminate the floating-point
> operations.

It should not matter for i386, since it still generates a floating-point
loading, but I agree that forcing the fp operation seems better.

> 
>> diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c
>> index 18949e982a..a4c70cd1d1 100644
>> --- a/sysdeps/i386/fpu/fesetexcept.c
>> +++ b/sysdeps/i386/fpu/fesetexcept.c
>> @@ -17,15 +17,54 @@
>>     <https://www.gnu.org/licenses/>.  */
>>  
>>  #include <fenv.h>
>> +#include <ldsodefs.h>
>>  
>>  int
>>  fesetexcept (int excepts)
>>  {
>> +  /* The flags can be set in the 387 unit or in the SSE unit.  To set a flag,
>> +     it is sufficient to do it in the SSE unit, because that is guaranteed to
>> +     not trap.  However, on i386 CPUs that have only a 387 unit, set the flags
>> +     in the 387, as long as this cannot trap.  */
>> +
>>    fenv_t temp;
>>  
>> +  excepts &= FE_ALL_EXCEPT;
>> +
>>    __asm__ ("fnstenv %0" : "=m" (*&temp));
> 
> The variable 'temp' and this __asm__ statement can be moved to the 'else'
> branch, since in the case that an SSE unit is present, we don't need to
> touch the 387 unit at all. (Since the job here is to _set_ some exception
> flag bits.)

Indeed, I will change it.

> 
>> +  if (CPU_FEATURE_USABLE (SSE))
>> +    {
>> +      /* Clear relevant flags.  */
>> +      temp.__status_word &= ~excepts;
>> +
>> +      /* Store the new status word (along with the rest of the environment).  */
>> +      __asm__ ("fldenv %0" : : "m" (*&temp));
>> +
> 
> These last two statements can be removed, since in this case, we don't need to
> touch the 387 unit at all.

Ack.

> 
>> +      /* And now similarly for SSE.  */
>> +      unsigned int mxcsr;
>> +      __asm__ ("stmxcsr %0" : "=m" (*&mxcsr));
>> +
>> +      /* Set relevant flags.  */
>> +      mxcsr |= excepts & FE_ALL_EXCEPT;
> 
> No need for the ' & FE_ALL_EXCEPT' here, since it was already done at function
> entry.

Ack.

> 
>> +      /* Put the new data in effect.  */
>> +      __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr));
>> +    }
>> +  else
>> +    {
>> +      /* Clear or set relevant flags.  */
>> +      temp.__status_word ^= temp.__status_word & excepts;
> 
> This last statement is not right: It clears bits from temp.__status_word,
> but should *set* these bits instead. Change this to:
> 
>          /* Set relevant flags.  */
>          temp.__status_word |= excepts;

Indeed, I think I missed it because I don't have some old chip that actually
stress it.  I will check if qemu-user can emulated one.

> 
>> +      if (temp.__control_word & temp.__status_word & excepts)
> 
> The temp.__status_word does not need to be tested here, since we just
> set all EXCEPTS bit in it just before. With Joseph's remark, this line
> becomes
> 
>          if ((~temp.__control_word) & excepts)

Ack.

> 
>> +        /* Setting the exception flags may trigger a trap (at the next
>> +           floating-point instruction, but that does not matter).
>> +           ISO C 23 § 7.6.4.5 does not allow it.  */
> 
> In this function, we need to reference § 7.6.4.4.

Ack (and I will remove § to avoid non ascii characteres).

> 
>> diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h
>> new file mode 100644
>> index 0000000000..d88229c271
>> --- /dev/null
>> +++ b/sysdeps/i386/fpu/math-tests-trap-force.h
>> @@ -0,0 +1,29 @@
>> +/* Configuration for math tests: support for setting exception flags
>> +   without causing enabled traps.  i686 version.
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H
>> +#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1
>> +
>> +#include <cpu-features.h>
>> +
>> +/* Setting exception flags in FPSCR results in enabled traps for those
>> +   exceptions being taken.  */
> 
> The i386 does not have an FPSCR register. The exception flags are stored
> in the register that gdb calls 'fstat' instead. I would use the same name.
> (The Intel reference does not have a short name for this register; it
> calls it "FPU Status Register".)
> 

Ack.

> Bruno
> 
> 
>
Bruno Haible Oct. 24, 2023, 1:35 p.m. UTC | #4
Adhemerval Zanella Netto wrote:
> > In this function, we need to reference § 7.6.4.4.
> 
> Ack (and I will remove § to avoid non ascii characteres).

We can have UTF-8 encoded non-ASCII characters in the source code comments
nowadays. Such as in glibc/include/idx.h, glibc/string/strverscmp.c, etc.

Bruno
diff mbox series

Patch

diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c
index 96f6c4752f..122c23eb7e 100644
--- a/math/test-fesetexcept-traps.c
+++ b/math/test-fesetexcept-traps.c
@@ -19,6 +19,7 @@ 
 #include <fenv.h>
 #include <stdio.h>
 #include <math-tests.h>
+#include <math-barriers.h>
 
 static int
 do_test (void)
@@ -43,6 +44,16 @@  do_test (void)
      where setting the exception might result in traps the function should
      return a nonzero value.  */
   ret = fesetexcept (FE_ALL_EXCEPT);
+
+  /* Execute some floating-point operations, since on some CPUs exceptions
+     triggers a trap only at the next floating-point instruction.  */
+  double a = 1.0;
+  double b = a + a;
+  math_force_eval (b);
+  long double al = 1.0L;
+  long double bl = al + al;
+  math_force_eval (bl);
+
   if (ret == 0)
     puts ("fesetexcept (FE_ALL_EXCEPT) succeeded");
   else if (!EXCEPTION_SET_FORCES_TRAP)
diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c
index 18949e982a..a4c70cd1d1 100644
--- a/sysdeps/i386/fpu/fesetexcept.c
+++ b/sysdeps/i386/fpu/fesetexcept.c
@@ -17,15 +17,54 @@ 
    <https://www.gnu.org/licenses/>.  */
 
 #include <fenv.h>
+#include <ldsodefs.h>
 
 int
 fesetexcept (int excepts)
 {
+  /* The flags can be set in the 387 unit or in the SSE unit.  To set a flag,
+     it is sufficient to do it in the SSE unit, because that is guaranteed to
+     not trap.  However, on i386 CPUs that have only a 387 unit, set the flags
+     in the 387, as long as this cannot trap.  */
+
   fenv_t temp;
 
+  excepts &= FE_ALL_EXCEPT;
+
   __asm__ ("fnstenv %0" : "=m" (*&temp));
-  temp.__status_word |= excepts & FE_ALL_EXCEPT;
-  __asm__ ("fldenv %0" : : "m" (*&temp));
+
+  if (CPU_FEATURE_USABLE (SSE))
+    {
+      /* Clear relevant flags.  */
+      temp.__status_word &= ~excepts;
+
+      /* Store the new status word (along with the rest of the environment).  */
+      __asm__ ("fldenv %0" : : "m" (*&temp));
+
+      /* And now similarly for SSE.  */
+      unsigned int mxcsr;
+      __asm__ ("stmxcsr %0" : "=m" (*&mxcsr));
+
+      /* Set relevant flags.  */
+      mxcsr |= excepts & FE_ALL_EXCEPT;
+
+      /* Put the new data in effect.  */
+      __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr));
+    }
+  else
+    {
+      /* Clear or set relevant flags.  */
+      temp.__status_word ^= temp.__status_word & excepts;
+
+      if (temp.__control_word & temp.__status_word & excepts)
+        /* Setting the exception flags may trigger a trap (at the next
+           floating-point instruction, but that does not matter).
+           ISO C 23 § 7.6.4.5 does not allow it.  */
+        return -1;
+
+      /* Store the new status word (along with the rest of the environment).  */
+      __asm__ ("fldenv %0" : : "m" (*&temp));
+    }
 
   return 0;
 }
diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h
new file mode 100644
index 0000000000..d88229c271
--- /dev/null
+++ b/sysdeps/i386/fpu/math-tests-trap-force.h
@@ -0,0 +1,29 @@ 
+/* Configuration for math tests: support for setting exception flags
+   without causing enabled traps.  i686 version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H
+#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1
+
+#include <cpu-features.h>
+
+/* Setting exception flags in FPSCR results in enabled traps for those
+   exceptions being taken.  */
+#define EXCEPTION_SET_FORCES_TRAP (CPU_FEATURE_USABLE (SSE))
+
+#endif /* math-tests-trap-force.h.  */