diff mbox

ARM: Add support for AT_HWCAP2 in _dl_procinfo

Message ID 1403701970-21947-1-git-send-email-will.newton@linaro.org
State Accepted
Headers show

Commit Message

Will Newton June 25, 2014, 1:12 p.m. UTC
Add support for the new HWCAP2 values for ARMv8 added in the
3.15 kernel. Tested using QEMU which supports these extensions.

ChangeLog:

2014-06-25  Will Newton  <will.newton@linaro.org>

	* sysdeps/unix/sysv/linux/arm/dl-procinfo.c
	(_dl_arm_cap_flags): Add HWCAP2 values.
	* sysdeps/unix/sysv/linux/arm/dl-procinfo.h
	(_DL_HWCAP_COUNT): Increase to 37.
	(_DL_HWCAP_LAST): New define.
	(_DL_HWCAP2_LAST): New define.
	(_dl_procinfo): Add support for printing
	AT_HWCAP2 entries.
	(_dl_string_hwcap): Use _dl_hwcap_string.
---
 sysdeps/unix/sysv/linux/arm/dl-procinfo.c |  5 ++-
 sysdeps/unix/sysv/linux/arm/dl-procinfo.h | 67 +++++++++++++++++++------------
 2 files changed, 44 insertions(+), 28 deletions(-)

Comments

Joseph Myers June 25, 2014, 2:56 p.m. UTC | #1
On Wed, 25 Jun 2014, Will Newton wrote:

> Add support for the new HWCAP2 values for ARMv8 added in the
> 3.15 kernel. Tested using QEMU which supports these extensions.
> 
> ChangeLog:
> 
> 2014-06-25  Will Newton  <will.newton@linaro.org>
> 
> 	* sysdeps/unix/sysv/linux/arm/dl-procinfo.c
> 	(_dl_arm_cap_flags): Add HWCAP2 values.
> 	* sysdeps/unix/sysv/linux/arm/dl-procinfo.h
> 	(_DL_HWCAP_COUNT): Increase to 37.
> 	(_DL_HWCAP_LAST): New define.
> 	(_DL_HWCAP2_LAST): New define.
> 	(_dl_procinfo): Add support for printing
> 	AT_HWCAP2 entries.
> 	(_dl_string_hwcap): Use _dl_hwcap_string.

OK.
Roland McGrath June 25, 2014, 4:40 p.m. UTC | #2
> +  switch(type)

Missing space.
Richard Earnshaw June 26, 2014, 9:14 a.m. UTC | #3
On 25/06/14 14:12, Will Newton wrote:
> Add support for the new HWCAP2 values for ARMv8 added in the
> 3.15 kernel. Tested using QEMU which supports these extensions.
> 
> ChangeLog:
> 
> 2014-06-25  Will Newton  <will.newton@linaro.org>
> 
> 	* sysdeps/unix/sysv/linux/arm/dl-procinfo.c
> 	(_dl_arm_cap_flags): Add HWCAP2 values.
> 	* sysdeps/unix/sysv/linux/arm/dl-procinfo.h
> 	(_DL_HWCAP_COUNT): Increase to 37.
> 	(_DL_HWCAP_LAST): New define.
> 	(_DL_HWCAP2_LAST): New define.
> 	(_dl_procinfo): Add support for printing
> 	AT_HWCAP2 entries.
> 	(_dl_string_hwcap): Use _dl_hwcap_string.

I don't have a specific comment about this patch.

I do have a general comment that I think the HWCAPs exported by the
kernel for 32-bit ARM are a joke.  The principle problem is that there
is precisely zero way to determine the base architecture.  You cannot
even tell whether you are running on ARMv6 or ARMv7, let alone whether
you have key features such as Thumb2.

I've heard it suggested that you can part the architecture string (eg
armv7l), but 1) the format of this string is not precisely defined in a
way that allows you to predict what future cores will generate and 2)
parsing strings in ifunc code when function calls can't be made is
likely to be hairy at best.

We really need a better solution than this.

R.
Will Newton June 26, 2014, 9:36 a.m. UTC | #4
On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 25/06/14 14:12, Will Newton wrote:
>> Add support for the new HWCAP2 values for ARMv8 added in the
>> 3.15 kernel. Tested using QEMU which supports these extensions.
>>
>> ChangeLog:
>>
>> 2014-06-25  Will Newton  <will.newton@linaro.org>
>>
>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.c
>>       (_dl_arm_cap_flags): Add HWCAP2 values.
>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.h
>>       (_DL_HWCAP_COUNT): Increase to 37.
>>       (_DL_HWCAP_LAST): New define.
>>       (_DL_HWCAP2_LAST): New define.
>>       (_dl_procinfo): Add support for printing
>>       AT_HWCAP2 entries.
>>       (_dl_string_hwcap): Use _dl_hwcap_string.
>
> I don't have a specific comment about this patch.
>
> I do have a general comment that I think the HWCAPs exported by the
> kernel for 32-bit ARM are a joke.  The principle problem is that there
> is precisely zero way to determine the base architecture.  You cannot
> even tell whether you are running on ARMv6 or ARMv7, let alone whether
> you have key features such as Thumb2.

I agree, and the situation on AArch64 looks no better. It's pretty
much impossible to determine the micro-architecture from userland too
- which may not be that much of an issue for ARM, but AArch64 likely
much more so

> I've heard it suggested that you can part the architecture string (eg
> armv7l), but 1) the format of this string is not precisely defined in a
> way that allows you to predict what future cores will generate and 2)
> parsing strings in ifunc code when function calls can't be made is
> likely to be hairy at best.

I think the platform is probably the best way to pass that info. The
kernel currently sets it to:

        snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c",
                 list->elf_name, ENDIANNESS);

Where elf_name is one of:

v4
v5
v5t
v6
v7
v7m

That doesn't look too intractable, and we can work with the kernel
guys to make sure nothing too surprising is added there. The string
parsing of architecture revision is pretty trivial in those cases.
Perhaps this is something we can discuss at the GNU Tools Cauldron
next month.

On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
Richard Earnshaw June 26, 2014, 10:01 a.m. UTC | #5
On 26/06/14 10:36, Will Newton wrote:
> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 25/06/14 14:12, Will Newton wrote:
>>> Add support for the new HWCAP2 values for ARMv8 added in the
>>> 3.15 kernel. Tested using QEMU which supports these extensions.
>>>
>>> ChangeLog:
>>>
>>> 2014-06-25  Will Newton  <will.newton@linaro.org>
>>>
>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.c
>>>       (_dl_arm_cap_flags): Add HWCAP2 values.
>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.h
>>>       (_DL_HWCAP_COUNT): Increase to 37.
>>>       (_DL_HWCAP_LAST): New define.
>>>       (_DL_HWCAP2_LAST): New define.
>>>       (_dl_procinfo): Add support for printing
>>>       AT_HWCAP2 entries.
>>>       (_dl_string_hwcap): Use _dl_hwcap_string.
>>
>> I don't have a specific comment about this patch.
>>
>> I do have a general comment that I think the HWCAPs exported by the
>> kernel for 32-bit ARM are a joke.  The principle problem is that there
>> is precisely zero way to determine the base architecture.  You cannot
>> even tell whether you are running on ARMv6 or ARMv7, let alone whether
>> you have key features such as Thumb2.
> 
> I agree, and the situation on AArch64 looks no better. It's pretty
> much impossible to determine the micro-architecture from userland too
> - which may not be that much of an issue for ARM, but AArch64 likely
> much more so
> 

There has to be a better way of addressing that issue than reading the
microarchitecture name and then switching on that.  The list is
potentially unbounded: what do you do when you encounter a new
micro-architecture?

>> I've heard it suggested that you can part the architecture string (eg
>> armv7l), but 1) the format of this string is not precisely defined in a
>> way that allows you to predict what future cores will generate and 2)
>> parsing strings in ifunc code when function calls can't be made is
>> likely to be hairy at best.
> 
> I think the platform is probably the best way to pass that info. The
> kernel currently sets it to:
> 
>         snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c",
>                  list->elf_name, ENDIANNESS);
> 
> Where elf_name is one of:
> 
> v4
> v5
> v5t
> v6
> v7
> v7m
> 
> That doesn't look too intractable, and we can work with the kernel
> guys to make sure nothing too surprising is added there. 

Until you realize that these do not have a total ordering; that is,
while you can write

v4 < v5 < v5t < v6 < v7

You cannot insert v7m in that list at any point, since it is both more
and less than v6.  In fact, it's both more and less than the baseline
v7, since it also has a divide instruction.


> The string
> parsing of architecture revision is pretty trivial in those cases.
> Perhaps this is something we can discuss at the GNU Tools Cauldron
> next month.
> 
> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
> 

Yeah, but then, I don't think reading this string is a useful way of
solving this problem.

R.
Will Newton June 26, 2014, 1:07 p.m. UTC | #6
On 26 June 2014 11:01, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 26/06/14 10:36, Will Newton wrote:
>> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote:
>>> On 25/06/14 14:12, Will Newton wrote:
>>>> Add support for the new HWCAP2 values for ARMv8 added in the
>>>> 3.15 kernel. Tested using QEMU which supports these extensions.
>>>>
>>>> ChangeLog:
>>>>
>>>> 2014-06-25  Will Newton  <will.newton@linaro.org>
>>>>
>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.c
>>>>       (_dl_arm_cap_flags): Add HWCAP2 values.
>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.h
>>>>       (_DL_HWCAP_COUNT): Increase to 37.
>>>>       (_DL_HWCAP_LAST): New define.
>>>>       (_DL_HWCAP2_LAST): New define.
>>>>       (_dl_procinfo): Add support for printing
>>>>       AT_HWCAP2 entries.
>>>>       (_dl_string_hwcap): Use _dl_hwcap_string.
>>>
>>> I don't have a specific comment about this patch.
>>>
>>> I do have a general comment that I think the HWCAPs exported by the
>>> kernel for 32-bit ARM are a joke.  The principle problem is that there
>>> is precisely zero way to determine the base architecture.  You cannot
>>> even tell whether you are running on ARMv6 or ARMv7, let alone whether
>>> you have key features such as Thumb2.
>>
>> I agree, and the situation on AArch64 looks no better. It's pretty
>> much impossible to determine the micro-architecture from userland too
>> - which may not be that much of an issue for ARM, but AArch64 likely
>> much more so
>>
>
> There has to be a better way of addressing that issue than reading the
> microarchitecture name and then switching on that.  The list is
> potentially unbounded: what do you do when you encounter a new
> micro-architecture?

In the context of an ifunc resolver, do what you do now which is use
the hwcap bits. The use case that I can imagine is where we have a
microarchitecture that suffers from a particularly poor behaviour with
e.g. the default memcpy then we can switch it to use a custom version.
At the moment the only way I can see to deal with that is read /proc
and stash that information inside ld.so somewhere at startup but that
is rather ugly...

>>> I've heard it suggested that you can part the architecture string (eg
>>> armv7l), but 1) the format of this string is not precisely defined in a
>>> way that allows you to predict what future cores will generate and 2)
>>> parsing strings in ifunc code when function calls can't be made is
>>> likely to be hairy at best.
>>
>> I think the platform is probably the best way to pass that info. The
>> kernel currently sets it to:
>>
>>         snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c",
>>                  list->elf_name, ENDIANNESS);
>>
>> Where elf_name is one of:
>>
>> v4
>> v5
>> v5t
>> v6
>> v7
>> v7m
>>
>> That doesn't look too intractable, and we can work with the kernel
>> guys to make sure nothing too surprising is added there.
>
> Until you realize that these do not have a total ordering; that is,
> while you can write
>
> v4 < v5 < v5t < v6 < v7
>
> You cannot insert v7m in that list at any point, since it is both more
> and less than v6.  In fact, it's both more and less than the baseline
> v7, since it also has a divide instruction.

I don't see why we would care about ordering these values - isn't it
just a symbolic value? What use case do you have in mind?

>> The string
>> parsing of architecture revision is pretty trivial in those cases.
>> Perhaps this is something we can discuss at the GNU Tools Cauldron
>> next month.
>>
>> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
>>
>
> Yeah, but then, I don't think reading this string is a useful way of
> solving this problem.

Do you have an alternative proposal? ;-)
Richard Earnshaw June 26, 2014, 1:27 p.m. UTC | #7
On 26/06/14 14:07, Will Newton wrote:
> On 26 June 2014 11:01, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 26/06/14 10:36, Will Newton wrote:
>>> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote:
>>>> On 25/06/14 14:12, Will Newton wrote:
>>>>> Add support for the new HWCAP2 values for ARMv8 added in the
>>>>> 3.15 kernel. Tested using QEMU which supports these extensions.
>>>>>
>>>>> ChangeLog:
>>>>>
>>>>> 2014-06-25  Will Newton  <will.newton@linaro.org>
>>>>>
>>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.c
>>>>>       (_dl_arm_cap_flags): Add HWCAP2 values.
>>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.h
>>>>>       (_DL_HWCAP_COUNT): Increase to 37.
>>>>>       (_DL_HWCAP_LAST): New define.
>>>>>       (_DL_HWCAP2_LAST): New define.
>>>>>       (_dl_procinfo): Add support for printing
>>>>>       AT_HWCAP2 entries.
>>>>>       (_dl_string_hwcap): Use _dl_hwcap_string.
>>>>
>>>> I don't have a specific comment about this patch.
>>>>
>>>> I do have a general comment that I think the HWCAPs exported by the
>>>> kernel for 32-bit ARM are a joke.  The principle problem is that there
>>>> is precisely zero way to determine the base architecture.  You cannot
>>>> even tell whether you are running on ARMv6 or ARMv7, let alone whether
>>>> you have key features such as Thumb2.
>>>
>>> I agree, and the situation on AArch64 looks no better. It's pretty
>>> much impossible to determine the micro-architecture from userland too
>>> - which may not be that much of an issue for ARM, but AArch64 likely
>>> much more so
>>>
>>
>> There has to be a better way of addressing that issue than reading the
>> microarchitecture name and then switching on that.  The list is
>> potentially unbounded: what do you do when you encounter a new
>> micro-architecture?
> 
> In the context of an ifunc resolver, do what you do now which is use
> the hwcap bits. 

And that's my major gripe.  They don't work, since they don't tell you
what you really need to know.  There's no HWCAP bit to say Thumb2;
similarly there's no HWCAP bit to say that you've got the integer SIMD
instructions or the v5e DSP instructions.  It's assumed you can work
this out from the architecture name; but there's insufficient
specification for that to make it work going forwards to new
architecture variants.

The net result is that you have to start making inferences based on
other HWCAPS (eg if I've got Neon, then I've got thumb2); but the
inverse is not always implied (for example, if I have thumb2 I don't
necessarily have Neon), so there will still be cases that are missed.
It's also dangerous to do this since some of the inferences may end up
being incorrect in the long run.

I don't have an answer to any of this as of today, other than to go read
the full hardware features bits out of CP15; but that's a privileged
operation not available to user-space code.



> The use case that I can imagine is where we have a
> microarchitecture that suffers from a particularly poor behaviour with
> e.g. the default memcpy then we can switch it to use a custom version.
> At the moment the only way I can see to deal with that is read /proc
> and stash that information inside ld.so somewhere at startup but that
> is rather ugly...
> 

For some of these, puting the relevant code in a VDSO might be a better
approach; at least for critical system routines.

>>>> I've heard it suggested that you can part the architecture string (eg
>>>> armv7l), but 1) the format of this string is not precisely defined in a
>>>> way that allows you to predict what future cores will generate and 2)
>>>> parsing strings in ifunc code when function calls can't be made is
>>>> likely to be hairy at best.
>>>
>>> I think the platform is probably the best way to pass that info. The
>>> kernel currently sets it to:
>>>
>>>         snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c",
>>>                  list->elf_name, ENDIANNESS);
>>>
>>> Where elf_name is one of:
>>>
>>> v4
>>> v5
>>> v5t
>>> v6
>>> v7
>>> v7m
>>>
>>> That doesn't look too intractable, and we can work with the kernel
>>> guys to make sure nothing too surprising is added there.
>>
>> Until you realize that these do not have a total ordering; that is,
>> while you can write
>>
>> v4 < v5 < v5t < v6 < v7
>>
>> You cannot insert v7m in that list at any point, since it is both more
>> and less than v6.  In fact, it's both more and less than the baseline
>> v7, since it also has a divide instruction.
> 
> I don't see why we would care about ordering these values - isn't it
> just a symbolic value? What use case do you have in mind?

Again, the sort of question like, can I use Thumb2 instructions?

> 
>>> The string
>>> parsing of architecture revision is pretty trivial in those cases.
>>> Perhaps this is something we can discuss at the GNU Tools Cauldron
>>> next month.
>>>
>>> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
>>>
>>
>> Yeah, but then, I don't think reading this string is a useful way of
>> solving this problem.
> 
> Do you have an alternative proposal? ;-)
> 

Not really, at this time.  Sorry.

R.
diff mbox

Patch

diff --git a/sysdeps/unix/sysv/linux/arm/dl-procinfo.c b/sysdeps/unix/sysv/linux/arm/dl-procinfo.c
index 113cda5..7cb3be9 100644
--- a/sysdeps/unix/sysv/linux/arm/dl-procinfo.c
+++ b/sysdeps/unix/sysv/linux/arm/dl-procinfo.c
@@ -23,7 +23,7 @@ 
    If anything should be added here check whether the size of each string
    is still ok with the given array size.
 
-   All the #ifdefs in the definitions ar equite irritating but
+   All the #ifdefs in the definitions are quite irritating but
    necessary if we want to avoid duplicating the information.  There
    are three different modes:
 
@@ -46,13 +46,14 @@ 
 #if !defined PROCINFO_DECL && defined SHARED
   ._dl_arm_cap_flags
 #else
-PROCINFO_CLASS const char _dl_arm_cap_flags[22][10]
+PROCINFO_CLASS const char _dl_arm_cap_flags[37][10]
 #endif
 #ifndef PROCINFO_DECL
 = {
     "swp", "half", "thumb", "26bit", "fastmult", "fpa", "vfp", "edsp",
     "java", "iwmmxt", "crunch", "thumbee", "neon", "vfpv3", "vfpv3d16",
     "tls", "vfpv4", "idiva", "idivt", "vfpd32", "lpae", "evtstrm",
+    "aes", "pmull", "sha1", "sha2", "crc32",
   }
 #endif
 #if !defined SHARED || defined PROCINFO_DECL
diff --git a/sysdeps/unix/sysv/linux/arm/dl-procinfo.h b/sysdeps/unix/sysv/linux/arm/dl-procinfo.h
index 20a3e92..f7557b9 100644
--- a/sysdeps/unix/sysv/linux/arm/dl-procinfo.h
+++ b/sysdeps/unix/sysv/linux/arm/dl-procinfo.h
@@ -23,32 +23,17 @@ 
 #include <ldsodefs.h>
 #include <sysdep.h>
 
-#define _DL_HWCAP_COUNT 22
+#define _DL_HWCAP_COUNT 37
 
-/* The kernel provides platform data but it is not interesting.  */
-#define _DL_HWCAP_PLATFORM 	0
-
-
-static inline int
-__attribute__ ((unused))
-_dl_procinfo (unsigned int type, unsigned long int word)
-{
-  int i;
-
-  /* Fallback to unknown output mechanism.  */
-  if (type == AT_HWCAP2)
-    return -1;
-
-  _dl_printf ("AT_HWCAP:   ");
+/* Low 22 bits are allocated in HWCAP.  */
+#define _DL_HWCAP_LAST		21
 
-  for (i = 0; i < _DL_HWCAP_COUNT; ++i)
-    if (word & (1 << i))
-      _dl_printf (" %s", GLRO(dl_arm_cap_flags)[i]);
+/* Low 5 bits are allocated in HWCAP2.  */
+#define _DL_HWCAP2_LAST		4
 
-  _dl_printf ("\n");
+/* The kernel provides platform data but it is not interesting.  */
+#define _DL_HWCAP_PLATFORM	0
 
-  return 0;
-}
 
 static inline const char *
 __attribute__ ((unused))
@@ -57,17 +42,47 @@  _dl_hwcap_string (int idx)
   return GLRO(dl_arm_cap_flags)[idx];
 };
 
+static inline int
+__attribute__ ((unused))
+_dl_procinfo (unsigned int type, unsigned long int word)
+{
+  switch(type)
+    {
+    case AT_HWCAP:
+      _dl_printf ("AT_HWCAP:       ");
+
+      for (int i = 0; i <= _DL_HWCAP_LAST; ++i)
+	if (word & (1 << i))
+	  _dl_printf (" %s", _dl_hwcap_string (i));
+      break;
+    case AT_HWCAP2:
+      {
+	unsigned int offset = _DL_HWCAP_LAST + 1;
+
+	_dl_printf ("AT_HWCAP2:      ");
+
+	for (int i = 0; i <= _DL_HWCAP2_LAST; ++i)
+	  if (word & (1 << i))
+	    _dl_printf (" %s", _dl_hwcap_string (offset + i));
+	break;
+      }
+    default:
+      /* This should not happen.  */
+      return -1;
+    }
+  _dl_printf ("\n");
+  return 0;
+}
+
 #define HWCAP_IMPORTANT		(HWCAP_ARM_VFP | HWCAP_ARM_NEON)
 
 static inline int
 __attribute__ ((unused))
 _dl_string_hwcap (const char *str)
 {
-  int i;
-
-  for (i = 0; i < _DL_HWCAP_COUNT; i++)
+  for (int i = 0; i < _DL_HWCAP_COUNT; i++)
     {
-      if (strcmp (str, GLRO(dl_arm_cap_flags)[i]) == 0)
+      if (strcmp (str, _dl_hwcap_string (i)) == 0)
 	return i;
     }
   return -1;