diff mbox series

selftests/resctrl: Fix noncont_cat_run_test for AMD

Message ID 3a6c9dd9dc6bda6e2582db049bfe853cd836139f.1717622080.git.babu.moger@amd.com
State New
Headers show
Series selftests/resctrl: Fix noncont_cat_run_test for AMD | expand

Commit Message

Moger, Babu June 5, 2024, 9:36 p.m. UTC
The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
is, AMD supports non contiguous CBM masks but does not report it via CPUID.

Update noncont_cat_run_test to check for the vendor when verifying CPUID.

Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
This was part of the series
https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
Sending this as a separate fix per review comments.
---
 tools/testing/selftests/resctrl/cat_test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Reinette Chatre June 6, 2024, 8:33 p.m. UTC | #1
Hi Babu,

On 6/5/24 2:36 PM, Babu Moger wrote:
> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
> is, AMD supports non contiguous CBM masks but does not report it via CPUID.
> 
> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
> 
> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> This was part of the series
> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
> Sending this as a separate fix per review comments.
> ---
>   tools/testing/selftests/resctrl/cat_test.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index d4dffc934bc3..b2988888786e 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>   	else
>   		return -EINVAL;
>   
> -	if (sparse_masks != ((ecx >> 3) & 1)) {
> +	if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) & 1)) {
>   		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>   		return 1;
>   	}

Since AMD does not report this support via CPUID it does not seem
appropriate to use CPUID at all on AMD when doing the hardware check.
I think the above check makes it difficult to understand what is different
on AMD.

What if instead there is a new function, for example,
"static bool arch_supports_noncont_cat(const struct resctrl_test *test)"
that returns true if the hardware supports non-contiguous CBM?

The vendor check can be in there to make it obvious what is going on:

	/* AMD always supports non-contiguous CBM. */
	if (get_vendor() == AMD)
		return true;

	/* CPUID check for Intel here. */

The "sparse_masks" from kernel can then be checked against
hardware support with an appropriate (no mention of CPUID)
error message if this fails.

Reinette
Moger, Babu June 6, 2024, 11:09 p.m. UTC | #2
Hi Reinette,


On 6/6/2024 3:33 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/5/24 2:36 PM, Babu Moger wrote:
>> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
>> is, AMD supports non contiguous CBM masks but does not report it via 
>> CPUID.
>>
>> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
>>
>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT 
>> test")
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> This was part of the series
>> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
>> Sending this as a separate fix per review comments.
>> ---
>>   tools/testing/selftests/resctrl/cat_test.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c 
>> b/tools/testing/selftests/resctrl/cat_test.c
>> index d4dffc934bc3..b2988888786e 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct 
>> resctrl_test *test,
>>       else
>>           return -EINVAL;
>> -    if (sparse_masks != ((ecx >> 3) & 1)) {
>> +    if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) & 
>> 1)) {
>>           ksft_print_msg("CPUID output doesn't match 'sparse_masks' 
>> file content!\n");
>>           return 1;
>>       }
> 
> Since AMD does not report this support via CPUID it does not seem
> appropriate to use CPUID at all on AMD when doing the hardware check.
> I think the above check makes it difficult to understand what is different
> on AMD.
> 
> What if instead there is a new function, for example,
> "static bool arch_supports_noncont_cat(const struct resctrl_test *test)"
> that returns true if the hardware supports non-contiguous CBM?

Sure.

> 
> The vendor check can be in there to make it obvious what is going on:
> 
>      /* AMD always supports non-contiguous CBM. */
>      if (get_vendor() == AMD)
>          return true;
> 
>      /* CPUID check for Intel here. */
> 
> The "sparse_masks" from kernel can then be checked against
> hardware support with an appropriate (no mention of CPUID)
> error message if this fails.
> 

Something like this?


diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index d4dffc934bc3..b75d220f29f6 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -288,11 +288,30 @@ static int cat_run_test(const struct resctrl_test 
*test, const struct user_param
         return ret;
  }

+static bool arch_supports_noncont_cat(const struct resctrl_test *test)
+{
+       unsigned int eax, ebx, ecx, edx;
+
+       /* AMD always supports non-contiguous CBM. */
+       if (get_vendor() == ARCH_AMD) {
+               return true;
+       } else {
+               if (!strcmp(test->resource, "L3"))
+                       __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
+               else if (!strcmp(test->resource, "L2"))
+                       __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
+               else
+                       return false;
+
+               return ((ecx >> 3) & 1);
+       }
+}
+
  static int noncont_cat_run_test(const struct resctrl_test *test,
                                 const struct user_params *uparams)
  {
         unsigned long full_cache_mask, cont_mask, noncont_mask;
-       unsigned int eax, ebx, ecx, edx, sparse_masks;
+       unsigned int sparse_masks;
         int bit_center, ret;
         char schemata[64];

@@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct 
resctrl_test *test,
         if (ret)
                 return ret;

-       if (!strcmp(test->resource, "L3"))
-               __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
-       else if (!strcmp(test->resource, "L2"))
-               __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
-       else
-               return -EINVAL;
-
-       if (sparse_masks != ((ecx >> 3) & 1)) {
-               ksft_print_msg("CPUID output doesn't match 
'sparse_masks' file content!\n");
+       if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
+               ksft_print_msg("Hardware does not support non-contiguous 
CBM!\n");
                 return 1;
         }
Reinette Chatre June 6, 2024, 11:58 p.m. UTC | #3
Hi Babu,

On 6/6/24 4:09 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> 
> On 6/6/2024 3:33 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 6/5/24 2:36 PM, Babu Moger wrote:
>>> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
>>> is, AMD supports non contiguous CBM masks but does not report it via CPUID.
>>>
>>> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
>>>
>>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> This was part of the series
>>> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
>>> Sending this as a separate fix per review comments.
>>> ---
>>>   tools/testing/selftests/resctrl/cat_test.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index d4dffc934bc3..b2988888786e 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>>>       else
>>>           return -EINVAL;
>>> -    if (sparse_masks != ((ecx >> 3) & 1)) {
>>> +    if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) & 1)) {
>>>           ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>>>           return 1;
>>>       }
>>
>> Since AMD does not report this support via CPUID it does not seem
>> appropriate to use CPUID at all on AMD when doing the hardware check.
>> I think the above check makes it difficult to understand what is different
>> on AMD.
>>
>> What if instead there is a new function, for example,
>> "static bool arch_supports_noncont_cat(const struct resctrl_test *test)"
>> that returns true if the hardware supports non-contiguous CBM?
> 
> Sure.
> 
>>
>> The vendor check can be in there to make it obvious what is going on:
>>
>>      /* AMD always supports non-contiguous CBM. */
>>      if (get_vendor() == AMD)
>>          return true;
>>
>>      /* CPUID check for Intel here. */
>>
>> The "sparse_masks" from kernel can then be checked against
>> hardware support with an appropriate (no mention of CPUID)
>> error message if this fails.
>>
> 
> Something like this?
> 
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index d4dffc934bc3..b75d220f29f6 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -288,11 +288,30 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>          return ret;
>   }
> 
> +static bool arch_supports_noncont_cat(const struct resctrl_test *test)
> +{
> +       unsigned int eax, ebx, ecx, edx;
> +
> +       /* AMD always supports non-contiguous CBM. */
> +       if (get_vendor() == ARCH_AMD) {
> +               return true;
> +       } else {

The else can be dropped since it follows a return.
The rest of the code can be prefixed with a matching
comment like:
	/* Intel support for non-contiguous CBM needs to be discovered. */

(please feel free to improve)

> +               if (!strcmp(test->resource, "L3"))
> +                       __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
> +               else if (!strcmp(test->resource, "L2"))
> +                       __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
> +               else
> +                       return false;
> +
> +               return ((ecx >> 3) & 1);
> +       }
> +}
> +
>   static int noncont_cat_run_test(const struct resctrl_test *test,
>                                  const struct user_params *uparams)
>   {
>          unsigned long full_cache_mask, cont_mask, noncont_mask;
> -       unsigned int eax, ebx, ecx, edx, sparse_masks;
> +       unsigned int sparse_masks;
>          int bit_center, ret;
>          char schemata[64];
> 
> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>          if (ret)
>                  return ret;
> 
> -       if (!strcmp(test->resource, "L3"))
> -               __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
> -       else if (!strcmp(test->resource, "L2"))
> -               __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
> -       else
> -               return -EINVAL;
> -
> -       if (sparse_masks != ((ecx >> 3) & 1)) {
> -               ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
> +       if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
> +               ksft_print_msg("Hardware does not support non-contiguous CBM!\n");

Please fix the test as well as the message. It is not an error if hardware does
not support non-contiguous CBM. It is an error if the hardware and kernel disagrees whether
non-contiguous CBM is supported.

Reinette
Ilpo Järvinen June 7, 2024, 10:21 a.m. UTC | #4
On Wed, 5 Jun 2024, Babu Moger wrote:

> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
> is, AMD supports non contiguous CBM masks but does not report it via CPUID.
> 
> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
> 
> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> This was part of the series
> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
> Sending this as a separate fix per review comments.
> ---
>  tools/testing/selftests/resctrl/cat_test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index d4dffc934bc3..b2988888786e 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>  	else
>  		return -EINVAL;
>  
> -	if (sparse_masks != ((ecx >> 3) & 1)) {
> +	if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) & 1)) {
>  		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>  		return 1;
>  	}
> 

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Moger, Babu June 7, 2024, 6:16 p.m. UTC | #5
Hi Reinette,

On 6/6/2024 6:58 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/6/24 4:09 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>>
>> On 6/6/2024 3:33 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 6/5/24 2:36 PM, Babu Moger wrote:
>>>> The selftest noncont_cat_run_test fails on AMD with the warnings. 
>>>> Reason
>>>> is, AMD supports non contiguous CBM masks but does not report it via 
>>>> CPUID.
>>>>
>>>> Update noncont_cat_run_test to check for the vendor when verifying 
>>>> CPUID.
>>>>
>>>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT 
>>>> test")
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> This was part of the series
>>>> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
>>>> Sending this as a separate fix per review comments.
>>>> ---
>>>>   tools/testing/selftests/resctrl/cat_test.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c 
>>>> b/tools/testing/selftests/resctrl/cat_test.c
>>>> index d4dffc934bc3..b2988888786e 100644
>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct 
>>>> resctrl_test *test,
>>>>       else
>>>>           return -EINVAL;
>>>> -    if (sparse_masks != ((ecx >> 3) & 1)) {
>>>> +    if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) 
>>>> & 1)) {
>>>>           ksft_print_msg("CPUID output doesn't match 'sparse_masks' 
>>>> file content!\n");
>>>>           return 1;
>>>>       }
>>>
>>> Since AMD does not report this support via CPUID it does not seem
>>> appropriate to use CPUID at all on AMD when doing the hardware check.
>>> I think the above check makes it difficult to understand what is 
>>> different
>>> on AMD.
>>>
>>> What if instead there is a new function, for example,
>>> "static bool arch_supports_noncont_cat(const struct resctrl_test *test)"
>>> that returns true if the hardware supports non-contiguous CBM?
>>
>> Sure.
>>
>>>
>>> The vendor check can be in there to make it obvious what is going on:
>>>
>>>      /* AMD always supports non-contiguous CBM. */
>>>      if (get_vendor() == AMD)
>>>          return true;
>>>
>>>      /* CPUID check for Intel here. */
>>>
>>> The "sparse_masks" from kernel can then be checked against
>>> hardware support with an appropriate (no mention of CPUID)
>>> error message if this fails.
>>>
>>
>> Something like this?
>>
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c 
>> b/tools/testing/selftests/resctrl/cat_test.c
>> index d4dffc934bc3..b75d220f29f6 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -288,11 +288,30 @@ static int cat_run_test(const struct 
>> resctrl_test *test, const struct user_param
>>          return ret;
>>   }
>>
>> +static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>> +{
>> +       unsigned int eax, ebx, ecx, edx;
>> +
>> +       /* AMD always supports non-contiguous CBM. */
>> +       if (get_vendor() == ARCH_AMD) {
>> +               return true;
>> +       } else {
> 
> The else can be dropped since it follows a return.
Sure

> The rest of the code can be prefixed with a matching
> comment like:
>      /* Intel support for non-contiguous CBM needs to be discovered. */

Sure
> 
> (please feel free to improve)
> 
>> +               if (!strcmp(test->resource, "L3"))
>> +                       __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> +               else if (!strcmp(test->resource, "L2"))
>> +                       __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> +               else
>> +                       return false;
>> +
>> +               return ((ecx >> 3) & 1);
>> +       }
>> +}
>> +
>>   static int noncont_cat_run_test(const struct resctrl_test *test,
>>                                  const struct user_params *uparams)
>>   {
>>          unsigned long full_cache_mask, cont_mask, noncont_mask;
>> -       unsigned int eax, ebx, ecx, edx, sparse_masks;
>> +       unsigned int sparse_masks;
>>          int bit_center, ret;
>>          char schemata[64];
>>
>> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct 
>> resctrl_test *test,
>>          if (ret)
>>                  return ret;
>>
>> -       if (!strcmp(test->resource, "L3"))
>> -               __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> -       else if (!strcmp(test->resource, "L2"))
>> -               __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> -       else
>> -               return -EINVAL;
>> -
>> -       if (sparse_masks != ((ecx >> 3) & 1)) {
>> -               ksft_print_msg("CPUID output doesn't match 
>> 'sparse_masks' file content!\n");
>> +       if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
>> +               ksft_print_msg("Hardware does not support 
>> non-contiguous CBM!\n");
> 
> Please fix the test as well as the message. It is not an error if 
> hardware does
> not support non-contiguous CBM. It is an error if the hardware and 
> kernel disagrees whether
> non-contiguous CBM is supported.

Not sure about this comment.

Did you mean?

  if (!arch_supports_noncont_cat(test)) {
                 ksft_print_msg("Hardware does not support 
non-contiguous CBM!\n");
                 return 0;
          } else if (arch_supports_noncont_cat(test) && !sparse_masks)) {
                 ksft_print_msg("Hardware and kernel support for 
non-contiguous CBM does not match!\n");
                 return 1;
         }
Reinette Chatre June 7, 2024, 9:47 p.m. UTC | #6
Hi Babu,

On 6/7/24 11:16 AM, Moger, Babu wrote:
> On 6/6/2024 6:58 PM, Reinette Chatre wrote:
>> On 6/6/24 4:09 PM, Moger, Babu wrote:

>>> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>>>          if (ret)
>>>                  return ret;
>>>
>>> -       if (!strcmp(test->resource, "L3"))
>>> -               __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>>> -       else if (!strcmp(test->resource, "L2"))
>>> -               __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>>> -       else
>>> -               return -EINVAL;
>>> -
>>> -       if (sparse_masks != ((ecx >> 3) & 1)) {
>>> -               ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>>> +       if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
>>> +               ksft_print_msg("Hardware does not support non-contiguous CBM!\n");
>>
>> Please fix the test as well as the message. It is not an error if hardware does
>> not support non-contiguous CBM. It is an error if the hardware and kernel disagrees whether
>> non-contiguous CBM is supported.
> 
> Not sure about this comment.
> 
> Did you mean?
> 
>   if (!arch_supports_noncont_cat(test)) {
>                  ksft_print_msg("Hardware does not support non-contiguous CBM!\n");
>                  return 0;

The above changes whether support for non-contiguous CBM is treated as an error but the
test should still proceed since the test goes on to write different CBM to the system
and verifies results are as expected based on what hardware supports.

>           } else if (arch_supports_noncont_cat(test) && !sparse_masks)) {
>                  ksft_print_msg("Hardware and kernel support for non-contiguous CBM does not match!\n");
>                  return 1;

I can see how this will work for AMD for the scenario being checked but not for
the different Intel variants.

I think this can all be simplified with something like:
	if (arch_supports_noncont_cat(test) != sparse_masks)) {
		ksft_print_msg("Hardware and kernel differ on non-contiguous CBM support!\n");
		return 1;
	}

I modified the message slightly since non-contiguous CBM does not actually require kernel
support.

What do you think?

Reinette
Moger, Babu June 7, 2024, 10:35 p.m. UTC | #7
Hi Reinette,

On 6/7/2024 4:47 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/7/24 11:16 AM, Moger, Babu wrote:
>> On 6/6/2024 6:58 PM, Reinette Chatre wrote:
>>> On 6/6/24 4:09 PM, Moger, Babu wrote:
> 
>>>> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct 
>>>> resctrl_test *test,
>>>>          if (ret)
>>>>                  return ret;
>>>>
>>>> -       if (!strcmp(test->resource, "L3"))
>>>> -               __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>>>> -       else if (!strcmp(test->resource, "L2"))
>>>> -               __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>>>> -       else
>>>> -               return -EINVAL;
>>>> -
>>>> -       if (sparse_masks != ((ecx >> 3) & 1)) {
>>>> -               ksft_print_msg("CPUID output doesn't match 
>>>> 'sparse_masks' file content!\n");
>>>> +       if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
>>>> +               ksft_print_msg("Hardware does not support 
>>>> non-contiguous CBM!\n");
>>>
>>> Please fix the test as well as the message. It is not an error if 
>>> hardware does
>>> not support non-contiguous CBM. It is an error if the hardware and 
>>> kernel disagrees whether
>>> non-contiguous CBM is supported.
>>
>> Not sure about this comment.
>>
>> Did you mean?
>>
>>   if (!arch_supports_noncont_cat(test)) {
>>                  ksft_print_msg("Hardware does not support 
>> non-contiguous CBM!\n");
>>                  return 0;
> 
> The above changes whether support for non-contiguous CBM is treated as 
> an error but the
> test should still proceed since the test goes on to write different CBM 
> to the system
> and verifies results are as expected based on what hardware supports.
> 
>>           } else if (arch_supports_noncont_cat(test) && !sparse_masks)) {
>>                  ksft_print_msg("Hardware and kernel support for 
>> non-contiguous CBM does not match!\n");
>>                  return 1;
> 
> I can see how this will work for AMD for the scenario being checked but 
> not for
> the different Intel variants.
> 
> I think this can all be simplified with something like:
>      if (arch_supports_noncont_cat(test) != sparse_masks)) {
>          ksft_print_msg("Hardware and kernel differ on non-contiguous 
> CBM support!\n");
>          return 1;
>      }
> 
> I modified the message slightly since non-contiguous CBM does not 
> actually require kernel
> support.
> 
> What do you think?

Yes. That is fine.
Thank you
- Babu Moger
Ilpo Järvinen June 10, 2024, 4:20 p.m. UTC | #8
On Mon, 10 Jun 2024, Babu Moger wrote:

> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason

noncont_cat_run_test()

(In general, use () when refering to a function, same thing in the 
shortlog).

"the warnings" sounds like I should know about what warning it fails with
but there's no previous context which tells that information. I suggest 
you either use "a warning" or quote the warning it fails with into the 
commit message.

> is, AMD supports non contiguous CBM masks but does not report it via CPUID.

non-contiguous

> Update noncont_cat_run_test to check for the vendor when verifying CPUID.

()

> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v2: Moved the non contiguous verification to a new function
>     arch_supports_noncont_cat.
> 
> v1:
> This was part of the series
> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
> Sending this as a separate fix per review comments.
> ---
>  tools/testing/selftests/resctrl/cat_test.c | 32 +++++++++++++++-------
>  1 file changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index d4dffc934bc3..742782438ca3 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -288,11 +288,30 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>  	return ret;
>  }
>  
> +static bool arch_supports_noncont_cat(const struct resctrl_test *test)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	/* AMD always supports non-contiguous CBM. */
> +	if (get_vendor() == ARCH_AMD)
> +		return true;
> +
> +	/* Intel support for non-contiguous CBM needs to be discovered. */
> +	if (!strcmp(test->resource, "L3"))
> +		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
> +	else if (!strcmp(test->resource, "L2"))
> +		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
> +	else
> +		return false;
> +
> +	return ((ecx >> 3) & 1);
> +}
> +
>  static int noncont_cat_run_test(const struct resctrl_test *test,
>  				const struct user_params *uparams)
>  {
>  	unsigned long full_cache_mask, cont_mask, noncont_mask;
> -	unsigned int eax, ebx, ecx, edx, sparse_masks;
> +	unsigned int sparse_masks;
>  	int bit_center, ret;
>  	char schemata[64];
>  
> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>  	if (ret)
>  		return ret;
>  
> -	if (!strcmp(test->resource, "L3"))
> -		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
> -	else if (!strcmp(test->resource, "L2"))
> -		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
> -	else
> -		return -EINVAL;
> -
> -	if (sparse_masks != ((ecx >> 3) & 1)) {
> -		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
> +	if (arch_supports_noncont_cat(test) != sparse_masks) {
> +		ksft_print_msg("Hardware and kernel differ on non-contiguous CBM support!\n");
>  		return 1;

This looks better than the previous version, thanks.
Moger, Babu June 10, 2024, 5:51 p.m. UTC | #9
Hi

On 6/10/24 11:20, Ilpo Järvinen wrote:
> On Mon, 10 Jun 2024, Babu Moger wrote:
> 
>> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
> 
> noncont_cat_run_test()

I want to mention the test here. not function. How about this?

"The selftest non-contiguous CBM test fails on AMD with the warnings."

> 
> (In general, use () when refering to a function, same thing in the 
> shortlog).
> 
> "the warnings" sounds like I should know about what warning it fails with
> but there's no previous context which tells that information. I suggest 
> you either use "a warning" or quote the warning it fails with into the 
> commit message.
> 
>> is, AMD supports non contiguous CBM masks but does not report it via CPUID.
> 
> non-contiguous

Sure.

> 
>> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
> 
> ()

Sure.

> 
>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v2: Moved the non contiguous verification to a new function
>>     arch_supports_noncont_cat.
>>
>> v1:
>> This was part of the series
>> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
>> Sending this as a separate fix per review comments.
>> ---
>>  tools/testing/selftests/resctrl/cat_test.c | 32 +++++++++++++++-------
>>  1 file changed, 22 insertions(+), 10 deletions(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>> index d4dffc934bc3..742782438ca3 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -288,11 +288,30 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>>  	return ret;
>>  }
>>  
>> +static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>> +{
>> +	unsigned int eax, ebx, ecx, edx;
>> +
>> +	/* AMD always supports non-contiguous CBM. */
>> +	if (get_vendor() == ARCH_AMD)
>> +		return true;
>> +
>> +	/* Intel support for non-contiguous CBM needs to be discovered. */
>> +	if (!strcmp(test->resource, "L3"))
>> +		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> +	else if (!strcmp(test->resource, "L2"))
>> +		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> +	else
>> +		return false;
>> +
>> +	return ((ecx >> 3) & 1);
>> +}
>> +
>>  static int noncont_cat_run_test(const struct resctrl_test *test,
>>  				const struct user_params *uparams)
>>  {
>>  	unsigned long full_cache_mask, cont_mask, noncont_mask;
>> -	unsigned int eax, ebx, ecx, edx, sparse_masks;
>> +	unsigned int sparse_masks;
>>  	int bit_center, ret;
>>  	char schemata[64];
>>  
>> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct resctrl_test *test,
>>  	if (ret)
>>  		return ret;
>>  
>> -	if (!strcmp(test->resource, "L3"))
>> -		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> -	else if (!strcmp(test->resource, "L2"))
>> -		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> -	else
>> -		return -EINVAL;
>> -
>> -	if (sparse_masks != ((ecx >> 3) & 1)) {
>> -		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>> +	if (arch_supports_noncont_cat(test) != sparse_masks) {
>> +		ksft_print_msg("Hardware and kernel differ on non-contiguous CBM support!\n");
>>  		return 1;
> 
> This looks better than the previous version, thanks.

Thanks.
Babu Moger
Reinette Chatre June 10, 2024, 9:28 p.m. UTC | #10
Hi Babu,

On 6/10/24 10:51 AM, Moger, Babu wrote:
> Hi
> 
> On 6/10/24 11:20, Ilpo Järvinen wrote:
>> On Mon, 10 Jun 2024, Babu Moger wrote:
>>
>>> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
>>
>> noncont_cat_run_test()
> 
> I want to mention the test here. not function. How about this?
> 
> "The selftest non-contiguous CBM test fails on AMD with the warnings."
> 

This does not take Ilpo's feedback regarding "the warnings" into account.
I agree with Ilpo (keeping his comment below) that "with the warnings"
sounds incomplete. Also, it is not necessary to mention both "selftest"
and "test" in "The selftest non-contiguous CBM test".

You can change it to something like:
	The non-contiguous CBM test fails on AMD with:
		/* insert error message here */

	AMD always supports non-contiguous CBM but does not report
	it via CPUID.

	Fix the non-contiguous CBM test to use CPUID to discover
	non-contiguous CBM support only on Intel.

>>
>> (In general, use () when refering to a function, same thing in the
>> shortlog).
>>
>> "the warnings" sounds like I should know about what warning it fails with
>> but there's no previous context which tells that information. I suggest
>> you either use "a warning" or quote the warning it fails with into the
>> commit message.
>>
>>> is, AMD supports non contiguous CBM masks but does not report it via CPUID.

The "M" in CBM is for "mask" so  there is no need to write "mask" after CBM.

Reinette
Moger, Babu June 10, 2024, 11:11 p.m. UTC | #11
Hi Reinette,

On 6/10/2024 4:32 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> (please do not send new version of patch in response to previous version)

Yes. I saw your comments on other thread. I will take care of it.

> 
> On 6/10/24 9:00 AM, Babu Moger wrote:
>> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
>> is, AMD supports non contiguous CBM masks but does not report it via 
>> CPUID.
>>
>> Update noncont_cat_run_test to check for the vendor when verifying CPUID.
>>
>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT 
>> test")
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> The changelog needs to be reworked based on discussion in other thread.

Sure.

> With that complete, for this patch you can add:
> | Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Thanks
Ilpo Järvinen June 11, 2024, 6:50 a.m. UTC | #12
On Mon, 10 Jun 2024, Moger, Babu wrote:
> On 6/10/24 11:20, Ilpo Järvinen wrote:
> > On Mon, 10 Jun 2024, Babu Moger wrote:
> > 
> >> The selftest noncont_cat_run_test fails on AMD with the warnings. Reason
> > 
> > noncont_cat_run_test()
> 
> I want to mention the test here. not function. How about this?
> 
> "The selftest non-contiguous CBM test fails on AMD with the warnings."

Yes, it's possible to refer to something with natural language (in fact, I 
personally find that better than using a function name when both options 
exist).

The underscores are C artifacts to replace spaces that do not belong to 
natural language so if one uses underscores, I'll always take it as a 
direct reference to C code.

The quote still has "the warnings" though (but I see Reinette already 
noted that).

> > (In general, use () when refering to a function, same thing in the 
> > shortlog).
> > 
> > "the warnings" sounds like I should know about what warning it fails with
> > but there's no previous context which tells that information. I suggest 
> > you either use "a warning" or quote the warning it fails with into the 
> > commit message.
> > 
> >> is, AMD supports non contiguous CBM masks but does not report it via CPUID.
Reinette Chatre June 11, 2024, 11:14 p.m. UTC | #13
On 6/10/24 4:11 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 6/10/2024 4:32 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> (please do not send new version of patch in response to previous version)
> 
> Yes. I saw your comments on other thread. I will take care of it.
> 

... and yet you proceed to send v3 in response to previous version again.
diff mbox series

Patch

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index d4dffc934bc3..b2988888786e 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -308,7 +308,7 @@  static int noncont_cat_run_test(const struct resctrl_test *test,
 	else
 		return -EINVAL;
 
-	if (sparse_masks != ((ecx >> 3) & 1)) {
+	if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3) & 1)) {
 		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
 		return 1;
 	}