diff mbox series

[v4,04/19] selftests/resctrl: Close perf value read fd on errors

Message ID 20230713131932.133258-5-ilpo.jarvinen@linux.intel.com
State New
Headers show
Series selftests/resctrl: Fixes and cleanups | expand

Commit Message

Ilpo Järvinen July 13, 2023, 1:19 p.m. UTC
Perf event fd (fd_lm) is not closed on some error paths.

Always close fd_lm in get_llc_perf() and add close into an error
handling block in cat_val().

Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 tools/testing/selftests/resctrl/cache.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Reinette Chatre July 14, 2023, 5:36 p.m. UTC | #1
Hi Ilpo,

On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
> On Thu, 13 Jul 2023, Reinette Chatre wrote:
>> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
>>> Perf event fd (fd_lm) is not closed on some error paths.
>>>
>>> Always close fd_lm in get_llc_perf() and add close into an error
>>> handling block in cat_val().
>>>
>>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>> ---
>>>  tools/testing/selftests/resctrl/cache.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
>>> index 8a4fe8693be6..ced47b445d1e 100644
>>> --- a/tools/testing/selftests/resctrl/cache.c
>>> +++ b/tools/testing/selftests/resctrl/cache.c
>>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
>>>  static int get_llc_perf(unsigned long *llc_perf_miss)
>>>  {
>>>  	__u64 total_misses;
>>> +	int ret;
>>>  
>>>  	/* Stop counters after one span to get miss rate */
>>>  
>>>  	ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
>>>  
>>> -	if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
>>> +	ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
>>> +	close(fd_lm);
>>> +	if (ret == -1) {
>>>  		perror("Could not get llc misses through perf");
>>> -
>>>  		return -1;
>>>  	}
>>>  
>>>  	total_misses = rf_cqm.values[0].value;
>>> -
>>> -	close(fd_lm);
>>> -
>>>  	*llc_perf_miss = total_misses;
>>>  
>>>  	return 0;
>>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
>>>  					 memflush, operation, resctrl_val)) {
>>>  				fprintf(stderr, "Error-running fill buffer\n");
>>>  				ret = -1;
>>> +				close(fd_lm);
>>>  				break;
>>>  			}
>>>  
>>
>> Instead of fixing these existing patterns I think it would make the code
>> easier to understand and maintain if it is made symmetrical.
>> Having the perf event fd opened in one place but its close()
>> scattered elsewhere has the potential for confusion and making later
>> mistakes easy to miss.
>>
>> What if perf event fd is closed in a new "disable_llc_perf()" that
>> is matched with "reset_enable_llc_perf()" and called
>> from cat_val()?
>>
>> I think this raises another issue with the test trickery where
>> measure_cache_vals() has some assumptions about state based on the
>> test name.
> 
> I very much agree on the principle here, and thus I already have created 
> patches which will do a major cleanup on this area. The cleaned-up code 
> has pe_fd local var to cat_val() and handles closing it in cat_val() with 
> the usual patterns.
> 
> However, the patch is currently resides post L3 CAT test rewrite. 
> Backporting the cleanups/refactors into this series would require 
> considerable effort due to how convoluted all those n-step cleanup patches 
> and L3 CAT test rewrite are in this area. There's just very much to 
> cleanup here and L3 rewrite will touch the same areas so its a net 
> full of conflicts.
> 
> Do you want me to spend the effort to backport them into this series 
> (I expect will take some time)?

Considering the "Fixes" tag, having a smaller fix that can easily
be backported would be ideal so I am ok with deferring a bigger
rework.

I do think this fix can be made more robust with a couple of small
changes that should not introduce significant conflicts:
* initialize fd_lm to -1 
* do not close() fd_lm in get_llc_perf() but instead move its
  close() to at exit of cat_val().
* add check in get_llc_perf() that it does not attempt ioctl()
  on "fd_lm == -1" (later addition would be error checking of
  the ioctl())

> I currently have these items pending besides this series (in order):
> - L3 CAT test rewrite and its preparatory patches
> - More cleanups (including the pe_fd cleanup)
> - New generalized test framework
> - L2 CAT test

Thank you very much for taking this on.

Reinette
Ilpo Järvinen July 17, 2023, 1:05 p.m. UTC | #2
On Fri, 14 Jul 2023, Reinette Chatre wrote:
> On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
> > On Thu, 13 Jul 2023, Reinette Chatre wrote:
> >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
> >>> Perf event fd (fd_lm) is not closed on some error paths.
> >>>
> >>> Always close fd_lm in get_llc_perf() and add close into an error
> >>> handling block in cat_val().
> >>>
> >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>> ---
> >>>  tools/testing/selftests/resctrl/cache.c | 10 +++++-----
> >>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> >>> index 8a4fe8693be6..ced47b445d1e 100644
> >>> --- a/tools/testing/selftests/resctrl/cache.c
> >>> +++ b/tools/testing/selftests/resctrl/cache.c
> >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
> >>>  static int get_llc_perf(unsigned long *llc_perf_miss)
> >>>  {
> >>>  	__u64 total_misses;
> >>> +	int ret;
> >>>  
> >>>  	/* Stop counters after one span to get miss rate */
> >>>  
> >>>  	ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> >>>  
> >>> -	if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
> >>> +	ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> >>> +	close(fd_lm);
> >>> +	if (ret == -1) {
> >>>  		perror("Could not get llc misses through perf");
> >>> -
> >>>  		return -1;
> >>>  	}
> >>>  
> >>>  	total_misses = rf_cqm.values[0].value;
> >>> -
> >>> -	close(fd_lm);
> >>> -
> >>>  	*llc_perf_miss = total_misses;
> >>>  
> >>>  	return 0;
> >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
> >>>  					 memflush, operation, resctrl_val)) {
> >>>  				fprintf(stderr, "Error-running fill buffer\n");
> >>>  				ret = -1;
> >>> +				close(fd_lm);
> >>>  				break;
> >>>  			}
> >>>  
> >>
> >> Instead of fixing these existing patterns I think it would make the code
> >> easier to understand and maintain if it is made symmetrical.
> >> Having the perf event fd opened in one place but its close()
> >> scattered elsewhere has the potential for confusion and making later
> >> mistakes easy to miss.
> >>
> >> What if perf event fd is closed in a new "disable_llc_perf()" that
> >> is matched with "reset_enable_llc_perf()" and called
> >> from cat_val()?
> >>
> >> I think this raises another issue with the test trickery where
> >> measure_cache_vals() has some assumptions about state based on the
> >> test name.
> > 
> > I very much agree on the principle here, and thus I already have created 
> > patches which will do a major cleanup on this area. The cleaned-up code 
> > has pe_fd local var to cat_val() and handles closing it in cat_val() with 
> > the usual patterns.
> > 
> > However, the patch is currently resides post L3 CAT test rewrite. 
> > Backporting the cleanups/refactors into this series would require 
> > considerable effort due to how convoluted all those n-step cleanup patches 
> > and L3 CAT test rewrite are in this area. There's just very much to 
> > cleanup here and L3 rewrite will touch the same areas so its a net 
> > full of conflicts.
> > 
> > Do you want me to spend the effort to backport them into this series 
> > (I expect will take some time)?
> 
> Considering the "Fixes" tag, having a smaller fix that can easily
> be backported would be ideal so I am ok with deferring a bigger
> rework.
> 
> I do think this fix can be made more robust with a couple of small
> changes that should not introduce significant conflicts:
> * initialize fd_lm to -1 

> * do not close() fd_lm in get_llc_perf() but instead move its
>   close() to at exit of cat_val().

I changed the test to only close the fd in cat_val() which is the 
direction the later refactor/cleanup changes (not in this series) was 
moving anyway.

> * add check in get_llc_perf() that it does not attempt ioctl()
>   on "fd_lm == -1" (later addition would be error checking of
>   the ioctl())

The other two things suggested seem unnecessary and I've not implemented 
them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going 
to be replaced soonish, putting any extra "safety" effort into it now 
seems waste of time.
Reinette Chatre July 17, 2023, 4:09 p.m. UTC | #3
Hi Ilpo,

On 7/17/2023 6:05 AM, Ilpo Järvinen wrote:
> On Fri, 14 Jul 2023, Reinette Chatre wrote:
>> * add check in get_llc_perf() that it does not attempt ioctl()
>>   on "fd_lm == -1" (later addition would be error checking of
>>   the ioctl())
> 
> The other two things suggested seem unnecessary and I've not implemented 
> them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going 
> to be replaced soonish, putting any extra "safety" effort into it now 
> seems waste of time.

Yes, this suggestion was indeed to make the code more robust. I
certainly do not want to waste your time. Please keep in mind 
when you respond that I do not have insight into the reworks
you are still planning. 

Reinette
diff mbox series

Patch

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 8a4fe8693be6..ced47b445d1e 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -87,21 +87,20 @@  static int reset_enable_llc_perf(pid_t pid, int cpu_no)
 static int get_llc_perf(unsigned long *llc_perf_miss)
 {
 	__u64 total_misses;
+	int ret;
 
 	/* Stop counters after one span to get miss rate */
 
 	ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
 
-	if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
+	ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
+	close(fd_lm);
+	if (ret == -1) {
 		perror("Could not get llc misses through perf");
-
 		return -1;
 	}
 
 	total_misses = rf_cqm.values[0].value;
-
-	close(fd_lm);
-
 	*llc_perf_miss = total_misses;
 
 	return 0;
@@ -253,6 +252,7 @@  int cat_val(struct resctrl_val_param *param)
 					 memflush, operation, resctrl_val)) {
 				fprintf(stderr, "Error-running fill buffer\n");
 				ret = -1;
+				close(fd_lm);
 				break;
 			}