diff mbox series

[v4,02/16] selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1) only

Message ID 20240520123020.18938-3-ilpo.jarvinen@linux.intel.com
State Superseded
Headers show
Series selftests/resctrl: resctrl_val() related cleanups & improvements | expand

Commit Message

Ilpo Järvinen May 20, 2024, 12:30 p.m. UTC
For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs
the measurement over a duration of sleep(1) call. The memory bandwidth
numbers from IMC are derived over this duration. The resctrl FS derived
memory bandwidth, however, is calculated inside measure_vals() and only
takes delta between the previous value and the current one which
besides the actual test, also samples inter-test noise.

Rework the logic in measure_vals() and get_mem_bw_imc() such that the
resctrl FS memory bandwidth section covers much shorter duration
closely matching that of the IMC perf counters to improve measurement
accuracy. Open two the resctrl mem bw files twice to avoid opening
after the test during measurement period (reading the same file twice
returns the same value so two files are needed).

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---

v4:
- Open resctrl mem bw file (twice) beforehand to avoid opening it during
  the test
v3:
- Don't drop Return: entry from perf_open_imc_mem_bw() func comment
---
 tools/testing/selftests/resctrl/resctrl_val.c | 119 +++++++++++++-----
 1 file changed, 85 insertions(+), 34 deletions(-)

Comments

Ilpo Järvinen May 24, 2024, 7:57 a.m. UTC | #1
On Thu, 23 May 2024, Reinette Chatre wrote:
> On 5/20/24 5:30 AM, Ilpo Järvinen wrote:
> > For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs
> > the measurement over a duration of sleep(1) call. The memory bandwidth
> > numbers from IMC are derived over this duration. The resctrl FS derived
> > memory bandwidth, however, is calculated inside measure_vals() and only
> > takes delta between the previous value and the current one which
> > besides the actual test, also samples inter-test noise.
> > 
> > Rework the logic in measure_vals() and get_mem_bw_imc() such that the
> > resctrl FS memory bandwidth section covers much shorter duration
> > closely matching that of the IMC perf counters to improve measurement
> > accuracy. Open two the resctrl mem bw files twice to avoid opening
> > after the test during measurement period (reading the same file twice
> > returns the same value so two files are needed).
> 
> I think this is only because of how the current reading is done, resctrl
> surely supports keeping a file open and reading from it multiple times.
> 
> There seems to be two things that prevent current code from doing this
> correctly:
> (a) the fscanf() code does not take into account that resctrl also
>     prints a "\n" ... (this seems to be the part that may cause the same
>     value to be returned).
>     So:
> 	if (fscanf(fp, "%lu", mbm_total) <= 0) {
>     should be:
> 	if (fscanf(fp, "%lu\n", mbm_total) <= 0) {
> (b) the current reading does not reset the file position so a second
>     read will attempt to read past the beginning. A "rewind(fp)"
>     should help here.

(b) cannot be the cause for returning the same value again. It would 
not be able to reread the number at all if file position is not moved.

I certainly tried with fseek() and it is when I got same value on the 
second read which is when I just went to two files solution.

> A small program like below worked for me by showing different values
> on every read:
> 
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> 
> const char *mbm_total_path =
> "/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes";
> 
> int main(void)
> {
> 	unsigned long mbm_total;
> 	FILE *fp;
> 	int count;
> 
> 	fp = fopen(mbm_total_path, "r");
> 	if (!fp) {
> 		perror("Opening data file\n");
> 		exit(1);
> 	}
> 	for (count = 0; count < 100; count++) {
> 		if (fscanf(fp, "%lu\n", &mbm_total) <= 0) {
> 			perror("Unable to read from data file\n");
> 			exit(1);
> 		}
> 		printf("Read %d: %lu\n",count ,mbm_total );
> 		sleep(1);
> 		rewind(fp);
> 	}
> 	fclose(fp);
> 	return 0;
> }

Okay, so perhaps it's your explanation (a) but can libc be trusted to not 
do buffering/caching for FILE *? So to be on the safe side, it would 
need to use syscalls directly to guarantee it's read the file twice.

If I convert it into fds, fscanf() cannot be used which would complicate 
the string processing by adding extra steps.
Reinette Chatre May 24, 2024, 3:14 p.m. UTC | #2
Hi Ilpo,

On 5/24/24 12:57 AM, Ilpo Järvinen wrote:
> On Thu, 23 May 2024, Reinette Chatre wrote:
>> On 5/20/24 5:30 AM, Ilpo Järvinen wrote:
>>> For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs
>>> the measurement over a duration of sleep(1) call. The memory bandwidth
>>> numbers from IMC are derived over this duration. The resctrl FS derived
>>> memory bandwidth, however, is calculated inside measure_vals() and only
>>> takes delta between the previous value and the current one which
>>> besides the actual test, also samples inter-test noise.
>>>
>>> Rework the logic in measure_vals() and get_mem_bw_imc() such that the
>>> resctrl FS memory bandwidth section covers much shorter duration
>>> closely matching that of the IMC perf counters to improve measurement
>>> accuracy. Open two the resctrl mem bw files twice to avoid opening
>>> after the test during measurement period (reading the same file twice
>>> returns the same value so two files are needed).
>>
>> I think this is only because of how the current reading is done, resctrl
>> surely supports keeping a file open and reading from it multiple times.
>>
>> There seems to be two things that prevent current code from doing this
>> correctly:
>> (a) the fscanf() code does not take into account that resctrl also
>>      prints a "\n" ... (this seems to be the part that may cause the same
>>      value to be returned).
>>      So:
>> 	if (fscanf(fp, "%lu", mbm_total) <= 0) {
>>      should be:
>> 	if (fscanf(fp, "%lu\n", mbm_total) <= 0) {
>> (b) the current reading does not reset the file position so a second
>>      read will attempt to read past the beginning. A "rewind(fp)"
>>      should help here.
> 
> (b) cannot be the cause for returning the same value again. It would
> not be able to reread the number at all if file position is not moved.

I know. This was not intended to explain the duplicate answer but instead
describe another change required to use current code in a loop. I
specifically said in (a) that "(this seems to be the part that may cause
the same value to be returned)".

> I certainly tried with fseek() and it is when I got same value on the
> second read which is when I just went to two files solution.
> 
>> A small program like below worked for me by showing different values
>> on every read:
>>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <unistd.h>
>>
>> const char *mbm_total_path =
>> "/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes";
>>
>> int main(void)
>> {
>> 	unsigned long mbm_total;
>> 	FILE *fp;
>> 	int count;
>>
>> 	fp = fopen(mbm_total_path, "r");
>> 	if (!fp) {
>> 		perror("Opening data file\n");
>> 		exit(1);
>> 	}
>> 	for (count = 0; count < 100; count++) {
>> 		if (fscanf(fp, "%lu\n", &mbm_total) <= 0) {
>> 			perror("Unable to read from data file\n");
>> 			exit(1);
>> 		}
>> 		printf("Read %d: %lu\n",count ,mbm_total );
>> 		sleep(1);
>> 		rewind(fp);
>> 	}
>> 	fclose(fp);
>> 	return 0;
>> }
> 
> Okay, so perhaps it's your explanation (a) but can libc be trusted to not
> do buffering/caching for FILE *? So to be on the safe side, it would

Coding with expectation that libc cannot be trusted sounds strange to me.

> need to use syscalls directly to guarantee it's read the file twice.
> 
> If I convert it into fds, fscanf() cannot be used which would complicate
> the string processing by adding extra steps.
> 

It is not clear to me why you think that fscanf() cannot be used. Could
you please elaborate what the buffering issues are?

It is not necessary to open and close the file every time a value needs
to be read from it.

Reinette
Reinette Chatre May 28, 2024, 3:15 p.m. UTC | #3
Hi Ilpo,

On 5/28/24 3:19 AM, Ilpo Järvinen wrote:
> On Fri, 24 May 2024, Ilpo Järvinen wrote:
> 
>> On Fri, 24 May 2024, Reinette Chatre wrote:
>>> On 5/24/24 12:57 AM, Ilpo Järvinen wrote:
>>>> On Thu, 23 May 2024, Reinette Chatre wrote:
>>>>> On 5/20/24 5:30 AM, Ilpo Järvinen wrote:
>>>>>> For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs
>>>>>> the measurement over a duration of sleep(1) call. The memory bandwidth
>>>>>> numbers from IMC are derived over this duration. The resctrl FS derived
>>>>>> memory bandwidth, however, is calculated inside measure_vals() and only
>>>>>> takes delta between the previous value and the current one which
>>>>>> besides the actual test, also samples inter-test noise.
>>>>>>
>>>>>> Rework the logic in measure_vals() and get_mem_bw_imc() such that the
>>>>>> resctrl FS memory bandwidth section covers much shorter duration
>>>>>> closely matching that of the IMC perf counters to improve measurement
>>>>>> accuracy. Open two the resctrl mem bw files twice to avoid opening
>>>>>> after the test during measurement period (reading the same file twice
>>>>>> returns the same value so two files are needed).
>>>>>
>>>>> I think this is only because of how the current reading is done, resctrl
>>>>> surely supports keeping a file open and reading from it multiple times.
>>>>>
>>>>> There seems to be two things that prevent current code from doing this
>>>>> correctly:
>>>>> (a) the fscanf() code does not take into account that resctrl also
>>>>>       prints a "\n" ... (this seems to be the part that may cause the same
>>>>>       value to be returned).
>>>>>       So:
>>>>> 	if (fscanf(fp, "%lu", mbm_total) <= 0) {
>>>>>       should be:
>>>>> 	if (fscanf(fp, "%lu\n", mbm_total) <= 0) {
>>>>> (b) the current reading does not reset the file position so a second
>>>>>       read will attempt to read past the beginning. A "rewind(fp)"
>>>>>       should help here.
>>>>
>>>> (b) cannot be the cause for returning the same value again. It would
>>>> not be able to reread the number at all if file position is not moved.
>>>
>>> I know. This was not intended to explain the duplicate answer but instead
>>> describe another change required to use current code in a loop. I
>>> specifically said in (a) that "(this seems to be the part that may cause
>>> the same value to be returned)".
>>>
>>>> I certainly tried with fseek() and it is when I got same value on the
>>>> second read which is when I just went to two files solution.
>>>>
>>>>> A small program like below worked for me by showing different values
>>>>> on every read:
>>>>>
>>>>> #include <stdio.h>
>>>>> #include <stdlib.h>
>>>>> #include <unistd.h>
>>>>>
>>>>> const char *mbm_total_path =
>>>>> "/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes";
>>>>>
>>>>> int main(void)
>>>>> {
>>>>> 	unsigned long mbm_total;
>>>>> 	FILE *fp;
>>>>> 	int count;
>>>>>
>>>>> 	fp = fopen(mbm_total_path, "r");
>>>>> 	if (!fp) {
>>>>> 		perror("Opening data file\n");
>>>>> 		exit(1);
>>>>> 	}
>>>>> 	for (count = 0; count < 100; count++) {
>>>>> 		if (fscanf(fp, "%lu\n", &mbm_total) <= 0) {
>>>>> 			perror("Unable to read from data file\n");
>>>>> 			exit(1);
>>>>> 		}
>>>>> 		printf("Read %d: %lu\n",count ,mbm_total );
>>>>> 		sleep(1);
>>>>> 		rewind(fp);
>>>>> 	}
>>>>> 	fclose(fp);
>>>>> 	return 0;
>>>>> }
>>>>
>>>> Okay, so perhaps it's your explanation (a) but can libc be trusted to not
>>>> do buffering/caching for FILE *? So to be on the safe side, it would
>>>
>>> Coding with expectation that libc cannot be trusted sounds strange to me.
>>>
>>>> need to use syscalls directly to guarantee it's read the file twice.
>>>>
>>>> If I convert it into fds, fscanf() cannot be used which would complicate
>>>> the string processing by adding extra steps.
>>>>
>>>
>>> It is not clear to me why you think that fscanf() cannot be used.
>>
>> This was related to fscanf() not being able to read from an fd which is
>> different interface than what libc's FILE * is.

The part I am missing is why you believe syscalls are required. Could
you please elaborate why FILE * cannot be used?

>>
>>> Could you please elaborate what the buffering issues are?
>>
>> I'm pretty sure that by default libc does some buffering (even std*
>> streams are line buffered and others streams even more). I'm not entirely
>> sure about the extent of that buffering but here we need to always read
>> the up to date value from the file itself, not from some buffer.
>>
>> Maybe there never is any problem that the earlier read values are returned
>> from some libc buffer when lseek/rewind is used, I just don't know that
>> for sure. You seem to be more certain but I've not seen on what basis
>> (other than the anecdotial test you provided).

I demonstrated that it works. I have not heard a clear reason why this conclusion
is incorrect. The above remains vague to me and I cannot find a description of
a clear problem that can be studied.

>>
>>> It is not necessary to open and close the file every time a value needs
>>> to be read from it.
> 
> I'm bit unsure where to go with this. While I could change the code to
> match what you described, I realized with the two files approach there's
> no need to do even review/lseek() call during the measurement. It might
> not be very significant compared with the open that was there initially
> but it's still extra.
> 

We are discussing the resctrl selftests that will accompany the resctrl
filesystem in the kernel. When in doubt on how to interact with resctrl users
use the selftests as reference. Needing to open and close a resctrl file
every time a value needs to be read from it is not the correct guidance.

Reinette
Ilpo Järvinen May 31, 2024, 12:51 p.m. UTC | #4
On Thu, 30 May 2024, Reinette Chatre wrote:

> Hi Ilpo,
> 
> On 5/30/24 4:11 AM, Ilpo Järvinen wrote:
> > On Tue, 28 May 2024, Reinette Chatre wrote:
> > > On 5/28/24 3:19 AM, Ilpo Järvinen wrote:
> > > > On Fri, 24 May 2024, Ilpo Järvinen wrote:
> > > > > On Fri, 24 May 2024, Reinette Chatre wrote:
> > > > > > On 5/24/24 12:57 AM, Ilpo Järvinen wrote:
> > > > > > > On Thu, 23 May 2024, Reinette Chatre wrote:
> 
> ...
> 
> > > > > > It is not necessary to open and close the file every time a value
> > > > > > needs
> > > > > > to be read from it.
> > > > 
> > > > I'm bit unsure where to go with this. While I could change the code to
> > > > match what you described, I realized with the two files approach there's
> > > > no need to do even review/lseek() call during the measurement. It might
> > > > not be very significant compared with the open that was there initially
> > > > but it's still extra.
> > > 
> > > We are discussing the resctrl selftests that will accompany the resctrl
> > > filesystem in the kernel. When in doubt on how to interact with resctrl
> > > users
> > > use the selftests as reference. Needing to open and close a resctrl file
> > > every time a value needs to be read from it is not the correct guidance.
> > 
> > That's actually a different goal from the earlier, but I've no problem
> > adjusting to it.
> > 
> > Initially, this open/close() refactoring was made because of another goal
> > which was to avoid doing extra syscalls during the test.
> > 
> 
> It is not clear what you hint at here. Reading twice from an open file
> should not be a huge adjustment so it is not clear to me how this results
> in a big change to this work. As I understand this does match with original
> goal
> of reducing syscalls since the file need not be opened and closed twice.

What I tried to say is that with a single file, the test uses rewind() 
that also needs to do a syscall within the test period, whereas if the 
file is opened twice in advance rewind() won't be needed.

But I've converted it into single file for the sake of serving as an 
example for other resctrl users.
diff mbox series

Patch

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index f55f5989de72..bd5a1cbb8a21 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -306,18 +306,13 @@  static void perf_close_imc_mem_bw(void)
 }
 
 /*
- * get_mem_bw_imc:	Memory band width as reported by iMC counters
- * @cpu_no:		CPU number that the benchmark PID is binded to
- * @bw_report:		Bandwidth report type (reads, writes)
- *
- * Memory B/W utilized by a process on a socket can be calculated using
- * iMC counters. Perf events are used to read these counters.
+ * perf_open_imc_mem_bw - Open perf fds for IMCs
+ * @cpu_no: CPU number that the benchmark PID is binded to
  *
  * Return: = 0 on success. < 0 on failure.
  */
-static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
+static int perf_open_imc_mem_bw(int cpu_no)
 {
-	float reads, writes, of_mul_read, of_mul_write;
 	int imc, ret;
 
 	for (imc = 0; imc < imcs; imc++) {
@@ -325,8 +320,6 @@  static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 		imc_counters_config[imc][WRITE].fd = -1;
 	}
 
-	/* Start all iMC counters to log values (both read and write) */
-	reads = 0, writes = 0, of_mul_read = 1, of_mul_write = 1;
 	for (imc = 0; imc < imcs; imc++) {
 		ret = open_perf_event(imc, cpu_no, READ);
 		if (ret)
@@ -334,7 +327,26 @@  static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 		ret = open_perf_event(imc, cpu_no, WRITE);
 		if (ret)
 			goto close_fds;
+	}
 
+	return 0;
+
+close_fds:
+	perf_close_imc_mem_bw();
+	return -1;
+}
+
+/*
+ * do_mem_bw_test - Perform memory bandwidth test
+ *
+ * Runs memory bandwidth test over one second period. Also, handles starting
+ * and stopping of the IMC perf counters around the test.
+ */
+static void do_imc_mem_bw_test(void)
+{
+	int imc;
+
+	for (imc = 0; imc < imcs; imc++) {
 		membw_ioctl_perf_event_ioc_reset_enable(imc, READ);
 		membw_ioctl_perf_event_ioc_reset_enable(imc, WRITE);
 	}
@@ -346,6 +358,24 @@  static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 		membw_ioctl_perf_event_ioc_disable(imc, READ);
 		membw_ioctl_perf_event_ioc_disable(imc, WRITE);
 	}
+}
+
+/*
+ * get_mem_bw_imc - Memory band width as reported by iMC counters
+ * @bw_report: Bandwidth report type (reads, writes)
+ *
+ * Memory B/W utilized by a process on a socket can be calculated using
+ * iMC counters. Perf events are used to read these counters.
+ *
+ * Return: = 0 on success. < 0 on failure.
+ */
+static int get_mem_bw_imc(char *bw_report, float *bw_imc)
+{
+	float reads, writes, of_mul_read, of_mul_write;
+	int imc;
+
+	/* Start all iMC counters to log values (both read and write) */
+	reads = 0, writes = 0, of_mul_read = 1, of_mul_write = 1;
 
 	/*
 	 * Get results which are stored in struct type imc_counter_config
@@ -462,24 +492,23 @@  static void initialize_mem_bw_resctrl(const char *ctrlgrp, const char *mongrp,
  * 1. If con_mon grp is given, then read from it
  * 2. If con_mon grp is not given, then read from root con_mon grp
  */
-static int get_mem_bw_resctrl(unsigned long *mbm_total)
+static FILE *open_mem_bw_resctrl(const char *mbm_bw_file)
 {
 	FILE *fp;
 
-	fp = fopen(mbm_total_path, "r");
-	if (!fp) {
+	fp = fopen(mbm_bw_file, "r");
+	if (!fp)
 		ksft_perror("Failed to open total bw file");
 
-		return -1;
-	}
+	return fp;
+}
+
+static int get_mem_bw_resctrl(FILE *fp, unsigned long *mbm_total)
+{
 	if (fscanf(fp, "%lu", mbm_total) <= 0) {
 		ksft_perror("Could not get mbm local bytes");
-		fclose(fp);
-
 		return -1;
 	}
-	fclose(fp);
-
 	return 0;
 }
 
@@ -616,13 +645,22 @@  static void initialize_llc_occu_resctrl(const char *ctrlgrp, const char *mongrp,
 }
 
 static int measure_vals(const struct user_params *uparams,
-			struct resctrl_val_param *param,
-			unsigned long *bw_resc_start)
+			struct resctrl_val_param *param)
 {
-	unsigned long bw_resc, bw_resc_end;
+	unsigned long bw_resc, bw_resc_start, bw_resc_end;
+	FILE *mem_bw_fp, *mem_bw_fp2;
 	float bw_imc;
 	int ret;
 
+	mem_bw_fp = open_mem_bw_resctrl(mbm_total_path);
+	if (!mem_bw_fp)
+		return -1;
+	mem_bw_fp2 = open_mem_bw_resctrl(mbm_total_path);
+	if (!mem_bw_fp2) {
+		ret = -1;
+		goto close_fp;
+	}
+
 	/*
 	 * Measure memory bandwidth from resctrl and from
 	 * another source which is perf imc value or could
@@ -630,22 +668,36 @@  static int measure_vals(const struct user_params *uparams,
 	 * Compare the two values to validate resctrl value.
 	 * It takes 1sec to measure the data.
 	 */
-	ret = get_mem_bw_imc(uparams->cpu, param->bw_report, &bw_imc);
+	ret = perf_open_imc_mem_bw(uparams->cpu);
 	if (ret < 0)
-		return ret;
+		goto close_fp2;
 
-	ret = get_mem_bw_resctrl(&bw_resc_end);
+	ret = get_mem_bw_resctrl(mem_bw_fp, &bw_resc_start);
 	if (ret < 0)
-		return ret;
+		goto close_fp2;
 
-	bw_resc = (bw_resc_end - *bw_resc_start) / MB;
-	ret = print_results_bw(param->filename, bm_pid, bw_imc, bw_resc);
-	if (ret)
-		return ret;
+	do_imc_mem_bw_test();
+
+	ret = get_mem_bw_resctrl(mem_bw_fp2, &bw_resc_end);
+	if (ret < 0)
+		goto close_fp2;
+
+	ret = get_mem_bw_imc(param->bw_report, &bw_imc);
+	if (ret < 0)
+		goto close_fp2;
 
-	*bw_resc_start = bw_resc_end;
+	fclose(mem_bw_fp2);
+	fclose(mem_bw_fp);
 
-	return 0;
+	bw_resc = (bw_resc_end - bw_resc_start) / MB;
+
+	return print_results_bw(param->filename, bm_pid, bw_imc, bw_resc);
+
+close_fp2:
+	fclose(mem_bw_fp2);
+close_fp:
+	fclose(mem_bw_fp);
+	return ret;
 }
 
 /*
@@ -719,7 +771,6 @@  int resctrl_val(const struct resctrl_test *test,
 		struct resctrl_val_param *param)
 {
 	char *resctrl_val = param->resctrl_val;
-	unsigned long bw_resc_start = 0;
 	struct sigaction sigact;
 	int ret = 0, pipefd[2];
 	char pipe_message = 0;
@@ -861,7 +912,7 @@  int resctrl_val(const struct resctrl_test *test,
 
 		if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
 		    !strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
-			ret = measure_vals(uparams, param, &bw_resc_start);
+			ret = measure_vals(uparams, param);
 			if (ret)
 				break;
 		} else if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {