[v2,1/5] app/testpmd: clock gettime call in throughput calculation

Message ID 20200626220943.22526-1-honnappa.nagarahalli@arm.com
State Superseded
Headers show
Series
  • [v2,1/5] app/testpmd: clock gettime call in throughput calculation
Related show

Commit Message

Honnappa Nagarahalli June 26, 2020, 10:09 p.m.
The throughput calculation requires a counter that measures
passing of time. However, the kernel saves and restores the PMU
state when a thread is unscheduled and scheduled. This ensures
that the PMU cycles are not counted towards a thread that is
not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
the PMU cycles do not represent the passing of time.
This results in incorrect calculation of throughput numbers.
Use clock_gettime system call to calculate the time passed since
last call.

Bugzilla ID: 450
Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
Cc: zhihong.wang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

Reviewed-by: Phil Yang <phil.yang@arm.com>

Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

Tested-by: Phil Yang <phil.yang@arm.com>

Tested-by: Ali Alnubani <alialnu@mellanox.com>

---
v2: Updated commit log in 1/5 (Jerin)

 app/test-pmd/config.c | 44 +++++++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 14 deletions(-)

-- 
2.17.1

Comments

Ferruh Yigit July 6, 2020, 3:36 p.m. | #1
On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> The throughput calculation requires a counter that measures

> passing of time. However, the kernel saves and restores the PMU

> state when a thread is unscheduled and scheduled. This ensures

> that the PMU cycles are not counted towards a thread that is

> not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,

> the PMU cycles do not represent the passing of time.


Does this mean 'rte_rdtsc()' is broken for Arm?
Wouldn't it cause more serious issue than testpmd throughput stats?

> This results in incorrect calculation of throughput numbers.

> Use clock_gettime system call to calculate the time passed since

> last call.

> 

> Bugzilla ID: 450

> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")

> Cc: zhihong.wang@intel.com

> Cc: stable@dpdk.org

> 

> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> Reviewed-by: Phil Yang <phil.yang@arm.com>

> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

> Tested-by: Phil Yang <phil.yang@arm.com>

> Tested-by: Ali Alnubani <alialnu@mellanox.com>


<...>

> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)

>  		}

>  	}

>  

> -	diff_cycles = prev_cycles[port_id];

> -	prev_cycles[port_id] = rte_rdtsc();

> -	if (diff_cycles > 0)

> -		diff_cycles = prev_cycles[port_id] - diff_cycles;

> +	diff_ns = 0;

> +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {


I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think
it is OK.

<...>

> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)

>  		(stats.obytes - prev_bytes_tx[port_id]) : 0;

>  	prev_bytes_rx[port_id] = stats.ibytes;

>  	prev_bytes_tx[port_id] = stats.obytes;

> -	mbps_rx = diff_cycles > 0 ?

> -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;

> -	mbps_tx = diff_cycles > 0 ?

> -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;

> +	mbps_rx = diff_ns > 0 ?

> +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;

> +	mbps_tx = diff_ns > 0 ?

> +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;


The calculation also fixes other issue in the stats.
With previous method, if the sampling between two stats is a little long,
"diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.


Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Honnappa Nagarahalli July 6, 2020, 4:53 p.m. | #2
Hi Ferruh,
	Thanks for your comments.

<snip>

> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput

> calculation

> 

> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:

> > The throughput calculation requires a counter that measures passing of

> > time. However, the kernel saves and restores the PMU state when a

> > thread is unscheduled and scheduled. This ensures that the PMU cycles

> > are not counted towards a thread that is not scheduled. Hence, when

> > RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not

> represent

> > the passing of time.

> 

> Does this mean 'rte_rdtsc()' is broken for Arm?

It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.

> Wouldn't it cause more serious issue than testpmd throughput stats?

Within DPDK, it does not seem to be used for anything critical (I have fixed one).

> 

> > This results in incorrect calculation of throughput numbers.

> > Use clock_gettime system call to calculate the time passed since last

> > call.

> >

> > Bugzilla ID: 450

> > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")

> > Cc: zhihong.wang@intel.com

> > Cc: stable@dpdk.org

> >

> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> > Reviewed-by: Phil Yang <phil.yang@arm.com>

> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

> > Tested-by: Phil Yang <phil.yang@arm.com>

> > Tested-by: Ali Alnubani <alialnu@mellanox.com>

> 

> <...>

> 

> > @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)

> >  		}

> >  	}

> >

> > -	diff_cycles = prev_cycles[port_id];

> > -	prev_cycles[port_id] = rte_rdtsc();

> > -	if (diff_cycles > 0)

> > -		diff_cycles = prev_cycles[port_id] - diff_cycles;

> > +	diff_ns = 0;

> > +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {

> 

> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is

> OK.

> 

> <...>

> 

> > @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)

> >  		(stats.obytes - prev_bytes_tx[port_id]) : 0;

> >  	prev_bytes_rx[port_id] = stats.ibytes;

> >  	prev_bytes_tx[port_id] = stats.obytes;

> > -	mbps_rx = diff_cycles > 0 ?

> > -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;

> > -	mbps_tx = diff_cycles > 0 ?

> > -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;

> > +	mbps_rx = diff_ns > 0 ?

> > +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;

> > +	mbps_tx = diff_ns > 0 ?

> > +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;

> 

> The calculation also fixes other issue in the stats.

> With previous method, if the sampling between two stats is a little long,

> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.

> 

> 

> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit July 6, 2020, 5:19 p.m. | #3
On 7/6/2020 5:53 PM, Honnappa Nagarahalli wrote:
> Hi Ferruh,

> 	Thanks for your comments.

> 

> <snip>

> 

>> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput

>> calculation

>>

>> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:

>>> The throughput calculation requires a counter that measures passing of

>>> time. However, the kernel saves and restores the PMU state when a

>>> thread is unscheduled and scheduled. This ensures that the PMU cycles

>>> are not counted towards a thread that is not scheduled. Hence, when

>>> RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not

>> represent

>>> the passing of time.

>>

>> Does this mean 'rte_rdtsc()' is broken for Arm?

> It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.

> 

>> Wouldn't it cause more serious issue than testpmd throughput stats?

> Within DPDK, it does not seem to be used for anything critical (I have fixed one).


OK, thanks for clarification.

> 

>>

>>> This results in incorrect calculation of throughput numbers.

>>> Use clock_gettime system call to calculate the time passed since last

>>> call.

>>>

>>> Bugzilla ID: 450

>>> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")

>>> Cc: zhihong.wang@intel.com

>>> Cc: stable@dpdk.org

>>>

>>> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

>>> Reviewed-by: Phil Yang <phil.yang@arm.com>

>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

>>> Tested-by: Phil Yang <phil.yang@arm.com>

>>> Tested-by: Ali Alnubani <alialnu@mellanox.com>

>>

>> <...>

>>

>>> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)

>>>  		}

>>>  	}

>>>

>>> -	diff_cycles = prev_cycles[port_id];

>>> -	prev_cycles[port_id] = rte_rdtsc();

>>> -	if (diff_cycles > 0)

>>> -		diff_cycles = prev_cycles[port_id] - diff_cycles;

>>> +	diff_ns = 0;

>>> +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {

>>

>> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is

>> OK.

>>

>> <...>

>>

>>> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)

>>>  		(stats.obytes - prev_bytes_tx[port_id]) : 0;

>>>  	prev_bytes_rx[port_id] = stats.ibytes;

>>>  	prev_bytes_tx[port_id] = stats.obytes;

>>> -	mbps_rx = diff_cycles > 0 ?

>>> -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;

>>> -	mbps_tx = diff_cycles > 0 ?

>>> -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;

>>> +	mbps_rx = diff_ns > 0 ?

>>> +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;

>>> +	mbps_tx = diff_ns > 0 ?

>>> +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;

>>

>> The calculation also fixes other issue in the stats.

>> With previous method, if the sampling between two stats is a little long,

>> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.

>>

>>

>> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Patch

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 016bcb09c..91fbf99f8 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -54,6 +54,14 @@ 
 
 #define ETHDEV_FWVERS_LEN 32
 
+#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW
+#else
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC
+#endif
+
+#define NS_PER_SEC 1E9
+
 static char *flowtype_to_str(uint16_t flow_type);
 
 static const struct {
@@ -136,9 +144,10 @@  nic_stats_display(portid_t port_id)
 	static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS];
 	static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS];
 	static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS];
-	static uint64_t prev_cycles[RTE_MAX_ETHPORTS];
+	static uint64_t prev_ns[RTE_MAX_ETHPORTS];
+	struct timespec cur_time;
 	uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx,
-								diff_cycles;
+								diff_ns;
 	uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx;
 	struct rte_eth_stats stats;
 	struct rte_port *port = &ports[port_id];
@@ -195,10 +204,17 @@  nic_stats_display(portid_t port_id)
 		}
 	}
 
-	diff_cycles = prev_cycles[port_id];
-	prev_cycles[port_id] = rte_rdtsc();
-	if (diff_cycles > 0)
-		diff_cycles = prev_cycles[port_id] - diff_cycles;
+	diff_ns = 0;
+	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
+		uint64_t ns;
+
+		ns = cur_time.tv_sec * NS_PER_SEC;
+		ns += cur_time.tv_nsec;
+
+		if (prev_ns[port_id] != 0)
+			diff_ns = ns - prev_ns[port_id];
+		prev_ns[port_id] = ns;
+	}
 
 	diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ?
 		(stats.ipackets - prev_pkts_rx[port_id]) : 0;
@@ -206,10 +222,10 @@  nic_stats_display(portid_t port_id)
 		(stats.opackets - prev_pkts_tx[port_id]) : 0;
 	prev_pkts_rx[port_id] = stats.ipackets;
 	prev_pkts_tx[port_id] = stats.opackets;
-	mpps_rx = diff_cycles > 0 ?
-		diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0;
-	mpps_tx = diff_cycles > 0 ?
-		diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0;
+	mpps_rx = diff_ns > 0 ?
+		(double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0;
+	mpps_tx = diff_ns > 0 ?
+		(double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0;
 
 	diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ?
 		(stats.ibytes - prev_bytes_rx[port_id]) : 0;
@@ -217,10 +233,10 @@  nic_stats_display(portid_t port_id)
 		(stats.obytes - prev_bytes_tx[port_id]) : 0;
 	prev_bytes_rx[port_id] = stats.ibytes;
 	prev_bytes_tx[port_id] = stats.obytes;
-	mbps_rx = diff_cycles > 0 ?
-		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
-	mbps_tx = diff_cycles > 0 ?
-		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
+	mbps_rx = diff_ns > 0 ?
+		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
+	mbps_tx = diff_ns > 0 ?
+		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
 
 	printf("\n  Throughput (since last show)\n");
 	printf("  Rx-pps: %12"PRIu64"          Rx-bps: %12"PRIu64"\n  Tx-pps: %12"