diff mbox

[1/5] vmstat: Implement refresh_vm_stats()

Message ID 1338553446-22292-1-git-send-email-anton.vorontsov@linaro.org
State New
Headers show

Commit Message

Anton Vorontsov June 1, 2012, 12:24 p.m. UTC
This function forcibly flushes per-cpu vmstat diff counters to the
global counters.

Note that we don't try to flush percpu pagesets, the pcp will be
still flushed once per 3 seconds.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
---
 include/linux/vmstat.h |    2 ++
 mm/vmstat.c            |   22 +++++++++++++++++++++-
 2 files changed, 23 insertions(+), 1 deletion(-)

Comments

Christoph Lameter (Ampere) June 5, 2012, 2:30 p.m. UTC | #1
On Fri, 1 Jun 2012, Anton Vorontsov wrote:

> This function forcibly flushes per-cpu vmstat diff counters to the
> global counters.

Why is it necessary to have a function that does not expire the pcps? Is
that side effect important? We use refresh_vm_cpu_stats(cpu) in
page_alloc.c already to flush the vmstat counters. Is the flushing of the
pcps in 2 seconds insteads of 3 once really that important?

Also if we do this

Can we therefore also name the function in a different way like

	flush_vmstats()


> @@ -456,11 +457,15 @@ void refresh_cpu_vm_stats(int cpu)
>  				local_irq_restore(flags);
>  				atomic_long_add(v, &zone->vm_stat[i]);
>  				global_diff[i] += v;
> +				if (!drain_pcp)
> +					continue;
>  #ifdef CONFIG_NUMA
>  				/* 3 seconds idle till flush */
>  				p->expire = 3;
>  #endif

Erm. This should be

#ifdef CONFIG_NUMA
	if (drain_pcp)
		p->expire = 3;
#endif

The construct using "continue" is weird.


>  			}
> +		if (!drain_pcp)
> +			continue;
>  		cond_resched();
>  #ifdef CONFIG_NUMA
>  		/*
> @@ -495,6 +500,21 @@ void refresh_cpu_vm_stats(int cpu)
>  			atomic_long_add(global_diff[i], &vm_stat[i]);
>  }
>
> +void refresh_cpu_vm_stats(int cpu)
> +{
> +	__refresh_cpu_vm_stats(cpu, 1);
> +}

Fold __refresh_cpu_vm_stats into this function and modify the caller
of refresh_cpu_vm_stats instead.
KOSAKI Motohiro June 8, 2012, 3:17 a.m. UTC | #2
(6/1/12 8:24 AM), Anton Vorontsov wrote:
> This function forcibly flushes per-cpu vmstat diff counters to the
> global counters.
> 
> Note that we don't try to flush percpu pagesets, the pcp will be
> still flushed once per 3 seconds.
> 
> Signed-off-by: Anton Vorontsov<anton.vorontsov@linaro.org>

No.

This is insane. Your patch improved vmevent accuracy a _bit_. But instead of,
decrease a performance of large systems. That's no good deal. 99% user never
uses vmevent.

MOREOVER, this patch don't solve vmevent accuracy issue AT ALL. because of,
a second is enough big to make large inaccuracy. Modern cpus are fast. Guys,
the fact is, user land monitor can't use implicit batch likes vmstat. That's
a reason why perf don't use vmstat. I suggest you see perf APIs. It may bring
you good inspiration.
diff mbox

Patch

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 65efb92..2a53896 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -200,6 +200,7 @@  extern void __inc_zone_state(struct zone *, enum zone_stat_item);
 extern void dec_zone_state(struct zone *, enum zone_stat_item);
 extern void __dec_zone_state(struct zone *, enum zone_stat_item);
 
+void refresh_vm_stats(void);
 void refresh_cpu_vm_stats(int);
 void refresh_zone_stat_thresholds(void);
 
@@ -253,6 +254,7 @@  static inline void __dec_zone_page_state(struct page *page,
 
 #define set_pgdat_percpu_threshold(pgdat, callback) { }
 
+static inline void refresh_vm_stats(void) { }
 static inline void refresh_cpu_vm_stats(int cpu) { }
 static inline void refresh_zone_stat_thresholds(void) { }
 
diff --git a/mm/vmstat.c b/mm/vmstat.c
index f600557..4a9d432 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -13,6 +13,7 @@ 
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
 #include <linux/cpu.h>
 #include <linux/vmstat.h>
 #include <linux/sched.h>
@@ -434,7 +435,7 @@  EXPORT_SYMBOL(dec_zone_page_state);
  * with the global counters. These could cause remote node cache line
  * bouncing and will have to be only done when necessary.
  */
-void refresh_cpu_vm_stats(int cpu)
+static void __refresh_cpu_vm_stats(int cpu, bool drain_pcp)
 {
 	struct zone *zone;
 	int i;
@@ -456,11 +457,15 @@  void refresh_cpu_vm_stats(int cpu)
 				local_irq_restore(flags);
 				atomic_long_add(v, &zone->vm_stat[i]);
 				global_diff[i] += v;
+				if (!drain_pcp)
+					continue;
 #ifdef CONFIG_NUMA
 				/* 3 seconds idle till flush */
 				p->expire = 3;
 #endif
 			}
+		if (!drain_pcp)
+			continue;
 		cond_resched();
 #ifdef CONFIG_NUMA
 		/*
@@ -495,6 +500,21 @@  void refresh_cpu_vm_stats(int cpu)
 			atomic_long_add(global_diff[i], &vm_stat[i]);
 }
 
+void refresh_cpu_vm_stats(int cpu)
+{
+	__refresh_cpu_vm_stats(cpu, 1);
+}
+
+static void smp_call_refresh_cpu_vm_stats(void *info)
+{
+	__refresh_cpu_vm_stats(smp_processor_id(), 0);
+}
+
+void refresh_vm_stats(void)
+{
+	smp_call_function(smp_call_refresh_cpu_vm_stats, NULL, 1);
+}
+
 #endif
 
 #ifdef CONFIG_NUMA