[2/4] perf: Keep AUX flags in the output handle

Message ID 20170220133352.17995-3-alexander.shishkin@linux.intel.com
State New
Headers show
Series
  • [1/4] perf: Export AUX buffer helpers to modules
Related show

Commit Message

Alexander Shishkin Feb. 20, 2017, 1:33 p.m.
From: Will Deacon <will.deacon@arm.com>


In preparation for adding more flags to perf AUX records, introduce a
separate API for setting the flags for a session, rather than appending
more bool arguments to perf_aux_output_end. This allows to set each
flag at the time a corresponding condition is detected, instead of
tracking it in each driver's private state.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>

---
 arch/x86/events/intel/bts.c                      | 16 +++++------
 arch/x86/events/intel/pt.c                       | 17 ++++++------
 arch/x86/events/intel/pt.h                       |  1 -
 drivers/hwtracing/coresight/coresight-etb10.c    |  7 +++--
 drivers/hwtracing/coresight/coresight-etm-perf.c |  9 +++----
 drivers/hwtracing/coresight/coresight-priv.h     |  2 --
 drivers/hwtracing/coresight/coresight-tmc-etf.c  |  7 +++--
 include/linux/coresight.h                        |  2 +-
 include/linux/perf_event.h                       |  8 +++---
 kernel/events/ring_buffer.c                      | 34 ++++++++++++++++--------
 10 files changed, 55 insertions(+), 48 deletions(-)

-- 
2.11.0

Comments

kernel test robot Feb. 20, 2017, 5:01 p.m. | #1
Hi Will,

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.10 next-20170220]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Alexander-Shishkin/perf-pt-coresight-AUX-flags-and-VMX-update/20170220-220625
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All errors (new ones prefixed by >>):

   In file included from ./arch/arm64/include/generated/asm/local.h:1:0,
                    from drivers/hwtracing/coresight/coresight-etb10.c:15:
   drivers/hwtracing/coresight/coresight-etb10.c: In function 'etb_update_buffer':
>> drivers/hwtracing/coresight/coresight-etb10.c:431:17: error: 'struct cs_buffers' has no member named 'lost'

      local_inc(&buf->lost);
                    ^
   include/asm-generic/local.h:30:40: note: in definition of macro 'local_inc'
    #define local_inc(l) atomic_long_inc(&(l)->a)
                                           ^

vim +431 drivers/hwtracing/coresight/coresight-etb10.c

2997aa40 Mathieu Poirier 2016-02-17  425  		read_ptr = (write_ptr + drvdata->buffer_depth) -
2997aa40 Mathieu Poirier 2016-02-17  426  					to_read / ETB_FRAME_SIZE_WORDS;
2997aa40 Mathieu Poirier 2016-02-17  427  		/* Wrap around if need be*/
bedffda8 Mathieu Poirier 2016-05-03  428  		if (read_ptr > (drvdata->buffer_depth - 1))
bedffda8 Mathieu Poirier 2016-05-03  429  			read_ptr -= drvdata->buffer_depth;
2997aa40 Mathieu Poirier 2016-02-17  430  		/* let the decoder know we've skipped ahead */
2997aa40 Mathieu Poirier 2016-02-17 @431  		local_inc(&buf->lost);
2997aa40 Mathieu Poirier 2016-02-17  432  	}
2997aa40 Mathieu Poirier 2016-02-17  433  
2997aa40 Mathieu Poirier 2016-02-17  434  	/* finally tell HW where we want to start reading from */

:::::: The code at line 431 was first introduced by commit
:::::: 2997aa4063d97fdb39450c6078bd81a7b0504f22 coresight: etb10: implementing AUX API

:::::: TO: Mathieu Poirier <mathieu.poirier@linaro.org>
:::::: CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Alexander Shishkin Feb. 20, 2017, 5:17 p.m. | #2
kbuild test robot <lkp@intel.com> writes:

> All errors (new ones prefixed by >>):

>

>    In file included from ./arch/arm64/include/generated/asm/local.h:1:0,

>                     from drivers/hwtracing/coresight/coresight-etb10.c:15:

>    drivers/hwtracing/coresight/coresight-etb10.c: In function 'etb_update_buffer':

>>> drivers/hwtracing/coresight/coresight-etb10.c:431:17: error: 'struct cs_buffers' has no member named 'lost'

>       local_inc(&buf->lost);

>                     ^

>    include/asm-generic/local.h:30:40: note: in definition of macro 'local_inc'


Ah shoot. Peter, can you fold this in:

From 8272adc208eb2ad874e3952766282624d035ea50 Mon Sep 17 00:00:00 2001
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>

Date: Mon, 20 Feb 2017 19:15:27 +0200
Subject: [PATCH] fixup! perf: Keep AUX flags in the output handle

---
 drivers/hwtracing/coresight/coresight-etb10.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.11.0diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 82c8ddcf09..979ea6ec79 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -428,7 +428,7 @@ static void etb_update_buffer(struct coresight_device *csdev,
 		if (read_ptr > (drvdata->buffer_depth - 1))
 			read_ptr -= drvdata->buffer_depth;
 		/* let the decoder know we've skipped ahead */
-		local_inc(&buf->lost);
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 	}
 
 	/* finally tell HW where we want to start reading from */

Mathieu Poirier Feb. 20, 2017, 8:42 p.m. | #3
On 20 February 2017 at 06:33, Alexander Shishkin
<alexander.shishkin@linux.intel.com> wrote:
> From: Will Deacon <will.deacon@arm.com>

>

> In preparation for adding more flags to perf AUX records, introduce a

> separate API for setting the flags for a session, rather than appending

> more bool arguments to perf_aux_output_end. This allows to set each

> flag at the time a corresponding condition is detected, instead of

> tracking it in each driver's private state.

>

> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>

> Signed-off-by: Will Deacon <will.deacon@arm.com>

> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>

> ---

>  arch/x86/events/intel/bts.c                      | 16 +++++------

>  arch/x86/events/intel/pt.c                       | 17 ++++++------

>  arch/x86/events/intel/pt.h                       |  1 -

>  drivers/hwtracing/coresight/coresight-etb10.c    |  7 +++--

>  drivers/hwtracing/coresight/coresight-etm-perf.c |  9 +++----

>  drivers/hwtracing/coresight/coresight-priv.h     |  2 --

>  drivers/hwtracing/coresight/coresight-tmc-etf.c  |  7 +++--


For the CS files:
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>


>  include/linux/coresight.h                        |  2 +-

>  include/linux/perf_event.h                       |  8 +++---

>  kernel/events/ring_buffer.c                      | 34 ++++++++++++++++--------

>  10 files changed, 55 insertions(+), 48 deletions(-)

>

> diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c

> index 982c9e31da..8ae8c5ce3a 100644

> --- a/arch/x86/events/intel/bts.c

> +++ b/arch/x86/events/intel/bts.c

> @@ -63,7 +63,6 @@ struct bts_buffer {

>         unsigned int    cur_buf;

>         bool            snapshot;

>         local_t         data_size;

> -       local_t         lost;

>         local_t         head;

>         unsigned long   end;

>         void            **data_pages;

> @@ -199,7 +198,8 @@ static void bts_update(struct bts_ctx *bts)

>                         return;

>

>                 if (ds->bts_index >= ds->bts_absolute_maximum)

> -                       local_inc(&buf->lost);

> +                       perf_aux_output_flag(&bts->handle,

> +                                            PERF_AUX_FLAG_TRUNCATED);

>

>                 /*

>                  * old and head are always in the same physical buffer, so we

> @@ -276,7 +276,7 @@ static void bts_event_start(struct perf_event *event, int flags)

>         return;

>

>  fail_end_stop:

> -       perf_aux_output_end(&bts->handle, 0, false);

> +       perf_aux_output_end(&bts->handle, 0);

>

>  fail_stop:

>         event->hw.state = PERF_HES_STOPPED;

> @@ -319,9 +319,8 @@ static void bts_event_stop(struct perf_event *event, int flags)

>                                 bts->handle.head =

>                                         local_xchg(&buf->data_size,

>                                                    buf->nr_pages << PAGE_SHIFT);

> -

> -                       perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),

> -                                           !!local_xchg(&buf->lost, 0));

> +                       perf_aux_output_end(&bts->handle,

> +                                           local_xchg(&buf->data_size, 0));

>                 }

>

>                 cpuc->ds->bts_index = bts->ds_back.bts_buffer_base;

> @@ -484,8 +483,7 @@ int intel_bts_interrupt(void)

>         if (old_head == local_read(&buf->head))

>                 return handled;

>

> -       perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),

> -                           !!local_xchg(&buf->lost, 0));

> +       perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0));

>

>         buf = perf_aux_output_begin(&bts->handle, event);

>         if (buf)

> @@ -500,7 +498,7 @@ int intel_bts_interrupt(void)

>                          * cleared handle::event

>                          */

>                         barrier();

> -                       perf_aux_output_end(&bts->handle, 0, false);

> +                       perf_aux_output_end(&bts->handle, 0);

>                 }

>         }

>

> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c

> index d92a60ef08..a2d0050fde 100644

> --- a/arch/x86/events/intel/pt.c

> +++ b/arch/x86/events/intel/pt.c

> @@ -823,7 +823,8 @@ static void pt_handle_status(struct pt *pt)

>                  */

>                 if (!pt_cap_get(PT_CAP_topa_multiple_entries) ||

>                     buf->output_off == sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size)) {

> -                       local_inc(&buf->lost);

> +                       perf_aux_output_flag(&pt->handle,

> +                                            PERF_AUX_FLAG_TRUNCATED);

>                         advance++;

>                 }

>         }

> @@ -916,8 +917,10 @@ static int pt_buffer_reset_markers(struct pt_buffer *buf,

>

>         /* can't stop in the middle of an output region */

>         if (buf->output_off + handle->size + 1 <

> -           sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size))

> +           sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size)) {

> +               perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

>                 return -EINVAL;

> +       }

>

>

>         /* single entry ToPA is handled by marking all regions STOP=1 INT=1 */

> @@ -1269,8 +1272,7 @@ void intel_pt_interrupt(void)

>

>         pt_update_head(pt);

>

> -       perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0),

> -                           local_xchg(&buf->lost, 0));

> +       perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0));

>

>         if (!event->hw.state) {

>                 int ret;

> @@ -1285,7 +1287,7 @@ void intel_pt_interrupt(void)

>                 /* snapshot counters don't use PMI, so it's safe */

>                 ret = pt_buffer_reset_markers(buf, &pt->handle);

>                 if (ret) {

> -                       perf_aux_output_end(&pt->handle, 0, true);

> +                       perf_aux_output_end(&pt->handle, 0);

>                         return;

>                 }

>

> @@ -1357,7 +1359,7 @@ static void pt_event_start(struct perf_event *event, int mode)

>         return;

>

>  fail_end_stop:

> -       perf_aux_output_end(&pt->handle, 0, true);

> +       perf_aux_output_end(&pt->handle, 0);

>  fail_stop:

>         hwc->state = PERF_HES_STOPPED;

>  }

> @@ -1398,8 +1400,7 @@ static void pt_event_stop(struct perf_event *event, int mode)

>                         pt->handle.head =

>                                 local_xchg(&buf->data_size,

>                                            buf->nr_pages << PAGE_SHIFT);

> -               perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0),

> -                                   local_xchg(&buf->lost, 0));

> +               perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0));

>         }

>  }

>

> diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h

> index fa81f8d8ed..25fa9710f4 100644

> --- a/arch/x86/events/intel/pt.h

> +++ b/arch/x86/events/intel/pt.h

> @@ -144,7 +144,6 @@ struct pt_buffer {

>         size_t                  output_off;

>         unsigned long           nr_pages;

>         local_t                 data_size;

> -       local_t                 lost;

>         local64_t               head;

>         bool                    snapshot;

>         unsigned long           stop_pos, intr_pos;

> diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c

> index d7325c6534..82c8ddcf09 100644

> --- a/drivers/hwtracing/coresight/coresight-etb10.c

> +++ b/drivers/hwtracing/coresight/coresight-etb10.c

> @@ -321,7 +321,7 @@ static int etb_set_buffer(struct coresight_device *csdev,

>

>  static unsigned long etb_reset_buffer(struct coresight_device *csdev,

>                                       struct perf_output_handle *handle,

> -                                     void *sink_config, bool *lost)

> +                                     void *sink_config)

>  {

>         unsigned long size = 0;

>         struct cs_buffers *buf = sink_config;

> @@ -343,7 +343,6 @@ static unsigned long etb_reset_buffer(struct coresight_device *csdev,

>                  * resetting parameters here and squaring off with the ring

>                  * buffer API in the tracer PMU is fine.

>                  */

> -               *lost = !!local_xchg(&buf->lost, 0);

>                 size = local_xchg(&buf->data_size, 0);

>         }

>

> @@ -385,7 +384,7 @@ static void etb_update_buffer(struct coresight_device *csdev,

>                         (unsigned long)write_ptr);

>

>                 write_ptr &= ~(ETB_FRAME_SIZE_WORDS - 1);

> -               local_inc(&buf->lost);

> +               perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

>         }

>

>         /*

> @@ -396,7 +395,7 @@ static void etb_update_buffer(struct coresight_device *csdev,

>          */

>         status = readl_relaxed(drvdata->base + ETB_STATUS_REG);

>         if (status & ETB_STATUS_RAM_FULL) {

> -               local_inc(&buf->lost);

> +               perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

>                 to_read = capacity;

>                 read_ptr = write_ptr;

>         } else {

> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c

> index 70eaa74dc2..47ea0eee67 100644

> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c

> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c

> @@ -301,7 +301,8 @@ static void etm_event_start(struct perf_event *event, int flags)

>         return;

>

>  fail_end_stop:

> -       perf_aux_output_end(handle, 0, true);

> +       perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

> +       perf_aux_output_end(handle, 0);

>  fail:

>         event->hw.state = PERF_HES_STOPPED;

>         goto out;

> @@ -309,7 +310,6 @@ static void etm_event_start(struct perf_event *event, int flags)

>

>  static void etm_event_stop(struct perf_event *event, int mode)

>  {

> -       bool lost;

>         int cpu = smp_processor_id();

>         unsigned long size;

>         struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);

> @@ -347,10 +347,9 @@ static void etm_event_stop(struct perf_event *event, int mode)

>                         return;

>

>                 size = sink_ops(sink)->reset_buffer(sink, handle,

> -                                                   event_data->snk_config,

> -                                                   &lost);

> +                                                   event_data->snk_config);

>

> -               perf_aux_output_end(handle, size, lost);

> +               perf_aux_output_end(handle, size);

>         }

>

>         /* Disabling the path make its elements available to other sessions */

> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h

> index ef9d8e93e3..5f662d8205 100644

> --- a/drivers/hwtracing/coresight/coresight-priv.h

> +++ b/drivers/hwtracing/coresight/coresight-priv.h

> @@ -76,7 +76,6 @@ enum cs_mode {

>   * @nr_pages:  max number of pages granted to us

>   * @offset:    offset within the current buffer

>   * @data_size: how much we collected in this run

> - * @lost:      other than zero if we had a HW buffer wrap around

>   * @snapshot:  is this run in snapshot mode

>   * @data_pages:        a handle the ring buffer

>   */

> @@ -85,7 +84,6 @@ struct cs_buffers {

>         unsigned int            nr_pages;

>         unsigned long           offset;

>         local_t                 data_size;

> -       local_t                 lost;

>         bool                    snapshot;

>         void                    **data_pages;

>  };

> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c

> index 1549436e24..aec61a6d5c 100644

> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c

> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c

> @@ -329,7 +329,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev,

>

>  static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,

>                                           struct perf_output_handle *handle,

> -                                         void *sink_config, bool *lost)

> +                                         void *sink_config)

>  {

>         long size = 0;

>         struct cs_buffers *buf = sink_config;

> @@ -350,7 +350,6 @@ static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,

>                  * resetting parameters here and squaring off with the ring

>                  * buffer API in the tracer PMU is fine.

>                  */

> -               *lost = !!local_xchg(&buf->lost, 0);

>                 size = local_xchg(&buf->data_size, 0);

>         }

>

> @@ -389,7 +388,7 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,

>          */

>         status = readl_relaxed(drvdata->base + TMC_STS);

>         if (status & TMC_STS_FULL) {

> -               local_inc(&buf->lost);

> +               perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

>                 to_read = drvdata->size;

>         } else {

>                 to_read = CIRC_CNT(write_ptr, read_ptr, drvdata->size);

> @@ -434,7 +433,7 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,

>                         read_ptr -= drvdata->size;

>                 /* Tell the HW */

>                 writel_relaxed(read_ptr, drvdata->base + TMC_RRP);

> -               local_inc(&buf->lost);

> +               perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);

>         }

>

>         cur = buf->cur;

> diff --git a/include/linux/coresight.h b/include/linux/coresight.h

> index 2a5982c37d..035c16c9a5 100644

> --- a/include/linux/coresight.h

> +++ b/include/linux/coresight.h

> @@ -201,7 +201,7 @@ struct coresight_ops_sink {

>                           void *sink_config);

>         unsigned long (*reset_buffer)(struct coresight_device *csdev,

>                                       struct perf_output_handle *handle,

> -                                     void *sink_config, bool *lost);

> +                                     void *sink_config);

>         void (*update_buffer)(struct coresight_device *csdev,

>                               struct perf_output_handle *handle,

>                               void *sink_config);

> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h

> index a39564314e..cdbaa88dc8 100644

> --- a/include/linux/perf_event.h

> +++ b/include/linux/perf_event.h

> @@ -812,6 +812,7 @@ struct perf_output_handle {

>         struct ring_buffer              *rb;

>         unsigned long                   wakeup;

>         unsigned long                   size;

> +       u64                             aux_flags;

>         union {

>                 void                    *addr;

>                 unsigned long           head;

> @@ -860,10 +861,11 @@ perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx)

>  extern void *perf_aux_output_begin(struct perf_output_handle *handle,

>                                    struct perf_event *event);

>  extern void perf_aux_output_end(struct perf_output_handle *handle,

> -                               unsigned long size, bool truncated);

> +                               unsigned long size);

>  extern int perf_aux_output_skip(struct perf_output_handle *handle,

>                                 unsigned long size);

>  extern void *perf_get_aux(struct perf_output_handle *handle);

> +extern void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags);

>

>  extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);

>  extern void perf_pmu_unregister(struct pmu *pmu);

> @@ -1278,8 +1280,8 @@ static inline void *

>  perf_aux_output_begin(struct perf_output_handle *handle,

>                       struct perf_event *event)                         { return NULL; }

>  static inline void

> -perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,

> -                   bool truncated)                                     { }

> +perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)

> +                                                                       { }

>  static inline int

>  perf_aux_output_skip(struct perf_output_handle *handle,

>                      unsigned long size)                                { return -EINVAL; }

> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c

> index 6415807169..f3ebafe060 100644

> --- a/kernel/events/ring_buffer.c

> +++ b/kernel/events/ring_buffer.c

> @@ -297,6 +297,19 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)

>                 rb->paused = 1;

>  }

>

> +void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags)

> +{

> +       /*

> +        * OVERWRITE is determined by perf_aux_output_end() and can't

> +        * be passed in directly.

> +        */

> +       if (WARN_ON_ONCE(flags & PERF_AUX_FLAG_OVERWRITE))

> +               return;

> +

> +       handle->aux_flags |= flags;

> +}

> +EXPORT_SYMBOL_GPL(perf_aux_output_flag);

> +

>  /*

>   * This is called before hardware starts writing to the AUX area to

>   * obtain an output handle and make sure there's room in the buffer.

> @@ -360,6 +373,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,

>         handle->event = event;

>         handle->head = aux_head;

>         handle->size = 0;

> +       handle->aux_flags = 0;

>

>         /*

>          * In overwrite mode, AUX data stores do not depend on aux_tail,

> @@ -409,34 +423,32 @@ EXPORT_SYMBOL_GPL(perf_aux_output_begin);

>   * of the AUX buffer management code is that after pmu::stop(), the AUX

>   * transaction must be stopped and therefore drop the AUX reference count.

>   */

> -void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,

> -                        bool truncated)

> +void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)

>  {

>         struct ring_buffer *rb = handle->rb;

> -       bool wakeup = truncated;

> +       bool wakeup = !!handle->aux_flags;

>         unsigned long aux_head;

> -       u64 flags = 0;

> -

> -       if (truncated)

> -               flags |= PERF_AUX_FLAG_TRUNCATED;

>

>         /* in overwrite mode, driver provides aux_head via handle */

>         if (rb->aux_overwrite) {

> -               flags |= PERF_AUX_FLAG_OVERWRITE;

> +               handle->aux_flags |= PERF_AUX_FLAG_OVERWRITE;

>

>                 aux_head = handle->head;

>                 local_set(&rb->aux_head, aux_head);

>         } else {

> +               handle->aux_flags &= ~PERF_AUX_FLAG_OVERWRITE;

> +

>                 aux_head = local_read(&rb->aux_head);

>                 local_add(size, &rb->aux_head);

>         }

>

> -       if (size || flags) {

> +       if (size || handle->aux_flags) {

>                 /*

>                  * Only send RECORD_AUX if we have something useful to communicate

>                  */

>

> -               perf_event_aux_event(handle->event, aux_head, size, flags);

> +               perf_event_aux_event(handle->event, aux_head, size,

> +                                    handle->aux_flags);

>         }

>

>         aux_head = rb->user_page->aux_head = local_read(&rb->aux_head);

> @@ -447,7 +459,7 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,

>         }

>

>         if (wakeup) {

> -               if (truncated)

> +               if (handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)

>                         handle->event->pending_disable = 1;

>                 perf_output_wakeup(handle);

>         }

> --

> 2.11.0

>
Will Deacon Feb. 21, 2017, 10:42 a.m. | #4
Hi Alexander,

Thanks for picking this up/adapting it. Just one comment below

On Mon, Feb 20, 2017 at 03:33:50PM +0200, Alexander Shishkin wrote:
> From: Will Deacon <will.deacon@arm.com>

> 

> In preparation for adding more flags to perf AUX records, introduce a

> separate API for setting the flags for a session, rather than appending

> more bool arguments to perf_aux_output_end. This allows to set each

> flag at the time a corresponding condition is detected, instead of

> tracking it in each driver's private state.

> 

> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>

> Signed-off-by: Will Deacon <will.deacon@arm.com>

> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>

> ---


--->8

> +void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags)

> +{

> +	/*

> +	 * OVERWRITE is determined by perf_aux_output_end() and can't

> +	 * be passed in directly.

> +	 */

> +	if (WARN_ON_ONCE(flags & PERF_AUX_FLAG_OVERWRITE))

> +		return;


Now that you've added this check...

> +	handle->aux_flags |= flags;

> +}

> +EXPORT_SYMBOL_GPL(perf_aux_output_flag);

> +

>  /*

>   * This is called before hardware starts writing to the AUX area to

>   * obtain an output handle and make sure there's room in the buffer.

> @@ -360,6 +373,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,

>  	handle->event = event;

>  	handle->head = aux_head;

>  	handle->size = 0;

> +	handle->aux_flags = 0;

>  

>  	/*

>  	 * In overwrite mode, AUX data stores do not depend on aux_tail,

> @@ -409,34 +423,32 @@ EXPORT_SYMBOL_GPL(perf_aux_output_begin);

>   * of the AUX buffer management code is that after pmu::stop(), the AUX

>   * transaction must be stopped and therefore drop the AUX reference count.

>   */

> -void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,

> -			 bool truncated)

> +void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)

>  {

>  	struct ring_buffer *rb = handle->rb;

> -	bool wakeup = truncated;

> +	bool wakeup = !!handle->aux_flags;

>  	unsigned long aux_head;

> -	u64 flags = 0;

> -

> -	if (truncated)

> -		flags |= PERF_AUX_FLAG_TRUNCATED;

>  

>  	/* in overwrite mode, driver provides aux_head via handle */

>  	if (rb->aux_overwrite) {

> -		flags |= PERF_AUX_FLAG_OVERWRITE;

> +		handle->aux_flags |= PERF_AUX_FLAG_OVERWRITE;

>  

>  		aux_head = handle->head;

>  		local_set(&rb->aux_head, aux_head);

>  	} else {

> +		handle->aux_flags &= ~PERF_AUX_FLAG_OVERWRITE;

> +


... I don't think you need this addition anymore. It's harmless, though.

Will
Alexander Shishkin Feb. 21, 2017, 10:54 a.m. | #5
Will Deacon <will.deacon@arm.com> writes:

> Hi Alexander,


Hi Will,

>> +		handle->aux_flags &= ~PERF_AUX_FLAG_OVERWRITE;

>> +

>

> ... I don't think you need this addition anymore. It's harmless, though.


Right, assuming the PMU driver isn't doing anything fishy to
handle->aux_flags directly.

Regards,
--
Alex

Patch

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 982c9e31da..8ae8c5ce3a 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -63,7 +63,6 @@  struct bts_buffer {
 	unsigned int	cur_buf;
 	bool		snapshot;
 	local_t		data_size;
-	local_t		lost;
 	local_t		head;
 	unsigned long	end;
 	void		**data_pages;
@@ -199,7 +198,8 @@  static void bts_update(struct bts_ctx *bts)
 			return;
 
 		if (ds->bts_index >= ds->bts_absolute_maximum)
-			local_inc(&buf->lost);
+			perf_aux_output_flag(&bts->handle,
+			                     PERF_AUX_FLAG_TRUNCATED);
 
 		/*
 		 * old and head are always in the same physical buffer, so we
@@ -276,7 +276,7 @@  static void bts_event_start(struct perf_event *event, int flags)
 	return;
 
 fail_end_stop:
-	perf_aux_output_end(&bts->handle, 0, false);
+	perf_aux_output_end(&bts->handle, 0);
 
 fail_stop:
 	event->hw.state = PERF_HES_STOPPED;
@@ -319,9 +319,8 @@  static void bts_event_stop(struct perf_event *event, int flags)
 				bts->handle.head =
 					local_xchg(&buf->data_size,
 						   buf->nr_pages << PAGE_SHIFT);
-
-			perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
-					    !!local_xchg(&buf->lost, 0));
+			perf_aux_output_end(&bts->handle,
+			                    local_xchg(&buf->data_size, 0));
 		}
 
 		cpuc->ds->bts_index = bts->ds_back.bts_buffer_base;
@@ -484,8 +483,7 @@  int intel_bts_interrupt(void)
 	if (old_head == local_read(&buf->head))
 		return handled;
 
-	perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
-			    !!local_xchg(&buf->lost, 0));
+	perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0));
 
 	buf = perf_aux_output_begin(&bts->handle, event);
 	if (buf)
@@ -500,7 +498,7 @@  int intel_bts_interrupt(void)
 			 * cleared handle::event
 			 */
 			barrier();
-			perf_aux_output_end(&bts->handle, 0, false);
+			perf_aux_output_end(&bts->handle, 0);
 		}
 	}
 
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index d92a60ef08..a2d0050fde 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -823,7 +823,8 @@  static void pt_handle_status(struct pt *pt)
 		 */
 		if (!pt_cap_get(PT_CAP_topa_multiple_entries) ||
 		    buf->output_off == sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size)) {
-			local_inc(&buf->lost);
+			perf_aux_output_flag(&pt->handle,
+			                     PERF_AUX_FLAG_TRUNCATED);
 			advance++;
 		}
 	}
@@ -916,8 +917,10 @@  static int pt_buffer_reset_markers(struct pt_buffer *buf,
 
 	/* can't stop in the middle of an output region */
 	if (buf->output_off + handle->size + 1 <
-	    sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size))
+	    sizes(TOPA_ENTRY(buf->cur, buf->cur_idx)->size)) {
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 		return -EINVAL;
+	}
 
 
 	/* single entry ToPA is handled by marking all regions STOP=1 INT=1 */
@@ -1269,8 +1272,7 @@  void intel_pt_interrupt(void)
 
 	pt_update_head(pt);
 
-	perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0),
-			    local_xchg(&buf->lost, 0));
+	perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0));
 
 	if (!event->hw.state) {
 		int ret;
@@ -1285,7 +1287,7 @@  void intel_pt_interrupt(void)
 		/* snapshot counters don't use PMI, so it's safe */
 		ret = pt_buffer_reset_markers(buf, &pt->handle);
 		if (ret) {
-			perf_aux_output_end(&pt->handle, 0, true);
+			perf_aux_output_end(&pt->handle, 0);
 			return;
 		}
 
@@ -1357,7 +1359,7 @@  static void pt_event_start(struct perf_event *event, int mode)
 	return;
 
 fail_end_stop:
-	perf_aux_output_end(&pt->handle, 0, true);
+	perf_aux_output_end(&pt->handle, 0);
 fail_stop:
 	hwc->state = PERF_HES_STOPPED;
 }
@@ -1398,8 +1400,7 @@  static void pt_event_stop(struct perf_event *event, int mode)
 			pt->handle.head =
 				local_xchg(&buf->data_size,
 					   buf->nr_pages << PAGE_SHIFT);
-		perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0),
-				    local_xchg(&buf->lost, 0));
+		perf_aux_output_end(&pt->handle, local_xchg(&buf->data_size, 0));
 	}
 }
 
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index fa81f8d8ed..25fa9710f4 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -144,7 +144,6 @@  struct pt_buffer {
 	size_t			output_off;
 	unsigned long		nr_pages;
 	local_t			data_size;
-	local_t			lost;
 	local64_t		head;
 	bool			snapshot;
 	unsigned long		stop_pos, intr_pos;
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index d7325c6534..82c8ddcf09 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -321,7 +321,7 @@  static int etb_set_buffer(struct coresight_device *csdev,
 
 static unsigned long etb_reset_buffer(struct coresight_device *csdev,
 				      struct perf_output_handle *handle,
-				      void *sink_config, bool *lost)
+				      void *sink_config)
 {
 	unsigned long size = 0;
 	struct cs_buffers *buf = sink_config;
@@ -343,7 +343,6 @@  static unsigned long etb_reset_buffer(struct coresight_device *csdev,
 		 * resetting parameters here and squaring off with the ring
 		 * buffer API in the tracer PMU is fine.
 		 */
-		*lost = !!local_xchg(&buf->lost, 0);
 		size = local_xchg(&buf->data_size, 0);
 	}
 
@@ -385,7 +384,7 @@  static void etb_update_buffer(struct coresight_device *csdev,
 			(unsigned long)write_ptr);
 
 		write_ptr &= ~(ETB_FRAME_SIZE_WORDS - 1);
-		local_inc(&buf->lost);
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 	}
 
 	/*
@@ -396,7 +395,7 @@  static void etb_update_buffer(struct coresight_device *csdev,
 	 */
 	status = readl_relaxed(drvdata->base + ETB_STATUS_REG);
 	if (status & ETB_STATUS_RAM_FULL) {
-		local_inc(&buf->lost);
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 		to_read = capacity;
 		read_ptr = write_ptr;
 	} else {
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 70eaa74dc2..47ea0eee67 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -301,7 +301,8 @@  static void etm_event_start(struct perf_event *event, int flags)
 	return;
 
 fail_end_stop:
-	perf_aux_output_end(handle, 0, true);
+	perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
+	perf_aux_output_end(handle, 0);
 fail:
 	event->hw.state = PERF_HES_STOPPED;
 	goto out;
@@ -309,7 +310,6 @@  static void etm_event_start(struct perf_event *event, int flags)
 
 static void etm_event_stop(struct perf_event *event, int mode)
 {
-	bool lost;
 	int cpu = smp_processor_id();
 	unsigned long size;
 	struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);
@@ -347,10 +347,9 @@  static void etm_event_stop(struct perf_event *event, int mode)
 			return;
 
 		size = sink_ops(sink)->reset_buffer(sink, handle,
-						    event_data->snk_config,
-						    &lost);
+						    event_data->snk_config);
 
-		perf_aux_output_end(handle, size, lost);
+		perf_aux_output_end(handle, size);
 	}
 
 	/* Disabling the path make its elements available to other sessions */
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index ef9d8e93e3..5f662d8205 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -76,7 +76,6 @@  enum cs_mode {
  * @nr_pages:	max number of pages granted to us
  * @offset:	offset within the current buffer
  * @data_size:	how much we collected in this run
- * @lost:	other than zero if we had a HW buffer wrap around
  * @snapshot:	is this run in snapshot mode
  * @data_pages:	a handle the ring buffer
  */
@@ -85,7 +84,6 @@  struct cs_buffers {
 	unsigned int		nr_pages;
 	unsigned long		offset;
 	local_t			data_size;
-	local_t			lost;
 	bool			snapshot;
 	void			**data_pages;
 };
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 1549436e24..aec61a6d5c 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -329,7 +329,7 @@  static int tmc_set_etf_buffer(struct coresight_device *csdev,
 
 static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,
 					  struct perf_output_handle *handle,
-					  void *sink_config, bool *lost)
+					  void *sink_config)
 {
 	long size = 0;
 	struct cs_buffers *buf = sink_config;
@@ -350,7 +350,6 @@  static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,
 		 * resetting parameters here and squaring off with the ring
 		 * buffer API in the tracer PMU is fine.
 		 */
-		*lost = !!local_xchg(&buf->lost, 0);
 		size = local_xchg(&buf->data_size, 0);
 	}
 
@@ -389,7 +388,7 @@  static void tmc_update_etf_buffer(struct coresight_device *csdev,
 	 */
 	status = readl_relaxed(drvdata->base + TMC_STS);
 	if (status & TMC_STS_FULL) {
-		local_inc(&buf->lost);
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 		to_read = drvdata->size;
 	} else {
 		to_read = CIRC_CNT(write_ptr, read_ptr, drvdata->size);
@@ -434,7 +433,7 @@  static void tmc_update_etf_buffer(struct coresight_device *csdev,
 			read_ptr -= drvdata->size;
 		/* Tell the HW */
 		writel_relaxed(read_ptr, drvdata->base + TMC_RRP);
-		local_inc(&buf->lost);
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
 	}
 
 	cur = buf->cur;
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 2a5982c37d..035c16c9a5 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -201,7 +201,7 @@  struct coresight_ops_sink {
 			  void *sink_config);
 	unsigned long (*reset_buffer)(struct coresight_device *csdev,
 				      struct perf_output_handle *handle,
-				      void *sink_config, bool *lost);
+				      void *sink_config);
 	void (*update_buffer)(struct coresight_device *csdev,
 			      struct perf_output_handle *handle,
 			      void *sink_config);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a39564314e..cdbaa88dc8 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -812,6 +812,7 @@  struct perf_output_handle {
 	struct ring_buffer		*rb;
 	unsigned long			wakeup;
 	unsigned long			size;
+	u64				aux_flags;
 	union {
 		void			*addr;
 		unsigned long		head;
@@ -860,10 +861,11 @@  perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx)
 extern void *perf_aux_output_begin(struct perf_output_handle *handle,
 				   struct perf_event *event);
 extern void perf_aux_output_end(struct perf_output_handle *handle,
-				unsigned long size, bool truncated);
+				unsigned long size);
 extern int perf_aux_output_skip(struct perf_output_handle *handle,
 				unsigned long size);
 extern void *perf_get_aux(struct perf_output_handle *handle);
+extern void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags);
 
 extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);
 extern void perf_pmu_unregister(struct pmu *pmu);
@@ -1278,8 +1280,8 @@  static inline void *
 perf_aux_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event)				{ return NULL; }
 static inline void
-perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
-		    bool truncated)					{ }
+perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
+									{ }
 static inline int
 perf_aux_output_skip(struct perf_output_handle *handle,
 		     unsigned long size)				{ return -EINVAL; }
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 6415807169..f3ebafe060 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -297,6 +297,19 @@  ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 		rb->paused = 1;
 }
 
+void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags)
+{
+	/*
+	 * OVERWRITE is determined by perf_aux_output_end() and can't
+	 * be passed in directly.
+	 */
+	if (WARN_ON_ONCE(flags & PERF_AUX_FLAG_OVERWRITE))
+		return;
+
+	handle->aux_flags |= flags;
+}
+EXPORT_SYMBOL_GPL(perf_aux_output_flag);
+
 /*
  * This is called before hardware starts writing to the AUX area to
  * obtain an output handle and make sure there's room in the buffer.
@@ -360,6 +373,7 @@  void *perf_aux_output_begin(struct perf_output_handle *handle,
 	handle->event = event;
 	handle->head = aux_head;
 	handle->size = 0;
+	handle->aux_flags = 0;
 
 	/*
 	 * In overwrite mode, AUX data stores do not depend on aux_tail,
@@ -409,34 +423,32 @@  EXPORT_SYMBOL_GPL(perf_aux_output_begin);
  * of the AUX buffer management code is that after pmu::stop(), the AUX
  * transaction must be stopped and therefore drop the AUX reference count.
  */
-void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
-			 bool truncated)
+void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
 {
 	struct ring_buffer *rb = handle->rb;
-	bool wakeup = truncated;
+	bool wakeup = !!handle->aux_flags;
 	unsigned long aux_head;
-	u64 flags = 0;
-
-	if (truncated)
-		flags |= PERF_AUX_FLAG_TRUNCATED;
 
 	/* in overwrite mode, driver provides aux_head via handle */
 	if (rb->aux_overwrite) {
-		flags |= PERF_AUX_FLAG_OVERWRITE;
+		handle->aux_flags |= PERF_AUX_FLAG_OVERWRITE;
 
 		aux_head = handle->head;
 		local_set(&rb->aux_head, aux_head);
 	} else {
+		handle->aux_flags &= ~PERF_AUX_FLAG_OVERWRITE;
+
 		aux_head = local_read(&rb->aux_head);
 		local_add(size, &rb->aux_head);
 	}
 
-	if (size || flags) {
+	if (size || handle->aux_flags) {
 		/*
 		 * Only send RECORD_AUX if we have something useful to communicate
 		 */
 
-		perf_event_aux_event(handle->event, aux_head, size, flags);
+		perf_event_aux_event(handle->event, aux_head, size,
+		                     handle->aux_flags);
 	}
 
 	aux_head = rb->user_page->aux_head = local_read(&rb->aux_head);
@@ -447,7 +459,7 @@  void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
 	}
 
 	if (wakeup) {
-		if (truncated)
+		if (handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)
 			handle->event->pending_disable = 1;
 		perf_output_wakeup(handle);
 	}