[2/2] perf_event_open.2: Document write_backward

Message ID 1477049893-143199-2-git-send-email-wangnan0@huawei.com
State New
Headers show

Commit Message

Wang Nan Oct. 21, 2016, 11:38 a.m.
Linux 4.7 (9ecda41acb971ebd07c8fb35faf24005c0baea12) introduces write_backward
attribute to perf_event_attr. Document this feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
---
 man2/perf_event_open.2 | 56 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 53 insertions(+), 3 deletions(-)

-- 
2.10.1

Comments

Vince Weaver Oct. 21, 2016, 9:25 p.m. | #1
On Fri, 21 Oct 2016, Wang Nan wrote:

>            context_switch :  1,  /* context switch data */

> -

> -          __reserved_1   : 37;

> +          write_backward :  1,  /* Write ring buffer from end to beginning */

> +          __reserved_1   : 36;


This removes a blank line, not sure if intentional or not.

> +.IR "write_backward" " (since Linux 4.6)"


It didn't committed until Linux 4.7 from what I can tell?

> +This makes the resuling event use a backward ring-buffer, which

resulting

> +writes samples from the end of the ring-buffer.

> +

> +It is not allowed to connect events with backward and forward

> +ring-buffer settings together using

> +.B PERF_EVENT_IOC_SET_OUTPUT.

> +

> +Backward ring-buffer is useful when the ring-buffer is overwritable

> +(created by readonly

> +.BR mmap (2)

> +). 


A ring buffer is over-writable when it is mmapped readonly?
Is this a hard requirement?
Can you set the read-backwards bit if not mapped readonly?

Otherwise the documentation seems reasonable.

Reviewed-by: Vince Weaver <vincent.weaver@maine.edu>
Michael Kerrisk (man-opages) Oct. 22, 2016, 10:05 a.m. | #2
On 10/21/2016 11:25 PM, Vince Weaver wrote:
> On Fri, 21 Oct 2016, Wang Nan wrote:

> 

>>            context_switch :  1,  /* context switch data */

>> -

>> -          __reserved_1   : 37;

>> +          write_backward :  1,  /* Write ring buffer from end to beginning */

>> +          __reserved_1   : 36;

> 

> This removes a blank line, not sure if intentional or not.


Maybe it would be better to keep it. I don't feel too strongly about 
this though.

>> +.IR "write_backward" " (since Linux 4.6)"

> 

> It didn't committed until Linux 4.7 from what I can tell?


Yes, that's my recollection too.

> 

>> +This makes the resuling event use a backward ring-buffer, which

> resulting

> 

>> +writes samples from the end of the ring-buffer.

>> +

>> +It is not allowed to connect events with backward and forward

>> +ring-buffer settings together using

>> +.B PERF_EVENT_IOC_SET_OUTPUT.

>> +

>> +Backward ring-buffer is useful when the ring-buffer is overwritable

>> +(created by readonly

>> +.BR mmap (2)

>> +). 

> 

> A ring buffer is over-writable when it is mmapped readonly?

> Is this a hard requirement?

> Can you set the read-backwards bit if not mapped readonly?


Wang Nan, could you perhaps clarify this in the next version of the patch?

> 

> Otherwise the documentation seems reasonable.

> 

> Reviewed-by: Vince Weaver <vincent.weaver@maine.edu>


Thanks for reviewing both patches, Vince. Wang Nan, please include the
Reviewed-by: in the next patch iteration.


Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Patch

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 2d3acad..e5fdfec 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -244,8 +244,8 @@  struct perf_event_attr {
                                    due to exec */
           use_clockid    :  1,  /* use clockid for time fields */
           context_switch :  1,  /* context switch data */
-
-          __reserved_1   : 37;
+          write_backward :  1,  /* Write ring buffer from end to beginning */
+          __reserved_1   : 36;
 
     union {
         __u32 wakeup_events;    /* wakeup every n events */
@@ -1127,6 +1127,29 @@  The advantage of this method is that it will give full
 information even with strict
 .I perf_event_paranoid
 settings.
+.IR "write_backward" " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the resuling event use a backward ring-buffer, which
+writes samples from the end of the ring-buffer.
+
+It is not allowed to connect events with backward and forward
+ring-buffer settings together using
+.B PERF_EVENT_IOC_SET_OUTPUT.
+
+Backward ring-buffer is useful when the ring-buffer is overwritable
+(created by readonly
+.BR mmap (2)
+). In this case,
+.IR data_tail
+is useless,
+.IR data_head
+points to the head of the most recent sample in a backward
+ring-buffer. It is easy to iterate over the whole ring-buffer by reading
+samples one by one because size of a sample can be found from decoding
+its header. In contract, in a forward overwritable ring-buffer, the only
+information is the end of the most recent sample which is pointed by
+.IR data_head,
+but the size of a sample can't be determined from the end of it.
 .TP
 .IR "wakeup_events" ", " "wakeup_watermark"
 This union sets how many samples
@@ -1671,7 +1694,9 @@  And vice versa:
 .TP
 .I data_head
 This points to the head of the data section.
-The value continuously increases, it does not wrap.
+The value continuously increases (or decrease if
+.IR write_backward
+is set), it does not wrap.
 The value needs to be manually wrapped by the size of the mmap buffer
 before accessing the samples.
 
@@ -2727,6 +2752,24 @@  Starting with Linux 3.18,
 .B POLL_HUP
 is indicated if the event being monitored is attached to a different
 process and that process exits.
+.SS Reading from overwritable ring-buffer
+Reader is unable to update
+.IR data_tail
+if the mapping is not
+.BR PROT_WRITE .
+In this case, kernel will overwrite data without considering whether
+they are read or not, so ring-buffer is overwritable and
+behaves like a flight recorder. To read from an overwritable
+ring-buffer, setting
+.IR write_backward
+is suggested, or it would be hard to find a proper position to start
+decoding. In addition, ring-buffer should be paused before reading
+through
+.BR ioctl (2)
+with
+.B PERF_EVENT_IOC_PAUSE_OUTPUT
+to avoid racing between kernel and reader. Ring-buffer should be resumed
+after finish reading.
 .SS rdpmc instruction
 Starting with Linux 3.4 on x86, you can use the
 .\" commit c7206205d00ab375839bd6c7ddb247d600693c09
@@ -2839,6 +2882,13 @@  The file descriptors must all be on the same CPU.
 
 The argument specifies the desired file descriptor, or \-1 if
 output should be ignored.
+
+Two events with different
+.IR write_backward
+settings are not allowed to be connected together using
+.B PERF_EVENT_IOC_SET_OUTPUT.
+.B EINVAL
+is returned in this case.
 .TP
 .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)"
 .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830