From patchwork Tue Nov 4 00:28:37 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pawel Moll X-Patchwork-Id: 40085 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ee0-f71.google.com (mail-ee0-f71.google.com [74.125.83.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 569FA21894 for ; Tue, 4 Nov 2014 00:29:34 +0000 (UTC) Received: by mail-ee0-f71.google.com with SMTP id e51sf3453979eek.10 for ; Mon, 03 Nov 2014 16:29:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=qJR7pjIF5tH6A9KK6BwbdTsjLY7tg5K9RFp457Oysvk=; b=IM/j+t1MPGtY8xKMG4d6lyEjG9CpgVFhguH56/CdwrggfCOhreViQ4s0NkFbVg65NK y7gkJjqm8/+gAP0sU8Mz+iFUdlE/9nOPXwy7ia/GVC7jL9BhXFirn1c7UuovJKp0v1gy tY479+wG/2qgJTfbLQF4CZGB2ObnqoqzbJIt8RB9bmZbNYo2I/G/PSmD98tYBc+uox/L lqpi+Nf4I/IsnxiJ1VZjGGGHx3S6qnPfeJZu4CFhy1/ZWlLHmnIjyX1e7+3n/a3pf+yn 0RQD1E0ihUy3PZMYkmmqqWZ+LdqoOKz58wwmZaTYv7zVPv693DF68P02DgIzQxZgAYvv Cepg== X-Gm-Message-State: ALoCoQkmZdbaAgKqWxJcMZf7/w+yDKMuXOmMjdEkLIC/xuf0XEowMIwrmHA/4ZILtl8B4Yy6qlU5 X-Received: by 10.152.26.72 with SMTP id j8mr7923346lag.3.1415060973538; Mon, 03 Nov 2014 16:29:33 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.36.38 with SMTP id n6ls738235laj.86.gmail; Mon, 03 Nov 2014 16:29:33 -0800 (PST) X-Received: by 10.152.4.132 with SMTP id k4mr45590114lak.1.1415060973359; Mon, 03 Nov 2014 16:29:33 -0800 (PST) Received: from mail-lb0-f178.google.com (mail-lb0-f178.google.com. [209.85.217.178]) by mx.google.com with ESMTPS id pd3si34728816lbc.136.2014.11.03.16.29.33 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Nov 2014 16:29:33 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.178 as permitted sender) client-ip=209.85.217.178; Received: by mail-lb0-f178.google.com with SMTP id f15so11551314lbj.9 for ; Mon, 03 Nov 2014 16:29:33 -0800 (PST) X-Received: by 10.152.29.8 with SMTP id f8mr54996658lah.56.1415060973262; Mon, 03 Nov 2014 16:29:33 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.184.201 with SMTP id ew9csp45129lbc; Mon, 3 Nov 2014 16:29:32 -0800 (PST) X-Received: by 10.66.65.169 with SMTP id y9mr43154264pas.24.1415060971458; Mon, 03 Nov 2014 16:29:31 -0800 (PST) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j1si2879583pdn.89.2014.11.03.16.29.25 for ; Mon, 03 Nov 2014 16:29:31 -0800 (PST) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751735AbaKDA3P (ORCPT + 25 others); Mon, 3 Nov 2014 19:29:15 -0500 Received: from foss-mx-na.foss.arm.com ([217.140.108.86]:52467 "EHLO foss-mx-na.foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751665AbaKDA3I (ORCPT ); Mon, 3 Nov 2014 19:29:08 -0500 Received: from foss-smtp-na-1.foss.arm.com (unknown [10.80.61.8]) by foss-mx-na.foss.arm.com (Postfix) with ESMTP id A8254222; Mon, 3 Nov 2014 18:28:58 -0600 (CST) Received: from collaborate-mta1.arm.com (highbank-bc01-b06.austin.arm.com [10.112.81.134]) by foss-smtp-na-1.foss.arm.com (Postfix) with ESMTP id 2681A5FAD8; Mon, 3 Nov 2014 18:28:56 -0600 (CST) Received: from rojo.cambridge.arm.com (unknown [10.37.23.91]) by collaborate-mta1.arm.com (Postfix) with ESMTP id 09F6C13F91E; Mon, 3 Nov 2014 18:28:51 -0600 (CST) From: Pawel Moll To: Richard Cochran , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Paul Mackerras , Arnaldo Carvalho de Melo , John Stultz , Masami Hiramatsu , Christopher Covington , Namhyung Kim , David Ahern , Thomas Gleixner , Tomeu Vizoso Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Pawel Moll , Pawel Moll Subject: [PATCH v3 2/3] perf: Userspace event Date: Tue, 4 Nov 2014 00:28:37 +0000 Message-Id: <1415060918-19954-3-git-send-email-pawel.moll@arm.com> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1415060918-19954-1-git-send-email-pawel.moll@arm.com> References: <1415060918-19954-1-git-send-email-pawel.moll@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: pawel.moll@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.178 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , From: Pawel Moll This patch adds a PR_TASK_PERF_UEVENT prctl call which can be used by any process to inject custom data into perf data stream as a new PERF_RECORD_UEVENT record, if such process is being observed or if it is running on a CPU being observed by the perf framework. The prctl call takes the following arguments: prctl(PR_TASK_PERF_UEVENT, type, size, data, flags); - type: a number meaning to describe content of the following data. Kernel does not pay attention to it and merely passes it further in the perf data, therefore its use must be agreed between the events producer (the process being observed) and the consumer (performance analysis tool). The perf userspace tool will contain a repository of "well known" types and reference implementation of their decoders. - size: Length in bytes of the data. - data: Pointer to the data. - flags: Reserved for future use. Always pass zero. Perf context that are supposed to receive events generated with the prctl above must be opened with perf_event_attr.uevent set to 1. The PERF_RECORD_UEVENT records consist of a standard perf event header, 32-bit type value, 32-bit data size and the data itself, followed by padding to align the overall record size to 8 bytes and optional, standard sample_id field. Example use cases: - "perf_printf" like mechanism to add logging messages to perf data; in the simplest case it can be just prctl(PR_TASK_PERF_UEVENT, 0, 8, "Message", 0); - synchronisation of performance data generated in user space with the perf stream coming from the kernel. For example, the marker can be inserted by a JIT engine after it generated portion of the code, but before the code is executed for the first time, allowing the post-processor to pick the correct debugging information. Signed-off-by: Pawel Moll --- include/linux/perf_event.h | 4 +++ include/uapi/linux/perf_event.h | 23 ++++++++++++- include/uapi/linux/prctl.h | 10 ++++++ kernel/events/core.c | 71 +++++++++++++++++++++++++++++++++++++++++ kernel/sys.c | 5 +++ 5 files changed, 112 insertions(+), 1 deletion(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index ba490d5..867415d 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -721,6 +721,8 @@ extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks extern void perf_event_exec(void); extern void perf_event_comm(struct task_struct *tsk, bool exec); extern void perf_event_fork(struct task_struct *tsk); +extern int perf_uevent(struct task_struct *tsk, u32 type, u32 size, + const char __user *data); /* Callchains */ DECLARE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry); @@ -830,6 +832,8 @@ static inline void perf_event_mmap(struct vm_area_struct *vma) { } static inline void perf_event_exec(void) { } static inline void perf_event_comm(struct task_struct *tsk, bool exec) { } static inline void perf_event_fork(struct task_struct *tsk) { } +static inline int perf_uevent(struct task_struct *tsk, u32 type, u32 size, + const char __user *data) { return -1; }; static inline void perf_event_init(void) { } static inline int perf_swevent_get_recursion_context(void) { return -1; } static inline void perf_swevent_put_recursion_context(int rctx) { } diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 9d84540..9a64eb1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -303,7 +303,8 @@ struct perf_event_attr { exclude_callchain_user : 1, /* exclude user callchains */ mmap2 : 1, /* include mmap with inode data */ comm_exec : 1, /* flag comm events that are due to an exec */ - __reserved_1 : 39; + uevents : 1, /* allow uevents into the buffer */ + __reserved_1 : 38; union { __u32 wakeup_events; /* wakeup every n events */ @@ -712,6 +713,26 @@ enum perf_event_type { */ PERF_RECORD_MMAP2 = 10, + /* + * Data in userspace event record is transparent for the kernel + * + * Userspace perf tool code maintains a list of known types with + * reference implementations of parsers for the data field. + * + * Overall size of the record (including type and size fields) + * is always aligned to 8 bytes by adding padding after the data. + * + * struct { + * struct perf_event_header header; + * u32 type; + * u32 size; + * char data[size]; + * char __padding[-size & 7]; + * struct sample_id sample_id; + * }; + */ + PERF_RECORD_UEVENT = 11, + PERF_RECORD_MAX, /* non-ABI */ }; diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 513df75..2a6852f 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -179,4 +179,14 @@ struct prctl_mm_map { #define PR_SET_THP_DISABLE 41 #define PR_GET_THP_DISABLE 42 +/* + * Perf userspace event generation + * + * second argument: event type + * third argument: data size + * fourth argument: pointer to data + * fifth argument: flags (currently unused, pass 0) + */ +#define PR_TASK_PERF_UEVENT 43 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/events/core.c b/kernel/events/core.c index ea3d6d3..3738e9c 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5565,6 +5565,77 @@ static void perf_log_throttle(struct perf_event *event, int enable) } /* + * Userspace-generated event + */ + +struct perf_uevent { + struct perf_event_header header; + u32 type; + u32 size; + u8 data[0]; +}; + +static void perf_uevent_output(struct perf_event *event, void *data) +{ + struct perf_uevent *uevent = data; + struct perf_output_handle handle; + struct perf_sample_data sample; + int size = uevent->header.size; + + if (!event->attr.uevents) + return; + + perf_event_header__init_id(&uevent->header, &sample, event); + + if (perf_output_begin(&handle, event, uevent->header.size) != 0) + goto out; + perf_output_put(&handle, uevent->header); + perf_output_put(&handle, uevent->type); + perf_output_put(&handle, uevent->size); + __output_copy(&handle, uevent->data, uevent->size); + + /* Padding to align overall data size to 8 bytes */ + perf_output_skip(&handle, -uevent->size & (sizeof(u64) - 1)); + + perf_event__output_id_sample(event, &handle, &sample); + + perf_output_end(&handle); +out: + uevent->header.size = size; +} + +int perf_uevent(struct task_struct *tsk, u32 type, u32 size, + const char __user *data) +{ + struct perf_uevent *uevent; + + /* Need some reasonable limit */ + if (size > PAGE_SIZE) + return -E2BIG; + + uevent = kmalloc(sizeof(*uevent) + size, GFP_KERNEL); + if (!uevent) + return -ENOMEM; + + uevent->header.type = PERF_RECORD_UEVENT; + uevent->header.size = sizeof(*uevent) + ALIGN(size, sizeof(u64)); + + uevent->type = type; + uevent->size = size; + if (copy_from_user(uevent->data, data, size)) { + kfree(uevent); + return -EFAULT; + } + + perf_event_aux(perf_uevent_output, uevent, NULL); + + kfree(uevent); + + return 0; +} + + +/* * Generic event overflow handling, sampling. */ diff --git a/kernel/sys.c b/kernel/sys.c index 1eaa2f0..1c83677 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2121,6 +2121,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_TASK_PERF_EVENTS_ENABLE: error = perf_event_task_enable(); break; + case PR_TASK_PERF_UEVENT: + if (arg5 != 0) + return -EINVAL; + error = perf_uevent(me, arg2, arg3, (char __user *)arg4); + break; case PR_GET_TIMERSLACK: error = current->timer_slack_ns; break;