diff mbox

[RFC] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

Message ID 1435274682-9425-1-git-send-email-john.stultz@linaro.org
State New
Headers show

Commit Message

John Stultz June 25, 2015, 11:24 p.m. UTC
From: Ruchi Kandoi <kandoiruchi@google.com>

This patch has been in the Android tree for ahwile, and I've not
seen it posted for review. Unfortunately the PR_SET_TIMERSLACK_PID
value has collided many times w/ upstream (first w/
PR_SET_THP_DISABLE, and again w/ PR_MPX_ENABLE_MANAGEMENT).

So to try to avoid further ABI trouble, I wanted to send this out
for initial review to see if there were any major objections,
or ideas for alternative approaches.

Thoughts and feedback would be appreciated.
thanks
-john


This patch allows power/performance management software to set
timer slack for other threads according to its policy for the
thread (such as when the thread is designated foreground vs.
background activity)

Second argument is similar to PR_SET_TIMERSLACK, if non-zero
then the slack is set to that value otherwise sets it to the
default for the thread.

Takes PID of the thread as the third argument.

This interface checks check for CAP_SYS_NICE to make sure that
arbitrary apps do not change the timer slack for other apps.

Additional fixes from Ruchi and Micha Kalfon <micha@cellrox.com>
have been folded into this patch to make it easier to reivew.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Micha Kalfon <micha@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
[jstultz:
 * Folded in CAP_SYS_NICE check from Ruchi.
 * Folded in fix misplaced PR_SET_TIMERSLACK_PID case fix from
   Micha.
 * Folded in make PR_SET_TIMERSLACK_PID pid namespace aware fix
   from Micha.
 * Changed PR_SET_TIMERSLACK_PID so it didn't collide with
   already upstream prctrl values.
 * Reworked commit message.]
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/uapi/linux/prctl.h |  7 +++++++
 kernel/sys.c               | 24 ++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

Comments

John Stultz June 26, 2015, 9:36 p.m. UTC | #1
On Thu, Jun 25, 2015 at 6:21 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> On 6/25/2015 4:24 PM, John Stultz wrote:
>>
>> From: Ruchi Kandoi <kandoiruchi@google.com>
>>
>> This patch has been in the Android tree for ahwile, and I've not
>> seen it posted for review. Unfortunately the PR_SET_TIMERSLACK_PID
>> value has collided many times w/ upstream (first w/
>> PR_SET_THP_DISABLE, and again w/ PR_MPX_ENABLE_MANAGEMENT).
>>
>> So to try to avoid further ABI trouble, I wanted to send this out
>> for initial review to see if there were any major objections,
>> or ideas for alternative approaches.
>>
>> Thoughts and feedback would be appreciated.
>
>
> I am wondering if it would be more natural to make this a cgroup
> property.. and have a way for threads to inherit their cgroup
> effective value.

I'm hesitant here, as I know the android cgroup usage (ie: having a
forground cgroup and background cgroup), is frowned upon by the cgroup
maintainers, since frequently migrating processes and memory between
the cgroups is costly. So  they've been suggesting (assuming I
understood their feedback properly) to use per-application cgroups
(changing the cgroup knobs to "forground settings" or "background
settings") as needed.

So if you were thinking that one could set the slack for larger
cgroups to avoid per-process adjustments, that may not be the right
way forward. But maybe you're suggesting something else?

> it'll fit more natural in many places than just a per pid setting.
> (I have nothing against it per se other than that I think the
> security check is wrong; I fully expected to be on the level of
> ptrace_may_access(PTRACE_MODE_ATTACH) or the like.. not just NICE.
> you can seriously change task behavior with this by setting a 4 second
> value or so.. much more than just with nice)

Yea, they've sort of overloaded CAP_SYS_NICE for a number of Android
specific checks.  I think using a different existing cap or
introducing a new one would be fine, but I'm not sure which cap would
really be best. Are there any other opinions here, or is
PTRACE_MODE_ATTACH ok?

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 31891d9..7086be4 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -187,6 +187,13 @@  struct prctl_mm_map {
 
 #define PR_SET_FP_MODE		45
 #define PR_GET_FP_MODE		46
+
+/* Sets the timerslack for arbitrary threads
+ * arg2 slack value, 0 means "use default"
+ * arg3 pid of the thread whose timer slack needs to be set
+ */
+#define PR_SET_TIMERSLACK_PID	47
+
 # define PR_FP_MODE_FR		(1 << 0)	/* 64b FP registers */
 # define PR_FP_MODE_FRE		(1 << 1)	/* 32b compatibility */
 
diff --git a/kernel/sys.c b/kernel/sys.c
index 8571296..75b32f0 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -41,6 +41,9 @@ 
 #include <linux/syscore_ops.h>
 #include <linux/version.h>
 #include <linux/ctype.h>
+#include <linux/mm.h>
+#include <linux/mempolicy.h>
+#include <linux/sched.h>
 
 #include <linux/compat.h>
 #include <linux/syscalls.h>
@@ -2053,6 +2056,7 @@  SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		unsigned long, arg4, unsigned long, arg5)
 {
 	struct task_struct *me = current;
+	struct task_struct *tsk;
 	unsigned char comm[sizeof(me->comm)];
 	long error;
 
@@ -2195,6 +2199,26 @@  SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_GET_TID_ADDRESS:
 		error = prctl_get_tid_address(me, (int __user **)arg2);
 		break;
+	case PR_SET_TIMERSLACK_PID:
+		if (task_pid_vnr(current) != (pid_t)arg3 &&
+				!capable(CAP_SYS_NICE))
+			return -EPERM;
+		rcu_read_lock();
+		tsk = find_task_by_vpid((pid_t)arg3);
+		if (tsk == NULL) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		get_task_struct(tsk);
+		rcu_read_unlock();
+		if (arg2 <= 0)
+			tsk->timer_slack_ns =
+				tsk->default_timer_slack_ns;
+		else
+			tsk->timer_slack_ns = arg2;
+		put_task_struct(tsk);
+		error = 0;
+		break;
 	case PR_SET_CHILD_SUBREAPER:
 		me->signal->is_child_subreaper = !!arg2;
 		break;