From patchwork Fri Apr 20 12:05:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 133871 Delivered-To: patch@linaro.org Received: by 10.46.66.142 with SMTP id h14csp198154ljf; Fri, 20 Apr 2018 05:07:38 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/B0I6K+1jssbXw8EzYvYDaJkGK/+MRU3LgsnjsiKNHJYp0IJmBXJdl21jLDQ2cfit6iiQh X-Received: by 10.99.37.196 with SMTP id l187mr8447342pgl.221.1524226058728; Fri, 20 Apr 2018 05:07:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524226058; cv=none; d=google.com; s=arc-20160816; b=gKSEY87cKFD34mF1s1njfdb+z5TSdfO0ew4eQycndDeoAfRzAQ23IJVSLz6j0MoiR2 /oQ3kw1gVk2tBPII3RNldYExSXf5oXl/00df8DQSloxRtc8zXXBAXHFhg+di+5akk4sV VPZ/TP5cJLxtch6aB3nfDbWzy1AbjnbsFxExYIa3UrPvTEQSitHYwGRQKlWeDCup8I4s n6P5q9e+d1rI/NfdD20eqafG6EzQ7b+2mkElBexcCdIUsT21bvlgBw1MC4Hf/UT4/iYf TD+qlQRoagKkyduZS+sUVlwXov7ncgdH/dvsDeUvD4T2L+xBxdnWfkJQP50OLFo9GAOw pjww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=CaZ+lbQ0l4bL9J9SaktDIsD6mbZ4DDscAxex7IW+NKY=; b=LcrkGwNg/FZtKbOSuouJfZ6SQBxuVUeTl2lMqG2rS/9rSx5a+L+Q0HTFUJs2dJdo5w thIUyf33lPyEwQS1UfTCsihzmChdTTRG3nRwCz4VVu+DJ9GbpvzAPRIkdbi5iPFUTjvj 0/b8FPW8M8Ht/OBZ/OTpd1FhRGVM97yofnSi6/VtJeIJkfrs6YosZJdGIqnJ9CuQlVyM BKRJzPTwkEU3+pLkK076S0OVZ5qS038D9q6hmtUb9le6EJYMNCVCTKNuxApnCHsPkK7W pSw9T5rgWkEcOy7a28nwwkIM0OlOlk3EhIHJcoB57tsjy5EVnftzJaLfB0mnZV4yl8nQ 1e2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l4si4883497pgc.374.2018.04.20.05.07.38; Fri, 20 Apr 2018 05:07:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754730AbeDTMHf (ORCPT + 29 others); Fri, 20 Apr 2018 08:07:35 -0400 Received: from mout.kundenserver.de ([212.227.126.187]:50873 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754673AbeDTMHb (ORCPT ); Fri, 20 Apr 2018 08:07:31 -0400 Received: from wuerfel.lan ([95.208.111.237]) by mrelayeu.kundenserver.de (mreue006 [212.227.15.129]) with ESMTPA (Nemesis) id 0LqaYA-1eWI6b49Io-00eMqe; Fri, 20 Apr 2018 14:06:44 +0200 From: Arnd Bergmann To: y2038@lists.linaro.org Cc: linux-kernel@vger.kernel.org, x86@kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, Arnd Bergmann , Paul Eggert , "Eric W . Biederman" , Richard Henderson , Ivan Kokshaysky , Matt Turner , Al Viro , Dominik Brodowski , Thomas Gleixner , Andrew Morton , linux-alpha@vger.kernel.org Subject: [PATCH v2 1/2] y2038: rusage: Use __kernel_old_timeval for process times Date: Fri, 20 Apr 2018 14:05:13 +0200 Message-Id: <20180420120605.1612248-1-arnd@arndb.de> X-Mailer: git-send-email 2.9.0 X-Provags-ID: V03:K1:0SNFXZmFStKcH5/8A4BPPkveCyAk32QpJigBXVo4F7YQWwpeUFS YVWvXTrGRJg2blaKLn2PhWh/cQQIIWp6/H9+pocsn+bRgyVDvA8bE5dHVpmIdven1UWJkjd AnBiCuEo9xd2qjP4VNFl2Wq9jloJs7r6AibkLCOcgOM2SqYdyLFY4C38APSKgwodCq++lOi hcExeyh3dw9UDhN1/Q5Gg== X-UI-Out-Filterresults: notjunk:1; V01:K0:1gcBgKKlDZo=:Q+6FinLZilEVDee+qz6neT h1gRz00XaggIMUnZEYcrZfH30sFfmzW3cYltBx6jvvYuKFEaGQYqSzN3vA4ufhlaCcs3djAxj z8tTvYZyKn/x32p9BAjG8EqHxM2qrb5J9eJ6UASjBDgn+ilD7rtz2W7VKRSh76D5YNum/mxN5 L+cBxM4JR23T+QBZizDRzm3kaeA4uK67jzmoKwcbK5whou6KJXuDvyQZrRTCF52thVzlA5GKg YzBKGF5LEntKNIm7pygiIq7v92M8rN1+Dm+6s8x4YSm7unTfX5Z8xs5H5D6okCbOgW1A4o4aV ziGZYWjEsBoyMFPzsnJq9eO8Ih+lQ9YMtxEtVWrpp5HaAnyoB4FHirDdpx2305o86aQHT1qFw Q6Z1acgsKpbTZEugFHnHdZdBUyQw+vM2LruB7UOizaagx89lfoT7CaJTqqaufnuimgOKdQsfr EN2jtSE6IBshoxMspPzSUzh5YVL2Kxf7x5oCgMbUtChqxbYBAf2h02o+UaSwbvey1gEyLYK+P 3ilgc84pzqFayJTZSOxIvDyWJKNucHzRMZXgIM5bxa/ohxIorIaZ0pQ8DCt1pLjxx6dqlbm5I IGeFPZWVUs5eSMLB8wSivASC87aU58+YqD3Q9J438+fcOR5IYAGREQ0LC8NbnJW9YJ+uiujOZ pBSIro1UihWmryRRdpPJmoIKvuaa/kUgU0KmEEOibC1isNU905lmxJsqylKD4h4SKH8vLDDEf L114s0xht4EhYNbGOaJQJu2M1vdfxxSkWpegpA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 'struct rusage' contains the run times of a process in 'timeval' format and is accessed through the wait4() and getrusage() system calls. This is not a problem for y2038 safety by itself, but causes an issue when the C library starts using 64-bit time_t on 32-bit architectures because the structure layout becomes incompatible. There are three possible ways of dealing with this: a) deprecate the wait4() and getrusage() system calls, and create a set of kernel interfaces based around a newly defined structure that could solve multiple problems at once, e.g. provide more fine-grained timestamps. The C library could then implement the posix interfaces on top of the new system calls. b) Extend the approach taken by the x32 ABI, and use the 64-bit native structure layout for rusage on all architectures with new system calls that is otherwise compatible. A downside of this is that it requires a number of ugly hacks to deal with all the other fields of the structure also becoming 64 bit wide. Especially on big-endian architectures, we can't easily use the union trick from glibc. c) Change the definition of struct rusage to be independent of time_t. This is the easiest change, as it does not involve new system call entry points, but it requires the C library to convert between the kernel format of the structure and the user space definition. d) Add a new ABI variant of 'struct rusage' that corresponds to the current layout with 32-bit counters but 64-bit time_t. This would minimize the libc changes but require additional kernel code to handle a third binary layout on 64-bit kernels. I'm picking approach c) for its simplicity. As pointed out by reviewers, simply using the kernel structure in user space would not be POSIX compliant, but I have verified that none of the usual C libraries (glibc, musl, uclibc-ng, newlib) do that. Instead, they all provide their own definition of 'struct rusage' to applications in sys/resource.h. To be on the safe side, I'm only changing the definition inside of the kernel and for user space with an updated 'time_t'. All existing users will see the traditional layout that is compatible with what the C libraries export. A 32-bit application that includes linux/resource.h but uses an update C library with 64-bit time_t will now see the low-level kernel structure that corresponds to the getrusage() system call interface but that will be different from one defined in sys/resource.h for the getrusage library interface. Link: https://patchwork.kernel.org/patch/10077527/ Cc: Paul Eggert Cc: Eric W. Biederman Signed-off-by: Arnd Bergmann --- arch/alpha/kernel/osf_sys.c | 15 +++++++++------ include/uapi/linux/resource.h | 14 ++++++++++++-- kernel/sys.c | 4 ++-- 3 files changed, 23 insertions(+), 10 deletions(-) -- 2.9.0 diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 89faa6f4de47..cad03ee445b3 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -1184,6 +1184,7 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options, struct rusage32 __user *, ur) { unsigned int status = 0; + struct rusage32 r32; struct rusage r; long err = kernel_wait4(pid, &status, options, &r); if (err <= 0) @@ -1192,12 +1193,14 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options, return -EFAULT; if (!ur) return err; - if (put_tv_to_tv32(&ur->ru_utime, &r.ru_utime)) - return -EFAULT; - if (put_tv_to_tv32(&ur->ru_stime, &r.ru_stime)) - return -EFAULT; - if (copy_to_user(&ur->ru_maxrss, &r.ru_maxrss, - sizeof(struct rusage32) - offsetof(struct rusage32, ru_maxrss))) + r32.ru_utime.tv_sec = r.ru_utime.tv_sec; + r32.ru_utime.tv_usec = r.ru_utime.tv_usec; + r32.ru_stime.tv_sec = r.ru_stime.tv_sec; + r32.ru_stime.tv_usec = r.ru_stime.tv_usec; + memcpy(&r32.ru_maxrss, &r.ru_maxrss, + sizeof(struct rusage32) - offsetof(struct rusage32, ru_maxrss)); + + if (copy_to_user(ur, &r32, sizeof(r32))) return -EFAULT; return err; } diff --git a/include/uapi/linux/resource.h b/include/uapi/linux/resource.h index cc00fd079631..611d3745c70a 100644 --- a/include/uapi/linux/resource.h +++ b/include/uapi/linux/resource.h @@ -22,8 +22,18 @@ #define RUSAGE_THREAD 1 /* only the calling thread */ struct rusage { - struct timeval ru_utime; /* user time used */ - struct timeval ru_stime; /* system time used */ +#if (__BITS_PER_LONG != 32 || !defined(__USE_TIME_BITS64)) && !defined(__KERNEL__) + struct timeval ru_utime; /* user time used */ + struct timeval ru_stime; /* system time used */ +#else + /* + * For 32-bit user space with 64-bit time_t, the binary layout + * in these fields is incompatible with 'struct timeval', so the + * C library has to translate this into the POSIX compatible layout. + */ + struct __kernel_old_timeval ru_utime; + struct __kernel_old_timeval ru_stime; +#endif __kernel_long_t ru_maxrss; /* maximum resident set size */ __kernel_long_t ru_ixrss; /* integral shared memory size */ __kernel_long_t ru_idrss; /* integral unshared data size */ diff --git a/kernel/sys.c b/kernel/sys.c index ad692183dfe9..1de538f622e8 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1769,8 +1769,8 @@ void getrusage(struct task_struct *p, int who, struct rusage *r) unlock_task_sighand(p, &flags); out: - r->ru_utime = ns_to_timeval(utime); - r->ru_stime = ns_to_timeval(stime); + r->ru_utime = ns_to_kernel_old_timeval(utime); + r->ru_stime = ns_to_kernel_old_timeval(stime); if (who != RUSAGE_CHILDREN) { struct mm_struct *mm = get_task_mm(p); From patchwork Fri Apr 20 12:05:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 133872 Delivered-To: patch@linaro.org Received: by 10.46.66.142 with SMTP id h14csp200479ljf; Fri, 20 Apr 2018 05:09:44 -0700 (PDT) X-Google-Smtp-Source: AIpwx48eUHIp/6LOyo0dzB3+COLIZdiU8IWOovw6Iai1S39TSG5ZokatPcjRDuRfcdamsSM/50yT X-Received: by 10.99.185.8 with SMTP id z8mr8364419pge.436.1524226184294; Fri, 20 Apr 2018 05:09:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524226184; cv=none; d=google.com; s=arc-20160816; b=loVc09yNsBzJ3qzck9eB0+N2UHraYaMNGNC+VE1rKk5dYgRaGj6KFnfwUUXGbyT59i L2DdDa5Go8KRZqEQiDH99dOme+b+sBvczjsKXjxdQu0tTOEI2BbQhhwI/XheIGuU3gEI 6p7Ny7pY62wVdNo06j6UBrUsSgBBX/3cXtHpMFg7oZqPAx6VbcieWSfmt3JXv5RtqCHI oc8F4kfxXUp4y35KVxbarLuUxjC4Ukv44RPJ1v9jURF9EjK3Gr8CxIxb656Fp5iGcsHu WzH0uWGqnQcHQLgBq8lIuj9GPdu7+edJkC8s/aHKakfksRFJuUDQSz3JuD95KVy5ipE5 /kTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=0QYdPs+wQz9IRa8SAz+Y8DLtOI/zZm2TNtuG3kEHowk=; b=LqSlPIWO59j9QOxUhK6qpSpbUmENC/HuxlGYkZNQPQx+bMr4CQORMjO6byhQph91+V +25GDhGvUjIt1oFOuBfbWGv7lmJts17aV87ZcqJpGUDZTmVvGQgCHjoRv7RlKaa15wDO jvxlfkZUG/fNkPHJzAulmNuwa6vI6ZcJXLTZoCqCtsW3W3GLHlhWlqgf7D5ylXOgHJqO 4L5lcwK/ogYyzIRSrDUBWPtZA/jAAlQChLWCm9AUbzyna1j4/Fi8jQ+F7pCMlq2G4CH6 NZXLy7/IlaZXp2johoGguN6vhi8oJn+wvyrpt6ryQF9Zgg0+K9Xyt7pv2ebTq2QcqS/1 G/ag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 31-v6si5491499plj.101.2018.04.20.05.09.43; Fri, 20 Apr 2018 05:09:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754748AbeDTMJj (ORCPT + 29 others); Fri, 20 Apr 2018 08:09:39 -0400 Received: from mout.kundenserver.de ([212.227.126.134]:45225 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754686AbeDTMJg (ORCPT ); Fri, 20 Apr 2018 08:09:36 -0400 Received: from wuerfel.lan ([95.208.111.237]) by mrelayeu.kundenserver.de (mreue006 [212.227.15.129]) with ESMTPA (Nemesis) id 0MHvkv-1f87Oj1HFl-003f79; Fri, 20 Apr 2018 14:08:36 +0200 From: Arnd Bergmann To: y2038@lists.linaro.org Cc: linux-kernel@vger.kernel.org, x86@kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, Arnd Bergmann , Paul Eggert , "Eric W . Biederman" , Richard Henderson , Ivan Kokshaysky , Matt Turner , Al Viro , Dominik Brodowski , Thomas Gleixner , Andrew Morton , linux-alpha@vger.kernel.org, Deepa Dinamani , Ingo Molnar Subject: [PATCH v2 2/2] rusage: allow 64-bit times ru_utime/ru_stime Date: Fri, 20 Apr 2018 14:05:14 +0200 Message-Id: <20180420120605.1612248-2-arnd@arndb.de> X-Mailer: git-send-email 2.9.0 In-Reply-To: <20180420120605.1612248-1-arnd@arndb.de> References: <20180420120605.1612248-1-arnd@arndb.de> X-Provags-ID: V03:K1:ahO9WbP00t7TnGWOSFgjOdrsN89dCI8PXu2ibH2vgAYSZmXPObo vV2SNEqOiddQHqQK7opZ2CJ6FASWz7dsLGGm480EFf1PF8FAMCmA5DTqHJcVyfK4an10UPk WESm0SqQEC5Szzvu64RDXF35i1YwI4wPzbTQeUAuRx0CcxDrzUvvld8srEhL9kIgU+VbeRh Rj+ovvLLTX4AD+geSi35Q== X-UI-Out-Filterresults: notjunk:1; V01:K0:eQd0N607K3c=:4AsQp225IR4ft/3wKaQl4c a4bE/v0xAAf3sViJ5BCTSSJfigKXO1xqHs2BzdFxrZc1JGvd+CDPl5M2FR+DcIq/M/rHI/Qrn REaGIi7gM44IXby1AEQMwN5rc8lfQvqLl6Whhqf6hysQkAIMC49QR19shokZDqBD5GEmCuaU5 Ew1ptnC/rCoKHJwTOkv0PFIca3OR87lxmqEZbybXVKCmZ/NyTI/NW7Vv3rSVjUGf7pOQf+NQU B8ZJDMRgpPwjp8xgyC+PoeDEIBNLvg9nCS44yzVwolvPE2AFMv6DjqxZ4VZjO0S8E2XdSAIm9 hTeIKNroogjLuQnnNEA2ro1jPASXW9qwrcEzfokZIHb4nOBOcmgmCG9BRLuiPLQQ8mw8DFE79 y6tihhZoKdLPkpXwk0XplHZJIRYmleDk0p9gSDy7JG/Ujz4n94V92/sOuDoOSMrII7Bz0jBkK 5lark/g0/o62qY8nN0Upm7w04EtEesALxA9dflEvScigJfi6r4S14L/7MDrze4ZI9VrdNigD4 mqGaAGQwEgwilBRNwFhN+PT/US5TfE/UIBRUQiyZvf9b7V/yu3SA45AbdqxprGiKBXsl9TLg/ Dl0K5SM9rf9ROn2jyHbpkeqCj3xZJ02tN/KESGXwEOk3spu3SviaGW8usdvB0QIPeknwljUIF ktVzKx/SkAzVHoCSsGt6i60hRm/pgPOMKOdYUvtqOBPXpy0NjCB4dmiP5/h6rSXuS5/aCQg/j EzenwMbhpCXrv38YTCeGzgFb66r8do/i4j9nsQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org As part of the y2038 system call work, the getrusage/wait4/waitid system calls came up, since they pass time data to user space in 'timeval' format as required by POSIX. This means the existing kernel data structure is no longer compatible with the one from user space once we have a C library exposing 64-bit time_t, requiring the library to convert the kernel structure into its own structure. This patch moves that conversion into the kernel itself, providing a set of system calls that can directly be used to implement the libc getrusage/wait4/waitid functions as we have traditionally done. There are two advantages to this: - The new path becomes the native case, avoiding the conversion overhead for future 32-bit C libraries. At least glibc will still have to implement a conversion logic as a fallback in order to run new applications on older kernels, but that does not have to be used on new kernels. - The range for the ru_utime/ru_stime is no longer limited to a 31-bit second counter (about 68 years). That limit may theoretically be hit on large SMP systems with a single process running for an extended time, e.g. 256 concurrent threads running for more than 97 days. Note that there is no overflow in 2038, as all the times are relative to the start of a process. The downside of this is obviously the added complexity of having three additional system call entry points plus their respective compat handlers, and updated syscall tables on each architecture (not included in this patch). Overall, I think this is *not* worth it, but I feel it's important to show how it can be done and what the cost is. There are probably some minor improvements that can be implemented on top, as well as bugs that I introduce. When reviewing this patch, let's for now focus instead on the question whether we want it at all or not. Signed-off-by: Arnd Bergmann --- arch/alpha/kernel/osf_sys.c | 2 +- include/linux/compat.h | 26 ++++++++- include/linux/resource.h | 4 +- include/linux/sched/task.h | 4 +- include/linux/syscalls.h | 8 +++ include/uapi/linux/resource.h | 29 ++++++++++ kernel/compat.c | 30 ++++++++++- kernel/exit.c | 120 ++++++++++++++++++++++++++++++++++++++---- kernel/sys.c | 74 +++++++++++++++++++++++--- 9 files changed, 275 insertions(+), 22 deletions(-) -- 2.9.0 diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index cad03ee445b3..aecdb48257b5 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -1185,7 +1185,7 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options, { unsigned int status = 0; struct rusage32 r32; - struct rusage r; + struct __kernel_rusage r; long err = kernel_wait4(pid, &status, options, &r); if (err <= 0) return err; diff --git a/include/linux/compat.h b/include/linux/compat.h index b73e2616a409..2ef30d314c48 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -105,7 +105,7 @@ typedef __compat_uid32_t compat_uid_t; typedef __compat_gid32_t compat_gid_t; struct compat_sel_arg_struct; -struct rusage; +struct __kernel_rusage; struct compat_itimerspec { struct compat_timespec it_interval; @@ -321,9 +321,31 @@ struct compat_rusage { compat_long_t ru_nivcsw; }; -extern int put_compat_rusage(const struct rusage *, +struct compat_rusage_time64 { + struct __kernel_rusage_timeval ru_utime; + struct __kernel_rusage_timeval ru_stime; + compat_long_t ru_maxrss; + compat_long_t ru_ixrss; + compat_long_t ru_idrss; + compat_long_t ru_isrss; + compat_long_t ru_minflt; + compat_long_t ru_majflt; + compat_long_t ru_nswap; + compat_long_t ru_inblock; + compat_long_t ru_oublock; + compat_long_t ru_msgsnd; + compat_long_t ru_msgrcv; + compat_long_t ru_nsignals; + compat_long_t ru_nvcsw; + compat_long_t ru_nivcsw; +}; + +extern int put_compat_rusage(const struct __kernel_rusage *, struct compat_rusage __user *); +extern int put_compat_rusage_time64(const struct __kernel_rusage *, + struct compat_rusage_time64 __user *); + struct compat_siginfo; struct compat_dirent { diff --git a/include/linux/resource.h b/include/linux/resource.h index bdf491cbcab7..8cebf90e76b7 100644 --- a/include/linux/resource.h +++ b/include/linux/resource.h @@ -7,8 +7,10 @@ struct task_struct; -void getrusage(struct task_struct *p, int who, struct rusage *ru); +void getrusage(struct task_struct *p, int who, struct __kernel_rusage *ru); int do_prlimit(struct task_struct *tsk, unsigned int resource, struct rlimit *new_rlim, struct rlimit *old_rlim); +int put_rusage(const struct __kernel_rusage *rk, struct rusage __user *ru); + #endif diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 5be31eb7b266..cc54ae5e6010 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -10,7 +10,7 @@ #include struct task_struct; -struct rusage; +struct __kernel_rusage; union thread_union; /* @@ -75,7 +75,7 @@ extern long _do_fork(unsigned long, unsigned long, unsigned long, int __user *, extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *); struct task_struct *fork_idle(int); extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); -extern long kernel_wait4(pid_t, int *, int, struct rusage *); +extern long kernel_wait4(pid_t, int *, int, struct __kernel_rusage *); extern void free_task(struct task_struct *tsk); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index a756ab42894f..084360078f29 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -37,6 +37,7 @@ struct pollfd; struct rlimit; struct rlimit64; struct rusage; +struct __kernel_rusage; struct sched_param; struct sched_attr; struct sel_arg_struct; @@ -522,6 +523,10 @@ asmlinkage long sys_waitid(int which, pid_t pid, struct siginfo __user *infop, int options, struct rusage __user *ru); +asmlinkage long sys_waitid_time64(int which, pid_t pid, + struct siginfo __user *infop, + int options, struct __kernel_rusage __user *ru); + /* kernel/fork.c */ asmlinkage long sys_set_tid_address(int __user *tidptr); asmlinkage long sys_unshare(unsigned long unshare_flags); @@ -656,6 +661,7 @@ asmlinkage long sys_getrlimit(unsigned int resource, asmlinkage long sys_setrlimit(unsigned int resource, struct rlimit __user *rlim); asmlinkage long sys_getrusage(int who, struct rusage __user *ru); +asmlinkage long sys_getrusage_time64(int who, struct __kernel_rusage __user *ru); asmlinkage long sys_umask(int mask); asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5); @@ -821,6 +827,8 @@ asmlinkage long sys_recvmmsg(int fd, struct mmsghdr __user *msg, asmlinkage long sys_wait4(pid_t pid, int __user *stat_addr, int options, struct rusage __user *ru); +asmlinkage long sys_wait4_time64(pid_t pid, int __user *stat_addr, + int options, struct __kernel_rusage __user *ru); asmlinkage long sys_prlimit64(pid_t pid, unsigned int resource, const struct rlimit64 __user *new_rlim, struct rlimit64 __user *old_rlim); diff --git a/include/uapi/linux/resource.h b/include/uapi/linux/resource.h index 611d3745c70a..a822e716e122 100644 --- a/include/uapi/linux/resource.h +++ b/include/uapi/linux/resource.h @@ -50,6 +50,35 @@ struct rusage { __kernel_long_t ru_nivcsw; /* involuntary " */ }; +/* + * __kernel_rusage is the binary that we expect 32-bit C libraries + * to provide for their 'struct rusage' after migrating to a 64-bit + * time_t. + */ +struct __kernel_rusage_timeval { + __s64 tv_sec; + __s64 tv_usec; +}; + +struct __kernel_rusage { + struct __kernel_rusage_timeval ru_utime; /* user time used */ + struct __kernel_rusage_timeval ru_stime; /* system time used */ + __kernel_long_t ru_maxrss; /* maximum resident set size */ + __kernel_long_t ru_ixrss; /* integral shared memory size */ + __kernel_long_t ru_idrss; /* integral unshared data size */ + __kernel_long_t ru_isrss; /* integral unshared stack size */ + __kernel_long_t ru_minflt; /* page reclaims */ + __kernel_long_t ru_majflt; /* page faults */ + __kernel_long_t ru_nswap; /* swaps */ + __kernel_long_t ru_inblock; /* block input operations */ + __kernel_long_t ru_oublock; /* block output operations */ + __kernel_long_t ru_msgsnd; /* messages sent */ + __kernel_long_t ru_msgrcv; /* messages received */ + __kernel_long_t ru_nsignals; /* signals received */ + __kernel_long_t ru_nvcsw; /* voluntary context switches */ + __kernel_long_t ru_nivcsw; /* involuntary " */ +}; + struct rlimit { __kernel_ulong_t rlim_cur; __kernel_ulong_t rlim_max; diff --git a/kernel/compat.c b/kernel/compat.c index 51a081b46832..e3cb7c14558a 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -234,7 +234,7 @@ COMPAT_SYSCALL_DEFINE3(sigprocmask, int, how, #endif -int put_compat_rusage(const struct rusage *r, struct compat_rusage __user *ru) +int put_compat_rusage(const struct __kernel_rusage *r, struct compat_rusage __user *ru) { struct compat_rusage r32; memset(&r32, 0, sizeof(r32)); @@ -261,6 +261,34 @@ int put_compat_rusage(const struct rusage *r, struct compat_rusage __user *ru) return 0; } +int put_compat_rusage_time64(const struct __kernel_rusage *r, + struct compat_rusage_time64 __user *ru) +{ + struct compat_rusage_time64 r32; + memset(&r32, 0, sizeof(r32)); + r32.ru_utime.tv_sec = r->ru_utime.tv_sec; + r32.ru_utime.tv_usec = r->ru_utime.tv_usec; + r32.ru_stime.tv_sec = r->ru_stime.tv_sec; + r32.ru_stime.tv_usec = r->ru_stime.tv_usec; + r32.ru_maxrss = r->ru_maxrss; + r32.ru_ixrss = r->ru_ixrss; + r32.ru_idrss = r->ru_idrss; + r32.ru_isrss = r->ru_isrss; + r32.ru_minflt = r->ru_minflt; + r32.ru_majflt = r->ru_majflt; + r32.ru_nswap = r->ru_nswap; + r32.ru_inblock = r->ru_inblock; + r32.ru_oublock = r->ru_oublock; + r32.ru_msgsnd = r->ru_msgsnd; + r32.ru_msgrcv = r->ru_msgrcv; + r32.ru_nsignals = r->ru_nsignals; + r32.ru_nvcsw = r->ru_nvcsw; + r32.ru_nivcsw = r->ru_nivcsw; + if (copy_to_user(ru, &r32, sizeof(r32))) + return -EFAULT; + return 0; +} + static int compat_get_user_cpu_mask(compat_ulong_t __user *user_mask_ptr, unsigned len, struct cpumask *new_mask) { diff --git a/kernel/exit.c b/kernel/exit.c index c3c7ac560114..5088c671ea74 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -995,7 +995,7 @@ struct wait_opts { struct waitid_info *wo_info; int wo_stat; - struct rusage *wo_rusage; + struct __kernel_rusage *wo_rusage; wait_queue_entry_t child_wait; int notask_error; @@ -1548,7 +1548,7 @@ static long do_wait(struct wait_opts *wo) } static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop, - int options, struct rusage *ru) + int options, struct __kernel_rusage *ru) { struct wait_opts wo; struct pid *pid = NULL; @@ -1596,7 +1596,7 @@ static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop, SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, infop, int, options, struct rusage __user *, ru) { - struct rusage r; + struct __kernel_rusage r; struct waitid_info info = {.status = 0}; long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL); int signo = 0; @@ -1604,7 +1604,41 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, if (err > 0) { signo = SIGCHLD; err = 0; - if (ru && copy_to_user(ru, &r, sizeof(struct rusage))) + if (ru && put_rusage(&r, ru)) + return -EFAULT; + } + if (!infop) + return err; + + if (!access_ok(VERIFY_WRITE, infop, sizeof(*infop))) + return -EFAULT; + + user_access_begin(); + unsafe_put_user(signo, &infop->si_signo, Efault); + unsafe_put_user(0, &infop->si_errno, Efault); + unsafe_put_user(info.cause, &infop->si_code, Efault); + unsafe_put_user(info.pid, &infop->si_pid, Efault); + unsafe_put_user(info.uid, &infop->si_uid, Efault); + unsafe_put_user(info.status, &infop->si_status, Efault); + user_access_end(); + return err; +Efault: + user_access_end(); + return -EFAULT; +} + +SYSCALL_DEFINE5(waitid_time64, int, which, pid_t, upid, struct siginfo __user *, + infop, int, options, struct __kernel_rusage __user *, ru) +{ + struct __kernel_rusage r; + struct waitid_info info = {.status = 0}; + long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL); + int signo = 0; + + if (err > 0) { + signo = SIGCHLD; + err = 0; + if (ru && copy_to_user(ru, &r, sizeof(struct __kernel_rusage))) return -EFAULT; } if (!infop) @@ -1628,7 +1662,7 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, } long kernel_wait4(pid_t upid, int __user *stat_addr, int options, - struct rusage *ru) + struct __kernel_rusage *ru) { struct wait_opts wo; struct pid *pid = NULL; @@ -1673,11 +1707,24 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, int options, SYSCALL_DEFINE4(wait4, pid_t, upid, int __user *, stat_addr, int, options, struct rusage __user *, ru) { - struct rusage r; + struct __kernel_rusage r; long err = kernel_wait4(upid, stat_addr, options, ru ? &r : NULL); if (err > 0) { - if (ru && copy_to_user(ru, &r, sizeof(struct rusage))) + if (ru && put_rusage(&r, ru)) + return -EFAULT; + } + return err; +} + +SYSCALL_DEFINE4(wait4_time64, pid_t, upid, int __user *, stat_addr, + int, options, struct __kernel_rusage __user *, ru) +{ + struct __kernel_rusage r; + long err = kernel_wait4(upid, stat_addr, options, ru ? &r : NULL); + + if (err > 0) { + if (ru && copy_to_user(ru, &r, sizeof(struct __kernel_rusage))) return -EFAULT; } return err; @@ -1703,7 +1750,7 @@ COMPAT_SYSCALL_DEFINE4(wait4, int, options, struct compat_rusage __user *, ru) { - struct rusage r; + struct __kernel_rusage r; long err = kernel_wait4(pid, stat_addr, options, ru ? &r : NULL); if (err > 0) { if (ru && put_compat_rusage(&r, ru)) @@ -1712,12 +1759,27 @@ COMPAT_SYSCALL_DEFINE4(wait4, return err; } +COMPAT_SYSCALL_DEFINE4(wait4_time64, + compat_pid_t, pid, + compat_uint_t __user *, stat_addr, + int, options, + struct compat_rusage_time64 __user *, ru) +{ + struct __kernel_rusage r; + long err = kernel_wait4(pid, stat_addr, options, ru ? &r : NULL); + if (err > 0) { + if (ru && put_compat_rusage_time64(&r, ru)) + return -EFAULT; + } + return err; +} + COMPAT_SYSCALL_DEFINE5(waitid, int, which, compat_pid_t, pid, struct compat_siginfo __user *, infop, int, options, struct compat_rusage __user *, uru) { - struct rusage ru; + struct __kernel_rusage ru; struct waitid_info info = {.status = 0}; long err = kernel_waitid(which, pid, &info, options, uru ? &ru : NULL); int signo = 0; @@ -1754,6 +1816,46 @@ COMPAT_SYSCALL_DEFINE5(waitid, user_access_end(); return -EFAULT; } + +COMPAT_SYSCALL_DEFINE5(waitid_time64, + int, which, compat_pid_t, pid, + struct compat_siginfo __user *, infop, int, options, + struct compat_rusage_time64 __user *, uru) +{ + struct __kernel_rusage ru; + struct waitid_info info = {.status = 0}; + long err = kernel_waitid(which, pid, &info, options, uru ? &ru : NULL); + int signo = 0; + if (err > 0) { + signo = SIGCHLD; + err = 0; + if (uru) { + /* kernel_waitid() overwrites everything in ru */ + err = put_compat_rusage_time64(&ru, uru); + if (err) + return -EFAULT; + } + } + + if (!infop) + return err; + + if (!access_ok(VERIFY_WRITE, infop, sizeof(*infop))) + return -EFAULT; + + user_access_begin(); + unsafe_put_user(signo, &infop->si_signo, Efault); + unsafe_put_user(0, &infop->si_errno, Efault); + unsafe_put_user(info.cause, &infop->si_code, Efault); + unsafe_put_user(info.pid, &infop->si_pid, Efault); + unsafe_put_user(info.uid, &infop->si_uid, Efault); + unsafe_put_user(info.status, &infop->si_status, Efault); + user_access_end(); + return err; +Efault: + user_access_end(); + return -EFAULT; +} #endif __weak void abort(void) diff --git a/kernel/sys.c b/kernel/sys.c index 1de538f622e8..5b5f2dc19e79 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1699,7 +1699,7 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim) * */ -static void accumulate_thread_rusage(struct task_struct *t, struct rusage *r) +static void accumulate_thread_rusage(struct task_struct *t, struct __kernel_rusage *r) { r->ru_nvcsw += t->nvcsw; r->ru_nivcsw += t->nivcsw; @@ -1709,12 +1709,13 @@ static void accumulate_thread_rusage(struct task_struct *t, struct rusage *r) r->ru_oublock += task_io_get_oublock(t); } -void getrusage(struct task_struct *p, int who, struct rusage *r) +void getrusage(struct task_struct *p, int who, struct __kernel_rusage *r) { struct task_struct *t; unsigned long flags; u64 tgutime, tgstime, utime, stime; unsigned long maxrss = 0; + struct timespec64 ts; memset((char *)r, 0, sizeof (*r)); utime = stime = 0; @@ -1769,8 +1770,12 @@ void getrusage(struct task_struct *p, int who, struct rusage *r) unlock_task_sighand(p, &flags); out: - r->ru_utime = ns_to_kernel_old_timeval(utime); - r->ru_stime = ns_to_kernel_old_timeval(stime); + ts = ns_to_timespec64(utime); + r->ru_utime.tv_sec = ts.tv_sec; + r->ru_utime.tv_usec = ts.tv_nsec / NSEC_PER_USEC; + ts = ns_to_timespec64(stime); + r->ru_stime.tv_sec = ts.tv_sec; + r->ru_stime.tv_usec = ts.tv_nsec / NSEC_PER_USEC; if (who != RUSAGE_CHILDREN) { struct mm_struct *mm = get_task_mm(p); @@ -1783,10 +1788,54 @@ void getrusage(struct task_struct *p, int who, struct rusage *r) r->ru_maxrss = maxrss * (PAGE_SIZE / 1024); /* convert pages to KBs */ } -SYSCALL_DEFINE2(getrusage, int, who, struct rusage __user *, ru) +int put_rusage(const struct __kernel_rusage *rk, struct rusage __user *ru) { struct rusage r; + if (IS_ENABLED(CONFIG_64BIT)) + return copy_to_user(ru, &rk, sizeof(rk)) ? -EFAULT : 0; + + memset(&r, 0, sizeof(r)); + r.ru_utime.tv_sec = rk->ru_utime.tv_sec; + r.ru_utime.tv_usec = rk->ru_utime.tv_usec; + r.ru_stime.tv_sec = rk->ru_stime.tv_sec; + r.ru_stime.tv_usec = rk->ru_stime.tv_usec; + r.ru_maxrss = rk->ru_maxrss; + r.ru_ixrss = rk->ru_ixrss; + r.ru_idrss = rk->ru_idrss; + r.ru_isrss = rk->ru_isrss; + r.ru_minflt = rk->ru_minflt; + r.ru_majflt = rk->ru_majflt; + r.ru_nswap = rk->ru_nswap; + r.ru_inblock = rk->ru_inblock; + r.ru_oublock = rk->ru_oublock; + r.ru_msgsnd = rk->ru_msgsnd; + r.ru_msgrcv = rk->ru_msgrcv; + r.ru_nsignals = rk->ru_nsignals; + r.ru_nvcsw = rk->ru_nvcsw; + r.ru_nivcsw = rk->ru_nivcsw; + if (copy_to_user(ru, &r, sizeof(r))) + return -EFAULT; + return 0; +} + +SYSCALL_DEFINE2(getrusage, int, who, struct rusage __user *, ru) +{ + struct __kernel_rusage r; + + if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN && + who != RUSAGE_THREAD) + return -EINVAL; + + getrusage(current, who, &r); + return put_rusage(&r, ru); +} + +#ifndef CONFIG_64BIT +SYSCALL_DEFINE2(getrusage_time64, int, who, struct __kernel_rusage __user *, ru) +{ + struct __kernel_rusage r; + if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN && who != RUSAGE_THREAD) return -EINVAL; @@ -1794,11 +1843,12 @@ SYSCALL_DEFINE2(getrusage, int, who, struct rusage __user *, ru) getrusage(current, who, &r); return copy_to_user(ru, &r, sizeof(r)) ? -EFAULT : 0; } +#endif #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(getrusage, int, who, struct compat_rusage __user *, ru) { - struct rusage r; + struct __kernel_rusage r; if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN && who != RUSAGE_THREAD) @@ -1807,6 +1857,18 @@ COMPAT_SYSCALL_DEFINE2(getrusage, int, who, struct compat_rusage __user *, ru) getrusage(current, who, &r); return put_compat_rusage(&r, ru); } + +COMPAT_SYSCALL_DEFINE2(getrusage_time64, int, who, struct compat_rusage_time64 __user *, ru) +{ + struct __kernel_rusage r; + + if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN && + who != RUSAGE_THREAD) + return -EINVAL; + + getrusage(current, who, &r); + return put_compat_rusage_time64(&r, ru); +} #endif SYSCALL_DEFINE1(umask, int, mask)