From patchwork Fri Jan 31 10:13:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 23945 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pb0-f72.google.com (mail-pb0-f72.google.com [209.85.160.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id D643E202B2 for ; Fri, 31 Jan 2014 10:14:05 +0000 (UTC) Received: by mail-pb0-f72.google.com with SMTP id up15sf10018139pbc.7 for ; Fri, 31 Jan 2014 02:14:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id:cc :precedence:list-id:list-unsubscribe:list-archive:list-post :list-help:list-subscribe:mime-version:sender:errors-to :x-original-sender:x-original-authentication-results:mailing-list :content-type:content-transfer-encoding; bh=gekmFIWo+rom5Xi/x96MFDs3bLIbax/BdxWleKxoKjI=; b=YXEYfd9qYE4rpMRrWxJkmbX+yF9mrYQos5nk+AbRNla01k3gGF+GnNJK0EVS+B5/uT QVo5sKj3qAXbzgEfTXwmltpx15dBpyWrbThNjn84K8lgALY8pmUOZwIIjSsiE74BrDSa CuEb/zCLe0d9SrsBKk+H6X3ROP61bkL17BGedYO4a6pyB7+6Pb2Lc2zb6BEOFdHYLIbp QlF0I9jVIwlNtcO7ftxLf2qIlSKa89vFFEUrXFqPTNvlD4ze3FN7BTGDKH8fhDSTkPJq lZE0xxGIdkIg2ibRG665qzwytCjQd2pDjM2t1ah8PymOjSOX/oLahEvqRe8Cl8u+HxnT +9zA== X-Gm-Message-State: ALoCoQkMybLj/aNTS7o0NYKI8EezUM+XGcqnHqDnjnf3pggUoJxh7fPG29HaxJBsO0EkfwJPmwgN X-Received: by 10.66.102.8 with SMTP id fk8mr7436570pab.24.1391163245115; Fri, 31 Jan 2014 02:14:05 -0800 (PST) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.20.100 with SMTP id 91ls924252qgi.18.gmail; Fri, 31 Jan 2014 02:14:05 -0800 (PST) X-Received: by 10.52.108.40 with SMTP id hh8mr40885vdb.51.1391163244969; Fri, 31 Jan 2014 02:14:04 -0800 (PST) Received: from mail-vb0-f41.google.com (mail-vb0-f41.google.com [209.85.212.41]) by mx.google.com with ESMTPS id gv8si3307025veb.120.2014.01.31.02.14.04 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Jan 2014 02:14:04 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.41 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.212.41; Received: by mail-vb0-f41.google.com with SMTP id g10so2890108vbg.28 for ; Fri, 31 Jan 2014 02:14:04 -0800 (PST) X-Received: by 10.52.50.177 with SMTP id d17mr13223847vdo.23.1391163244803; Fri, 31 Jan 2014 02:14:04 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.174.196 with SMTP id u4csp84233vcz; Fri, 31 Jan 2014 02:14:04 -0800 (PST) X-Received: by 10.180.205.204 with SMTP id li12mr12993015wic.34.1391163243847; Fri, 31 Jan 2014 02:14:03 -0800 (PST) Received: from casper.infradead.org (casper.infradead.org. [2001:770:15f::2]) by mx.google.com with ESMTPS id ll10si4751240wjc.79.2014.01.31.02.14.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jan 2014 02:14:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:770:15f::2 as permitted sender) client-ip=2001:770:15f::2; Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1W9B6a-0001Rw-D4; Fri, 31 Jan 2014 10:13:48 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1W9B6X-000393-TS; Fri, 31 Jan 2014 10:13:45 +0000 Received: from mail-lb0-f178.google.com ([209.85.217.178]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1W9B6U-00038Z-Mf for linux-arm-kernel@lists.infradead.org; Fri, 31 Jan 2014 10:13:43 +0000 Received: by mail-lb0-f178.google.com with SMTP id u14so3310674lbd.9 for ; Fri, 31 Jan 2014 02:13:19 -0800 (PST) X-Received: by 10.152.180.42 with SMTP id dl10mr217658lac.62.1391163199123; Fri, 31 Jan 2014 02:13:19 -0800 (PST) Received: from ards-macbook-pro.local (cag06-7-83-153-85-71.fbx.proxad.net. [83.153.85.71]) by mx.google.com with ESMTPSA id h7sm9641636lbj.1.2014.01.31.02.13.17 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Jan 2014 02:13:18 -0800 (PST) From: Ard Biesheuvel To: will.deacon@arm.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org Subject: [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Date: Fri, 31 Jan 2014 11:13:15 +0100 Message-Id: <1391163196-27619-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140131_051343_025037_F14FF0D2 X-CRM114-Status: GOOD ( 24.97 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.217.178 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Ard Biesheuvel X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.41 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 If a task gets scheduled out and back in again and nothing has touched its FPSIMD state in the mean time, there is really no reason to reload it from memory. Similarly, repeated calls to kernel_neon_begin() and kernel_neon_end() will preserve and restore the FPSIMD state every time. This patch defers the FPSIMD state restore to the last possible moment, i.e., right before the task re-enters userland. If a task does not enter userland at all (for any reason), the existing FPSIMD state is preserved and may be reused by the owning task if it gets scheduled in again on the same CPU. Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/fpsimd.h | 3 ++ arch/arm64/include/asm/thread_info.h | 4 +- arch/arm64/kernel/entry.S | 2 +- arch/arm64/kernel/fpsimd.c | 79 +++++++++++++++++++++++++++++++----- arch/arm64/kernel/process.c | 3 +- arch/arm64/kernel/signal.c | 3 ++ 6 files changed, 81 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index c43b4ac13008..609bc44ceb8d 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -37,6 +37,8 @@ struct fpsimd_state { u32 fpcr; }; }; + /* the id of the last cpu to have restored this state */ + unsigned int last_cpu; }; #if defined(__KERNEL__) && defined(CONFIG_COMPAT) @@ -57,6 +59,7 @@ extern void fpsimd_load_state(struct fpsimd_state *state); extern void fpsimd_thread_switch(struct task_struct *next); extern void fpsimd_flush_thread(void); +extern void fpsimd_reload_fpstate(void); #endif diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 720e70b66ffd..4a1ca1cfb2f8 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void) #define TIF_SIGPENDING 0 #define TIF_NEED_RESCHED 1 #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ +#define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */ #define TIF_SYSCALL_TRACE 8 #define TIF_POLLING_NRFLAG 16 #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ @@ -112,10 +113,11 @@ static inline struct thread_info *current_thread_info(void) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) +#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE) #define _TIF_32BIT (1 << TIF_32BIT) #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ - _TIF_NOTIFY_RESUME) + _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE) #endif /* __KERNEL__ */ #endif /* __ASM_THREAD_INFO_H */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 39ac630d83de..80464e2fb1a5 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -576,7 +576,7 @@ fast_work_pending: str x0, [sp, #S_X0] // returned x0 work_pending: tbnz x1, #TIF_NEED_RESCHED, work_resched - /* TIF_SIGPENDING or TIF_NOTIFY_RESUME case */ + /* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */ ldr x2, [sp, #S_PSTATE] mov x0, sp // 'regs' tst x2, #PSR_MODE_MASK // user mode regs? diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 4aef42a04bdc..226a495e019c 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -35,6 +35,23 @@ #define FPEXC_IDF (1 << 7) /* + * In order to reduce the number of times the fpsimd state is needlessly saved + * and restored, keep track here of which task's userland owns the current state + * of the FPSIMD register file. + * + * This percpu variable points to the fpsimd_state.last_cpu field of the task + * whose FPSIMD state was most recently loaded onto this cpu. The last_cpu field + * itself contains the id of the cpu onto which the task's FPSIMD state was + * loaded most recently. So, to decide whether we can skip reloading the FPSIMD + * state, we need to check + * (a) whether this task was the last one to have its FPSIMD state loaded onto + * this cpu + * (b) whether this task may have manipulated its FPSIMD state on another cpu in + * the meantime + */ +static DEFINE_PER_CPU(unsigned int *, fpsimd_last_task); + +/* * Trapped FP/ASIMD access. */ void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs) @@ -72,18 +89,56 @@ void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) void fpsimd_thread_switch(struct task_struct *next) { - /* check if not kernel threads */ - if (current->mm) + /* + * The thread flag TIF_FOREIGN_FPSTATE conveys that the userland FPSIMD + * state belonging to the current task is not present in the registers + * but has (already) been saved to memory in order for the kernel to be + * able to go off and use the registers for something else. Therefore, + * we must not (re)save the register contents if this flag is set. + */ + if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) fpsimd_save_state(¤t->thread.fpsimd_state); - if (next->mm) - fpsimd_load_state(&next->thread.fpsimd_state); + + if (next->mm) { + /* + * If we are switching to a task whose most recent userland NEON + * contents are already in the registers of *this* cpu, we can + * skip loading the state from memory. Otherwise, set the + * TIF_FOREIGN_FPSTATE flag so the state will be loaded upon the + * next entry of userland. + */ + struct fpsimd_state *st = &next->thread.fpsimd_state; + + if (__get_cpu_var(fpsimd_last_task) == &st->last_cpu + && st->last_cpu == smp_processor_id()) + clear_ti_thread_flag(task_thread_info(next), + TIF_FOREIGN_FPSTATE); + else + set_ti_thread_flag(task_thread_info(next), + TIF_FOREIGN_FPSTATE); + } } void fpsimd_flush_thread(void) { - preempt_disable(); memset(¤t->thread.fpsimd_state, 0, sizeof(struct fpsimd_state)); - fpsimd_load_state(¤t->thread.fpsimd_state); + set_thread_flag(TIF_FOREIGN_FPSTATE); +} + +void fpsimd_reload_fpstate(void) +{ + preempt_disable(); + if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { + /* + * We are entering userland and the userland context is not yet + * present in the registers. + */ + struct fpsimd_state *st = ¤t->thread.fpsimd_state; + + fpsimd_load_state(st); + __get_cpu_var(fpsimd_last_task) = &st->last_cpu; + st->last_cpu = smp_processor_id(); + } preempt_enable(); } @@ -98,16 +153,20 @@ void kernel_neon_begin(void) BUG_ON(in_interrupt()); preempt_disable(); - if (current->mm) + /* + * Save the userland FPSIMD state if we have one and if we haven't done + * so already. Clear fpsimd_last_task to indicate that there is no + * longer userland context in the registers. + */ + if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) fpsimd_save_state(¤t->thread.fpsimd_state); + __get_cpu_var(fpsimd_last_task) = NULL; + } EXPORT_SYMBOL(kernel_neon_begin); void kernel_neon_end(void) { - if (current->mm) - fpsimd_load_state(¤t->thread.fpsimd_state); - preempt_enable(); } EXPORT_SYMBOL(kernel_neon_end); diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 248a15db37f2..274316df860f 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -205,7 +205,8 @@ void release_thread(struct task_struct *dead_task) int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { - fpsimd_save_state(¤t->thread.fpsimd_state); + if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) + fpsimd_save_state(¤t->thread.fpsimd_state); *dst = *src; return 0; } diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 890a591f75dd..0a9eccf4fc0f 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -416,4 +416,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, clear_thread_flag(TIF_NOTIFY_RESUME); tracehook_notify_resume(regs); } + + if (thread_flags & _TIF_FOREIGN_FPSTATE) + fpsimd_reload_fpstate(); }