From patchwork Fri Jan 27 10:34:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 92594 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp141490obz; Fri, 27 Jan 2017 02:47:12 -0800 (PST) X-Received: by 10.200.49.167 with SMTP id h36mr6708460qte.11.1485514032600; Fri, 27 Jan 2017 02:47:12 -0800 (PST) Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id y190si778762qke.241.2017.01.27.02.47.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 27 Jan 2017 02:47:12 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:44270 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX43h-0001DA-TW for patch@linaro.org; Fri, 27 Jan 2017 05:47:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48050) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX3sM-0007y4-Gj for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:35:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cX3sL-0002EG-5y for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:35:26 -0500 Received: from mail-wm0-x22b.google.com ([2a00:1450:400c:c09::22b]:36114) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cX3sK-0002E3-TA for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:35:25 -0500 Received: by mail-wm0-x22b.google.com with SMTP id c85so110462826wmi.1 for ; Fri, 27 Jan 2017 02:35:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=erpZnd0pitnHmgzHfCuX3/9CIVLvrSFtg5nbi7i7aX8=; b=C+VcGO4GRH+8jmfwXDB02n0ENOjrgdCsDigED+LPu57h29MO3kv1IH0Hh3VFfn6yws WI7RVMoaOvScuqgv5wpE7laT3hFGoQu/stvHfvc75VzSFr+LnEI4XySkblXiv7qEM00J VWdr3HwxGNGQ6yA5aKqtEfvJHK4dpH+wRmSE4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=erpZnd0pitnHmgzHfCuX3/9CIVLvrSFtg5nbi7i7aX8=; b=eVwF6PnGqdPk6XgrXr6K2GTReKtRH3t0Q8eWQHHBnDHSjPecrrApOlDCKhR1g6rP3G 00B2BfR8/7XC9/UYKGs8IrjgsM7spJA02sbJqg0sffkoKtb1y6CBu38I1YI6t06MYbRv wVk0UwFXHJj1JxTkTBoe8zndgyBVcjvldUCaw6aJqa7ERV3kHjUt2V/rILMzVS8pAnK+ TsCfM6mR97/NxpC0cCbBqQW/0VoOhhgCbVW93ioETFBflj+W9FStnTSD8N8EpBH3vSbS xxgK4NtXsaDKf7G9rwNCGnImw0Dl3VJDoXJcNHTt7Pyk+KooAkiYVrczAAjtmJRebP/W XscQ== X-Gm-Message-State: AIkVDXLucNqHA5QHCnn9+lJEpsozFt9gGb7o7OlaTtTG3fXFo/XL+JL/J5gWYJcUED2RBQ8o X-Received: by 10.28.23.66 with SMTP id 63mr2550657wmx.46.1485513323769; Fri, 27 Jan 2017 02:35:23 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id r5sm3067553wme.23.2017.01.27.02.35.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Jan 2017 02:35:17 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id ABF6B3E3767; Fri, 27 Jan 2017 10:35:06 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: Date: Fri, 27 Jan 2017 10:34:55 +0000 Message-Id: <20170127103505.18606-16-alex.bennee@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170127103505.18606-1-alex.bennee@linaro.org> References: <20170127103505.18606-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::22b Subject: [Qemu-devel] [PATCH v8 15/25] cputlb: introduce tlb_flush_* async work. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Crosthwaite , "open list:Overall" , Paolo Bonzini , KONRAD Frederic , =?utf-8?q?Alex_Benn=C3=A9e?= , Richard Henderson Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: KONRAD Frederic Some architectures allow to flush the tlb of other VCPUs. This is not a problem when we have only one thread for all VCPUs but it definitely needs to be an asynchronous work when we are in true multithreaded work. We take the tb_lock() when doing this to avoid racing with other threads which may be invalidating TB's at the same time. The alternative would be to use proper atomic primitives to clear the tlb entries en-mass. This patch doesn't do anything to protect other cputlb function being called in MTTCG mode making cross vCPU changes. Signed-off-by: KONRAD Frederic [AJB: remove need for g_malloc on defer, make check fixes, tb_lock] Signed-off-by: Alex Bennée --- v8 - fix merge failure mentioning global flush v6 (base patches) - don't use cmpxchg_bool (we drop it later anyway) - use RUN_ON_CPU macros instead of inlines - bug out of tlb_flush if !tcg_enabled() (MacOSX make check failure) v5 (base patches) - take tb_lock() for memset - ensure tb_flush_page properly asyncs work for other vCPUs - use run_on_cpu_data v4 (base_patches) - brought forward from arm enabling series - restore pending_tlb_flush flag v1 - Remove tlb_flush_all just do the check in tlb_flush. - remove the need to g_malloc - tlb_flush calls direct if !cpu->created --- cputlb.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++-- include/exec/exec-all.h | 1 + include/qom/cpu.h | 6 +++++ 3 files changed, 71 insertions(+), 2 deletions(-) -- 2.11.0 diff --git a/cputlb.c b/cputlb.c index 94fa9977c5..5dfd3c3ba9 100644 --- a/cputlb.c +++ b/cputlb.c @@ -64,6 +64,10 @@ } \ } while (0) +/* run_on_cpu_data.target_ptr should always be big enough for a + * target_ulong even on 32 bit builds */ +QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data)); + /* statistics */ int tlb_flush_count; @@ -72,13 +76,22 @@ int tlb_flush_count; * flushing more entries than required is only an efficiency issue, * not a correctness issue. */ -void tlb_flush(CPUState *cpu) +static void tlb_flush_nocheck(CPUState *cpu) { CPUArchState *env = cpu->env_ptr; + /* The QOM tests will trigger tlb_flushes without setting up TCG + * so we bug out here in that case. + */ + if (!tcg_enabled()) { + return; + } + assert_cpu_is_self(cpu); tlb_debug("(count: %d)\n", tlb_flush_count++); + tb_lock(); + memset(env->tlb_table, -1, sizeof(env->tlb_table)); memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table)); memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); @@ -86,6 +99,27 @@ void tlb_flush(CPUState *cpu) env->vtlb_index = 0; env->tlb_flush_addr = -1; env->tlb_flush_mask = 0; + + tb_unlock(); + + atomic_mb_set(&cpu->pending_tlb_flush, false); +} + +static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data) +{ + tlb_flush_nocheck(cpu); +} + +void tlb_flush(CPUState *cpu) +{ + if (cpu->created && !qemu_cpu_is_self(cpu)) { + if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) { + async_run_on_cpu(cpu, tlb_flush_global_async_work, + RUN_ON_CPU_NULL); + } + } else { + tlb_flush_nocheck(cpu); + } } static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp) @@ -95,6 +129,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp) assert_cpu_is_self(cpu); tlb_debug("start\n"); + tb_lock(); + for (;;) { int mmu_idx = va_arg(argp, int); @@ -109,6 +145,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp) } memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); + + tb_unlock(); } void tlb_flush_by_mmuidx(CPUState *cpu, ...) @@ -131,13 +169,15 @@ static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr) } } -void tlb_flush_page(CPUState *cpu, target_ulong addr) +static void tlb_flush_page_async_work(CPUState *cpu, run_on_cpu_data data) { CPUArchState *env = cpu->env_ptr; + target_ulong addr = (target_ulong) data.target_ptr; int i; int mmu_idx; assert_cpu_is_self(cpu); + tlb_debug("page :" TARGET_FMT_lx "\n", addr); /* Check if we need to flush due to large pages. */ @@ -167,6 +207,18 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr) tb_flush_jmp_cache(cpu, addr); } +void tlb_flush_page(CPUState *cpu, target_ulong addr) +{ + tlb_debug("page :" TARGET_FMT_lx "\n", addr); + + if (!qemu_cpu_is_self(cpu)) { + async_run_on_cpu(cpu, tlb_flush_page_async_work, + RUN_ON_CPU_TARGET_PTR(addr)); + } else { + tlb_flush_page_async_work(cpu, RUN_ON_CPU_TARGET_PTR(addr)); + } +} + void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...) { CPUArchState *env = cpu->env_ptr; @@ -213,6 +265,16 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...) tb_flush_jmp_cache(cpu, addr); } +void tlb_flush_page_all(target_ulong addr) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + async_run_on_cpu(cpu, tlb_flush_page_async_work, + RUN_ON_CPU_TARGET_PTR(addr)); + } +} + /* update the TLBs so that writes to code in the virtual page 'addr' can be detected */ void tlb_protect_code(ram_addr_t ram_addr) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index bd4622ac5d..e43cb68355 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -158,6 +158,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr, void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr); void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx, uintptr_t retaddr); +void tlb_flush_page_all(target_ulong addr); #else static inline void tlb_flush_page(CPUState *cpu, target_ulong addr) { diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 1a06ae5938..7f1d6a81a0 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -398,6 +398,12 @@ struct CPUState { bool hax_vcpu_dirty; struct hax_vcpu_state *hax_vcpu; + + /* The pending_tlb_flush flag is set and cleared atomically to + * avoid potential races. The aim of the flag is to avoid + * unnecessary flushes. + */ + bool pending_tlb_flush; }; QTAILQ_HEAD(CPUTailQ, CPUState);