From patchwork Wed May 17 14:52:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 100002 Delivered-To: patch@linaro.org Received: by 10.182.142.97 with SMTP id rv1csp255280obb; Wed, 17 May 2017 07:56:34 -0700 (PDT) X-Received: by 10.55.141.133 with SMTP id p127mr3239602qkd.216.1495032994165; Wed, 17 May 2017 07:56:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1495032994; cv=none; d=google.com; s=arc-20160816; b=vm1t/ozot37Tg/Q5En1ui4iymw7t84FEHx0GNYohXVS2nOWX4fIcV28+4CqKSBjSDT Aj8aDNnGjuQojB4N/zwwu8J81F6QrQCKVM5+7MQCPzTfwdKFH7qWL0nNnR7A4qKB5O5t pHlbHTeNLHccaFxHOKHcRWxUX+taQFxG2vz4BkMeQJUig5HBdPybxEgMRxh8jXFVE0H7 uTRuPJRuG2XHghoh3nHC10GL3x2qFv73uNBq44EFOLwqsTDKajcu2FPtCgPvA78vBR9N 2DkXBe/lVTbLCGw6/6J/d4uHPw8nHyhSvGeb8hfaR7A+M/YHWaJPMNIuMm3XoS08/p3Z NAQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=fkx3DolIxITVCWhGdxcm2J15pc5f5bdwpaYpklZZI14=; b=wvJ61eHUYEaLj76qAd4ptBPmUIHIpUBOJ8ykScH4Jy8j1CnMGv60pEHCExOyCW/9HJ r5ghZK9SBVty0Aw66XD3KA41t5Wn9XiFYs5rvNfGDBe1F6/koO2/wcaVsl34uSOqgIWU kXg5iajuxgSfUpa2kSryp9uyL2lfNGL61CFQRTsUS3MYBRyPNZjv5TveRtVpjlpXo8J6 7bnIDbLnaobySnj6taJTtnwZQ/9rjwkF5Or7RPJJ/gFD+0w7DOiiikOF+mpDuNpTNv0p wuCAqfLcElSHg/q9QbYEMIopXI3thpk/T6TIqZSxK3q9es7lDueCTBFSWVVmHvV2D/pt XgRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w124si2413645qkc.185.2017.05.17.07.56.34 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 17 May 2017 07:56:34 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49394 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dB0NN-0004uB-MT for patch@linaro.org; Wed, 17 May 2017 10:56:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50772) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dB0JA-0001nb-Qe for qemu-devel@nongnu.org; Wed, 17 May 2017 10:52:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dB0J9-0004gL-8V for qemu-devel@nongnu.org; Wed, 17 May 2017 10:52:12 -0400 Received: from mail-wm0-x22e.google.com ([2a00:1450:400c:c09::22e]:36130) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dB0J9-0004fZ-0P for qemu-devel@nongnu.org; Wed, 17 May 2017 10:52:11 -0400 Received: by mail-wm0-x22e.google.com with SMTP id 70so13262455wmq.1 for ; Wed, 17 May 2017 07:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fkx3DolIxITVCWhGdxcm2J15pc5f5bdwpaYpklZZI14=; b=YSLZQo8SAxhOYWvsuP/KzyKCZ5FChOUcC+9vN0Q78EEzY+MEl8oe4hw+Gxw0/hg5pd T1c5gkykGddw5UBMnWZ132+TlXR7uDKY3c/hmDsFHxMNnSWPP9aHFKdjsHu484fwiJDv qM/xg4uYDgVDH1AEq8Ka/v6g+Drpkovv+VNTw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fkx3DolIxITVCWhGdxcm2J15pc5f5bdwpaYpklZZI14=; b=de9PKCvZ/tv+zzJChhGYLRSTvN3mAHHqaLPEUoZJfuaM1VG+G5WnU0YMYn5EW43BBW doVcGATUzDG2zCEbMLo8g0+fpJczWOEo23RnDgkvpwkfz0NDIqTwnytLYChQCvLC+gn+ 1ajq4vlM4buqq/W+A/5H6rIBcyzGJMGs4y5ITawz/mcSfUgcZuF8RAQWb1mYZ3PCYt/T ZyMeex/pWquyUB+lhUUTuSULA9Rqw3PEV/UlhuVAb+itvpYV07wnY7RpUbYBl6D4Ciyx bVvIZJmzRDz5Zyg7P0Zo1dbn49zGGNvkNXbajwOzHd08hbq55kXJ8XB+fm92vqF6nauH lm1g== X-Gm-Message-State: AODbwcDti0n0mUvCjdMUoiExAHjkHl+QQrxnx8ezTJbh7vCTEgJcBBHc t2PiBGuzeboEX56y X-Received: by 10.28.24.207 with SMTP id 198mr6585004wmy.86.1495032729823; Wed, 17 May 2017 07:52:09 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id 2sm19188660wmk.20.2017.05.17.07.52.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 May 2017 07:52:08 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id D7C7D3E08CD; Wed, 17 May 2017 15:52:59 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: pbonzini@redhat.com, stefanha@redhat.com Date: Wed, 17 May 2017 15:52:58 +0100 Message-Id: <20170517145259.28979-6-alex.bennee@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170517145259.28979-1-alex.bennee@linaro.org> References: <20170517145259.28979-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::22e Subject: [Qemu-devel] [PATCH v2 5/6] cputlb: add trace events X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , cota@braap.org, =?utf-8?q?Alex_Be?= =?utf-8?b?bm7DqWU=?= , qemu-devel@nongnu.org, Peter Crosthwaite Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Given the range of costs for various SoftMMU TLB operations from deferring work for the currently running vCPU to bring the whole emulated machine to a stop for synchronised updates simple counters are less useful. Instead we log events via the trace infrastructure and we can then post-process the data in a range of ways. tlb_flush_self - the vCPU flushed its own TLB tlb_flush_async_schedule - work was scheduled and the vCPU kicked tlb_flush_synced_schedule - exclusive work was scheduled on a vCPU tlb_flush_work - scheduled work was done We can use the difference between the work being scheduled and tlb_flush_work to calculate the latency introduced. Signed-off-by: Alex Bennée --- cputlb.c | 34 +++++++++++++++++++++++++++++++++- trace-events | 7 +++++++ 2 files changed, 40 insertions(+), 1 deletion(-) -- 2.11.0 Reviewed-by: Philippe Mathieu-Daudé diff --git a/cputlb.c b/cputlb.c index d1859c3f37..b2490863e4 100644 --- a/cputlb.c +++ b/cputlb.c @@ -127,6 +127,7 @@ static void tlb_flush_nocheck(CPUState *cpu) static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data) { + trace_tlb_flush_work(__func__, cpu->cpu_index); tlb_flush_nocheck(cpu); } @@ -135,17 +136,22 @@ void tlb_flush(CPUState *cpu) if (cpu->created && !qemu_cpu_is_self(cpu)) { if (atomic_mb_read(&cpu->pending_tlb_flush) != ALL_MMUIDX_BITS) { atomic_mb_set(&cpu->pending_tlb_flush, ALL_MMUIDX_BITS); + trace_tlb_flush_async_schedule(__func__, current_cpu ? + current_cpu->cpu_index : + cpu->cpu_index, cpu->cpu_index); async_run_on_cpu(cpu, tlb_flush_global_async_work, RUN_ON_CPU_NULL); } } else { - tlb_flush_nocheck(cpu); + trace_tlb_flush_self(__func__, cpu->cpu_index); + tlb_flush_global_async_work(cpu, RUN_ON_CPU_NULL); } } void tlb_flush_all_cpus(CPUState *src_cpu) { const run_on_cpu_func fn = tlb_flush_global_async_work; + trace_tlb_flush_async_schedule(__func__, src_cpu->cpu_index, -1); flush_all_helper(src_cpu, fn, RUN_ON_CPU_NULL); fn(src_cpu, RUN_ON_CPU_NULL); } @@ -153,6 +159,7 @@ void tlb_flush_all_cpus(CPUState *src_cpu) void tlb_flush_all_cpus_synced(CPUState *src_cpu) { const run_on_cpu_func fn = tlb_flush_global_async_work; + trace_tlb_flush_synced_schedule(__func__, src_cpu->cpu_index, -1); flush_all_helper(src_cpu, fn, RUN_ON_CPU_NULL); async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_NULL); } @@ -163,6 +170,8 @@ static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data) unsigned long mmu_idx_bitmask = data.host_int; int mmu_idx; + trace_tlb_flush_work(__func__, cpu->cpu_index); + assert_cpu_is_self(cpu); tb_lock(); @@ -196,12 +205,16 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap) if (pending_flushes) { tlb_debug("reduced mmu_idx: 0x%" PRIx16 "\n", pending_flushes); + trace_tlb_flush_async_schedule(__func__, + current_cpu->cpu_index, + cpu->cpu_index); atomic_or(&cpu->pending_tlb_flush, pending_flushes); async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work, RUN_ON_CPU_HOST_INT(pending_flushes)); } } else { + trace_tlb_flush_self(__func__, cpu->cpu_index); tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap)); } @@ -212,6 +225,7 @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *src_cpu, uint16_t idxmap) const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work; tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap); + trace_tlb_flush_async_schedule(__func__, src_cpu->cpu_index, -1); flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); fn(src_cpu, RUN_ON_CPU_HOST_INT(idxmap)); @@ -223,6 +237,7 @@ void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *src_cpu, const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work; tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap); + trace_tlb_flush_synced_schedule(__func__, src_cpu->cpu_index, -1); flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); @@ -252,6 +267,7 @@ static void tlb_flush_page_async_work(CPUState *cpu, run_on_cpu_data data) assert_cpu_is_self(cpu); tlb_debug("page :" TARGET_FMT_lx "\n", addr); + trace_tlb_flush_work(__func__, cpu->cpu_index); /* Check if we need to flush due to large pages. */ if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) { @@ -285,9 +301,12 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr) tlb_debug("page :" TARGET_FMT_lx "\n", addr); if (!qemu_cpu_is_self(cpu)) { + trace_tlb_flush_async_schedule(__func__, current_cpu->cpu_index, + cpu->cpu_index); async_run_on_cpu(cpu, tlb_flush_page_async_work, RUN_ON_CPU_TARGET_PTR(addr)); } else { + trace_tlb_flush_self(__func__, cpu->cpu_index); tlb_flush_page_async_work(cpu, RUN_ON_CPU_TARGET_PTR(addr)); } } @@ -312,6 +331,7 @@ static void tlb_flush_page_by_mmuidx_async_work(CPUState *cpu, tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx:0x%lx\n", page, addr, mmu_idx_bitmap); + trace_tlb_flush_work(__func__, cpu->cpu_index); for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) { if (test_bit(mmu_idx, &mmu_idx_bitmap)) { @@ -336,6 +356,7 @@ static void tlb_check_page_and_flush_by_mmuidx_async_work(CPUState *cpu, unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS; tlb_debug("addr:"TARGET_FMT_lx" mmu_idx: %04lx\n", addr, mmu_idx_bitmap); + trace_tlb_flush_work(__func__, cpu->cpu_index); /* Check if we need to flush due to large pages. */ if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) { @@ -343,9 +364,11 @@ static void tlb_check_page_and_flush_by_mmuidx_async_work(CPUState *cpu, TARGET_FMT_lx "/" TARGET_FMT_lx ")\n", env->tlb_flush_addr, env->tlb_flush_mask); + trace_tlb_flush_self(__func__, cpu->cpu_index); tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(mmu_idx_bitmap)); } else { + trace_tlb_flush_self(__func__, cpu->cpu_index); tlb_flush_page_by_mmuidx_async_work(cpu, data); } } @@ -361,9 +384,12 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap) addr_and_mmu_idx |= idxmap; if (!qemu_cpu_is_self(cpu)) { + trace_tlb_flush_async_schedule(__func__, current_cpu->cpu_index, + cpu->cpu_index); async_run_on_cpu(cpu, tlb_check_page_and_flush_by_mmuidx_async_work, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); } else { + trace_tlb_flush_self(__func__, cpu->cpu_index); tlb_check_page_and_flush_by_mmuidx_async_work( cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); } @@ -376,6 +402,7 @@ void tlb_flush_page_by_mmuidx_all_cpus(CPUState *src_cpu, target_ulong addr, target_ulong addr_and_mmu_idx; tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap); + trace_tlb_flush_async_schedule(__func__, src_cpu->cpu_index, -1); /* This should already be page aligned */ addr_and_mmu_idx = addr & TARGET_PAGE_MASK; @@ -393,6 +420,7 @@ void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *src_cpu, target_ulong addr_and_mmu_idx; tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap); + trace_tlb_flush_synced_schedule(__func__, src_cpu->cpu_index, -1); /* This should already be page aligned */ addr_and_mmu_idx = addr & TARGET_PAGE_MASK; @@ -406,6 +434,8 @@ void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr) { const run_on_cpu_func fn = tlb_flush_page_async_work; + trace_tlb_flush_async_schedule(__func__, src->cpu_index, -1); + flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); fn(src, RUN_ON_CPU_TARGET_PTR(addr)); } @@ -415,6 +445,8 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, { const run_on_cpu_func fn = tlb_flush_page_async_work; + trace_tlb_flush_synced_schedule(__func__, src->cpu_index, -1); + flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); async_safe_run_on_cpu(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); } diff --git a/trace-events b/trace-events index e582d6315d..500f84b8c4 100644 --- a/trace-events +++ b/trace-events @@ -78,6 +78,13 @@ disable exec_tb(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR disable exec_tb_nocache(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR disable exec_tb_exit(void *last_tb, unsigned int flags) "tb:%p flags=%x" +# SoftMMU TLB flushes +# - setting to=-1 indicates work scheduled on all vCPUs +tlb_flush_self(const char *fn, int vcpu) "%s: cpu %d" +tlb_flush_async_schedule(const char *fn, int from, int to) "%s: from_cpu=%d to_cpu=%d" +tlb_flush_synced_schedule(const char *fn, int from, int to) "%s: from_cpu=%d to_cpu=%d" +tlb_flush_work(const char *fn, int vcpu) "%s: cpu %d" + # translate-all.c translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%"PRIxPTR", tb_code:%p"