From patchwork Mon Jan 28 15:59:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 156774 Delivered-To: patch@linaro.org Received: by 2002:ac9:7558:0:0:0:0:0 with SMTP id r24csp3533954oct; Mon, 28 Jan 2019 08:13:09 -0800 (PST) X-Google-Smtp-Source: ALg8bN51i8hXHq1sZf4oriImLIV0uX1aHn3WQ7O14jEyaa5lkBtpW3iiiohYH3l9OS1OWZwikclw X-Received: by 2002:a5d:4fcb:: with SMTP id h11mr22846264wrw.139.1548691989263; Mon, 28 Jan 2019 08:13:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548691989; cv=none; d=google.com; s=arc-20160816; b=mEYBmCbe5K2KDWVUwK/KXwtz9sWwnhLWkbAV6mHohRs/+NebuEaddDCXEgwJ+PjKZx mqD395dsgHdyqD7CI+Ei4P5HXuop+2Qvrn8jSJDED4JPLz/68Igu+qs5e2DihWM5c0Y/ 5wQdyje7PlM9MOLvbDJoau5kLTRUzk6Pmnqbi8ka9YNflvaZlFBIC64U+rBRCEqdq+a7 1m4v1g+TpwT4lX6tHQ5Qk1/ORqmh9I4AD1An12NsNZOzNWMUhf/oXwWE7R3BJRxPUIrU YwY3AJ4RjCP5uCiDfboVqWJaypwHQtjIx0mFNAzbAc0v+2+hIZ4x+inMD7ZAJJAnDz8Z 2/KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=C2BrQaGdJrcnzidVlPeIc02gojy3eZ+R8AGWPoviy0s=; b=R1squmBoPFXoCAOV7rQaMGweqYG2oF2Slin08Z8OLggy8QofP9XmDJm1BoMzWnct39 S3GRE4Z84bQUDw1Rw/wfFCm21gYQYbvZGfxnmykHc/Yb9PHd7CYBCdT+jIyacV740ZbV eOcuS3faojyD+RRaOeNp2MHanITjh9PbYniz4I90C39XgEaezdVYsD0HAGXb/ABIjChq vvh514qImP3ieHj8uNVChB4vzJDvZn4WsahKd9VE7wiZDF5PShCreYFcle+d/2H3yrg2 3SojvIm77L0NYJGjr+CO0+2fHefC/9F6RBiOIOiaotIYwgPyj0aifDzjiAYv9oki2sPJ Gcjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kamK8Rk4; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l22si46010446wmc.35.2019.01.28.08.13.08 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 28 Jan 2019 08:13:09 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kamK8Rk4; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:34345 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1go9X2-0006uL-41 for patch@linaro.org; Mon, 28 Jan 2019 11:13:08 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33613) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1go9K6-0005DR-G7 for qemu-devel@nongnu.org; Mon, 28 Jan 2019 10:59:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1go9K4-0007vb-Km for qemu-devel@nongnu.org; Mon, 28 Jan 2019 10:59:46 -0500 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]:35950) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1go9K4-0007qY-8i for qemu-devel@nongnu.org; Mon, 28 Jan 2019 10:59:44 -0500 Received: by mail-pf1-x42a.google.com with SMTP id b85so8190178pfc.3 for ; Mon, 28 Jan 2019 07:59:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=C2BrQaGdJrcnzidVlPeIc02gojy3eZ+R8AGWPoviy0s=; b=kamK8Rk4a+jCO6iqsF2h6hjY6yq8DGKjuc70zmq6769+hxP3u+XjRPusFjiEQHLVBS Us7YcQlkNEcwCjS2deCjFA4gPotVEU+SR7YeHdYVhRKf5HOTPo55h6Z4Qk146NsrUsqk 2JLw9NKyVWOCIMCZNtlbLsxJW8sfqtCecdGpo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=C2BrQaGdJrcnzidVlPeIc02gojy3eZ+R8AGWPoviy0s=; b=h4hPOiDyvzAG8ZsQz0AeSxbnwXt7Vb/iyeHBNuAI0WdV8X6Pmaa2fni+MvXvaRS4jm li5AfoGE2BvhiRjjy9nB2AruLi8h0S3/TDzWi6H+v5LZxubBZVVn6JajhOkxiN8RlgZ8 OryGBKkvzwpDul446FCKY41g9Ea+jzbyVSWnGGgTw8j005NGaitv02NxRE6Lu2ZQ5/mv xQD3NsfkPGYaSglY3KUFm9yoXg7o3hrRRuj0jvKJEGltrqdBoTQhbe0tmFJfEpigPpjI OYxjy3PFqR+t2pKsa4FKUjK8cdyKiuay6HwjqF5Kg5+jmPm3Jj3ZmnNiHdpm6wggyuhU uGlw== X-Gm-Message-State: AJcUukfjQHf3oy0eSvnjlRR+U82aFzQCRMzNtMQnmc1Uh5/r4TCiDLEo 1qyl5r90pv9tLNelsiRFxM8fDuL0Gcs= X-Received: by 2002:a63:e001:: with SMTP id e1mr14102341pgh.39.1548691177092; Mon, 28 Jan 2019 07:59:37 -0800 (PST) Received: from cloudburst.twiddle.net (50-233-235-3-static.hfc.comcastbusiness.net. [50.233.235.3]) by smtp.gmail.com with ESMTPSA id p2sm45518687pfp.125.2019.01.28.07.59.35 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 28 Jan 2019 07:59:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 28 Jan 2019 07:59:03 -0800 Message-Id: <20190128155907.20607-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190128155907.20607-1-richard.henderson@linaro.org> References: <20190128155907.20607-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::42a Subject: [Qemu-devel] [PULL 19/23] tcg/arm: enable dynamic TLB sizing X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.h | 2 +- tcg/arm/tcg-target.inc.c | 143 +++++++++++++++++++-------------------- 2 files changed, 72 insertions(+), 73 deletions(-) -- 2.17.2 diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index c5a7064bdc..679aaf097e 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -60,7 +60,7 @@ extern int arm_arch; #undef TCG_TARGET_STACK_GROWSUP #define TCG_TARGET_INSN_UNIT_SIZE 4 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16 -#define TCG_TARGET_IMPLEMENTS_DYN_TLB 0 +#define TCG_TARGET_IMPLEMENTS_DYN_TLB 1 typedef enum { TCG_REG_R0 = 0, diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c index 49f57d655e..2245a8aeb9 100644 --- a/tcg/arm/tcg-target.inc.c +++ b/tcg/arm/tcg-target.inc.c @@ -500,6 +500,12 @@ static inline void tcg_out_ldrd_r(TCGContext *s, int cond, TCGReg rt, tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 0); } +static inline void tcg_out_ldrd_rwb(TCGContext *s, int cond, TCGReg rt, + TCGReg rn, TCGReg rm) +{ + tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1); +} + static inline void tcg_out_strd_8(TCGContext *s, int cond, TCGReg rt, TCGReg rn, int imm8) { @@ -1229,8 +1235,13 @@ static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg, #define TLB_SHIFT (CPU_TLB_ENTRY_BITS + CPU_TLB_BITS) -/* We're expecting to use an 8-bit immediate and to mask. */ -QEMU_BUILD_BUG_ON(CPU_TLB_BITS > 8); +/* We expect tlb_mask to be before tlb_table. */ +QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table) < + offsetof(CPUArchState, tlb_mask)); + +/* We expect to use a 20-bit unsigned offset from ENV. */ +QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table[NB_MMU_MODES - 1]) + > 0xfffff); /* Load and compare a TLB entry, leaving the flags set. Returns the register containing the addend of the tlb entry. Clobbers R0, R1, R2, TMP. */ @@ -1238,84 +1249,72 @@ QEMU_BUILD_BUG_ON(CPU_TLB_BITS > 8); static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi, TCGMemOp opc, int mem_index, bool is_load) { - TCGReg base = TCG_AREG0; - int cmp_off = - (is_load - ? offsetof(CPUArchState, tlb_table[mem_index][0].addr_read) - : offsetof(CPUArchState, tlb_table[mem_index][0].addr_write)); - int add_off = offsetof(CPUArchState, tlb_table[mem_index][0].addend); - int mask_off; + int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read) + : offsetof(CPUTLBEntry, addr_write)); + int mask_off = offsetof(CPUArchState, tlb_mask[mem_index]); + int table_off = offsetof(CPUArchState, tlb_table[mem_index]); + TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0; unsigned s_bits = opc & MO_SIZE; unsigned a_bits = get_alignment_bits(opc); - /* V7 generates the following: - * ubfx r0, addrlo, #TARGET_PAGE_BITS, #CPU_TLB_BITS - * add r2, env, #high - * add r2, r2, r0, lsl #CPU_TLB_ENTRY_BITS - * ldr r0, [r2, #cmp] - * ldr r2, [r2, #add] - * movw tmp, #page_align_mask - * bic tmp, addrlo, tmp - * cmp r0, tmp - * - * Otherwise we generate: - * shr tmp, addrlo, #TARGET_PAGE_BITS - * add r2, env, #high - * and r0, tmp, #(CPU_TLB_SIZE - 1) - * add r2, r2, r0, lsl #CPU_TLB_ENTRY_BITS - * ldr r0, [r2, #cmp] - * ldr r2, [r2, #add] - * tst addrlo, #s_mask - * cmpeq r0, tmp, lsl #TARGET_PAGE_BITS - */ - if (use_armv7_instructions) { - tcg_out_extract(s, COND_AL, TCG_REG_R0, addrlo, - TARGET_PAGE_BITS, CPU_TLB_BITS); - } else { - tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, - 0, addrlo, SHIFT_IMM_LSR(TARGET_PAGE_BITS)); - } + if (table_off > 0xfff) { + int mask_hi = mask_off & ~0xfff; + int table_hi = table_off & ~0xfff; + int rot; - /* Add portions of the offset until the memory access is in range. - * If we plan on using ldrd, reduce to an 8-bit offset; otherwise - * we can use a 12-bit offset. */ - if (use_armv6_instructions && TARGET_LONG_BITS == 64) { - mask_off = 0xff; - } else { - mask_off = 0xfff; - } - while (cmp_off > mask_off) { - int shift = ctz32(cmp_off & ~mask_off) & ~1; - int rot = ((32 - shift) << 7) & 0xf00; - int addend = cmp_off & (0xff << shift); - tcg_out_dat_imm(s, COND_AL, ARITH_ADD, TCG_REG_R2, base, - rot | ((cmp_off >> shift) & 0xff)); - base = TCG_REG_R2; - add_off -= addend; - cmp_off -= addend; - } - - if (!use_armv7_instructions) { - tcg_out_dat_imm(s, COND_AL, ARITH_AND, - TCG_REG_R0, TCG_REG_TMP, CPU_TLB_SIZE - 1); - } - tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_R2, base, - TCG_REG_R0, SHIFT_IMM_LSL(CPU_TLB_ENTRY_BITS)); - - /* Load the tlb comparator. Use ldrd if needed and available, - but due to how the pointer needs setting up, ldm isn't useful. - Base arm5 doesn't have ldrd, but armv5te does. */ - if (use_armv6_instructions && TARGET_LONG_BITS == 64) { - tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_REG_R2, cmp_off); - } else { - tcg_out_ld32_12(s, COND_AL, TCG_REG_R0, TCG_REG_R2, cmp_off); - if (TARGET_LONG_BITS == 64) { - tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R2, cmp_off + 4); + table_base = TCG_REG_R2; + if (mask_hi == table_hi) { + mask_base = table_base; + } else if (mask_hi) { + mask_base = TCG_REG_TMP; + rot = encode_imm(mask_hi); + assert(rot >= 0); + tcg_out_dat_imm(s, COND_AL, ARITH_ADD, mask_base, TCG_AREG0, + rotl(mask_hi, rot) | (rot << 7)); } + rot = encode_imm(table_hi); + assert(rot >= 0); + tcg_out_dat_imm(s, COND_AL, ARITH_ADD, table_base, TCG_AREG0, + rotl(table_hi, rot) | (rot << 7)); + + mask_off -= mask_hi; + table_off -= table_hi; + } + + /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */ + tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP, mask_base, mask_off); + tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R2, table_base, table_off); + + /* Extract the tlb index from the address into TMP. */ + tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_TMP, TCG_REG_TMP, addrlo, + SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS)); + + /* + * Add the tlb_table pointer, creating the CPUTLBEntry address in R2. + * Load the tlb comparator into R0/R1 and the fast path addend into R2. + */ + if (cmp_off == 0) { + if (use_armv6_instructions && TARGET_LONG_BITS == 64) { + tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R0, TCG_REG_R2, TCG_REG_TMP); + } else { + tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R0, TCG_REG_R2, TCG_REG_TMP); + } + } else { + tcg_out_dat_reg(s, COND_AL, ARITH_ADD, + TCG_REG_R2, TCG_REG_R2, TCG_REG_TMP, 0); + if (use_armv6_instructions && TARGET_LONG_BITS == 64) { + tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_REG_R2, cmp_off); + } else { + tcg_out_ld32_12(s, COND_AL, TCG_REG_R0, TCG_REG_R2, cmp_off); + } + } + if (!use_armv6_instructions && TARGET_LONG_BITS == 64) { + tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R2, cmp_off + 4); } /* Load the tlb addend. */ - tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R2, add_off); + tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R2, + offsetof(CPUTLBEntry, addend)); /* Check alignment. We don't support inline unaligned acceses, but we can easily support overalignment checks. */