From patchwork Fri Nov 23 14:45:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 151895 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2223931ljp; Fri, 23 Nov 2018 07:01:13 -0800 (PST) X-Google-Smtp-Source: AJdET5e8qhOLRYddkjZcdEc1VcaZeEcFN7BDo2+L+QYtAxM4P8twgJAwizzFi3+XGrQcHsYBwizc X-Received: by 2002:a81:7051:: with SMTP id l78mr16360526ywc.146.1542985273093; Fri, 23 Nov 2018 07:01:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542985273; cv=none; d=google.com; s=arc-20160816; b=y85SMSVVv6Adaq8Mo/28nTprGCDdy/EIehmTWRVOJryr8q77sP6V10y6jOHaQoPEt6 Ja36bJZiHohHhR5liuVNmTEFd5gWAgBZGYgXHXPKVjU06DDJdfukiUmd2Q9K+lvSF05R kZDH64LgCgP7cc/+gpre0DsZoDZ/VddWnjB1xPDuevY6uXYkvr1bb4hBKwC5xTl93p4a w9HdP6OZQtLhXloaploZIW63N0EAsMi7vwNEowMW4gm04WdAGcEV8YKcl4ywcvQrY69I 7cidhdi5LJWLXY+GCsEGyJgRIMolEATd1gzK/o0zX4tMZchJYve4Rb+jr44HzzqOdrMZ OGqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=EcT4jjYW2sEeAnKGx9+I7bAH1gthrHWnSv8affH5DRQ=; b=u5XXimwJFaAEVI85a3PtsaZ5y+DTFtD/Sj3B/xZ9ilETWFbHxdAxFgqqgDNQSd2x5q WJFknNlUgtuc+pbL9G++F8yeM7b+318qTF1GMoHy+zj17K2KrOwMRAxcxrp/WhajugT3 qrGsUvZ35H2W1/GSUt3KlJbJ/xLctxxMHfSeAVXBg6erlhjBjpgTk8kb+U1uhn7HHGtc +0U2ZZLUwgRVZtt1TaFUoEFml9X55vB1nqUvW76TxASuG0V3JgjiyFEym4Ax5s86Wr7W XAjdAGzpN09+ae1eNiJvIZ54LNldTf73SRktJMu8EuIIZxR1M3gZlija8VMN5YwbCuQZ Ajvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IxZTQKmV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p68si10473262ywd.229.2018.11.23.07.01.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 23 Nov 2018 07:01:13 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IxZTQKmV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52838 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQCxE-0006is-9A for patch@linaro.org; Fri, 23 Nov 2018 10:01:12 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43991) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQCio-00085u-Ez for qemu-devel@nongnu.org; Fri, 23 Nov 2018 09:46:20 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQCih-0003DI-Bc for qemu-devel@nongnu.org; Fri, 23 Nov 2018 09:46:18 -0500 Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]:51226) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gQCih-0003Bn-0e for qemu-devel@nongnu.org; Fri, 23 Nov 2018 09:46:11 -0500 Received: by mail-wm1-x342.google.com with SMTP id j207so5042306wmj.1 for ; Fri, 23 Nov 2018 06:46:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=EcT4jjYW2sEeAnKGx9+I7bAH1gthrHWnSv8affH5DRQ=; b=IxZTQKmVeEl3+Kx9cesC2lSF2E0aoZSyoDrr1224efJ9rTOqMGOdtMe3PidhfZtLki BRp8dkIIE5PZrQIpWnGFrNT2CuQ56N0MKCpDRK5n75i9YR5g2WcffNsQw6FBjwPBtdvg OcrCHpGCs15I24GdyuD8TPVTqqeGKA2P7G6CQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EcT4jjYW2sEeAnKGx9+I7bAH1gthrHWnSv8affH5DRQ=; b=an9eqfTLBwYDjm9rvi06w6M2/+lF0KIN7Lh73mShN4Y2II09l14nJOjmULwow7k3BY Ct48Vs3XZJZmijsT5jv4hv+CaRSQ8KAQz81q90B8fW9FHT7dVLs3dJN0uGM7KmO8anLg HpDJaqrEtH0eJ0Keu/gnlJu1dDPMYe9u1XE+BeRcIIg5GHSJH7VaIgTZL9Iuu9zyk6z3 ld0Gotu1kCcAjXmZDHtkYy6R2qWrgHQrNkh61kJfgpvJjIN9g7vz9pbGJYpqRZt2R3NL yt/8udfcb8yexC+vTgpGJzuz0YNdsHJMGDtwY8SA9ehjCEhD62yli3xr34/i4d5IksJf ZexA== X-Gm-Message-State: AGRZ1gLVLVgTtqO7fn8qJQonK97SeqEUxvcauDpNkFZZsreWKO3XZZH6 CDtiemgnb9gaMgV36bQbdR2vLWMal1v2gQ== X-Received: by 2002:a7b:c397:: with SMTP id s23mr13570418wmj.127.1542984369254; Fri, 23 Nov 2018 06:46:09 -0800 (PST) Received: from cloudburst.twiddle.net ([195.77.246.50]) by smtp.gmail.com with ESMTPSA id p74sm10339630wmd.29.2018.11.23.06.46.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 23 Nov 2018 06:46:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 23 Nov 2018 15:45:30 +0100 Message-Id: <20181123144558.5048-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20181123144558.5048-1-richard.henderson@linaro.org> References: <20181123144558.5048-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::342 Subject: [Qemu-devel] [PATCH for-4.0 v2 09/37] tcg/i386: Use TCG_TARGET_NEED_LDST_OOL_LABELS X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alistair.Francis@wdc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Move the entire memory operation out of line. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 391 ++++++++++++++++---------------------- 2 files changed, 162 insertions(+), 231 deletions(-) -- 2.17.2 Reviewed-by: Alex Bennée diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 2441658865..1b2d4e1b0d 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -220,7 +220,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) #ifdef CONFIG_SOFTMMU -#define TCG_TARGET_NEED_LDST_LABELS +#define TCG_TARGET_NEED_LDST_OOL_LABELS #endif #define TCG_TARGET_NEED_POOL_LABELS diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 50e5dc31b3..5c68cbd43d 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -1643,7 +1643,7 @@ static void tcg_out_nopn(TCGContext *s, int n) } #if defined(CONFIG_SOFTMMU) -#include "tcg-ldst.inc.c" +#include "tcg-ldst-ool.inc.c" /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) @@ -1656,6 +1656,14 @@ static void * const qemu_ld_helpers[16] = { [MO_BEUW] = helper_be_lduw_mmu, [MO_BEUL] = helper_be_ldul_mmu, [MO_BEQ] = helper_be_ldq_mmu, + + [MO_SB] = helper_ret_ldsb_mmu, + [MO_LESW] = helper_le_ldsw_mmu, + [MO_BESW] = helper_be_ldsw_mmu, +#if TCG_TARGET_REG_BITS == 64 + [MO_LESL] = helper_le_ldsl_mmu, + [MO_BESL] = helper_be_ldsl_mmu, +#endif }; /* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, @@ -1765,18 +1773,18 @@ static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, } /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + tcg_out_opc(s, OPC_JCC_short + JCC_JNE, 0, 0, 0); label_ptr[0] = s->code_ptr; - s->code_ptr += 4; + s->code_ptr += 1; if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { /* cmp 4(r0), addrhi */ tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, 4); /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + tcg_out_opc(s, OPC_JCC_short + JCC_JNE, 0, 0, 0); label_ptr[1] = s->code_ptr; - s->code_ptr += 4; + s->code_ptr += 1; } /* TLB Hit. */ @@ -1788,181 +1796,6 @@ static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, return base; } -/* - * Record the context of a call to the out of line helper code for the slow path - * for a load or store, so that we can later generate the correct helper code - */ -static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi, - TCGReg datalo, TCGReg datahi, - TCGReg addrlo, TCGReg addrhi, - tcg_insn_unit *raddr, - tcg_insn_unit **label_ptr) -{ - TCGLabelQemuLdst *label = new_ldst_label(s); - - label->is_ld = is_ld; - label->oi = oi; - label->datalo_reg = datalo; - label->datahi_reg = datahi; - label->addrlo_reg = addrlo; - label->addrhi_reg = addrhi; - label->raddr = raddr; - label->label_ptr[0] = label_ptr[0]; - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - label->label_ptr[1] = label_ptr[1]; - } -} - -/* - * Generate code for the slow path for a load at the end of block - */ -static void tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - TCGMemOpIdx oi = l->oi; - TCGMemOp opc = get_memop(oi); - TCGReg data_reg; - tcg_insn_unit **label_ptr = &l->label_ptr[0]; - - /* resolve label address */ - tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); - } - - if (TCG_TARGET_REG_BITS == 32) { - int ofs = 0; - - tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs); - ofs += 4; - - tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs); - ofs += 4; - - if (TARGET_LONG_BITS == 64) { - tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs); - ofs += 4; - } - - tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs); - ofs += 4; - - tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs); - } else { - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ - tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi); - tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3], - (uintptr_t)l->raddr); - } - - tcg_out_call(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]); - - data_reg = l->datalo_reg; - switch (opc & MO_SSIZE) { - case MO_SB: - tcg_out_ext8s(s, data_reg, TCG_REG_EAX, P_REXW); - break; - case MO_SW: - tcg_out_ext16s(s, data_reg, TCG_REG_EAX, P_REXW); - break; -#if TCG_TARGET_REG_BITS == 64 - case MO_SL: - tcg_out_ext32s(s, data_reg, TCG_REG_EAX); - break; -#endif - case MO_UB: - case MO_UW: - /* Note that the helpers have zero-extended to tcg_target_long. */ - case MO_UL: - tcg_out_mov(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX); - break; - case MO_Q: - if (TCG_TARGET_REG_BITS == 64) { - tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX); - } else if (data_reg == TCG_REG_EDX) { - /* xchg %edx, %eax */ - tcg_out_opc(s, OPC_XCHG_ax_r32 + TCG_REG_EDX, 0, 0, 0); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EAX); - } else { - tcg_out_mov(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); - } - break; - default: - tcg_abort(); - } - - /* Jump to the code corresponding to next IR of qemu_st */ - tcg_out_jmp(s, l->raddr); -} - -/* - * Generate code for the slow path for a store at the end of block - */ -static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - TCGMemOpIdx oi = l->oi; - TCGMemOp opc = get_memop(oi); - TCGMemOp s_bits = opc & MO_SIZE; - tcg_insn_unit **label_ptr = &l->label_ptr[0]; - TCGReg retaddr; - - /* resolve label address */ - tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); - } - - if (TCG_TARGET_REG_BITS == 32) { - int ofs = 0; - - tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs); - ofs += 4; - - tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs); - ofs += 4; - - if (TARGET_LONG_BITS == 64) { - tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs); - ofs += 4; - } - - tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs); - ofs += 4; - - if (s_bits == MO_64) { - tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs); - ofs += 4; - } - - tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs); - ofs += 4; - - retaddr = TCG_REG_EAX; - tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); - tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs); - } else { - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ - tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32), - tcg_target_call_iarg_regs[2], l->datalo_reg); - tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi); - - if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) { - retaddr = tcg_target_call_iarg_regs[4]; - tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); - } else { - retaddr = TCG_REG_RAX; - tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); - tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, - TCG_TARGET_CALL_STACK_OFFSET); - } - } - - /* "Tail call" to the helper, with the return address back inline. */ - tcg_out_push(s, retaddr); - tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); -} #elif defined(__x86_64__) && defined(__linux__) # include # include @@ -2091,7 +1924,6 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) TCGReg datahi __attribute__((unused)) = -1; TCGReg addrhi __attribute__((unused)) = -1; TCGMemOpIdx oi; - TCGMemOp opc; int i = -1; datalo = args[++i]; @@ -2103,35 +1935,25 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) addrhi = args[++i]; } oi = args[++i]; - opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - { - int mem_index = get_mmuidx(oi); - tcg_insn_unit *label_ptr[2]; - TCGReg base; - - tcg_debug_assert(datalo == softmmu_arg(ARG_LDVAL, is64, 0)); - if (TCG_TARGET_REG_BITS == 32 && is64) { - tcg_debug_assert(datahi == softmmu_arg(ARG_LDVAL, is64, 1)); - } - tcg_debug_assert(addrlo == softmmu_arg(ARG_ADDR, 0, 0)); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - tcg_debug_assert(addrhi == softmmu_arg(ARG_ADDR, 0, 1)); - } - - base = tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, - label_ptr, offsetof(CPUTLBEntry, addr_read)); - - /* TLB Hit. */ - tcg_out_qemu_ld_direct(s, datalo, datahi, base, -1, 0, 0, opc); - - /* Record the current context of a load into ldst label */ - add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + /* Assert that we've set up the constraints properly. */ + tcg_debug_assert(datalo == softmmu_arg(ARG_LDVAL, is64, 0)); + if (TCG_TARGET_REG_BITS == 32 && is64) { + tcg_debug_assert(datahi == softmmu_arg(ARG_LDVAL, is64, 1)); } + tcg_debug_assert(addrlo == softmmu_arg(ARG_ADDR, 0, 0)); + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + tcg_debug_assert(addrhi == softmmu_arg(ARG_ADDR, 0, 1)); + } + + /* Call to thunk. */ + tcg_out8(s, OPC_CALL_Jz); + add_ldst_ool_label(s, true, is64, oi, R_386_PC32, -4); + s->code_ptr += 4; #else { + TCGMemOp opc = get_memop(oi); int32_t offset = guest_base; TCGReg base = addrlo; int index = -1; @@ -2246,7 +2068,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) TCGReg datahi __attribute__((unused)) = -1; TCGReg addrhi __attribute__((unused)) = -1; TCGMemOpIdx oi; - TCGMemOp opc; int i = -1; datalo = args[++i]; @@ -2258,35 +2079,25 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) addrhi = args[++i]; } oi = args[++i]; - opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - { - int mem_index = get_mmuidx(oi); - tcg_insn_unit *label_ptr[2]; - TCGReg base; - - tcg_debug_assert(datalo == softmmu_arg(ARG_STVAL, is64, 0)); - if (TCG_TARGET_REG_BITS == 32 && is64) { - tcg_debug_assert(datahi == softmmu_arg(ARG_STVAL, is64, 1)); - } - tcg_debug_assert(addrlo == softmmu_arg(ARG_ADDR, 0, 0)); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - tcg_debug_assert(addrhi == softmmu_arg(ARG_ADDR, 0, 1)); - } - - base = tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, - label_ptr, offsetof(CPUTLBEntry, addr_write)); - - /* TLB Hit. */ - tcg_out_qemu_st_direct(s, datalo, datahi, base, 0, 0, opc); - - /* Record the current context of a store into ldst label */ - add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + /* Assert that we've set up the constraints properly. */ + tcg_debug_assert(datalo == softmmu_arg(ARG_STVAL, is64, 0)); + if (TCG_TARGET_REG_BITS == 32 && is64) { + tcg_debug_assert(datahi == softmmu_arg(ARG_STVAL, is64, 1)); } + tcg_debug_assert(addrlo == softmmu_arg(ARG_ADDR, 0, 0)); + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + tcg_debug_assert(addrhi == softmmu_arg(ARG_ADDR, 0, 1)); + } + + /* Call to thunk. */ + tcg_out8(s, OPC_CALL_Jz); + add_ldst_ool_label(s, false, is64, oi, R_386_PC32, -4); + s->code_ptr += 4; #else { + TCGMemOp opc = get_memop(oi); int32_t offset = guest_base; TCGReg base = addrlo; int seg = 0; @@ -2321,6 +2132,126 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) #endif } +#if defined(CONFIG_SOFTMMU) +/* + * Generate code for an out-of-line thunk performing a load. + */ +static tcg_insn_unit *tcg_out_qemu_ldst_ool(TCGContext *s, bool is_ld, + bool is_64, TCGMemOpIdx oi) +{ + TCGMemOp opc = get_memop(oi); + int mem_index = get_mmuidx(oi); + tcg_insn_unit *label_ptr[2], *thunk; + TCGReg datalo, addrlo, base; + TCGReg datahi __attribute__((unused)) = -1; + TCGReg addrhi __attribute__((unused)) = -1; + int i; + + /* Since we're amortizing the cost, align the thunk. */ + thunk = QEMU_ALIGN_PTR_UP(s->code_ptr, 16); + if (thunk != s->code_ptr) { + memset(s->code_ptr, 0x90, thunk - s->code_ptr); + s->code_ptr = thunk; + } + + /* Discover where the inputs are held. */ + addrlo = softmmu_arg(ARG_ADDR, 0, 0); + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + addrhi = softmmu_arg(ARG_ADDR, 0, 1); + } + datalo = softmmu_arg(is_ld ? ARG_LDVAL : ARG_STVAL, is_64, 0); + if (TCG_TARGET_REG_BITS == 32 && is_64) { + datahi = softmmu_arg(is_ld ? ARG_LDVAL : ARG_STVAL, is_64, 1); + } + + base = tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, label_ptr, + is_ld ? offsetof(CPUTLBEntry, addr_read) + : offsetof(CPUTLBEntry, addr_write)); + + /* TLB Hit. */ + if (is_ld) { + tcg_out_qemu_ld_direct(s, datalo, datahi, base, -1, 0, 0, opc); + } else { + tcg_out_qemu_st_direct(s, datalo, datahi, base, 0, 0, opc); + } + tcg_out_opc(s, OPC_RET, 0, 0, 0); + + /* TLB Miss. */ + + /* resolve label address */ + tcg_patch8(label_ptr[0], s->code_ptr - label_ptr[0] - 1); + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + tcg_patch8(label_ptr[1], s->code_ptr - label_ptr[1] - 1); + } + + if (TCG_TARGET_REG_BITS == 32) { + /* Copy the return address into a temporary. */ + tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_L0, TCG_REG_ESP, 0); + i = 4; + + tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, i); + i += 4; + + tcg_out_st(s, TCG_TYPE_I32, addrlo, TCG_REG_ESP, i); + i += 4; + + if (TARGET_LONG_BITS == 64) { + tcg_out_st(s, TCG_TYPE_I32, addrhi, TCG_REG_ESP, i); + i += 4; + } + + if (!is_ld) { + tcg_out_st(s, TCG_TYPE_I32, datalo, TCG_REG_ESP, i); + i += 4; + + if (is_64) { + tcg_out_st(s, TCG_TYPE_I32, datahi, TCG_REG_ESP, i); + i += 4; + } + } + + tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, i); + i += 4; + + tcg_out_st(s, TCG_TYPE_PTR, TCG_REG_L0, TCG_REG_ESP, i); + } else { + tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); + + /* The address and data values have been placed by constraints. */ + tcg_debug_assert(addrlo == tcg_target_call_iarg_regs[1]); + if (is_ld) { + i = 2; + } else { + tcg_debug_assert(datalo == tcg_target_call_iarg_regs[2]); + i = 3; + } + + tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[i++], oi); + + /* Copy the return address from the stack to the rvalue argument. + * WIN64 runs out of argument registers for stores. + */ + if (i < (int)ARRAY_SIZE(tcg_target_call_iarg_regs)) { + tcg_out_ld(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[i], + TCG_REG_ESP, 0); + } else { + tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_RAX, TCG_REG_ESP, 0); + tcg_out_st(s, TCG_TYPE_PTR, TCG_REG_RAX, TCG_REG_ESP, + TCG_TARGET_CALL_STACK_OFFSET + 8); + } + } + + /* Tail call to the helper. */ + if (is_ld) { + tcg_out_jmp(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]); + } else { + tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + } + + return thunk; +} +#endif + static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args) {