From patchwork Mon Jan 14 11:56:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 155453 Delivered-To: patch@linaro.org Received: by 2002:a02:48:0:0:0:0:0 with SMTP id 69csp3570766jaa; Mon, 14 Jan 2019 03:57:28 -0800 (PST) X-Google-Smtp-Source: ALg8bN7I7h/q6Q7oxBjP5E9VyzFhURCeDI9mzdOLIndEuiyySxfx+Up9XAiTgI5F2zGIj8IEhCfZ X-Received: by 2002:a1c:7d54:: with SMTP id y81mr11495832wmc.106.1547467048285; Mon, 14 Jan 2019 03:57:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547467048; cv=none; d=google.com; s=arc-20160816; b=xJLj2hWoqEPjWQ6dVTR7Ht7e5Du6ph7UeGLsCHn7RKy9eIIdmONuXJfGVNAWm6iHJE meeaL7JVNxX3JaqZWjcZ2+SpIwPryGcYzRXWNfln+7AvN3+wBjgmw/fcLnHBr14nKX6Y 8RhUNWdM6IW28cny0wOABHcLlpbIOnQOWUx0DWyiOsZtEHGyqCSfLjEcTBbnnckYv2bf UmQLBPrbQH/tZ/5rPuMQ3GTUkvjCQK3EJRbgKzmdK+yVX9m2J0MozaS7X+lJCsX8J3t/ FSFTU5r07RIGQJWHorgziYQaflrKe/sD+3JCYjNFkjEGErCB0bauV8243TERaK9858VS hWvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:message-id:date:to:from :dkim-signature; bh=JJr1C7xnGnYQ9yTznPJqg/6XqVIgZuUV+lI4IdS1fkg=; b=f1HSicc1BHN9qm7W36atiRkznYp1t5SXaRTyHAQdqKgnyPbvMUqgOWfiVbIvnZi3YD X9vU3Wu6HZXz0SgFJoXWJrCmnj1IVoAqJF2wcwffWWVNysyjm3f3Pl73avkbBZpS3vxd Ce6nL3kV8KS9bD6/Igt5X0TCoKn+cxprBp3O0LFPqiMfwa0En1dG53tnJEWVv+Uz3Sfw /NQR8h4lt30rY1+FVkv9+QPbWE153jY93gwSBT4xsG2XGSJS29IcLNYDU3iQNCzcCNJD 90H/kSUfwbMXA+02KUtGQeBurT106tAxsGIGGuEwX43dPaXl2a2qMfOIoOpTdPEPJj5E gTZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Htag7P//"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id m185si18529623wmd.125.2019.01.14.03.57.26 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 14 Jan 2019 03:57:27 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Htag7P//"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57655 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gj0ru-0006mB-6X for patch@linaro.org; Mon, 14 Jan 2019 06:57:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47145) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gj0rE-0006WA-OD for qemu-devel@nongnu.org; Mon, 14 Jan 2019 06:56:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gj0rC-0005p3-9K for qemu-devel@nongnu.org; Mon, 14 Jan 2019 06:56:44 -0500 Received: from mail-wr1-x441.google.com ([2a00:1450:4864:20::441]:35585) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gj0rC-0005oR-2B for qemu-devel@nongnu.org; Mon, 14 Jan 2019 06:56:42 -0500 Received: by mail-wr1-x441.google.com with SMTP id 96so22525491wrb.2 for ; Mon, 14 Jan 2019 03:56:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=JJr1C7xnGnYQ9yTznPJqg/6XqVIgZuUV+lI4IdS1fkg=; b=Htag7P//Pdt/NoM0cfb3lmZpmJaZf9R9v+gIYddkeY/AlBYRBGgIZ6gh2CtlhGyIi4 IOaBHhXg7nfrZJ2Brua0NN1zf/uld0ltxHPCqds/Qyhh9N+/ggMqwc+Yjfxm1PtqsiBp P7B13qKlH5ESmcYdPhDtHo7Awz9qBbx7uA50M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=JJr1C7xnGnYQ9yTznPJqg/6XqVIgZuUV+lI4IdS1fkg=; b=UetnMSxBAyFSTaYsVcYUN+SWAkfVHQucl4i+RaqQEOL3j7FH1op9EH9p98mGqwC0CX Qx6IpK2OVCZdksR1WQAXn8yCRdTvmyjP8qiFlQdA6cH6HGNKc8pouGJaZTGN8x1op/TW wHbeiefv/n9EMo5obB+zZj0rDzuw0ulNhmJAgy9A0gYL8Byy2E5g6PIV83eiDP33z259 FId5NwtlLcLJpTFK1syXUTdI8o6JAAngBuqKABB3Y6lcZxiE7FPyE2FHqrLD0WsXNypU oAQD5QNHxdSiPcVMCfFhlpPhz38ycQBHYKhBqM7e4kiwxdhy9cSYRlYO9HpzI8AVkppl Y+JQ== X-Gm-Message-State: AJcUukdaEyGdy0qLcYRBGl++Nz2CEpMXR3B2wYq2vxgYlBUtsQ9pK+Ex XJ4OM12AatYrVFBG5SU6aE0lKQ== X-Received: by 2002:adf:c042:: with SMTP id c2mr24930786wrf.158.1547467000666; Mon, 14 Jan 2019 03:56:40 -0800 (PST) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id 125sm47835842wmm.26.2019.01.14.03.56.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 14 Jan 2019 03:56:39 -0800 (PST) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Date: Mon, 14 Jan 2019 11:56:33 +0000 Message-Id: <20190114115637.6335-1-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::441 Subject: [Qemu-devel] [PATCH v2 0/4] tcg: support heterogenous CPU clusters X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Peter Crosthwaite , Alistair Francis , Richard Henderson , "Emilio G . Cota" , "Edgar E. Iglesias" , Paolo Bonzini , Aleksandar Markovic Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Changes v1->v2: just fixing a bug in patch 3 that meant we were accidentally never reusing TBs. Patch 3 is the only one that needs review. Original cover letter included below for context. thanks -- PMM Currently, TCG implicitly assumes that all CPUs are alike, because we have a single cache of generated TBs and we don't account for which CPU generated the code or is looking for the TB when adding or searching for generated TBs. This can go wrong in two situations: (1) two CPUs have different physical address spaces (eg CPU 1 has one lot of RAM/ROM, and CPU 2 has different RAM/ROM): the physical address alone is then not sufficient to distinguish what code to run (2) two CPUs have different features (eg FPU vs no FPU): since our TCG frontends bake assumptions into the generated code about the presence/absence of features, if a CPU with FPU picks up a TB for one generated without an FPU it will behave wrongly This is unfortunate, because we already have one board in the tree which has a heterogenous setup: the xlnx-zcu102. This board has 4xCortex-A53 and 2xCortex-R5F. Although most "normal" guest code doesn't run into this, it is possible to demonstrate the bug with it. There's a test case at http://people.linaro.org/~peter.maydell/xlnx-het.tgz which arranges to run a fragment of code in AArch32 which should behave differently on the two CPUs, but does not (either the A53 gets the behaviour the R5 should have, or vice-versa). This patchset adds the "cluster ID" to the set of things we include in the TCG TB hash, which means that the two sets of CPUs in this board can no longer accidentally get TBs generated by the other cluster. It fixes the bug demonstrated by the test case. Design notes: * Adding cluster ID to the hash is RTH's suggestion. It might also be possible to make each cluster have a TCGContext code generation context, and have the CPUs use the right one for the cluster, but that would be a lot more code rework compared to this approach. * I put the cluster number in the existing cflags, since we have plenty of spare bits there. If in future we need either more than 256 clusters (unlikely) or want to reclaim bits in the cflags field for some other purpose we can always extend our xxhash from 28 to 32 bytes. (I actually wrote the patch to do that before realising I didn't need it...) * The cluster object now fills in the CPU object's cluster_index field when the cluster object is realized. This imposes an ordering constraint that all CPUs must be added to a cluster before realizing the cluster. That's not a big deal, except that unfortunately QOM provides no way that I could find to enforce this. If anybody has a suggestion for a better approach that would be great. Patch 1 therefore makes sure our only current user of the cluster feature obeys the ordering constraint. * Patch 4 is a tidyup to the gdbstub code which is possible now that the CPUState struct has the cluster ID in it. I believe that this fix is basically all we need to support heterogenous setups (assuming of course that you just mean "within an architecture family" like Arm, rather than systems with say both an Arm and a Microblaze core in them). The other things I have considered are: * code that calls cpu_physical_memory_read() and friends, uses address_space_memory, or otherwise assumes there's only a single view of "CPU memory": I sent some patches out before Christmas that fixed these in generic code like the monitor and disassembler. There are also some uses in target-arch or device-specific code, which I think we can in practice ignore unless/until we come to implement a board that is heterogenous and also uses those devices or target architectures. * handling of TB invalidation: I think this should Just Work, because tb_invalidate_phys_addr() takes an AddressSpace+hwaddr, which it converts into a ramaddr for (the badly misnamed) tb_invalidate_phys_page_range(). So the TB caching all works on ramaddrs, and if CPU 1 writes to some ram X that's mapped at physaddr A for it but at physaddr B for CPU 2, we will still correctly invalidate any TBs that CPU 2 had for code that from its point of view lives at physaddr B. Note that translate-all.c has a single l1_map[] structure which will have PageDesc entries for pages for all CPUs in the system. I believe this is OK because the only thing we use them for is TB invalidation. * dirty memory tracking: like TB invalidation, this is OK because it all works either on MemoryRegions or on ramaddrs, not on physaddrs. Have I missed anything that we also need to fix ? thanks -- PMM Peter Maydell (4): hw/arm/xlx-zynqmp: Realize cluster after putting RPUs in it qom/cpu: Add cluster_index to CPUState accel/tcg: Add cluster number to TCG TB hash gdbstub: Simplify gdb_get_cpu_pid() to use cpu->cluster_index include/exec/exec-all.h | 4 +++- include/hw/cpu/cluster.h | 19 ++++++++++++++++ include/qom/cpu.h | 7 ++++++ accel/tcg/cpu-exec.c | 3 +++ accel/tcg/translate-all.c | 3 +++ gdbstub.c | 48 ++++----------------------------------- hw/arm/xlnx-zynqmp.c | 4 ++-- hw/cpu/cluster.c | 33 +++++++++++++++++++++++++++ qom/cpu.c | 1 + 9 files changed, 76 insertions(+), 46 deletions(-) -- 2.20.1