From patchwork Fri May 16 16:53:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 890781 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1D261D7E5C; Fri, 16 May 2025 16:53:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414412; cv=none; b=SeIjeUWWyz8cIMa/R1dan3+Jcz4DjEN7itscm6J8hEVv3cqkx6Qxr9qxxsvBkcR0ic0m8+cOLzY7VHZhFjNeF+gdScf3Id6mg6VVBYoALGHkaaUeg7VU/UHAiMkr2qdQYZOhEf+UbGqIPfgDN7htrWi8HX2jwvvqwW5H8hgG3YM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414412; c=relaxed/simple; bh=mI87wUYVwjJdeMND0P7OjMNfdN3yE7W8j+sh4Iu84i8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=e/Juzh1XFrIUlPlHcBpHZjmcFIQZliDkNKg015KuE9AC278QLY09xWnEGEqJihjYBTV20ZnqW5r2XYgTe7qCuqR53Uf/pKfRdpuJLjnskiYOfHMKzGKoAd55UsAQFwrKhLvyneLVZMC3HAF/UY3WhCT77S/LZ5UhIg39UVe/QfA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-43cfe574976so15609915e9.1; Fri, 16 May 2025 09:53:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747414409; x=1748019209; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SfHWyDzdYdPgseWM8in4Xga6BH3ORUK0taNjKDi8WrU=; b=SVkb1P1rv/1Oxk2hRy4J60CyOz69+09/qerlU/VvTsyxO6zMRuun89x/fiAgGsnvHs BEdHD5ifzH3iyzpeP0Xld+5mR1WkX4DrHGtusq79eJXbPqvlm/J1GAh5QwjdfpLGGmvw 0alaMCvov1tjozT7py0K/tjGTfBDXxgeHmhRKjc/TyTCXT0ErpOnCWKGJbR7gOuEo5Ce p92vc717MXxRG4AIgnE5cP8Fo2FcxSPc3Is35shMRl5psFUKHkPHIHfR2SmlEDaJpsWE 2SafJov4vDRAht71ISTX49ij0EsRJkXJ0R1luhT66/TvmD0TLgb68dK0ECxOFbErDy0G HKtQ== X-Forwarded-Encrypted: i=1; AJvYcCUobGhrqYzn/GM8uykiIZRPrS/TUGwvEGkBIYo2xwlnj6pFbZwvp0+JYduvUD7IVkYsD/MR6Cg9Txo0EMzd@vger.kernel.org, AJvYcCW7LwhOQOnjVkOpyjKhGr7R2neoP6FtdNzWNP7+ThkakyHusM2tb1SCs71NvEIwZUOT+dsqXxxowM5ZR5M=@vger.kernel.org, AJvYcCWBmxFxNsCFIRZQb3veJaKJ/7VWquKSuMewfbY73pku+ueck8iARTQz6ODtHq167puwop/rWAe2/hs=@vger.kernel.org X-Gm-Message-State: AOJu0YxyBaXPFAMhxK7EwG/jDDudY5HqtG0zVzJWV42UO9N1atVCeJYE y8YJp2f9IfuRAbXW8TagvIWKSr8FvkQLw5/XKIls0f6caiDSGPMrc7+7 X-Gm-Gg: ASbGncvliiyiaTeeMXK2r0vrCWmFWXRfGiWk6vvOLbmyrZfwvZ1qVDLgO6e7aCxZHP4 FOAS/XhJKxqdkKx/V00gl5WIoNq6PtFQvyVKkKcGJ5wttgFpJEo3LkYvSbe4tRzMN5z6y7mYEAI k7vQ57lc/Xx7D/h/NJ8/4CKbDvd+JagkleIpDnG/RI7eykio+9TEEdEvlfiiklRuSHEfFgSUd1F AJsqOhnK7eJE0TazRuf1b7WwDjPksaJKrw+nn4WVy3UQzlgDLADitcr9tZZ+LmMk2iclcOal1Cs 05n7G+7w0EtCziW4hgWJFCpU4dnWa8VeNxtpoGKAjAnWKWJzQ7MhbAyI+1mQUQ3mWjEEenjDs55 anOLSoDvx2ZO67oupMDjX X-Google-Smtp-Source: AGHT+IEyqMrmDea7JavBMgkK3wvxNcoS+lcm9iLW6qMl0usvxOeRCVB8F+OJbgGXLnn2gpefZQpWQA== X-Received: by 2002:a05:600c:4f42:b0:43b:cb12:ba6d with SMTP id 5b1f17b1804b1-442fd60b543mr50449995e9.3.1747414408596; Fri, 16 May 2025 09:53:28 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442f3380498sm116511755e9.11.2025.05.16.09.53.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 May 2025 09:53:28 -0700 (PDT) From: Tomeu Vizoso Date: Fri, 16 May 2025 18:53:15 +0200 Subject: [PATCH v3 01/10] dt-bindings: npu: rockchip,rknn: Add bindings Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250516-6-10-rocket-v3-1-7051ac9225db@tomeuvizoso.net> References: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> In-Reply-To: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Add the bindings for the Neural Processing Unit IP from Rockchip. v2: - Adapt to new node structure (one node per core, each with its own IOMMU) - Several misc. fixes from Sebastian Reichel v3: - Split register block in its constituent subblocks, and only require the ones that the kernel would ever use (Nicolas Frattaroli) - Group supplies (Rob Herring) - Explain the way in which the top core is special (Rob Herring) Signed-off-by: Tomeu Vizoso Signed-off-by: Sebastian Reichel --- .../bindings/npu/rockchip,rknn-core.yaml | 162 +++++++++++++++++++++ 1 file changed, 162 insertions(+) diff --git a/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml new file mode 100644 index 0000000000000000000000000000000000000000..4572fb777f1454d0147da29791033fc27c53b8d2 --- /dev/null +++ b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml @@ -0,0 +1,162 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Neural Processing Unit IP from Rockchip + +maintainers: + - Tomeu Vizoso + +description: + Rockchip IP for accelerating inference of neural networks, based on NVIDIA's + open source NVDLA IP. + + There is to be a node per each core in the NPU. In Rockchip's design there + will be one core that is special and needs to be powered on before any of the + other cores can be used. This special core is called the top core and should + have the compatible string that corresponds to top cores. + +properties: + $nodename: + pattern: '^npu-core@[a-f0-9]+$' + + compatible: + oneOf: + - items: + - enum: + - rockchip,rk3588-rknn-core-top + - items: + - enum: + - rockchip,rk3588-rknn-core + + reg: + minItems: 3 + + reg-names: + minItems: 3 + items: + - const: pc + - const: cna + - const: core + + clocks: + minItems: 2 + maxItems: 4 + + clock-names: + items: + - const: aclk + - const: hclk + - const: npu + - const: pclk + minItems: 2 + + interrupts: + maxItems: 1 + + iommus: + maxItems: 1 + + npu-supply: true + + power-domains: + maxItems: 1 + + resets: + maxItems: 2 + + reset-names: + items: + - const: srst_a + - const: srst_h + + sram-supply: true + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - iommus + - power-domains + - resets + - reset-names + - npu-supply + - sram-supply + +allOf: + - if: + properties: + compatible: + contains: + enum: + - rockchip,rknn-core-top + then: + properties: + clocks: + minItems: 4 + + clock-names: + minItems: 4 + - if: + properties: + compatible: + contains: + enum: + - rockchip,rknn-core + then: + properties: + clocks: + maxItems: 2 + clock-names: + maxItems: 2 + +additionalProperties: false + +examples: + - | + #include + #include + #include + #include + #include + + bus { + #address-cells = <2>; + #size-cells = <2>; + + rknn_core_top: npu-core@fdab0000 { + compatible = "rockchip,rk3588-rknn-core-top", "rockchip,rknn-core-top"; + reg = <0x0 0xfdab0000 0x0 0x9000>; + assigned-clocks = <&scmi_clk SCMI_CLK_NPU>; + assigned-clock-rates = <200000000>; + clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>, + <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>; + clock-names = "aclk", "hclk", "npu", "pclk"; + interrupts = ; + iommus = <&rknn_mmu_top>; + npu-supply = <&vdd_npu_s0>; + power-domains = <&power RK3588_PD_NPUTOP>; + resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>; + reset-names = "srst_a", "srst_h"; + sram-supply = <&vdd_npu_mem_s0>; + }; + + rknn_core_1: npu-core@fdac0000 { + compatible = "rockchip,rk3588-rknn-core", "rockchip,rknn-core"; + reg = <0x0 0xfdac0000 0x0 0x9000>; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "hclk"; + interrupts = ; + iommus = <&rknn_mmu_1>; + npu-supply = <&vdd_npu_s0>; + power-domains = <&power RK3588_PD_NPU1>; + resets = <&cru SRST_A_RKNN1>, <&cru SRST_H_RKNN1>; + reset-names = "srst_a", "srst_h"; + sram-supply = <&vdd_npu_mem_s0>; + }; + }; +... From patchwork Fri May 16 16:53:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 890780 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 842961F153C; Fri, 16 May 2025 16:53:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414416; cv=none; b=dPE0a6B/87qSMxZ3chnwklr24OsxaByI6lKy/H+x7IWrPmYI4dJsdaZyRtemEjQJXoD5hAu+/xLC9dRM2SAoaoPqYqvjp0dH2PpYw06kggdBEBsNXJJVJOSXgCi7RG1shGnhyzz89GqBFSKRX6jXPQX9hzwr13MXlIJcGGBxtVY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414416; c=relaxed/simple; bh=i91shgPHsZNyilUroS+RDBNxOPQpEugf79RIy+zK7UQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=SU4UNijdFARGW8WoCOFaNGVzN2ggt26VulMDUsQJLhfWha0a/uM0ISBBI4kDqhB6TZYvUkbJDB3xcUCeukaR8pllLWo982rAwblVoGuDA1QeSBlDDALxKg8aHBj1Ba57PRdy2RcXDhGxwXjV6jG6/cDUKYmOhzPKawDyzpFadI4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-441d1ed82faso17462365e9.0; Fri, 16 May 2025 09:53:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747414413; x=1748019213; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JMvFo5QL3bZFz0qcgL/CL0qvkVwIy7LI7zBc66tFtZI=; b=pGzOTcfxqNEULl+sT5CfJjj85pEwSL1vuF0nnbuPUMYmYRdb26E2RF97KJ1sCDZc5e wPoy2U4+1TjNKIckCGD13+BsdhgTSTugX4iXPab27EY2oURgygVQjWuZHK5CLDNa0Lo4 my/X2zIM5CQ4d7k3YN6NweYJ6pNHt6c7aVJS7qwurTHxasZerT3hMkUB4AIp9UPY3Bi/ kDSwVG0lJfRJtczmrCMkNpL3MUTQhumQrqY/4b3ArippRijnj95hty66jYddWoHCAJkB mdGI6PMY+k6USp/xQRSEumWq7Svgr1z8MXxvoHd7j8DA5p7WzHXgjMYQOfQ4m9cewmma f+GA== X-Forwarded-Encrypted: i=1; AJvYcCW3hjXfBAWtyAp3WrWmmbybnJFcq0oD0ZDtP1ef6/PMvYAt89g2/SSY+uD4MkFEOnTmLPKaI3wBcboT7eA=@vger.kernel.org, AJvYcCWL4jmpinAxu4y19f7zNGnkanQPRuOa47Bmbp+alr20Ln0+u9ckeP4gT8sYAlLCHRaMEUOJqpT5UE8=@vger.kernel.org, AJvYcCXrCUhpJt4e5B9pKwyOX2fZFrp8clv4K4BmX6coOZHm42zYF8/A3f+tVFPokNYuuw3k88aPf3YffUWaTBTM@vger.kernel.org X-Gm-Message-State: AOJu0Yy0/22Q9nb+wzUGIY7xejKGs6t/xgaFbxJQLJFZzY3s6LO3M/Xc mDtA6inEHUqa0JiXueTgAn6q+oIedka2i4kJkAMQwdJgHBsRBno8gcq6 X-Gm-Gg: ASbGnctQvW2/gsnPtMD/iSPAj0Tb82urQOLDVJwAKq8FiHTzHl7Cisy2NQNMxLsNjwy +d4SJ6fX0LFZz38vBWy0SOs8hwoy9+M+PmNspnQW+BRziJ9NTqjKOl8U1KdCY75biVP2nF3+iZ5 1wbbU/xt4XLhR3i4OGogLunNye7X4iRoT/3ICUZnACMvLQDCViw+9zXRs65BM7L6uJXZtgc+Ekj aQUYWk2hYPQT3Ta6rxRmWRpxyLw8DC+An+8wsN6b1Uf20+lSx4J9ezPzEZnrXEkYXsH9z9JoQCo BOp3VIF4xlQ5XQr/MZJVGvBrTKzRE/D3A4Se1wYzD47k9VCDpSULdsmrlO0iqA9UZuZ7poqy/4k krVVZFAnHJw== X-Google-Smtp-Source: AGHT+IE3kKJia1yJ69hIetn7LETCPISzu5AThEm8caL/gjKNo+SI7DcwJ/kykHCbu+Yw1iluEYSstA== X-Received: by 2002:a05:6000:2af:b0:3a1:f5cf:9553 with SMTP id ffacd0b85a97d-3a35fe6628amr3027912f8f.6.1747414412687; Fri, 16 May 2025 09:53:32 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442f3380498sm116511755e9.11.2025.05.16.09.53.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 May 2025 09:53:32 -0700 (PDT) From: Tomeu Vizoso Date: Fri, 16 May 2025 18:53:17 +0200 Subject: [PATCH v3 03/10] arm64: dts: rockchip: Enable the NPU on quartzpro64 Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250516-6-10-rocket-v3-3-7051ac9225db@tomeuvizoso.net> References: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> In-Reply-To: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Enable the nodes added in a previous commit to the rk3588s device tree. v2: - Split nodes (Sebastian Reichel) - Sort nodes (Sebastian Reichel) - Add board regulators (Sebastian Reichel) Signed-off-by: Tomeu Vizoso --- .../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 30 ++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts index 78aaa6635b5d20a650aba8d8c2d0d4f498ff0d33..2e45b213c25b99571dd71ce90bc7970418f60276 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts @@ -415,6 +415,36 @@ &pcie3x4 { status = "okay"; }; +&rknn_core_top { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_core_1 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_core_2 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_mmu_top { + status = "okay"; +}; + +&rknn_mmu_1 { + status = "okay"; +}; + +&rknn_mmu_2 { + status = "okay"; +}; + &saradc { vref-supply = <&vcc_1v8_s0>; status = "okay"; From patchwork Fri May 16 16:53:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 890779 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91A001FF5EA; Fri, 16 May 2025 16:53:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414423; cv=none; b=IwdURxSRCyE9w3qi2qBdjXb7l7bQVKUjCewrUsOCd10gueJLtiPEx5uTeXTngm7991FZNuAwvpDDXJtMBSpk16Gpxn3c61MbNgG1NjGc/gnNIVx6UZpDtTWiqiRRtYOcpHFth/qYAc5IO7lqasY1kS/tB07yHRMQKKNqfSIYt5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414423; c=relaxed/simple; bh=FulfFPepvHnQiRUn4Zl14qkkLJEHhgSwil7LgfRmBcU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CDl/4TO64GIpDwSjWEaFKVa48ho+ZwsfIAKypaeSJGPO6SvrCq5673AQpKbX8Bg1TetIg30/YPGmNpWER16h4iaYMYnU4f/P9IBJSUtdAN0KOtXT3Nj1f0iw51RH0FJWhk/ruN2Bt73hzt/wvLMZWJRlOlt5ODGK+RxOzJx5QMI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-43d04dc73b7so23550415e9.3; Fri, 16 May 2025 09:53:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747414419; x=1748019219; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KqFp+5l5a6yetpMAp12Y0zlKsa5LaUPkAqZpxuuNymI=; b=X3qhLLSvMDmX9EqetXmelSSHvibJqd5eE1WnQHUpe+3cswYJwfpFGCeJ4InDvYln0t /eNho76ENAdLv5t85I2rvtY2PsI6PcYsFY51JywGxPEUxMBQWclkLIABZXLxAiW/dN1a 8eSb/Bp5yP5C0TNrZNG9w6zvpIq+afGfB8SpNbjgGqSd4YKJvMDBwTWPlR7nIP2KrZow IgiBcTgXiiREBxnETenWSuDOPaTozzW25G0XfjPzYMxVVCITeyGq7mmrDqnY/jgf4eqB hLPwjPL3I45C9xQ/I1E5Bx0c+QnN1V4MOSjHPX7uadBuw2wghj7QbHk2UJA0iMDdBz8b X9Hw== X-Forwarded-Encrypted: i=1; AJvYcCUpAGCZJ8t/411XLUtZE+JpcRE31Bv5k8Du7Lmf/bpX7LzXu2+Ux5VDeriKsFHpRJtQzasxtNUY+x4=@vger.kernel.org, AJvYcCX841zAHtLN0x2ozKCbNYVAcOEJZnJj3ZGG7jj4tsdHqt4zttX+cIpy0k26BpY6CJMLzFf2mx3REeZWGZw=@vger.kernel.org, AJvYcCXoUfv6GKGlkpwAPtTBeeixxP5nCct9ycsawppjg4Hh3Qx9Y7RlBAbe6ZaM8VG26lq84OgmDzG4hAYIzxHc@vger.kernel.org X-Gm-Message-State: AOJu0YzqXHqhj5qoQR5jTU0Ayb1cKv8/AjQeSFY8B8xKUTlsbArJzU/C xuMO5XuPrlRGy9V2jfyV+fk3Ssyxr2Rm4YDMC2YBae9fgDDjBzLZrJs9 X-Gm-Gg: ASbGnct53OeLWZlgcTf083Uc+aS0noJkTXR9ldVEVWIv2+aBhbhR6w9OwOlXWDW+gAb a8UsvZVaXzMFHW6TiLxCcuabVDoqI9Alxh80MQpjHD1oRhk4R0aAIF4E5go8dHjyujxeRpdsV8M LD1UH6CxgyUrhF6ZTTlAmiXV3LEy24aESy47MZE95QdhOvcS0/BlBRXBrc94p3YofQOoOjBKpaX jxd5slmgs4nhTJ4nte9PQWfVjclrSiPMVHH/HVFLcP5lgMrBY9Qqaa+RorXCG1AqpCck0CmnsKq K8DS5n3oMNUCTbhTAOx4HFft7XZcVr0Cc1xtw7ELXVqidDrCded1CDEBiDlonbdvi1DpCADnpBJ VO2ZL5HTsGg== X-Google-Smtp-Source: AGHT+IFNm/mudfWZ1JI9vN7Helu4oGxIpaVEEkvnJ2fm39gmHn/vNFJLnfq1ZBngxuSSudFYlZmOaQ== X-Received: by 2002:a05:600c:34d4:b0:442:f12f:bd9f with SMTP id 5b1f17b1804b1-442fd671f28mr41159785e9.27.1747414418462; Fri, 16 May 2025 09:53:38 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442f3380498sm116511755e9.11.2025.05.16.09.53.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 May 2025 09:53:37 -0700 (PDT) From: Tomeu Vizoso Date: Fri, 16 May 2025 18:53:20 +0200 Subject: [PATCH v3 06/10] accel/rocket: Add IOCTL for BO creation Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250516-6-10-rocket-v3-6-7051ac9225db@tomeuvizoso.net> References: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> In-Reply-To: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 This uses the SHMEM DRM helpers and we map right away to the CPU and NPU sides, as all buffers are expected to be accessed from both. v2: - Sync the IOMMUs for the other cores when mapping and unmapping. v3: - Make use of GPL-2.0-only for the copyright notice (Jeff Hugo) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/Makefile | 3 +- drivers/accel/rocket/rocket_device.c | 4 ++ drivers/accel/rocket/rocket_device.h | 2 + drivers/accel/rocket/rocket_drv.c | 7 +- drivers/accel/rocket/rocket_gem.c | 131 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_gem.h | 26 +++++++ include/uapi/drm/rocket_accel.h | 44 ++++++++++++ 7 files changed, 215 insertions(+), 2 deletions(-) diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile index abdd75f2492eaecf8bf5e78a2ac150ea19ac3e96..4deef267f9e1238c4d8bd108dcc8afd9dc8b2b8f 100644 --- a/drivers/accel/rocket/Makefile +++ b/drivers/accel/rocket/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ROCKET) := rocket.o rocket-y := \ rocket_core.o \ rocket_device.o \ - rocket_drv.o + rocket_drv.o \ + rocket_gem.o diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c index bb469ac87d36249157f4ba9d9f7106ad558309e4..eb10bda13e695fb0c89c1e3464145cdc63748de1 100644 --- a/drivers/accel/rocket/rocket_device.c +++ b/drivers/accel/rocket/rocket_device.c @@ -3,6 +3,7 @@ #include #include +#include #include "rocket_device.h" @@ -30,10 +31,13 @@ int rocket_device_init(struct rocket_device *rdev) if (err) return err; + mutex_init(&rdev->iommu_lock); + return 0; } void rocket_device_fini(struct rocket_device *rdev) { + mutex_destroy(&rdev->iommu_lock); rocket_core_fini(&rdev->cores[0]); } diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h index ba2301e9302120ae338c07baa7d12dd99cb925a9..7381a5148f291ef217c3b256f03d5ab357291688 100644 --- a/drivers/accel/rocket/rocket_device.h +++ b/drivers/accel/rocket/rocket_device.h @@ -14,6 +14,8 @@ struct rocket_device { struct clk *clk_npu; struct clk *pclk; + struct mutex iommu_lock; + struct rocket_core *cores; unsigned int num_cores; }; diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index 82f4cc374bfaa92678da791849537d51bb4c0ba8..4f346df06bcde5a24022bdb651c434d0c6e3c468 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -14,6 +15,7 @@ #include #include "rocket_drv.h" +#include "rocket_gem.h" static int rocket_open(struct drm_device *dev, struct drm_file *file) @@ -42,6 +44,8 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file) static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { #define ROCKET_IOCTL(n, func) \ DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) + + ROCKET_IOCTL(CREATE_BO, create_bo), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); @@ -51,9 +55,10 @@ DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); * - 1.0 - initial interface */ static const struct drm_driver rocket_drm_driver = { - .driver_features = DRIVER_COMPUTE_ACCEL, + .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM, .open = rocket_open, .postclose = rocket_postclose, + .gem_create_object = rocket_gem_create_object, .ioctls = rocket_drm_driver_ioctls, .num_ioctls = ARRAY_SIZE(rocket_drm_driver_ioctls), .fops = &rocket_accel_driver_fops, diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c new file mode 100644 index 0000000000000000000000000000000000000000..8a8a7185daac4740081293aae6945c9b2bbeb2dd --- /dev/null +++ b/drivers/accel/rocket/rocket_gem.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include +#include +#include + +#include "rocket_device.h" +#include "rocket_gem.h" + +static void rocket_gem_bo_free(struct drm_gem_object *obj) +{ + struct rocket_device *rdev = to_rocket_device(obj->dev); + struct rocket_gem_object *bo = to_rocket_bo(obj); + struct sg_table *sgt; + + drm_WARN_ON(obj->dev, bo->base.pages_use_count > 1); + + mutex_lock(&rdev->iommu_lock); + + sgt = drm_gem_shmem_get_pages_sgt(&bo->base); + + /* Unmap this object from the IOMMUs for cores > 0 */ + for (unsigned int core = 1; core < rdev->num_cores; core++) { + struct iommu_domain *domain = iommu_get_domain_for_dev(rdev->cores[core].dev); + size_t unmapped = iommu_unmap(domain, sgt->sgl->dma_address, bo->size); + + drm_WARN_ON(obj->dev, unmapped != bo->size); + } + + /* This will unmap the pages from the IOMMU linked to core 0 */ + drm_gem_shmem_free(&bo->base); + + mutex_unlock(&rdev->iommu_lock); +} + +static const struct drm_gem_object_funcs rocket_gem_funcs = { + .free = rocket_gem_bo_free, + .print_info = drm_gem_shmem_object_print_info, + .pin = drm_gem_shmem_object_pin, + .unpin = drm_gem_shmem_object_unpin, + .get_sg_table = drm_gem_shmem_object_get_sg_table, + .vmap = drm_gem_shmem_object_vmap, + .vunmap = drm_gem_shmem_object_vunmap, + .mmap = drm_gem_shmem_object_mmap, + .vm_ops = &drm_gem_shmem_vm_ops, +}; + +struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size) +{ + struct rocket_gem_object *obj; + + obj = kzalloc(sizeof(*obj), GFP_KERNEL); + if (!obj) + return ERR_PTR(-ENOMEM); + + obj->base.base.funcs = &rocket_gem_funcs; + + return &obj->base.base; +} + +int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_create_bo *args = data; + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_gem_shmem_object *shmem_obj; + struct rocket_gem_object *rkt_obj; + struct drm_gem_object *gem_obj; + struct sg_table *sgt; + int ret; + + shmem_obj = drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem_obj)) + return PTR_ERR(shmem_obj); + + gem_obj = &shmem_obj->base; + rkt_obj = to_rocket_bo(gem_obj); + + rkt_obj->size = args->size; + rkt_obj->offset = 0; + + ret = drm_gem_handle_create(file, gem_obj, &args->handle); + drm_gem_object_put(gem_obj); + if (ret) + goto err; + + mutex_lock(&rdev->iommu_lock); + + /* This will map the pages to the IOMMU linked to core 0 */ + sgt = drm_gem_shmem_get_pages_sgt(shmem_obj); + if (IS_ERR(sgt)) { + ret = PTR_ERR(sgt); + goto err_unlock; + } + + /* Map the pages to the IOMMUs linked to the other cores, so all cores can access this BO */ + for (unsigned int core = 1; core < rdev->num_cores; core++) { + ret = iommu_map_sgtable(iommu_get_domain_for_dev(rdev->cores[core].dev), + sgt->sgl->dma_address, + sgt, + IOMMU_READ | IOMMU_WRITE); + if (ret < 0 || ret < args->size) { + drm_err(dev, "failed to map buffer: size=%d request_size=%u\n", + ret, args->size); + ret = -ENOMEM; + goto err_unlock; + } + + /* iommu_map_sgtable might have aligned the size */ + rkt_obj->size = ret; + + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, + DMA_BIDIRECTIONAL); + } + + mutex_unlock(&rdev->iommu_lock); + + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node); + args->dma_address = sg_dma_address(shmem_obj->sgt->sgl); + + return 0; + +err_unlock: + mutex_unlock(&rdev->iommu_lock); +err: + drm_gem_shmem_object_free(gem_obj); + + return ret; +} diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h new file mode 100644 index 0000000000000000000000000000000000000000..41497554366961cfe18cf6c7e93ab1e4e5dc1886 --- /dev/null +++ b/drivers/accel/rocket/rocket_gem.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_GEM_H__ +#define __ROCKET_GEM_H__ + +#include + +struct rocket_gem_object { + struct drm_gem_shmem_object base; + + size_t size; + u32 offset; +}; + +struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size); + +int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file); + +static inline +struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj) +{ + return container_of(to_drm_gem_shmem_obj(obj), struct rocket_gem_object, base); +} + +#endif diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h new file mode 100644 index 0000000000000000000000000000000000000000..95720702b7c4413d72b89c1f0f59abb22dc8c6b3 --- /dev/null +++ b/include/uapi/drm/rocket_accel.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2024 Tomeu Vizoso + */ +#ifndef __DRM_UAPI_ROCKET_ACCEL_H__ +#define __DRM_UAPI_ROCKET_ACCEL_H__ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +#define DRM_ROCKET_CREATE_BO 0x00 + +#define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) + +/** + * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. + * + */ +struct drm_rocket_create_bo { + /** Input: Size of the requested BO. */ + __u32 size; + + /** Output: GEM handle for the BO. */ + __u32 handle; + + /** + * Output: DMA address for the BO in the NPU address space. This address + * is private to the DRM fd and is valid for the lifetime of the GEM + * handle. + */ + __u64 dma_address; + + /** Output: Offset into the drm node to use for subsequent mmap call. */ + __u64 offset; +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* __DRM_UAPI_ROCKET_ACCEL_H__ */ From patchwork Fri May 16 16:53:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 890778 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 447BB1FF7BC; Fri, 16 May 2025 16:53:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414426; cv=none; b=FFdbtVXKvGnYRDyyeGT2Nvr58gZHYT8/bLYfBNrdlzow0koK9dZEBzgM2D4K2WJHID4crdEl9j5yV+HLLPNg91qUvr10oauH7ljR3OAEK3tFZIBay9Hfg96oRaiLoKF6Tfs11++cTP4UViodYnWO1L7iLQ/paKOKEyCAaxbojzY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414426; c=relaxed/simple; bh=4fzxA6kZlWGPieMYxSPGmkfe2GgSSbECbqUcTHNNJpw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hHa+QTNdJlbwt3NpeSfbpfiOKi9T81S9SVExAuX5QofXWGP3As5vxwB0vMFh5OEFyMVu6JNv1TJMRFq+F6vD8a3hoOFUiIg8cAl5Pmm/zmsBp3UJ1/2mnbY3NjRoOaRBKncatWeHCUE0HuREkQtjFvEZeCsOeDqbh1keBe+kQLg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-441c99459e9so14920185e9.3; Fri, 16 May 2025 09:53:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747414421; x=1748019221; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z+Y33T1/lR0uPJoJ4gUNNYrF6sPamOn9UrgjtRdqssU=; b=Gdk30D9+yrdW6AKTmfzWFZeJBS5IVHnNixP0xcffSXklsCzuqYKzSBjQd39Y7y6fKz DU7Pfv3OMZ1PBB31DlFuJO/Mq8fsTw7JLkB9sNk0eNqV/d919qGiUshA5gLuKATPbTmG TlrZAzrYh2Q4VdceweEd9MrQLsB9HzSOOn1K+0xjNS7oGwlvVhwx0hTtudEw4l9MYb9l +LRBNumKacTiuT/dOkqOmW1ab/O2MuiPzCyqh72Nc58foOPfC8TsqMb3vApNh/LJODAO 5c38NJzJdt/cOlKD+raFZqCcuJ3zFccbU8UUMd6xyBUKZFnrdkuDcIWenOdNqFlAH4+c zf9g== X-Forwarded-Encrypted: i=1; AJvYcCUlKlN35qSaxpR+nj8JVnxdlHY0jZrkXLtV9BFr5haukUKrxLVIiBKoQBFIxTXA0JIwM+Wn6l1BmaTnyxEP@vger.kernel.org, AJvYcCV5a+gvi3cTiJhBWqZKrbYGswz/o3KJDsH8sqZ1+ImGV5/fD13TbosUKaaPArMVtIJuCUY2eGz/yfgy37A=@vger.kernel.org, AJvYcCW48X6GeBOh9Dk1SlPzm8qFW1cEa+sTC/SGCqjLhQ/TGIhl6LDsuNKMGD0HkCEQGXqj4r2pxI8ihH4=@vger.kernel.org X-Gm-Message-State: AOJu0YxzJTv73uvAsx5E2hxRkkM6dOkDQoR8yMdJjMwHBgsvRkZCflWj cKM/VSckOvZoj/s7OQtpmmbEfrL+wUsIAC1exUDWU/gwS9uevpVUQCr6dQteUQ== X-Gm-Gg: ASbGncvFInYmIuLMWnv+/+kgy56QzSW/bXOun1+1w9JTZAc/VgxoxgQ9UtO2Jt8tadU KmapqUpnh+I5xd3Z9bAGjquOuKDBpA0q8ZaQiavWNCuZHTPZVqMPTvCi1seFprAIp572n0A8wu6 7Kd5UL46sNYlSStYnuJbeXDkj1+6pZkkywdAk93rpsUgpP4R5yD+J0Y2Fi1l5USokplksfez7ZR mjZX5qdZ7rADeXs22e1DJJSpg0sVbz3/G05bhcqxnHqm1CPCDk5hmsup29gyhtmaW8o1gJdzeu6 UClz0GzYrqrvesLtbE1Y6GlMLS0ovrR5A09k73tM0E8YnJJ+RKN0O8baVGslTTkSLambdyLojox CXq1Y2qJSVqw8aQnPvXd6 X-Google-Smtp-Source: AGHT+IHhWTmzRY/iqWQn0PgP9UnALJTRZ2TQ9AY+gWnqFglM1nie4AKPmnOnAiDWCN6c/hR40D4nOw== X-Received: by 2002:a05:600c:46cc:b0:442:d9fc:7de with SMTP id 5b1f17b1804b1-442fd672025mr32041985e9.22.1747414421042; Fri, 16 May 2025 09:53:41 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442f3380498sm116511755e9.11.2025.05.16.09.53.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 May 2025 09:53:40 -0700 (PDT) From: Tomeu Vizoso Date: Fri, 16 May 2025 18:53:21 +0200 Subject: [PATCH v3 07/10] accel/rocket: Add job submission IOCTL Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250516-6-10-rocket-v3-7-7051ac9225db@tomeuvizoso.net> References: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> In-Reply-To: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Using the DRM GPU scheduler infrastructure, with a scheduler for each core. Userspace can decide for a series of tasks to be executed sequentially in the same core, so SRAM locality can be taken advantage of. The job submission code was initially based on Panfrost. v2: - Remove hardcoded number of cores - Misc. style fixes (Jeffrey Hugo) - Repack IOCTL struct (Jeffrey Hugo) v3: - Adapt to a split of the register block in the DT bindings (Nicolas Frattaroli) - Make use of GPL-2.0-only for the copyright notice (Jeff Hugo) - Use drm_* logging functions (Thomas Zimmermann) - Rename reg i/o macros (Thomas Zimmermann) - Add padding to ioctls and check for zero (Jeff Hugo) - Improve error handling (Nicolas Frattaroli) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/Makefile | 3 +- drivers/accel/rocket/rocket_core.c | 14 +- drivers/accel/rocket/rocket_core.h | 14 + drivers/accel/rocket/rocket_device.c | 2 + drivers/accel/rocket/rocket_device.h | 2 + drivers/accel/rocket/rocket_drv.c | 15 + drivers/accel/rocket/rocket_drv.h | 4 + drivers/accel/rocket/rocket_job.c | 723 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_job.h | 50 +++ include/uapi/drm/rocket_accel.h | 64 ++++ 10 files changed, 888 insertions(+), 3 deletions(-) diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile index 4deef267f9e1238c4d8bd108dcc8afd9dc8b2b8f..3713dfe223d6ec6293ced3ef9291af2f3d144131 100644 --- a/drivers/accel/rocket/Makefile +++ b/drivers/accel/rocket/Makefile @@ -6,4 +6,5 @@ rocket-y := \ rocket_core.o \ rocket_device.o \ rocket_drv.o \ - rocket_gem.o + rocket_gem.o \ + rocket_job.o diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c index a947c3120558e8af90bc0730a4d30ac796d5683d..6c28654c5f3122fb09a10c7df22da7650a8c7d32 100644 --- a/drivers/accel/rocket/rocket_core.c +++ b/drivers/accel/rocket/rocket_core.c @@ -8,6 +8,7 @@ #include #include "rocket_core.h" +#include "rocket_job.h" static int rocket_clk_init(struct rocket_core *core) { @@ -61,6 +62,10 @@ int rocket_core_init(struct rocket_core *core) return PTR_ERR(core->core_iomem); } + err = rocket_job_init(core); + if (err) + return err; + pm_runtime_use_autosuspend(dev); /* @@ -74,9 +79,13 @@ int rocket_core_init(struct rocket_core *core) pm_runtime_enable(dev); err = pm_runtime_get_sync(dev); + if (err) { + rocket_job_fini(core); + return err; + } - version = rocket_pc_read(core, VERSION); - version += rocket_pc_read(core, VERSION_NUM) & 0xffff; + version = rocket_pc_readl(core, VERSION); + version += rocket_pc_readl(core, VERSION_NUM) & 0xffff; pm_runtime_mark_last_busy(dev); pm_runtime_put_autosuspend(dev); @@ -90,4 +99,5 @@ void rocket_core_fini(struct rocket_core *core) { pm_runtime_dont_use_autosuspend(core->dev); pm_runtime_disable(core->dev); + rocket_job_fini(core); } diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h index 3bde8ad8e6e45f9000ee377d7a5ea9ca01f9ac53..1b21f07b6100fa0ea73cd3ef2accbb0b6fc8777a 100644 --- a/drivers/accel/rocket/rocket_core.h +++ b/drivers/accel/rocket/rocket_core.h @@ -37,6 +37,20 @@ struct rocket_core { void __iomem *core_iomem; struct clk *a_clk; struct clk *h_clk; + + struct rocket_job *in_flight_job; + + spinlock_t job_lock; + + struct { + struct workqueue_struct *wq; + struct work_struct work; + atomic_t pending; + } reset; + + struct drm_gpu_scheduler sched; + u64 fence_context; + u64 emit_seqno; }; int rocket_core_init(struct rocket_core *core); diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c index eb10bda13e695fb0c89c1e3464145cdc63748de1..14b4e95539d67b693c1810c5b5a89b6a16ca2232 100644 --- a/drivers/accel/rocket/rocket_device.c +++ b/drivers/accel/rocket/rocket_device.c @@ -32,12 +32,14 @@ int rocket_device_init(struct rocket_device *rdev) return err; mutex_init(&rdev->iommu_lock); + mutex_init(&rdev->sched_lock); return 0; } void rocket_device_fini(struct rocket_device *rdev) { + mutex_destroy(&rdev->sched_lock); mutex_destroy(&rdev->iommu_lock); rocket_core_fini(&rdev->cores[0]); } diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h index 7381a5148f291ef217c3b256f03d5ab357291688..01f6dc0b657f19eb4037557b0bf5a4c6fb653261 100644 --- a/drivers/accel/rocket/rocket_device.h +++ b/drivers/accel/rocket/rocket_device.h @@ -11,6 +11,8 @@ struct rocket_device { struct drm_device ddev; + struct mutex sched_lock; + struct clk *clk_npu; struct clk *pclk; diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index 4f346df06bcde5a24022bdb651c434d0c6e3c468..18ff051c336a14b7dda72d235faffb7a55a0a8ee 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -16,12 +16,14 @@ #include "rocket_drv.h" #include "rocket_gem.h" +#include "rocket_job.h" static int rocket_open(struct drm_device *dev, struct drm_file *file) { struct rocket_device *rdev = to_rocket_device(dev); struct rocket_file_priv *rocket_priv; + int ret; rocket_priv = kzalloc(sizeof(*rocket_priv), GFP_KERNEL); if (!rocket_priv) @@ -30,7 +32,15 @@ rocket_open(struct drm_device *dev, struct drm_file *file) rocket_priv->rdev = rdev; file->driver_priv = rocket_priv; + ret = rocket_job_open(rocket_priv); + if (ret) + goto err_free; + return 0; + +err_free: + kfree(rocket_priv); + return ret; } static void @@ -38,6 +48,7 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file) { struct rocket_file_priv *rocket_priv = file->driver_priv; + rocket_job_close(rocket_priv); kfree(rocket_priv); } @@ -46,6 +57,7 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) ROCKET_IOCTL(CREATE_BO, create_bo), + ROCKET_IOCTL(SUBMIT, submit), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); @@ -288,6 +300,9 @@ static int rocket_device_runtime_suspend(struct device *dev) if (core < 0) return -ENODEV; + if (!rocket_job_is_idle(&rdev->cores[core])) + return -EBUSY; + clk_disable_unprepare(rdev->cores[core].a_clk); clk_disable_unprepare(rdev->cores[core].h_clk); diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h index bd3a697ab7c8e378967ce638b04d7d86845b53c7..b4055cfad6bd431b7c59b0848653748ab945615c 100644 --- a/drivers/accel/rocket/rocket_drv.h +++ b/drivers/accel/rocket/rocket_drv.h @@ -4,10 +4,14 @@ #ifndef __ROCKET_DRV_H__ #define __ROCKET_DRV_H__ +#include + #include "rocket_device.h" struct rocket_file_priv { struct rocket_device *rdev; + + struct drm_sched_entity sched_entity; }; #endif diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c new file mode 100644 index 0000000000000000000000000000000000000000..aee6ebdb2bd227439449fdfcab3ce7d1e39cd4c4 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.c @@ -0,0 +1,723 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2019 Linaro, Ltd, Rob Herring */ +/* Copyright 2019 Collabora ltd. */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include +#include +#include +#include +#include + +#include "rocket_core.h" +#include "rocket_device.h" +#include "rocket_drv.h" +#include "rocket_job.h" +#include "rocket_registers.h" + +#define JOB_TIMEOUT_MS 500 + +static struct rocket_job * +to_rocket_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct rocket_job, base); +} + +struct rocket_fence { + struct dma_fence base; + struct drm_device *dev; + /* rocket seqno for signaled() test */ + u64 seqno; + int queue; +}; + +#define to_rocket_fence(dma_fence) \ + ((struct rocket_fence *)container_of(dma_fence, struct rocket_fence, base)) + +static const char *rocket_fence_get_driver_name(struct dma_fence *fence) +{ + return "rocket"; +} + +static const char *rocket_fence_get_timeline_name(struct dma_fence *fence) +{ + return "rockchip-npu"; +} + +static const struct dma_fence_ops rocket_fence_ops = { + .get_driver_name = rocket_fence_get_driver_name, + .get_timeline_name = rocket_fence_get_timeline_name, +}; + +static struct dma_fence *rocket_fence_create(struct rocket_core *core) +{ + struct rocket_device *rdev = core->rdev; + struct rocket_fence *fence; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + fence->dev = &rdev->ddev; + fence->seqno = ++core->emit_seqno; + dma_fence_init(&fence->base, &rocket_fence_ops, &core->job_lock, + core->fence_context, fence->seqno); + + return &fence->base; +} + +static int +rocket_copy_tasks(struct drm_device *dev, + struct drm_file *file_priv, + struct drm_rocket_job *job, + struct rocket_job *rjob) +{ + struct drm_rocket_task *tasks; + int ret = 0; + int i; + + rjob->task_count = job->task_count; + + if (!rjob->task_count) + return 0; + + tasks = kvmalloc_array(rjob->task_count, sizeof(*tasks), GFP_KERNEL); + if (!tasks) { + ret = -ENOMEM; + drm_dbg(dev, "Failed to allocate incoming tasks\n"); + goto fail; + } + + if (copy_from_user(tasks, + (void __user *)(uintptr_t)job->tasks, + rjob->task_count * sizeof(*tasks))) { + ret = -EFAULT; + drm_dbg(dev, "Failed to copy incoming tasks\n"); + goto fail; + } + + rjob->tasks = kvmalloc_array(job->task_count, sizeof(*rjob->tasks), GFP_KERNEL); + if (!rjob->tasks) { + drm_dbg(dev, "Failed to allocate task array\n"); + ret = -ENOMEM; + goto fail; + } + + for (i = 0; i < rjob->task_count; i++) { + if (tasks[i].reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_task struct should be 0.\n"); + return -EINVAL; + } + + if (tasks[i].regcmd_count == 0) { + ret = -EINVAL; + goto fail; + } + rjob->tasks[i].regcmd = tasks[i].regcmd; + rjob->tasks[i].regcmd_count = tasks[i].regcmd_count; + } + +fail: + kvfree(tasks); + return ret; +} + +static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_job *job) +{ + struct rocket_task *task; + bool task_pp_en = 1; + bool task_count = 1; + + /* GO ! */ + + /* Don't queue the job if a reset is in progress */ + if (!atomic_read(&core->reset.pending)) { + task = &job->tasks[job->next_task_idx]; + job->next_task_idx++; /* TODO: Do this only after a successful run? */ + + rocket_pc_writel(core, BASE_ADDRESS, 0x1); + + rocket_cna_writel(core, S_POINTER, 0xe + 0x10000000 * core->index); + rocket_core_writel(core, S_POINTER, 0xe + 0x10000000 * core->index); + + rocket_pc_writel(core, BASE_ADDRESS, task->regcmd); + rocket_pc_writel(core, REGISTER_AMOUNTS, (task->regcmd_count + 1) / 2 - 1); + + rocket_pc_writel(core, INTERRUPT_MASK, + PC_INTERRUPT_MASK_DPU_0 | PC_INTERRUPT_MASK_DPU_1); + rocket_pc_writel(core, INTERRUPT_CLEAR, + PC_INTERRUPT_CLEAR_DPU_0 | PC_INTERRUPT_CLEAR_DPU_1); + + rocket_pc_writel(core, TASK_CON, ((0x6 | task_pp_en) << 12) | task_count); + + rocket_pc_writel(core, TASK_DMA_BASE_ADDR, 0x0); + + rocket_pc_writel(core, OPERATION_ENABLE, 0x1); + + dev_dbg(core->dev, + "Submitted regcmd at 0x%llx to core %d", + task->regcmd, core->index); + } +} + +static int rocket_acquire_object_fences(struct drm_gem_object **bos, + int bo_count, + struct drm_sched_job *job, + bool is_write) +{ + int i, ret; + + for (i = 0; i < bo_count; i++) { + ret = dma_resv_reserve_fences(bos[i]->resv, 1); + if (ret) + return ret; + + ret = drm_sched_job_add_implicit_dependencies(job, bos[i], + is_write); + if (ret) + return ret; + } + + return 0; +} + +static void rocket_attach_object_fences(struct drm_gem_object **bos, + int bo_count, + struct dma_fence *fence) +{ + int i; + + for (i = 0; i < bo_count; i++) + dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE); +} + +static int rocket_job_push(struct rocket_job *job) +{ + struct rocket_device *rdev = job->rdev; + struct drm_gem_object **bos; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *), + GFP_KERNEL); + memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *)); + memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *)); + + ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); + if (ret) + goto err; + + mutex_lock(&rdev->sched_lock); + drm_sched_job_arm(&job->base); + + job->inference_done_fence = dma_fence_get(&job->base.s_fence->finished); + + ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count, &job->base, false); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count, &job->base, true); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + kref_get(&job->refcount); /* put by scheduler job completion */ + + drm_sched_entity_push_job(&job->base); + + mutex_unlock(&rdev->sched_lock); + + rocket_attach_object_fences(job->out_bos, job->out_bo_count, job->inference_done_fence); + +err_unlock: + drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); +err: + kfree(bos); + + return ret; +} + +static void rocket_job_cleanup(struct kref *ref) +{ + struct rocket_job *job = container_of(ref, struct rocket_job, + refcount); + unsigned int i; + + dma_fence_put(job->done_fence); + dma_fence_put(job->inference_done_fence); + + if (job->in_bos) { + for (i = 0; i < job->in_bo_count; i++) + drm_gem_object_put(job->in_bos[i]); + + kvfree(job->in_bos); + } + + if (job->out_bos) { + for (i = 0; i < job->out_bo_count; i++) + drm_gem_object_put(job->out_bos[i]); + + kvfree(job->out_bos); + } + + kfree(job->tasks); + + kfree(job); +} + +static void rocket_job_put(struct rocket_job *job) +{ + kref_put(&job->refcount, rocket_job_cleanup); +} + +static void rocket_job_free(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + + drm_sched_job_cleanup(sched_job); + + rocket_job_put(job); +} + +static struct rocket_core *sched_to_core(struct rocket_device *rdev, + struct drm_gpu_scheduler *sched) +{ + unsigned int core; + + for (core = 0; core < rdev->num_cores; core++) { + if (&rdev->cores[core].sched == sched) + return &rdev->cores[core]; + } + + return NULL; +} + +static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + struct dma_fence *fence = NULL; + int ret; + + if (unlikely(job->base.s_fence->finished.error)) + return NULL; + + /* + * Nothing to execute: can happen if the job has finished while + * we were resetting the GPU. + */ + if (job->next_task_idx == job->task_count) + return NULL; + + fence = rocket_fence_create(core); + if (IS_ERR(fence)) + return fence; + + if (job->done_fence) + dma_fence_put(job->done_fence); + job->done_fence = dma_fence_get(fence); + + ret = pm_runtime_get_sync(core->dev); + if (ret < 0) + return fence; + + spin_lock(&core->job_lock); + + core->in_flight_job = job; + rocket_job_hw_submit(core, job); + + spin_unlock(&core->job_lock); + + return fence; +} + +static void rocket_job_handle_done(struct rocket_core *core, + struct rocket_job *job) +{ + if (job->next_task_idx < job->task_count) { + rocket_job_hw_submit(core, job); + return; + } + + core->in_flight_job = NULL; + dma_fence_signal_locked(job->done_fence); + pm_runtime_put_autosuspend(core->dev); +} + +static void rocket_job_handle_irq(struct rocket_core *core) +{ + u32 status, raw_status; + + pm_runtime_mark_last_busy(core->dev); + + status = rocket_pc_readl(core, INTERRUPT_STATUS); + raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS); + + rocket_pc_writel(core, OPERATION_ENABLE, 0x0); + rocket_pc_writel(core, INTERRUPT_CLEAR, 0x1ffff); + + spin_lock(&core->job_lock); + + if (core->in_flight_job) + rocket_job_handle_done(core, core->in_flight_job); + + spin_unlock(&core->job_lock); +} + +static void +rocket_reset(struct rocket_core *core, struct drm_sched_job *bad) +{ + bool cookie; + + if (!atomic_read(&core->reset.pending)) + return; + + /* + * Stop the scheduler. + * + * FIXME: We temporarily get out of the dma_fence_signalling section + * because the cleanup path generate lockdep splats when taking locks + * to release job resources. We should rework the code to follow this + * pattern: + * + * try_lock + * if (locked) + * release + * else + * schedule_work_to_release_later + */ + drm_sched_stop(&core->sched, bad); + + cookie = dma_fence_begin_signalling(); + + if (bad) + drm_sched_increase_karma(bad); + + /* + * Mask job interrupts and synchronize to make sure we won't be + * interrupted during our reset. + */ + rocket_pc_writel(core, INTERRUPT_MASK, 0x0); + synchronize_irq(core->irq); + + /* Handle the remaining interrupts before we reset. */ + rocket_job_handle_irq(core); + + /* + * Remaining interrupts have been handled, but we might still have + * stuck jobs. Let's make sure the PM counters stay balanced by + * manually calling pm_runtime_put_noidle() and + * rocket_devfreq_record_idle() for each stuck job. + * Let's also make sure the cycle counting register's refcnt is + * kept balanced to prevent it from running forever + */ + spin_lock(&core->job_lock); + if (core->in_flight_job) + pm_runtime_put_noidle(core->dev); + + core->in_flight_job = NULL; + spin_unlock(&core->job_lock); + + /* Proceed with reset now. */ + pm_runtime_force_suspend(core->dev); + pm_runtime_force_resume(core->dev); + + /* GPU has been reset, we can clear the reset pending bit. */ + atomic_set(&core->reset.pending, 0); + + /* + * Now resubmit jobs that were previously queued but didn't have a + * chance to finish. + * FIXME: We temporarily get out of the DMA fence signalling section + * while resubmitting jobs because the job submission logic will + * allocate memory with the GFP_KERNEL flag which can trigger memory + * reclaim and exposes a lock ordering issue. + */ + dma_fence_end_signalling(cookie); + drm_sched_resubmit_jobs(&core->sched); + cookie = dma_fence_begin_signalling(); + + /* Restart the scheduler */ + drm_sched_start(&core->sched, 0); + + dma_fence_end_signalling(cookie); +} + +static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + + /* + * If the GPU managed to complete this jobs fence, the timeout is + * spurious. Bail out. + */ + if (dma_fence_is_signaled(job->done_fence)) + return DRM_GPU_SCHED_STAT_NOMINAL; + + /* + * Rocket IRQ handler may take a long time to process an interrupt + * if there is another IRQ handler hogging the processing. + * For example, the HDMI encoder driver might be stuck in the IRQ + * handler for a significant time in a case of bad cable connection. + * In order to catch such cases and not report spurious rocket + * job timeouts, synchronize the IRQ handler and re-check the fence + * status. + */ + synchronize_irq(core->irq); + + if (dma_fence_is_signaled(job->done_fence)) { + dev_warn(core->dev, "unexpectedly high interrupt latency\n"); + return DRM_GPU_SCHED_STAT_NOMINAL; + } + + dev_err(core->dev, "gpu sched timeout"); + + atomic_set(&core->reset.pending, 1); + rocket_reset(core, sched_job); + + return DRM_GPU_SCHED_STAT_NOMINAL; +} + +static void rocket_reset_work(struct work_struct *work) +{ + struct rocket_core *core; + + core = container_of(work, struct rocket_core, reset.work); + rocket_reset(core, NULL); +} + +static const struct drm_sched_backend_ops rocket_sched_ops = { + .run_job = rocket_job_run, + .timedout_job = rocket_job_timedout, + .free_job = rocket_job_free +}; + +static irqreturn_t rocket_job_irq_handler_thread(int irq, void *data) +{ + struct rocket_core *core = data; + + rocket_job_handle_irq(core); + + return IRQ_HANDLED; +} + +static irqreturn_t rocket_job_irq_handler(int irq, void *data) +{ + struct rocket_core *core = data; + u32 raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS); + + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + + if (!(raw_status & PC_INTERRUPT_RAW_STATUS_DPU_0 || + raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1)) + return IRQ_NONE; + + rocket_pc_writel(core, INTERRUPT_MASK, 0x0); + + return IRQ_WAKE_THREAD; +} + +int rocket_job_init(struct rocket_core *core) +{ + struct drm_sched_init_args args = { + .ops = &rocket_sched_ops, + .num_rqs = DRM_SCHED_PRIORITY_COUNT, + .credit_limit = 1, + .timeout = msecs_to_jiffies(JOB_TIMEOUT_MS), + .name = dev_name(core->dev), + .dev = core->dev, + }; + int ret; + + INIT_WORK(&core->reset.work, rocket_reset_work); + spin_lock_init(&core->job_lock); + + core->irq = platform_get_irq(to_platform_device(core->dev), 0); + if (core->irq < 0) + return core->irq; + + ret = devm_request_threaded_irq(core->dev, core->irq, + rocket_job_irq_handler, + rocket_job_irq_handler_thread, + IRQF_SHARED, KBUILD_MODNAME "-job", + core); + if (ret) { + dev_err(core->dev, "failed to request job irq"); + return ret; + } + + core->reset.wq = alloc_ordered_workqueue("rocket-reset-%d", 0, core->index); + if (!core->reset.wq) + return -ENOMEM; + + core->fence_context = dma_fence_context_alloc(1); + + args.timeout_wq = core->reset.wq; + ret = drm_sched_init(&core->sched, &args); + if (ret) { + dev_err(core->dev, "Failed to create scheduler: %d.", ret); + goto err_sched; + } + + return 0; + +err_sched: + drm_sched_fini(&core->sched); + + destroy_workqueue(core->reset.wq); + return ret; +} + +void rocket_job_fini(struct rocket_core *core) +{ + drm_sched_fini(&core->sched); + + cancel_work_sync(&core->reset.work); + destroy_workqueue(core->reset.wq); +} + +int rocket_job_open(struct rocket_file_priv *rocket_priv) +{ + struct rocket_device *rdev = rocket_priv->rdev; + struct drm_gpu_scheduler **scheds = kmalloc_array(rdev->num_cores, sizeof(scheds), + GFP_KERNEL); + unsigned int core; + int ret; + + for (core = 0; core < rdev->num_cores; core++) + scheds[core] = &rdev->cores[core].sched; + + ret = drm_sched_entity_init(&rocket_priv->sched_entity, + DRM_SCHED_PRIORITY_NORMAL, + scheds, + rdev->num_cores, NULL); + if (WARN_ON(ret)) + return ret; + + return 0; +} + +void rocket_job_close(struct rocket_file_priv *rocket_priv) +{ + struct drm_sched_entity *entity = &rocket_priv->sched_entity; + + kfree(entity->sched_list); + drm_sched_entity_destroy(entity); +} + +int rocket_job_is_idle(struct rocket_core *core) +{ + /* If there are any jobs in this HW queue, we're not idle */ + if (atomic_read(&core->sched.credit_count)) + return false; + + return true; +} + +static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file, + struct drm_rocket_job *job) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct rocket_file_priv *file_priv = file->driver_priv; + struct rocket_job *rjob = NULL; + int ret = 0; + + if (job->task_count == 0) + return -EINVAL; + + rjob = kzalloc(sizeof(*rjob), GFP_KERNEL); + if (!rjob) + return -ENOMEM; + + kref_init(&rjob->refcount); + + rjob->rdev = rdev; + + ret = drm_sched_job_init(&rjob->base, + &file_priv->sched_entity, + 1, NULL); + if (ret) + goto out_put_job; + + ret = rocket_copy_tasks(dev, file, job, rjob); + if (ret) + goto out_cleanup_job; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->in_bo_handles, + job->in_bo_handle_count, &rjob->in_bos); + if (ret) + goto out_cleanup_job; + + rjob->in_bo_count = job->in_bo_handle_count; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->out_bo_handles, + job->out_bo_handle_count, &rjob->out_bos); + if (ret) + goto out_cleanup_job; + + rjob->out_bo_count = job->out_bo_handle_count; + + ret = rocket_job_push(rjob); + if (ret) + goto out_cleanup_job; + +out_cleanup_job: + if (ret) + drm_sched_job_cleanup(&rjob->base); +out_put_job: + rocket_job_put(rjob); + + return ret; +} + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_submit *args = data; + struct drm_rocket_job *jobs; + int ret = 0; + unsigned int i = 0; + + if (args->reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_submit struct should be 0.\n"); + return -EINVAL; + } + + jobs = kvmalloc_array(args->job_count, sizeof(*jobs), GFP_KERNEL); + if (!jobs) { + drm_dbg(dev, "Failed to allocate incoming job array\n"); + return -ENOMEM; + } + + if (copy_from_user(jobs, + (void __user *)(uintptr_t)args->jobs, + args->job_count * sizeof(*jobs))) { + ret = -EFAULT; + drm_dbg(dev, "Failed to copy incoming job array\n"); + goto exit; + } + + for (i = 0; i < args->job_count; i++) { + if (jobs[i].reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_job struct should be 0.\n"); + return -EINVAL; + } + + rocket_ioctl_submit_job(dev, file, &jobs[i]); + } + +exit: + kfree(jobs); + + return ret; +} diff --git a/drivers/accel/rocket/rocket_job.h b/drivers/accel/rocket/rocket_job.h new file mode 100644 index 0000000000000000000000000000000000000000..99e1928fbd89f9b506c63bf9dd591124feeb54b5 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_JOB_H__ +#define __ROCKET_JOB_H__ + +#include +#include + +#include "rocket_core.h" +#include "rocket_drv.h" + +struct rocket_task { + u64 regcmd; + u32 regcmd_count; +}; + +struct rocket_job { + struct drm_sched_job base; + + struct rocket_device *rdev; + + struct drm_gem_object **in_bos; + struct drm_gem_object **out_bos; + + u32 in_bo_count; + u32 out_bo_count; + + struct rocket_task *tasks; + u32 task_count; + u32 next_task_idx; + + /* Fence to be signaled by drm-sched once its done with the job */ + struct dma_fence *inference_done_fence; + + /* Fence to be signaled by IRQ handler when the job is complete. */ + struct dma_fence *done_fence; + + struct kref refcount; +}; + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_job_init(struct rocket_core *core); +void rocket_job_fini(struct rocket_core *core); +int rocket_job_open(struct rocket_file_priv *rocket_priv); +void rocket_job_close(struct rocket_file_priv *rocket_priv); +int rocket_job_is_idle(struct rocket_core *core); + +#endif diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index 95720702b7c4413d72b89c1f0f59abb22dc8c6b3..cb1b5934c201160e7650aabd1b3a2b1c77b1fd7b 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -12,8 +12,10 @@ extern "C" { #endif #define DRM_ROCKET_CREATE_BO 0x00 +#define DRM_ROCKET_SUBMIT 0x01 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) +#define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -37,6 +39,68 @@ struct drm_rocket_create_bo { __u64 offset; }; +/** + * struct drm_rocket_task - A task to be run on the NPU + * + * A task is the smallest unit of work that can be run on the NPU. + */ +struct drm_rocket_task { + /** Input: DMA address to NPU mapping of register command buffer */ + __u64 regcmd; + + /** Input: Number of commands in the register command buffer */ + __u32 regcmd_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + +/** + * struct drm_rocket_job - A job to be run on the NPU + * + * The kernel will schedule the execution of this job taking into account its + * dependencies with other jobs. All tasks in the same job will be executed + * sequentially on the same core, to benefit from memory residency in SRAM. + */ +struct drm_rocket_job { + /** Input: Pointer to an array of struct drm_rocket_task. */ + __u64 tasks; + + /** Input: Pointer to a u32 array of the BOs that are read by the job. */ + __u64 in_bo_handles; + + /** Input: Pointer to a u32 array of the BOs that are written to by the job. */ + __u64 out_bo_handles; + + /** Input: Number of tasks passed in. */ + __u32 task_count; + + /** Input: Number of input BO handles passed in (size is that times 4). */ + __u32 in_bo_handle_count; + + /** Input: Number of output BO handles passed in (size is that times 4). */ + __u32 out_bo_handle_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + +/** + * struct drm_rocket_submit - ioctl argument for submitting commands to the NPU. + * + * The kernel will schedule the execution of these jobs in dependency order. + */ +struct drm_rocket_submit { + /** Input: Pointer to an array of struct drm_rocket_job. */ + __u64 jobs; + + /** Input: Number of jobs passed in. */ + __u32 job_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + #if defined(__cplusplus) } #endif From patchwork Fri May 16 16:53:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 890777 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8563202C26; Fri, 16 May 2025 16:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414431; cv=none; b=ENbeUHiKbkP8jkdcz9CiXOJ4uq/OrZlX4xHCo333M4PN1sZCZVlen38hXUF/fH3vYLJxzKPlLsCkSIN/DQHn6gaNIcJp853Gw/skzsOjPERfSthFxRXvy+U5ccQnAfre31xkC6wDP51bYSA+7b/jx4ZyLBkqBKhVnXRvi/5bhm8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747414431; c=relaxed/simple; bh=q/Cq8Ko0xIVDfVyvCtxXclzryhAsQiJ6NK8PK5J4bdc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=mHt9YyR7qjS8C59Jb5aiqyfa4zLt+Kfg6E+awTyHDF+IysYf++uaT6G3wvKoxFeRIVUbvle2oRvXv5AW7kopGKwBETmIDf627bYHPju8Fxrn9t19+tF+z3ANECuGgInWZzqBoW/U3zUFJD11TIeRd/SddqjiR/iEBk4fTHlyTik= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-442d146a1aaso20749435e9.1; Fri, 16 May 2025 09:53:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747414424; x=1748019224; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hkJKBkDUJRJ7bEb2VzWEqCdNMs8+V5XyH/wwMnfIh5o=; b=vK20A8EfeHnfv0T0B4uVYkM4hcQlkoVP7adjo3V5HsSwYrFPBtsNg6HQSf8lm55CMh /Adm4AJcA5GHpGbfQCpbogeNNv/zAFFNKEmoiabqA0s6W/llIGQa2LzLojiNKChGkP6b IFftVFppbhASpPhJtX1beVUufvrcVJ5VRe03jFChlwGN2KqTWGH2aKMjtj0Ad9DkNhMh e791Go3S3sAuP6ii4vwN3SXCL2NBbGlxnC1tG8u1ZeKZCIloPWygNrlUfM491SJkhOXD Yz98meLg6vL0W0iPVwE8mnabuLZR3Hl8kRvj7GtIaGo/mjhfrhxlt+e55YYSaAQbno3H Q62w== X-Forwarded-Encrypted: i=1; AJvYcCVqBxDlNBEtEVfS/MuBQmr56R47I7soE6CQM4XWEG+zsNJwVkF+eaoYhYISeh+uqQ1QxfavqHds+g3ZRfs=@vger.kernel.org, AJvYcCVrZXm+/ChKrURIlaI3a5aoiD7f/nSzMPHgclOgNpOMQ7WryoZYB/tv6bCzD2OLt+/K7scLRXbRmti1urkO@vger.kernel.org, AJvYcCXT3CF0yLC0f97bUWvG/QBPaUP2R7c8wlG6hJudWWpu19E9j9vNXAjaBjnFeNHsvORNUWqvr/3eT14=@vger.kernel.org X-Gm-Message-State: AOJu0Yy77uhEyatWqj28fR+AGV5yklV4FmsZm47jGJ5XoYMKHPkH8y5s 4dheWiGHoiTvxZHpgQKiEAzw5rneIRCtDkpXTgeFl0nz0WYc0JlBJQzH X-Gm-Gg: ASbGnctBACqe09hV1+4KSUxb4X7G7etWGuxYufJCc5Tf4AzBkXu3EjLQJwwtLaRabFe TZUwXP5njSRVXbEwLGVKIVE+ZVGVDVPIJ27mN9lJjAr6QI2UXyZg2CTEVyrxy1s9WvmJaQ47N/4 IRtX5YecY2Gsj1VG6bKay8Y6za+wH+4Mvk9oMJCNC/febVymMEPcojOM9dzzoP/djXksNE3SSVB 290vhEnjjAmDcgrvSfSVttJlKBkw83C6eRWh5QA1q/OfAUkUO1dAqIAHTlKl3CNf3Nq8Jp6H5+v f2dy4j7kj/jGDnpAJpTZ8zehz1XRUwTrCkP/EFkIMxX5yAOWF59F2w1rYqHgcUDFkRJmfNkI0Pp Nf3ruObCctg== X-Google-Smtp-Source: AGHT+IGw5JIgx7HqIbGrEYnlEsefAk1beGSNisgnRK7qNU0koGoWWuj6HFwKCOCyrDxBWxPyXARd6A== X-Received: by 2002:a05:600c:3ba1:b0:43d:762:e0c4 with SMTP id 5b1f17b1804b1-442ff03bbaemr34135475e9.27.1747414423883; Fri, 16 May 2025 09:53:43 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442f3380498sm116511755e9.11.2025.05.16.09.53.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 May 2025 09:53:43 -0700 (PDT) From: Tomeu Vizoso Date: Fri, 16 May 2025 18:53:22 +0200 Subject: [PATCH v3 08/10] accel/rocket: Add IOCTLs for synchronizing memory accesses Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250516-6-10-rocket-v3-8-7051ac9225db@tomeuvizoso.net> References: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> In-Reply-To: <20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 The NPU cores have their own access to the memory bus, and this isn't cache coherent with the CPUs. Add IOCTLs so userspace can mark when the caches need to be flushed, and also when a writer job needs to be waited for before the buffer can be accessed from the CPU. Initially based on the same IOCTLs from the Etnaviv driver. v2: - Don't break UABI by reordering the IOCTL IDs (Jeff Hugo) v3: - Check that padding fields in IOCTLs are zero (Jeff Hugo) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/rocket_drv.c | 2 + drivers/accel/rocket/rocket_gem.c | 80 +++++++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_gem.h | 5 +++ include/uapi/drm/rocket_accel.h | 37 ++++++++++++++++++ 4 files changed, 124 insertions(+) diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index 18ff051c336a14b7dda72d235faffb7a55a0a8ee..75ccf6e14b2bed80005a70b8cc06925b7c3ac405 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -58,6 +58,8 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { ROCKET_IOCTL(CREATE_BO, create_bo), ROCKET_IOCTL(SUBMIT, submit), + ROCKET_IOCTL(PREP_BO, prep_bo), + ROCKET_IOCTL(FINI_BO, fini_bo), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c index 8a8a7185daac4740081293aae6945c9b2bbeb2dd..cdc5238a93fa5978129dc1ac8ec8de955160dc18 100644 --- a/drivers/accel/rocket/rocket_gem.c +++ b/drivers/accel/rocket/rocket_gem.c @@ -129,3 +129,83 @@ int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file * return ret; } + +static inline enum dma_data_direction rocket_op_to_dma_dir(u32 op) +{ + if (op & ROCKET_PREP_READ) + return DMA_FROM_DEVICE; + else if (op & ROCKET_PREP_WRITE) + return DMA_TO_DEVICE; + else + return DMA_BIDIRECTIONAL; +} + +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_prep_bo *args = data; + unsigned long timeout = drm_timeout_abs_to_jiffies(args->timeout_ns); + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_gem_object *gem_obj; + struct drm_gem_shmem_object *shmem_obj; + bool write = !!(args->op & ROCKET_PREP_WRITE); + long ret = 0; + + if (args->op & ~(ROCKET_PREP_READ | ROCKET_PREP_WRITE)) + return -EINVAL; + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + ret = dma_resv_wait_timeout(gem_obj->resv, dma_resv_usage_rw(write), + true, timeout); + if (!ret) + ret = timeout ? -ETIMEDOUT : -EBUSY; + + shmem_obj = &to_rocket_bo(gem_obj)->base; + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_cpu(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(args->op)); + } + + to_rocket_bo(gem_obj)->last_cpu_prep_op = args->op; + + drm_gem_object_put(gem_obj); + + return ret; +} + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_rocket_fini_bo *args = data; + struct drm_gem_shmem_object *shmem_obj; + struct rocket_gem_object *rkt_obj; + struct drm_gem_object *gem_obj; + + if (args->reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_fini_bo struct should be 0.\n"); + return -EINVAL; + } + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + rkt_obj = to_rocket_bo(gem_obj); + shmem_obj = &rkt_obj->base; + + WARN_ON(rkt_obj->last_cpu_prep_op == 0); + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(rkt_obj->last_cpu_prep_op)); + } + + rkt_obj->last_cpu_prep_op = 0; + + drm_gem_object_put(gem_obj); + + return 0; +} diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h index 41497554366961cfe18cf6c7e93ab1e4e5dc1886..2caa268f7f496f782996c6ad2c4eb851a225a86f 100644 --- a/drivers/accel/rocket/rocket_gem.h +++ b/drivers/accel/rocket/rocket_gem.h @@ -11,12 +11,17 @@ struct rocket_gem_object { size_t size; u32 offset; + u32 last_cpu_prep_op; }; struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size); int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file); +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file); + static inline struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj) { diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index cb1b5934c201160e7650aabd1b3a2b1c77b1fd7b..b5c80dd767be56e9720b51e4a82617a425a881a1 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -13,9 +13,13 @@ extern "C" { #define DRM_ROCKET_CREATE_BO 0x00 #define DRM_ROCKET_SUBMIT 0x01 +#define DRM_ROCKET_PREP_BO 0x02 +#define DRM_ROCKET_FINI_BO 0x03 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) #define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) +#define DRM_IOCTL_ROCKET_PREP_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_PREP_BO, struct drm_rocket_prep_bo) +#define DRM_IOCTL_ROCKET_FINI_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_FINI_BO, struct drm_rocket_fini_bo) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -39,6 +43,39 @@ struct drm_rocket_create_bo { __u64 offset; }; +#define ROCKET_PREP_READ 0x01 +#define ROCKET_PREP_WRITE 0x02 + +/** + * struct drm_rocket_prep_bo - ioctl argument for starting CPU ownership of the BO. + * + * Takes care of waiting for any NPU jobs that might still use the NPU and performs cache + * synchronization. + */ +struct drm_rocket_prep_bo { + /** Input: GEM handle of the buffer object. */ + __u32 handle; + + /** Input: mask of ROCKET_PREP_x, direction of the access. */ + __u32 op; + + /** Input: Amount of time to wait for NPU jobs. */ + __s64 timeout_ns; +}; + +/** + * struct drm_rocket_fini_bo - ioctl argument for finishing CPU ownership of the BO. + * + * Synchronize caches for NPU access. + */ +struct drm_rocket_fini_bo { + /** Input: GEM handle of the buffer object. */ + __u32 handle; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + /** * struct drm_rocket_task - A task to be run on the NPU *