From patchwork Tue Sep 19 23:34:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 724417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A988CE79AD for ; Tue, 19 Sep 2023 23:36:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233524AbjISXgP (ORCPT ); Tue, 19 Sep 2023 19:36:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229935AbjISXgN (ORCPT ); Tue, 19 Sep 2023 19:36:13 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 586C2C6; Tue, 19 Sep 2023 16:36:07 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 6AE646606F85; Wed, 20 Sep 2023 00:36:05 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1695166565; bh=40cxse3JQ9cf/lsBsQtEwOdtU5azU2vw7zoB5H2GdF8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=C8j7q0NoShp33ewoqS/bCOJPwycWv+TT1lhUQRGUIPNZ82Kc+QLj93juuM22dtMSW GAmi2b4qx576lFXfIlc36ry7QJEn4wFB71l1C4WWwEb0GODqqRVctXYG4cjvGTodbN 36YJoXnQ16SHVzc6Zdtx9Vs8iqVVqICc40xzA1KKYXqmsTSSj1jrMAIsMn9GFQHByz V3AqUK9dOPrkkUBmccFiS935dw/OG+5+cwME0GJeNyN+onC8k6zt8C6UxFXaiyQu5p iYNZfsyXIgqj6bXHLRFH6aEARBTHxwIn8+J1RmNYf4s+rh6j+GM8AQ72JjywSLHmxO 9eaH+y1hsl+xA== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon Subject: [PATCH v6 1/6] drm/panfrost: Add cycle count GPU register definitions Date: Wed, 20 Sep 2023 00:34:49 +0100 Message-ID: <20230919233556.1458793-2-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230919233556.1458793-1-adrian.larumbe@collabora.com> References: <20230919233556.1458793-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org These GPU registers will be used when programming the cycle counter, which we need for providing accurate fdinfo drm-cycles values to user space. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_regs.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h index 919f44ac853d..55ec807550b3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_regs.h +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h @@ -46,6 +46,8 @@ #define GPU_CMD_SOFT_RESET 0x01 #define GPU_CMD_PERFCNT_CLEAR 0x03 #define GPU_CMD_PERFCNT_SAMPLE 0x04 +#define GPU_CMD_CYCLE_COUNT_START 0x05 +#define GPU_CMD_CYCLE_COUNT_STOP 0x06 #define GPU_CMD_CLEAN_CACHES 0x07 #define GPU_CMD_CLEAN_INV_CACHES 0x08 #define GPU_STATUS 0x34 @@ -73,6 +75,9 @@ #define GPU_PRFCNT_TILER_EN 0x74 #define GPU_PRFCNT_MMU_L2_EN 0x7c +#define GPU_CYCLE_COUNT_LO 0x90 +#define GPU_CYCLE_COUNT_HI 0x94 + #define GPU_THREAD_MAX_THREADS 0x0A0 /* (RO) Maximum number of threads per core */ #define GPU_THREAD_MAX_WORKGROUP_SIZE 0x0A4 /* (RO) Maximum workgroup size */ #define GPU_THREAD_MAX_BARRIER_SIZE 0x0A8 /* (RO) Maximum threads waiting at a barrier */ From patchwork Tue Sep 19 23:34:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 724416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FC3CCE79AE for ; Tue, 19 Sep 2023 23:36:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229521AbjISXgQ (ORCPT ); Tue, 19 Sep 2023 19:36:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233061AbjISXgN (ORCPT ); Tue, 19 Sep 2023 19:36:13 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDB93CE; Tue, 19 Sep 2023 16:36:07 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 34ADE6607083; Wed, 20 Sep 2023 00:36:06 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1695166566; bh=WfmVqHp4HoJXfVeYE4CYy4eKkMUnawX/XEzu8rruf8M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M5CB5UvA5bECx5iHWKHHROxLc3ssUZ7NVNy2A6FCxMFv9Mw9/nfsItnoj7fMdcrnc QJ9p7so/hyytmlx+/JmFvlBUT5ncB7u9kuN8YHVPhcTJ9yFJnLm2/xY5a3VfuyDj9I m63jg2JFP6eCw6QqgPc995qVwseFtiO9NeXUs44r30NHe4Q1hzOly0hGG8fjWnm0ev VwpRGcASccjmvsPZKiSWeffEE2A868udfqBsiPRMSTfVoYqIhvSOTlni8Fzm4pO6Lo dmWxnP81ZOfxSuUJXbJyCsWc8FSzo7DK6m/xVV+/aM6Hyg4bm09M0sOWpGOVkaCYKR RcTwDmepl/6uA== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon Subject: [PATCH v6 4/6] drm/drm_file: Add DRM obj's RSS reporting function for fdinfo Date: Wed, 20 Sep 2023 00:34:52 +0100 Message-ID: <20230919233556.1458793-5-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230919233556.1458793-1-adrian.larumbe@collabora.com> References: <20230919233556.1458793-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Some BO's might be mapped onto physical memory chunkwise and on demand, like Panfrost's tiler heap. In this case, even though the drm_gem_shmem_object page array might already be allocated, only a very small fraction of the BO is currently backed by system memory, but drm_show_memory_stats will then proceed to add its entire virtual size to the file's total resident size regardless. This led to very unrealistic RSS sizes being reckoned for Panfrost, where said tiler heap buffer is initially allocated with a virtual size of 128 MiB, but only a small part of it will eventually be backed by system memory after successive GPU page faults. Provide a new DRM object generic function that would allow drivers to return a more accurate RSS size for their BOs. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/drm_file.c | 5 ++++- include/drm/drm_gem.h | 9 +++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 883d83bc0e3d..762965e3d503 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) } if (s & DRM_GEM_OBJECT_RESIDENT) { - status.resident += obj->size; + if (obj->funcs && obj->funcs->rss) + status.resident += obj->funcs->rss(obj); + else + status.resident += obj->size; } else { /* If already purged or not yet backed by pages, don't * count it as purgeable: diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index bc9f6aa2f3fe..16364487fde9 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { */ enum drm_gem_object_status (*status)(struct drm_gem_object *obj); + /** + * @rss: + * + * Return resident size of the object in physical memory. + * + * Called by drm_show_memory_stats(). + */ + size_t (*rss)(struct drm_gem_object *obj); + /** * @vm_ops: * From patchwork Tue Sep 19 23:34:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 724415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F7B8CE79A8 for ; Tue, 19 Sep 2023 23:36:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233569AbjISXgR (ORCPT ); Tue, 19 Sep 2023 19:36:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233527AbjISXgP (ORCPT ); Tue, 19 Sep 2023 19:36:15 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D289C0; Tue, 19 Sep 2023 16:36:10 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 7D4B566071A3; Wed, 20 Sep 2023 00:36:06 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1695166566; bh=D8FJsJmU8pGFx+LOiGoEx6ey1t/Dq5oamCR5JeST1eU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YCYqfPyetEQv6KYvkVIOrIdkxnPy2Hcc2l9PSOFksuwp59HyGtjFay7IV5ca4jjGY JGFH5A0wMriM+Ew5F3IZXMpjTDtDFrobT4KwMs34gy9LcqaAcjUfEt+KC+JK4je3BT bb+tF1+EwCr5G5yh2VH3KEBgOGx80IJsbuQXkzO+UxcB814wKf0YU1dUpXcvyeWOv0 +d67z+WcaJecv+45WCTmYYK1fm4sHFIijzaPs6pYwo8F1v6xZeI6E4OKCiW0et7obi +b0zUzgPl3JqPtXmyx/KMPseKccsOQImlMKkj3k+k2qCSNzm+nk2lgbpBL+ZJlLZ1v POf6zVZGikmNQ== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon Subject: [PATCH v6 5/6] drm/panfrost: Implement generic DRM object RSS reporting function Date: Wed, 20 Sep 2023 00:34:53 +0100 Message-ID: <20230919233556.1458793-6-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230919233556.1458793-1-adrian.larumbe@collabora.com> References: <20230919233556.1458793-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org BO's RSS is updated every time new pages are allocated on demand and mapped for the object at GPU page fault's IRQ handler, but only for heap buffers. The reason this is unnecessary for non-heap buffers is that they are mapped onto the GPU's VA space and backed by physical memory in their entirety at BO creation time. This calculation is unnecessary for imported PRIME objects, since heap buffers cannot be exported by our driver, and the actual BO RSS size is the one reported in its attached dmabuf structure. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_gem.c | 15 +++++++++++++++ drivers/gpu/drm/panfrost/panfrost_gem.h | 5 +++++ drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 + 3 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index 7d8f83d20539..4365434b48db 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -208,6 +208,20 @@ static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj return res; } +static size_t panfrost_gem_rss(struct drm_gem_object *obj) +{ + struct panfrost_gem_object *bo = to_panfrost_bo(obj); + + if (bo->is_heap) { + return bo->heap_rss_size; + } else if (bo->base.pages) { + WARN_ON(bo->heap_rss_size); + return bo->base.base.size; + } else { + return 0; + } +} + static const struct drm_gem_object_funcs panfrost_gem_funcs = { .free = panfrost_gem_free_object, .open = panfrost_gem_open, @@ -220,6 +234,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = { .vunmap = drm_gem_shmem_object_vunmap, .mmap = drm_gem_shmem_object_mmap, .status = panfrost_gem_status, + .rss = panfrost_gem_rss, .vm_ops = &drm_gem_shmem_vm_ops, }; diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h index ad2877eeeccd..13c0a8149c3a 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.h +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h @@ -36,6 +36,11 @@ struct panfrost_gem_object { */ atomic_t gpu_usecount; + /* + * Object chunk size currently mapped onto physical memory + */ + size_t heap_rss_size; + bool noexec :1; bool is_heap :1; }; diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index d54d4e7b2195..7b1490cdaa48 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -522,6 +522,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); bomapping->active = true; + bo->heap_rss_size += SZ_2; dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);