From patchwork Tue Sep 5 18:45:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 601FBC83F3E for ; Tue, 5 Sep 2023 19:07:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232982AbjIETH4 (ORCPT ); Tue, 5 Sep 2023 15:07:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241782AbjIETHt (ORCPT ); Tue, 5 Sep 2023 15:07:49 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45DB8CFD; Tue, 5 Sep 2023 12:07:17 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 69C5266071F8; Tue, 5 Sep 2023 19:45:44 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939544; bh=40cxse3JQ9cf/lsBsQtEwOdtU5azU2vw7zoB5H2GdF8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U4d860lZYNiD+uQ8PYYCwqyD/zDNxDDoQ0nDrUu6G9IPoMKqJv/w0xhHnn9o6E6FS fEBvzZNPX63Fj8NAYMM2W5e1lne8Wf9EexMYLOGpVmaYuYv0hgkX8teD4bZgrS/iyv 64cIa0uZYV+84JB5wwK7y+LL7C1hV4WT2ee/41Uyzh3hdg5jDrCwhbaldfA33z/+XP JQ0bIfhLJOdV5gwSNXv2/kqJxCHH+Z/pLOFBnH9YAom5aB0i11qLux0FvJHqoJahXQ DSUbjOKV4TXe00lnjrFtfnGAPHBTZLv72n7aR/WraPtJ5w3tF/zxJSLpsbgj/0e9oM GlwImsrKAX+Og== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon Subject: [PATCH v3 1/8] drm/panfrost: Add cycle count GPU register definitions Date: Tue, 5 Sep 2023 19:45:17 +0100 Message-ID: <20230905184533.959171-2-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org These GPU registers will be used when programming the cycle counter, which we need for providing accurate fdinfo drm-cycles values to user space. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_regs.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h index 919f44ac853d..55ec807550b3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_regs.h +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h @@ -46,6 +46,8 @@ #define GPU_CMD_SOFT_RESET 0x01 #define GPU_CMD_PERFCNT_CLEAR 0x03 #define GPU_CMD_PERFCNT_SAMPLE 0x04 +#define GPU_CMD_CYCLE_COUNT_START 0x05 +#define GPU_CMD_CYCLE_COUNT_STOP 0x06 #define GPU_CMD_CLEAN_CACHES 0x07 #define GPU_CMD_CLEAN_INV_CACHES 0x08 #define GPU_STATUS 0x34 @@ -73,6 +75,9 @@ #define GPU_PRFCNT_TILER_EN 0x74 #define GPU_PRFCNT_MMU_L2_EN 0x7c +#define GPU_CYCLE_COUNT_LO 0x90 +#define GPU_CYCLE_COUNT_HI 0x94 + #define GPU_THREAD_MAX_THREADS 0x0A0 /* (RO) Maximum number of threads per core */ #define GPU_THREAD_MAX_WORKGROUP_SIZE 0x0A4 /* (RO) Maximum workgroup size */ #define GPU_THREAD_MAX_BARRIER_SIZE 0x0A8 /* (RO) Maximum threads waiting at a barrier */ From patchwork Tue Sep 5 18:45:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39A78C83F3E for ; Tue, 5 Sep 2023 19:07:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241197AbjIETHP (ORCPT ); Tue, 5 Sep 2023 15:07:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241181AbjIETHO (ORCPT ); Tue, 5 Sep 2023 15:07:14 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C5FBE42; Tue, 5 Sep 2023 12:06:39 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id A4F36660725E; Tue, 5 Sep 2023 19:45:44 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939544; bh=T0w/9ewtGaSWFKXNKJjvPNLeGF3c6rlMCZOOwei27ac=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fKWc2FrDs195RFkGo1XHWwPKlwIxs1Tnoex74M1zYAWh2kptEVsSKvvTMx3LqGGNg vU89h7tUm31YrrPaGmbqvJ+tQ89x2bog8QB00CAaNxu7nyNNkhiN/wQJLuo1Ll6V0O 4G9qBbIgcnIFAPm4p6rzI9WLQxt1cq9RnypvuxRoX57Ua2GDFfyg6s+pswC0AUsCQ8 T86rSVf3R9u8Q8hFLK36FAEwhfKeuHNVlQMjf1Y+Ix6RgbZT+fhUw9K+Gk8/wWrClZ P2gJKzaaHPWLrwCog78L1aQ8LWYYNBI0PDYylm8i6N1t6+STz/zNtrXSkBrrDqvaKk aJ7++Zmov1COQ== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 2/8] drm/panfrost: Enable cycle counter register upon job submission Date: Tue, 5 Sep 2023 19:45:18 +0100 Message-ID: <20230905184533.959171-3-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org In a future development, we will want to keep track of the number of GPU cycles spent on a given job. That means we should enable it only when the GPU has work to do, and switch it off whenever it is idle to avoid power waste. To avoid race conditions during enablement/disabling, a reference counting mechanism was introduced, and a job flag that tells us whether a given job increased the refcount. This is necessary, because a future development will let user space toggle cycle counting through a debugfs file, and a given job might have been in flight by the time cycle counting was disabled. Toggling of GPU cycle counting has to be done through a module parameter. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/panfrost_device.c | 5 +++ drivers/gpu/drm/panfrost/panfrost_device.h | 6 +++ drivers/gpu/drm/panfrost/panfrost_gpu.c | 43 ++++++++++++++++++++++ drivers/gpu/drm/panfrost/panfrost_gpu.h | 6 +++ drivers/gpu/drm/panfrost/panfrost_job.c | 10 +++++ drivers/gpu/drm/panfrost/panfrost_job.h | 1 + 6 files changed, 71 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c index fa1a086a862b..1ea2ac3804f0 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.c +++ b/drivers/gpu/drm/panfrost/panfrost_device.c @@ -18,6 +18,9 @@ #include "panfrost_mmu.h" #include "panfrost_perfcnt.h" +static bool profile; +module_param(profile, bool, 0600); + static int panfrost_reset_init(struct panfrost_device *pfdev) { pfdev->rstc = devm_reset_control_array_get_optional_exclusive(pfdev->dev); @@ -207,6 +210,8 @@ int panfrost_device_init(struct panfrost_device *pfdev) spin_lock_init(&pfdev->as_lock); + atomic_set(&pfdev->profile_mode, profile); + err = panfrost_clk_init(pfdev); if (err) { dev_err(pfdev->dev, "clk init failed %d\n", err); diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index b0126b9fbadc..5c09c9f3ae08 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -107,6 +107,7 @@ struct panfrost_device { struct list_head scheduled_jobs; struct panfrost_perfcnt *perfcnt; + atomic_t profile_mode; struct mutex sched_lock; @@ -121,6 +122,11 @@ struct panfrost_device { struct shrinker shrinker; struct panfrost_devfreq pfdevfreq; + + struct { + atomic_t use_count; + spinlock_t lock; + } cycle_counter; }; struct panfrost_mmu { diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c index 2faa344d89ee..fddbc72bf093 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c @@ -73,6 +73,8 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev) gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL); gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL); + atomic_set(&pfdev->cycle_counter.use_count, 0); + return 0; } @@ -321,6 +323,46 @@ static void panfrost_gpu_init_features(struct panfrost_device *pfdev) pfdev->features.shader_present, pfdev->features.l2_present); } +void panfrost_cycle_counter_get(struct panfrost_device *pfdev) +{ + if (atomic_inc_not_zero(&pfdev->cycle_counter.use_count)) + return; + + spin_lock(&pfdev->cycle_counter.lock); + if (atomic_inc_return(&pfdev->cycle_counter.use_count) == 1) + gpu_write(pfdev, GPU_CMD, GPU_CMD_CYCLE_COUNT_START); + spin_unlock(&pfdev->cycle_counter.lock); +} + +void panfrost_cycle_counter_put(struct panfrost_device *pfdev) +{ + if (atomic_add_unless(&pfdev->cycle_counter.use_count, -1, 1)) + return; + + spin_lock(&pfdev->cycle_counter.lock); + if (atomic_dec_return(&pfdev->cycle_counter.use_count) == 0) + gpu_write(pfdev, GPU_CMD, GPU_CMD_CYCLE_COUNT_STOP); + spin_unlock(&pfdev->cycle_counter.lock); +} + +void panfrost_cycle_counter_stop(struct panfrost_device *pfdev) +{ + atomic_set(&pfdev->profile_mode, 0); + gpu_write(pfdev, GPU_CMD, GPU_CMD_CYCLE_COUNT_STOP); +} + +unsigned long long panfrost_cycle_counter_read(struct panfrost_device *pfdev) +{ + u32 hi, lo; + + do { + hi = gpu_read(pfdev, GPU_CYCLE_COUNT_HI); + lo = gpu_read(pfdev, GPU_CYCLE_COUNT_LO); + } while (hi != gpu_read(pfdev, GPU_CYCLE_COUNT_HI)); + + return ((u64)hi << 32) | lo; +} + void panfrost_gpu_power_on(struct panfrost_device *pfdev) { int ret; @@ -367,6 +409,7 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev) void panfrost_gpu_power_off(struct panfrost_device *pfdev) { + panfrost_cycle_counter_stop(pfdev); gpu_write(pfdev, TILER_PWROFF_LO, 0); gpu_write(pfdev, SHADER_PWROFF_LO, 0); gpu_write(pfdev, L2_PWROFF_LO, 0); diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.h b/drivers/gpu/drm/panfrost/panfrost_gpu.h index 468c51e7e46d..4d62e8901c79 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gpu.h +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.h @@ -16,6 +16,12 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev); void panfrost_gpu_power_on(struct panfrost_device *pfdev); void panfrost_gpu_power_off(struct panfrost_device *pfdev); +void panfrost_stop_cycle_counter(struct panfrost_device *pfdev); +void panfrost_cycle_counter_get(struct panfrost_device *pfdev); +void panfrost_cycle_counter_stop(struct panfrost_device *pfdev); +void panfrost_cycle_counter_put(struct panfrost_device *pfdev); +unsigned long long panfrost_cycle_counter_read(struct panfrost_device *pfdev); + void panfrost_gpu_amlogic_quirk(struct panfrost_device *pfdev); #endif diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 033f5e684707..8b1bf6ac48f8 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -297,6 +297,11 @@ int panfrost_job_push(struct panfrost_job *job) kref_get(&job->refcount); /* put by scheduler job completion */ + if (atomic_read(&pfdev->profile_mode)) { + panfrost_cycle_counter_get(pfdev); + job->is_profiled = true; + } + drm_sched_entity_push_job(&job->base); mutex_unlock(&pfdev->sched_lock); @@ -351,6 +356,9 @@ static void panfrost_job_free(struct drm_sched_job *sched_job) drm_sched_job_cleanup(sched_job); + if (job->is_profiled) + panfrost_cycle_counter_put(job->pfdev); + panfrost_job_put(job); } @@ -842,6 +850,8 @@ int panfrost_job_init(struct panfrost_device *pfdev) } } + spin_lock_init(&pfdev->cycle_counter.lock); + panfrost_job_enable_interrupts(pfdev); return 0; diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h index 8becc1ba0eb9..2aa0add35459 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.h +++ b/drivers/gpu/drm/panfrost/panfrost_job.h @@ -32,6 +32,7 @@ struct panfrost_job { /* Fence to be signaled by drm-sched once its done with the job */ struct dma_fence *render_done_fence; + bool is_profiled; }; int panfrost_job_init(struct panfrost_device *pfdev); From patchwork Tue Sep 5 18:45:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11165CA0FF7 for ; Tue, 5 Sep 2023 18:47:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230358AbjIESrZ (ORCPT ); Tue, 5 Sep 2023 14:47:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234289AbjIESrV (ORCPT ); Tue, 5 Sep 2023 14:47:21 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7184FE74; Tue, 5 Sep 2023 11:46:49 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id DCE59660728F; Tue, 5 Sep 2023 19:45:44 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939545; bh=MhxhJBJbBzaBTGhs6JZKNFkL0G+t+vvnmMffEkd8aFk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c65tNdljPU4DPTedFXeJd+X/GP8ohzlgh8/44ZVw7K7d7+vv/Obg37TR0Qw09qS8f Zdh/UVc4balNx+Z7hgXdXOi/SgLPqkY5kRjJPFP5Nzr/fgKINnorLQpNhxlnJjbsL6 lsTx6MV97psqo5Dr+C4KiZaxYs7n7cZUTVMUgVYX4My/j3DxH3W6tPHuxrCispOCgH fczr4qDm3NRre6dEf++Rlow1zVhjZZlTxZ3omEHS5UFAk2AXKWBc9QitMHV3yx/dB9 lOIe5rHQ+oqCQQ5r2rZgC8ucvcDmXwHiueUN7m6qs5QwV5zJH1qL/MmFDWi82sHCXK JdhXxeav0pjuw== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 3/8] drm/panfrost: Enable debugfs toggling of cycle counter register Date: Tue, 5 Sep 2023 19:45:19 +0100 Message-ID: <20230905184533.959171-4-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Allow user space to decide whether the cycle counting register should be enabled. The main goal is letting tools like nvtop or IGT's gputop access this information in debug builds to obtain engine utilisation numbers. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/Makefile | 2 + drivers/gpu/drm/panfrost/panfrost_debugfs.c | 51 +++++++++++++++++++++ drivers/gpu/drm/panfrost/panfrost_debugfs.h | 13 ++++++ drivers/gpu/drm/panfrost/panfrost_drv.c | 5 ++ 4 files changed, 71 insertions(+) create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.c create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.h diff --git a/drivers/gpu/drm/panfrost/Makefile b/drivers/gpu/drm/panfrost/Makefile index 7da2b3f02ed9..2c01c1e7523e 100644 --- a/drivers/gpu/drm/panfrost/Makefile +++ b/drivers/gpu/drm/panfrost/Makefile @@ -12,4 +12,6 @@ panfrost-y := \ panfrost_perfcnt.o \ panfrost_dump.o +panfrost-$(CONFIG_DEBUG_FS) += panfrost_debugfs.o + obj-$(CONFIG_DRM_PANFROST) += panfrost.o diff --git a/drivers/gpu/drm/panfrost/panfrost_debugfs.c b/drivers/gpu/drm/panfrost/panfrost_debugfs.c new file mode 100644 index 000000000000..48d5ddfcb1c6 --- /dev/null +++ b/drivers/gpu/drm/panfrost/panfrost_debugfs.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2023 Collabora ltd. */ + +#include +#include +#include +#include + +#include "panfrost_device.h" +#include "panfrost_gpu.h" +#include "panfrost_debugfs.h" + +static int +profile_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct panfrost_device *pfdev = dev->dev_private; + + *val = atomic_read(&pfdev->profile_mode); + + return 0; +} + +static int +profile_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct panfrost_device *pfdev = dev->dev_private; + + if (atomic_read(&pfdev->profile_mode) == val) + return 0; + + if (val == false) + panfrost_cycle_counter_stop(pfdev); + else + atomic_set(&pfdev->profile_mode, 1); + + return 0; +} + +DEFINE_DEBUGFS_ATTRIBUTE(profile_fops, + profile_get, profile_set, + "%llu\n"); + +void panfrost_debugfs_init(struct drm_minor *minor) +{ + struct drm_device *dev = minor->dev; + + debugfs_create_file("profile", 0600, minor->debugfs_root, + dev, &profile_fops); +} diff --git a/drivers/gpu/drm/panfrost/panfrost_debugfs.h b/drivers/gpu/drm/panfrost/panfrost_debugfs.h new file mode 100644 index 000000000000..db1c158bcf2f --- /dev/null +++ b/drivers/gpu/drm/panfrost/panfrost_debugfs.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright 2023 Collabora ltd. + */ + +#ifndef PANFROST_DEBUGFS_H +#define PANFROST_DEBUGFS_H + +#ifdef CONFIG_DEBUG_FS +void panfrost_debugfs_init(struct drm_minor *minor); +#endif + +#endif /* PANFROST_DEBUGFS_H */ diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index a2ab99698ca8..2dfd9f79a31b 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -20,6 +20,7 @@ #include "panfrost_job.h" #include "panfrost_gpu.h" #include "panfrost_perfcnt.h" +#include "panfrost_debugfs.h" static bool unstable_ioctls; module_param_unsafe(unstable_ioctls, bool, 0600); @@ -546,6 +547,10 @@ static const struct drm_driver panfrost_drm_driver = { .gem_create_object = panfrost_gem_create_object, .gem_prime_import_sg_table = panfrost_gem_prime_import_sg_table, + +#ifdef CONFIG_DEBUG_FS + .debugfs_init = panfrost_debugfs_init, +#endif }; static int panfrost_probe(struct platform_device *pdev) From patchwork Tue Sep 5 18:45:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD34CCA1001 for ; Tue, 5 Sep 2023 19:11:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230502AbjIETLv (ORCPT ); Tue, 5 Sep 2023 15:11:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233371AbjIETLv (ORCPT ); Tue, 5 Sep 2023 15:11:51 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 430469B; Tue, 5 Sep 2023 12:11:26 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 2331B6607293; Tue, 5 Sep 2023 19:45:45 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939545; bh=Pu30e0FtKjnhfwqpwlIGFfeFS8D9XUETeCOm/pfZLMo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OpTsLa3rAKZ30ROExmo9ckNsvNcVUXf0R1QO2uv5PhuadMYc/+r1Te7KIBhaC0SFl UjFIRYsFsd+jhPL2WaB3clNGdRhVK4dN/yz3OFxLfytJR8SIehDCeIRdZ2VelQKod5 tID8ZS0GtbYZ441VBufUk52xG0paerToyvvXKhQHT3AWScGGp+XqHbu3W6J9D6RdzY /9i8uStBeBZAaumAQAIbr3KA4iCEqv/4J3IdxaHsS6XawLjWbrcwsxG6dg0J71Vxs4 3vf2fxFVIn5Z46ysX24AaXOXq/TSo/21pnnEmUTCjCyZ0SEGYO558NTIWSo9vu8Mnc sYu+1NocsVRkQ== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 4/8] drm/panfrost: Add fdinfo support GPU load metrics Date: Tue, 5 Sep 2023 19:45:20 +0100 Message-ID: <20230905184533.959171-5-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org The drm-stats fdinfo tags made available to user space are drm-engine, drm-cycles, drm-max-freq and drm-curfreq, one per job slot. This deviates from standard practice in other DRM drivers, where a single set of key:value pairs is provided for the whole render engine. However, Panfrost has separate queues for fragment and vertex/tiler jobs, so a decision was made to calculate bus cycles and workload times separately. Maximum operating frequency is calculated at devfreq initialisation time. Current frequency is made available to user space because nvtop uses it when performing engine usage calculations. It is important to bear in mind that both GPU cycle and kernel time numbers provided are at best rough estimations, and always reported in excess from the actual figure because of two reasons: - Excess time because of the delay between the end of a job processing, the subsequent job IRQ and the actual time of the sample. - Time spent in the engine queue waiting for the GPU to pick up the next job. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/panfrost_devfreq.c | 8 +++ drivers/gpu/drm/panfrost/panfrost_devfreq.h | 3 ++ drivers/gpu/drm/panfrost/panfrost_device.h | 13 +++++ drivers/gpu/drm/panfrost/panfrost_drv.c | 59 ++++++++++++++++++++- drivers/gpu/drm/panfrost/panfrost_job.c | 27 ++++++++++ drivers/gpu/drm/panfrost/panfrost_job.h | 4 ++ 6 files changed, 113 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c index 58dfb15a8757..28caffc689e2 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c @@ -58,6 +58,7 @@ static int panfrost_devfreq_get_dev_status(struct device *dev, spin_lock_irqsave(&pfdevfreq->lock, irqflags); panfrost_devfreq_update_utilization(pfdevfreq); + pfdevfreq->current_frequency = status->current_frequency; status->total_time = ktime_to_ns(ktime_add(pfdevfreq->busy_time, pfdevfreq->idle_time)); @@ -117,6 +118,7 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) struct devfreq *devfreq; struct thermal_cooling_device *cooling; struct panfrost_devfreq *pfdevfreq = &pfdev->pfdevfreq; + unsigned long freq = ULONG_MAX; if (pfdev->comp->num_supplies > 1) { /* @@ -172,6 +174,12 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) return ret; } + /* Find the fastest defined rate */ + opp = dev_pm_opp_find_freq_floor(dev, &freq); + if (IS_ERR(opp)) + return PTR_ERR(opp); + pfdevfreq->fast_rate = freq; + dev_pm_opp_put(opp); /* diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h b/drivers/gpu/drm/panfrost/panfrost_devfreq.h index 1514c1f9d91c..48dbe185f206 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h @@ -19,6 +19,9 @@ struct panfrost_devfreq { struct devfreq_simple_ondemand_data gov_data; bool opp_of_table_added; + unsigned long current_frequency; + unsigned long fast_rate; + ktime_t busy_time; ktime_t idle_time; ktime_t time_last_update; diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index 5c09c9f3ae08..7ad3973b1a3a 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -24,6 +24,7 @@ struct panfrost_perfcnt; #define NUM_JOB_SLOTS 3 #define MAX_PM_DOMAINS 5 +#define MAX_SLOT_NAME_LEN 25 struct panfrost_features { u16 id; @@ -141,12 +142,24 @@ struct panfrost_mmu { struct list_head list; }; +struct drm_info_gpu { + unsigned long maxfreq; + + struct engine_info { + unsigned long long elapsed_ns; + unsigned long long cycles; + char name[MAX_SLOT_NAME_LEN]; + } engines[NUM_JOB_SLOTS]; +}; + struct panfrost_file_priv { struct panfrost_device *pfdev; struct drm_sched_entity sched_entity[NUM_JOB_SLOTS]; struct panfrost_mmu *mmu; + + struct drm_info_gpu fdinfo; }; static inline struct panfrost_device *to_panfrost_device(struct drm_device *ddev) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 2dfd9f79a31b..94787f4aee27 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -268,6 +268,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, job->requirements = args->requirements; job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev); job->mmu = file_priv->mmu; + job->priv = file_priv; slot = panfrost_job_get_slot(job); @@ -484,6 +485,14 @@ panfrost_open(struct drm_device *dev, struct drm_file *file) goto err_free; } + snprintf(panfrost_priv->fdinfo.engines[0].name, MAX_SLOT_NAME_LEN, "fragment"); + snprintf(panfrost_priv->fdinfo.engines[1].name, MAX_SLOT_NAME_LEN, "vertex-tiler"); +/* Not exposed to userspace yet */ +#if 0 + snprintf(panfrost_priv->fdinfo.engines[2].name, MAX_SLOT_NAME_LEN, "compute-only"); +#endif + panfrost_priv->fdinfo.maxfreq = pfdev->pfdevfreq.fast_rate; + ret = panfrost_job_open(panfrost_priv); if (ret) goto err_job; @@ -524,7 +533,54 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = { PANFROST_IOCTL(MADVISE, madvise, DRM_RENDER_ALLOW), }; -DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops); + +static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev, + struct panfrost_file_priv *panfrost_priv, + struct drm_printer *p) +{ + int i; + + /* + * IMPORTANT NOTE: drm-cycles and drm-engine measurements are not + * accurate, as they only provide a rough estimation of the number of + * GPU cycles and CPU time spent in a given context. This is due to two + * different factors: + * - Firstly, we must consider the time the CPU and then the kernel + * takes to process the GPU interrupt, which means additional time and + * GPU cycles will be added in excess to the real figure. + * - Secondly, the pipelining done by the Job Manager (2 job slots per + * engine) implies there is no way to know exactly how much time each + * job spent on the GPU. + */ + + for (i = 0; i < NUM_JOB_SLOTS - 1; i++) { + struct engine_info *ei = &panfrost_priv->fdinfo.engines[i]; + + drm_printf(p, "drm-engine-%s:\t%llu ns\n", + ei->name, ei->elapsed_ns); + drm_printf(p, "drm-cycles-%s:\t%llu\n", + ei->name, ei->cycles); + drm_printf(p, "drm-maxfreq-%s:\t%lu Hz\n", + ei->name, panfrost_priv->fdinfo.maxfreq); + drm_printf(p, "drm-curfreq-%s:\t%lu Hz\n", + ei->name, pfdev->pfdevfreq.current_frequency); + } +} + +static void panfrost_show_fdinfo(struct drm_printer *p, struct drm_file *file) +{ + struct drm_device *dev = file->minor->dev; + struct panfrost_device *pfdev = dev->dev_private; + + panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p); + +} + +static const struct file_operations panfrost_drm_driver_fops = { + .owner = THIS_MODULE, + DRM_GEM_FOPS, + .show_fdinfo = drm_show_fdinfo, +}; /* * Panfrost driver version: @@ -536,6 +592,7 @@ static const struct drm_driver panfrost_drm_driver = { .driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ, .open = panfrost_open, .postclose = panfrost_postclose, + .show_fdinfo = panfrost_show_fdinfo, .ioctls = panfrost_drm_driver_ioctls, .num_ioctls = ARRAY_SIZE(panfrost_drm_driver_ioctls), .fops = &panfrost_drm_driver_fops, diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 8b1bf6ac48f8..8a02e1ee9f9b 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -159,6 +159,25 @@ panfrost_dequeue_job(struct panfrost_device *pfdev, int slot) struct panfrost_job *job = pfdev->jobs[slot][0]; WARN_ON(!job); + if (job->priv) { + struct engine_info *engine_info = &job->priv->fdinfo.engines[slot]; + + if (atomic_read(&pfdev->profile_mode)) { + engine_info->elapsed_ns += + ktime_to_ns(ktime_sub(ktime_get(), job->start_time)); + engine_info->cycles += + panfrost_cycle_counter_read(pfdev) - job->start_cycles; + + /* Reset in case the job has to be requeued */ + job->start_time = 0; + /* A GPU reset puts the Cycle Counter register back to 0 */ + job->start_cycles = atomic_read(&pfdev->reset.pending) ? + 0 : panfrost_cycle_counter_read(pfdev); + } + } else + dev_WARN(pfdev->dev, "Panfrost DRM file closed when job was on flight\n"); + + pfdev->jobs[slot][0] = pfdev->jobs[slot][1]; pfdev->jobs[slot][1] = NULL; @@ -233,6 +252,11 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) subslot = panfrost_enqueue_job(pfdev, js, job); /* Don't queue the job if a reset is in progress */ if (!atomic_read(&pfdev->reset.pending)) { + if (atomic_read(&pfdev->profile_mode)) { + job->start_time = ktime_get(); + job->start_cycles = panfrost_cycle_counter_read(pfdev); + } + job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START); dev_dbg(pfdev->dev, "JS: Submitting atom %p to js[%d][%d] with head=0x%llx AS %d", @@ -936,6 +960,9 @@ void panfrost_job_close(struct panfrost_file_priv *panfrost_priv) } job_write(pfdev, JS_COMMAND(i), cmd); + + /* Jobs can outlive their file context */ + job->priv = NULL; } } spin_unlock(&pfdev->js->job_lock); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h index 2aa0add35459..63bc830e057d 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.h +++ b/drivers/gpu/drm/panfrost/panfrost_job.h @@ -32,7 +32,11 @@ struct panfrost_job { /* Fence to be signaled by drm-sched once its done with the job */ struct dma_fence *render_done_fence; + + struct panfrost_file_priv *priv; bool is_profiled; + ktime_t start_time; + u64 start_cycles; }; int panfrost_job_init(struct panfrost_device *pfdev); From patchwork Tue Sep 5 18:45:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720422 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FC89CA1001 for ; Tue, 5 Sep 2023 19:02:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241476AbjIETCP (ORCPT ); Tue, 5 Sep 2023 15:02:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241625AbjIETCJ (ORCPT ); Tue, 5 Sep 2023 15:02:09 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EC54E5D; Tue, 5 Sep 2023 12:01:31 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 5F0B36607296; Tue, 5 Sep 2023 19:45:45 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939545; bh=8W3x2GM4aZpUf1aH5bfmuiuyCVBlSin1LhFM+JEzsek=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q/+KFXXqyQVjPW30DuDcdWKy22MeFyMG0iPQtuzTuea0ZTohoGNMrkvrdH0w9ZMr0 7JWg/Ervd6qs2DTIV4jhky5OqRxNP3KAlRDu4gFvO1/Gb8IDlyMZIK5Er9v7UvLWtf 2g67T7TP9XQ+oOhKGSwiPLuCmeRuFtUm/BhgjZg8K/CjYw4AxayHQhyNCQUPmqbUa+ ZNXFTACTCGXKFaxnUsGf0H0kAgm4z4HR2rvNmZryPZ3Bxn32Axmm7Vw7o5yjx9QmJ2 t0HEGF7liHMJNcndLrO4BEBSoHAbtU1aWfMV5EqW+vEjEhTQKMGi13iXrBT3xgSWTb 8PcxMyVB31Jdw== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 5/8] drm/panfrost: Add fdinfo support for memory stats Date: Tue, 5 Sep 2023 19:45:21 +0100 Message-ID: <20230905184533.959171-6-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org A new DRM GEM object function is added so that drm_show_memory_stats can provide more accurate memory usage numbers. Ideally, in panfrost_gem_status, the BO's purgeable flag would be checked after locking the driver's shrinker mutex, but drm_show_memory_stats takes over the drm file's object handle database spinlock, so there's potential for a race condition here. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + drivers/gpu/drm/panfrost/panfrost_gem.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 94787f4aee27..3de7e821ef23 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -574,6 +574,7 @@ static void panfrost_show_fdinfo(struct drm_printer *p, struct drm_file *file) panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p); + drm_show_memory_stats(p, file); } static const struct file_operations panfrost_drm_driver_fops = { diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index 3c812fbd126f..7d8f83d20539 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -195,6 +195,19 @@ static int panfrost_gem_pin(struct drm_gem_object *obj) return drm_gem_shmem_pin(&bo->base); } +static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj) +{ + struct panfrost_gem_object *bo = to_panfrost_bo(obj); + enum drm_gem_object_status res = 0; + + res |= (bo->base.madv == PANFROST_MADV_DONTNEED) ? + DRM_GEM_OBJECT_PURGEABLE : 0; + + res |= (bo->base.pages) ? DRM_GEM_OBJECT_RESIDENT : 0; + + return res; +} + static const struct drm_gem_object_funcs panfrost_gem_funcs = { .free = panfrost_gem_free_object, .open = panfrost_gem_open, @@ -206,6 +219,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = { .vmap = drm_gem_shmem_object_vmap, .vunmap = drm_gem_shmem_object_vunmap, .mmap = drm_gem_shmem_object_mmap, + .status = panfrost_gem_status, .vm_ops = &drm_gem_shmem_vm_ops, }; From patchwork Tue Sep 5 18:45:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D480C83F3E for ; Tue, 5 Sep 2023 19:02:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239208AbjIETCO (ORCPT ); Tue, 5 Sep 2023 15:02:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229950AbjIETCH (ORCPT ); Tue, 5 Sep 2023 15:02:07 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 247C7E5B; Tue, 5 Sep 2023 12:01:31 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 997C3660729D; Tue, 5 Sep 2023 19:45:45 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939545; bh=Jdh25YtuAbKOt5fTW39+W92sC7niIc+e4Omx5zuGqvY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PnqzKGe7k4QPTjz0fZBFwmKFCSHXkKBXzpTs1L6xmv5L6Fv009DItbuG1lx2c0M8X AnwlpmH3YznCckJ82u89F7zmZRq8+K460fFm8/GUZBZr03xFGnwFsOM3l9fPSL3/cy vMcL69ND9Y3T+NXBPeA3jjGjec1wXyNXU+itIRyXNIzajDlex039Ni+IafxKt/XltC jl5uOMpLUk2/tkKBb2LY/OJD2lJhnEDPeLXVvtUyjEG2GodNy4D8GvZSszAssMaZUI Z2gtOFs5FeP9Ivjyuz4868JT9vrTFCG138SQPSiDzj0vjeRLM9z54phUNBJAM8Zvir FZcxuO+yiFNRA== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon Subject: [PATCH v3 6/8] drm/drm_file: Add DRM obj's RSS reporting function for fdinfo Date: Tue, 5 Sep 2023 19:45:22 +0100 Message-ID: <20230905184533.959171-7-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Some BO's might be mapped onto physical memory chunkwise and on demand, like Panfrost's tiler heap. In this case, even though the drm_gem_shmem_object page array might already be allocated, only a very small fraction of the BO is currently backed by system memory, but drm_show_memory_stats will then proceed to add its entire virtual size to the file's total resident size regardless. This led to very unrealistic RSS sizes being reckoned for Panfrost, where said tiler heap buffer is initially allocated with a virtual size of 128 MiB, but only a small part of it will eventually be backed by system memory after successive GPU page faults. Provide a new DRM object generic function that would allow drivers to return a more accurate RSS size for their BOs. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon --- drivers/gpu/drm/drm_file.c | 5 ++++- include/drm/drm_gem.h | 9 +++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 883d83bc0e3d..762965e3d503 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) } if (s & DRM_GEM_OBJECT_RESIDENT) { - status.resident += obj->size; + if (obj->funcs && obj->funcs->rss) + status.resident += obj->funcs->rss(obj); + else + status.resident += obj->size; } else { /* If already purged or not yet backed by pages, don't * count it as purgeable: diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index bc9f6aa2f3fe..16364487fde9 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { */ enum drm_gem_object_status (*status)(struct drm_gem_object *obj); + /** + * @rss: + * + * Return resident size of the object in physical memory. + * + * Called by drm_show_memory_stats(). + */ + size_t (*rss)(struct drm_gem_object *obj); + /** * @vm_ops: * From patchwork Tue Sep 5 18:45:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09E86CA0FFB for ; Tue, 5 Sep 2023 18:54:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238913AbjIESyb (ORCPT ); Tue, 5 Sep 2023 14:54:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239661AbjIESy2 (ORCPT ); Tue, 5 Sep 2023 14:54:28 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69E2093; Tue, 5 Sep 2023 11:53:53 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id D618466072A3; Tue, 5 Sep 2023 19:45:45 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939546; bh=F1oEaAaBD16J9RAyVjjHzcIbd0a4ooHmWfqGUIHgGJg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=csvAiCNzyGIF6/V3+/q21HQ2A7qfmt5A9YgNLXZ3/VDEnO/vz8C2W0rWw/QP6pmTC ua5Quy7WFxS1kBnpN+3c44XxwMpVpYU1dKv5hgjCyr+jxNTgF4LUSkhVxBtDxud5uu aSfNjoh/i0kN9fZhzKpcAX1eU+O+5OYW97mTmos917SC5JGoYMzkBQeT6oedTF435M hnLlRMKLEvF/WkSKSBizWxa+GXStAj+3M97vKINqDkEOHTOTuQPtdQuEc+CklHvaIb aYXfpAnVAM0biAfrFoUZH65XpZH715G+86omGQWJVk/Cjeh3VTg8y9qSV3yNZn3KNz ngoKS6wC+P1Ow== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 7/8] drm/panfrost: Implement generic DRM object RSS reporting function Date: Tue, 5 Sep 2023 19:45:23 +0100 Message-ID: <20230905184533.959171-8-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org BO's RSS is updated every time new pages are allocated on demand and mapped for the object at GPU page fault's IRQ handler, but only for heap buffers. The reason this is unnecessary for non-heap buffers is that they are mapped onto the GPU's VA space and backed by physical memory in their entirety at BO creation time. This calculation is unnecessary for imported PRIME objects, since heap buffers cannot be exported by our driver, and the actual BO RSS size is the one reported in its attached dmabuf structure. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/panfrost_gem.c | 14 ++++++++++++++ drivers/gpu/drm/panfrost/panfrost_gem.h | 5 +++++ drivers/gpu/drm/panfrost/panfrost_mmu.c | 12 ++++++++---- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index 7d8f83d20539..cb92c0ed7615 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -208,6 +208,19 @@ static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj return res; } +static size_t panfrost_gem_rss(struct drm_gem_object *obj) +{ + struct panfrost_gem_object *bo = to_panfrost_bo(obj); + + if (bo->is_heap) + return bo->heap_rss_size; + else if (bo->base.pages) { + WARN_ON(bo->heap_rss_size); + return bo->base.base.size; + } else + return 0; +} + static const struct drm_gem_object_funcs panfrost_gem_funcs = { .free = panfrost_gem_free_object, .open = panfrost_gem_open, @@ -220,6 +233,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = { .vunmap = drm_gem_shmem_object_vunmap, .mmap = drm_gem_shmem_object_mmap, .status = panfrost_gem_status, + .rss = panfrost_gem_rss, .vm_ops = &drm_gem_shmem_vm_ops, }; diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h index ad2877eeeccd..13c0a8149c3a 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.h +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h @@ -36,6 +36,11 @@ struct panfrost_gem_object { */ atomic_t gpu_usecount; + /* + * Object chunk size currently mapped onto physical memory + */ + size_t heap_rss_size; + bool noexec :1; bool is_heap :1; }; diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index d54d4e7b2195..67c206124781 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -285,17 +285,19 @@ static void panfrost_mmu_flush_range(struct panfrost_device *pfdev, pm_runtime_put_autosuspend(pfdev->dev); } -static int mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu, +static size_t mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu, u64 iova, int prot, struct sg_table *sgt) { unsigned int count; struct scatterlist *sgl; struct io_pgtable_ops *ops = mmu->pgtbl_ops; u64 start_iova = iova; + size_t total = 0; for_each_sgtable_dma_sg(sgt, sgl, count) { unsigned long paddr = sg_dma_address(sgl); size_t len = sg_dma_len(sgl); + total += len; dev_dbg(pfdev->dev, "map: as=%d, iova=%llx, paddr=%lx, len=%zx", mmu->as, iova, paddr, len); @@ -315,7 +317,7 @@ static int mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu, panfrost_mmu_flush_range(pfdev, mmu, start_iova, iova - start_iova); - return 0; + return total; } int panfrost_mmu_map(struct panfrost_gem_mapping *mapping) @@ -447,6 +449,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, pgoff_t page_offset; struct sg_table *sgt; struct page **pages; + size_t mapped_size; bomapping = addr_to_mapping(pfdev, as, addr); if (!bomapping) @@ -518,10 +521,11 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, if (ret) goto err_map; - mmu_map_sg(pfdev, bomapping->mmu, addr, - IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); + mapped_size = mmu_map_sg(pfdev, bomapping->mmu, addr, + IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); bomapping->active = true; + bo->heap_rss_size += mapped_size; dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr); From patchwork Tue Sep 5 18:45:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 720423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 890CECA0FF7 for ; Tue, 5 Sep 2023 18:55:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233358AbjIESz6 (ORCPT ); Tue, 5 Sep 2023 14:55:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240933AbjIESze (ORCPT ); Tue, 5 Sep 2023 14:55:34 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BDF3173C; Tue, 5 Sep 2023 11:55:09 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id 1D71A66072B1; Tue, 5 Sep 2023 19:45:46 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1693939546; bh=/++GjTI/qMqzskPDEJeSXv7mzw9FD5fHfY1lAL+uK60=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MSdMHnxFCvB7g/mjGI/EVGU5LPnt05aiikstG4ovb4UAk7nvluMUlTQdg3efzTDH+ J4ddyTO0hTck2F5y3W8xHNkwxrpGMQc4ZSZmwPLkiS+rHsAEaC9ehAV2mEsd6pIVpx uIt3YObU65QtRztDv/CLXyQr/uqEhWllHfUnlJxBi604KWaWADEyjjuggl0ZeOKhS8 fw3DOW8G7kwYHIfEb0XHg0LXzemH88yJni8UXm1ZmIQ1uyJ3Ik29p9uOUKR+XuN32l x4+HtdoRpIlIMJDSAQU6r3NixRn8Q1hzhtG7ouHEAE/upErS0b4Eg0JBU1XzWy0x1I Mai/lZW+TsnEA== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com Subject: [PATCH v3 8/8] drm/drm-file: Show finer-grained BO sizes in drm_show_memory_stats Date: Tue, 5 Sep 2023 19:45:24 +0100 Message-ID: <20230905184533.959171-9-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230905184533.959171-1-adrian.larumbe@collabora.com> References: <20230905184533.959171-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org The current implementation will try to pick the highest available size display unit as soon as the BO size exceeds that of the previous multiplier. By selecting a higher threshold, we could show more accurate size numbers. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/drm_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 762965e3d503..0b5fbd493e05 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -879,7 +879,7 @@ static void print_size(struct drm_printer *p, const char *stat, unsigned u; for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { - if (sz < SZ_1K) + if (sz < (SZ_1K * 10000)) break; sz = div_u64(sz, SZ_1K); }