From patchwork Mon Jul 26 14:43:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 486852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EA2AC4338F for ; Mon, 26 Jul 2021 14:39:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F037460F58 for ; Mon, 26 Jul 2021 14:39:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234570AbhGZN7Y (ORCPT ); Mon, 26 Jul 2021 09:59:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234435AbhGZN7X (ORCPT ); Mon, 26 Jul 2021 09:59:23 -0400 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 127BAC061757; Mon, 26 Jul 2021 07:39:52 -0700 (PDT) Received: by mail-pj1-x102c.google.com with SMTP id g23-20020a17090a5797b02901765d605e14so273498pji.5; Mon, 26 Jul 2021 07:39:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ekwflhj473tLPM4o66HmQDg1eO85NFT85HSvncxpjEk=; b=neKWOI+Dh/7A+2hLuVLh7sTV/IeaMcSKIgZ45a7rDfkn0gRuRqisg9dEL8OVRT2m9l 9Q20V3okQFhysBuuEPmE4/jXtwhjE8XBZSfSJV6p9hRC9/qlBTNIaRQAbJ6H1Y2hV0xV Kdtc3keCi9jcL2bZKz86cFl7pR8urIkZTbloKKZc+hghK1TOXGGWrBBtR2j+NE8YO4RP ZJSAFavCoFYMy5Zoq+Rs7zVUfUHJOgyB2uopZcG7jUMtHeXl/XJfx2Z6Co2jpVm9IxRM 1shePO8vsQ0hVOuenANTj9rTeGP1/rsVJ4gU5pfXSXoVFeYB5fT7aSByIsCby/l4LqH0 W9oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ekwflhj473tLPM4o66HmQDg1eO85NFT85HSvncxpjEk=; b=SJi0Vw4CPELH084mIBZEJu6ZN+K78B/8Yd/HjggVrpOvdCTaavhi5kKVFRggCfvJKK GpkJJFwXnb7thXznfHWr+vIkpSjtY+fH+E8QdZzjVsJXJT5pftjP0E0omlqaxktmy6Qa S6+CzO4OgKvZ9CLVtBqUuL4AF5eJqbVV/BrbVa5bnz94R8In5EkuTGUNSgjB19mbcWLT Q6Leeq5NLrX2IgEvdXeTwOOGvnckuvRLBCbpPYFGx12muKK9h896cc4J2SjElpMMKYE/ yDl58LKeg1Xrd16oV202HNQ+Fs3F0PpcrjtH286N94Wz5GzlSG3noKzSrY8DHehbXazz 9oLg== X-Gm-Message-State: AOAM531ItneeWTTkdkZckvzQelNzbo4oehgCALWhq5hoOO1R0URZuR9J LVHvZiAMKIKC97x8UQyZjd8= X-Google-Smtp-Source: ABdhPJwVNoI9TSjltGtPcuwa7z7ZMWNEHSmfqGQrY8pwIfaxziRPGaSoaLtZTEDhY5xgBcvvc0B7sQ== X-Received: by 2002:aa7:8154:0:b029:310:70d:a516 with SMTP id d20-20020aa781540000b0290310070da516mr18664766pfn.63.1627310391600; Mon, 26 Jul 2021 07:39:51 -0700 (PDT) Received: from localhost ([2601:1c0:5200:a6:307:a401:7b76:c6e5]) by smtp.gmail.com with ESMTPSA id 85sm221234pfz.76.2021.07.26.07.39.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jul 2021 07:39:50 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: Rob Clark , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6ni?= =?utf-8?q?g?= , linux-arm-msm@vger.kernel.org (open list:DRM DRIVER FOR MSM ADRENO GPU), freedreno@lists.freedesktop.org (open list:DRM DRIVER FOR MSM ADRENO GPU), linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK) Subject: [PATCH 1/2] drm/msm: Let fences read directly from memptrs Date: Mon, 26 Jul 2021 07:43:57 -0700 Message-Id: <20210726144359.2179302-2-robdclark@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210726144359.2179302-1-robdclark@gmail.com> References: <20210726144359.2179302-1-robdclark@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org From: Rob Clark Let dma_fence::signaled, etc, read directly from the address that the hw is writing with updated completed fence seqno, so we can potentially notice that the fence is signaled sooner. Plus add some docs. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_fence.c | 11 ++++++-- drivers/gpu/drm/msm/msm_fence.h | 41 +++++++++++++++++++++++++--- drivers/gpu/drm/msm/msm_ringbuffer.c | 2 +- 3 files changed, 47 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index cd59a5918038..b92a9091a1e2 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -11,7 +11,8 @@ struct msm_fence_context * -msm_fence_context_alloc(struct drm_device *dev, const char *name) +msm_fence_context_alloc(struct drm_device *dev, volatile uint32_t *fenceptr, + const char *name) { struct msm_fence_context *fctx; @@ -22,6 +23,7 @@ msm_fence_context_alloc(struct drm_device *dev, const char *name) fctx->dev = dev; strncpy(fctx->name, name, sizeof(fctx->name)); fctx->context = dma_fence_context_alloc(1); + fctx->fenceptr = fenceptr; init_waitqueue_head(&fctx->event); spin_lock_init(&fctx->spinlock); @@ -35,7 +37,12 @@ void msm_fence_context_free(struct msm_fence_context *fctx) static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fence) { - return (int32_t)(fctx->completed_fence - fence) >= 0; + /* + * Note: Check completed_fence first, as fenceptr is in a write-combine + * mapping, so it will be more expensive to read. + */ + return (int32_t)(fctx->completed_fence - fence) >= 0 || + (int32_t)(*fctx->fenceptr - fence) >= 0; } /* legacy path for WAIT_FENCE ioctl: */ diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index 2d9af66dcca5..6ab97062ff1a 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/gpu/drm/msm/msm_fence.h @@ -9,19 +9,52 @@ #include "msm_drv.h" +/** + * struct msm_fence_context - fence context for gpu + * + * Each ringbuffer has a single fence context, with the GPU writing an + * incrementing fence seqno at the end of each submit + */ struct msm_fence_context { struct drm_device *dev; + /** name: human readable name for fence timeline */ char name[32]; + /** context: see dma_fence_context_alloc() */ unsigned context; - /* last_fence == completed_fence --> no pending work */ - uint32_t last_fence; /* last assigned fence */ - uint32_t completed_fence; /* last completed fence */ + + /** + * last_fence: + * + * Last assigned fence, incremented each time a fence is created + * on this fence context. If last_fence == completed_fence, + * there is no remaining pending work + */ + uint32_t last_fence; + + /** + * completed_fence: + * + * The last completed fence, updated from the CPU after interrupt + * from GPU + */ + uint32_t completed_fence; + + /** + * fenceptr: + * + * The address that the GPU directly writes with completed fence + * seqno. This can be ahead of completed_fence. We can peek at + * this to see if a fence has already signaled but the CPU hasn't + * gotten around to handling the irq and updating completed_fence + */ + volatile uint32_t *fenceptr; + wait_queue_head_t event; spinlock_t spinlock; }; struct msm_fence_context * msm_fence_context_alloc(struct drm_device *dev, - const char *name); + volatile uint32_t *fenceptr, const char *name); void msm_fence_context_free(struct msm_fence_context *fctx); int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence, diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 4d2a2a4abef8..7e92d9532454 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -51,7 +51,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, snprintf(name, sizeof(name), "gpu-ring-%d", ring->id); - ring->fctx = msm_fence_context_alloc(gpu->dev, name); + ring->fctx = msm_fence_context_alloc(gpu->dev, &ring->memptrs->fence, name); return ring; From patchwork Mon Jul 26 14:43:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 485909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 484B6C432BE for ; Mon, 26 Jul 2021 14:40:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2634060F55 for ; Mon, 26 Jul 2021 14:40:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234726AbhGZN7a (ORCPT ); Mon, 26 Jul 2021 09:59:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234620AbhGZN7Z (ORCPT ); Mon, 26 Jul 2021 09:59:25 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1064BC061757; Mon, 26 Jul 2021 07:39:54 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id d17so11808473plh.10; Mon, 26 Jul 2021 07:39:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IjQ9wQWspnQkFdYFJ7LxUdrGqhSWixun2cM2wAqPIaE=; b=Nbhl0cP15qUq4g7eGUHFnl/XoxR7cAuclhMqL1Z2TZ5q/okIx0+N0OMGJlSQYZhmZm ut0GubEkqZl9hbAMV8mD9IKACyNvW+wT6mje5voGcVOdJ9vwjO7IjiQ0fNsxTQkhChZj LjJUI8qZ+ecUXrveh8C+DA2YS9aGm1j5YyEhED+IlKsOlhXYqLOHXbNjgvSTxNYDW/zP VgsMkFxfBIEKG4hikJ9rO2uyqfUESi4PdStuhdtDbNSlSbrpTnBIznG2bF0lZp5X6Nem HEq1D68YGqSqy2AUk/7MJy0JLiM3+YcPozPK6ynBbzKJL1VC5nq9v1+9SBVAui4wf8o+ Kl3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IjQ9wQWspnQkFdYFJ7LxUdrGqhSWixun2cM2wAqPIaE=; b=ngxVkplxZ01wN0dnXF9GF2VCepNyL5isbVJoigbgRdJkWNvxtAGS0HJFPaS6nWKWIe z/ENKVnoxFSZ10gNuJV7L7HkJFWY+L6QD1V2g61gIFsmVKgR4tfXQM5stT7TghzeQ+Ox O/bkshMeFW/BreHm3sO2SadL4T3g3gqCSyGaUS9HIRGZaaiV+Xpya9HxYbNcXiIH1+qJ fh8rVsf/4kD3cIDkucNZoTR7LrSZUtDh8/+HZbg55csEDWUzNytY7ZZAVZ9aG7Mi1Gpy Pd/nYsRgYiC5EBQtae4rXaN/8OTUHSjjHh9djosql+VRhNH1NEmpeyciWrhbaxPe8I8h JoZA== X-Gm-Message-State: AOAM531rSeBaSks2wZLXs5WtfepQDstJmzs3tcleMgwmjuM1MNx2Mei6 wqqOzIWx4Lk7wLFysqpHsOs= X-Google-Smtp-Source: ABdhPJxVcme/ZypDj9k4J2IeOfO4b0MU0LZW5gUNix8AAANi8GVckcqzSKno/qA3fIBRBFAT3KwXsA== X-Received: by 2002:a65:550a:: with SMTP id f10mr18513356pgr.155.1627310393582; Mon, 26 Jul 2021 07:39:53 -0700 (PDT) Received: from localhost ([2601:1c0:5200:a6:307:a401:7b76:c6e5]) by smtp.gmail.com with ESMTPSA id c204sm238114pfb.90.2021.07.26.07.39.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jul 2021 07:39:52 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: Rob Clark , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , linux-arm-msm@vger.kernel.org (open list:DRM DRIVER FOR MSM ADRENO GPU), freedreno@lists.freedesktop.org (open list:DRM DRIVER FOR MSM ADRENO GPU), linux-kernel@vger.kernel.org (open list) Subject: [PATCH 2/2] drm/msm: Signal fences sooner Date: Mon, 26 Jul 2021 07:43:58 -0700 Message-Id: <20210726144359.2179302-3-robdclark@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210726144359.2179302-1-robdclark@gmail.com> References: <20210726144359.2179302-1-robdclark@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org From: Rob Clark Nothing we do to in update_fences() can't be done in an atomic context, so move this into the GPU's irq context to reduce latency (and call dma_fence_signal() so we aren't relying on dma_fence_is_signaled() which would defeat the purpose). Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gpu.c | 44 +++++++++++++++++++++-------------- 1 file changed, 26 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 0ebf7bc6ad09..647af45cf892 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -278,16 +278,18 @@ static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, uint32_t fence) { struct msm_gem_submit *submit; + unsigned long flags; - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); list_for_each_entry(submit, &ring->submits, node) { if (submit->seqno > fence) break; msm_update_fence(submit->ring->fctx, submit->fence->seqno); + dma_fence_signal(submit->fence); } - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); } #ifdef CONFIG_DEV_COREDUMP @@ -443,15 +445,16 @@ static struct msm_gem_submit * find_submit(struct msm_ringbuffer *ring, uint32_t fence) { struct msm_gem_submit *submit; + unsigned long flags; - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); list_for_each_entry(submit, &ring->submits, node) { if (submit->seqno == fence) { - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); return submit; } } - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); return NULL; } @@ -547,11 +550,12 @@ static void recover_worker(struct kthread_work *work) */ for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i]; + unsigned long flags; - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); list_for_each_entry(submit, &ring->submits, node) gpu->funcs->submit(gpu, submit); - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); } } @@ -641,7 +645,7 @@ static void hangcheck_handler(struct timer_list *t) hangcheck_timer_reset(gpu); /* workaround for missing irq: */ - kthread_queue_work(gpu->worker, &gpu->retire_work); + msm_gpu_retire(gpu); } /* @@ -752,6 +756,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; volatile struct msm_gpu_submit_stats *stats; u64 elapsed, clock = 0; + unsigned long flags; int i; stats = &ring->memptrs->stats[index]; @@ -781,9 +786,9 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, pm_runtime_mark_last_busy(&gpu->pdev->dev); pm_runtime_put_autosuspend(&gpu->pdev->dev); - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); list_del(&submit->node); - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); msm_gem_submit_put(submit); } @@ -798,11 +803,12 @@ static void retire_submits(struct msm_gpu *gpu) while (true) { struct msm_gem_submit *submit = NULL; + unsigned long flags; - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); submit = list_first_entry_or_null(&ring->submits, struct msm_gem_submit, node); - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); /* * If no submit, we are done. If submit->fence hasn't @@ -821,10 +827,6 @@ static void retire_submits(struct msm_gpu *gpu) static void retire_worker(struct kthread_work *work) { struct msm_gpu *gpu = container_of(work, struct msm_gpu, retire_work); - int i; - - for (i = 0; i < gpu->nr_rings; i++) - update_fences(gpu, gpu->rb[i], gpu->rb[i]->memptrs->fence); retire_submits(gpu); } @@ -832,6 +834,11 @@ static void retire_worker(struct kthread_work *work) /* call from irq handler to schedule work to retire bo's */ void msm_gpu_retire(struct msm_gpu *gpu) { + int i; + + for (i = 0; i < gpu->nr_rings; i++) + update_fences(gpu, gpu->rb[i], gpu->rb[i]->memptrs->fence); + kthread_queue_work(gpu->worker, &gpu->retire_work); update_sw_cntrs(gpu); } @@ -842,6 +849,7 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct drm_device *dev = gpu->dev; struct msm_drm_private *priv = dev->dev_private; struct msm_ringbuffer *ring = submit->ring; + unsigned long flags; int i; WARN_ON(!mutex_is_locked(&dev->struct_mutex)); @@ -879,9 +887,9 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) */ msm_gem_submit_get(submit); - spin_lock(&ring->submit_lock); + spin_lock_irqsave(&ring->submit_lock, flags); list_add_tail(&submit->node, &ring->submits); - spin_unlock(&ring->submit_lock); + spin_unlock_irqrestore(&ring->submit_lock, flags); gpu->funcs->submit(gpu, submit); priv->lastctx = submit->queue->ctx;