From patchwork Mon Jul 19 14:51:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 480486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED26BC636CE for ; Mon, 19 Jul 2021 16:24:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DB74161629 for ; Mon, 19 Jul 2021 16:24:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347665AbhGSPna (ORCPT ); Mon, 19 Jul 2021 11:43:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:34938 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347828AbhGSPjh (ORCPT ); Mon, 19 Jul 2021 11:39:37 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 54ABD6127C; Mon, 19 Jul 2021 16:19:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1626711558; bh=x+cz/X891inrQ3//V1KNZH9uM4o88sLn4CZYgOTpkpA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=elTdL9QK2FX+njvaXcozJPwY9ZoKVAOfnicv5dPHTbuhEfC2sfbpLhu7wWSR87Ewa 0fQP3/BbEDPpSx1LOV6OoGJTD581yDyePCfvtyUD8EWlslTVfya1RYj5MiVJcsBuWa HrOGwneF5Q96HydkJmFpHMAsoItzoL24Or1jcK3Q= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Matthew Auld , Jon Bloomfield , Chris Wilson , Daniel Vetter , Daniel Vetter , Rodrigo Vivi Subject: [PATCH 5.12 015/292] drm/i915/gtt: drop the page table optimisation Date: Mon, 19 Jul 2021 16:51:17 +0200 Message-Id: <20210719144943.030257610@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210719144942.514164272@linuxfoundation.org> References: <20210719144942.514164272@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Matthew Auld commit 0abb33bfca0fb74df76aac03e90ce685016ef7be upstream. We skip filling out the pt with scratch entries if the va range covers the entire pt, since we later have to fill it with the PTEs for the object pages anyway. However this might leave open a small window where the PTEs don't point to anything valid for the HW to consume. When for example using 2M GTT pages this fill_px() showed up as being quite significant in perf measurements, and ends up being completely wasted since we ignore the pt and just use the pde directly. Anyway, currently we have our PTE construction split between alloc and insert, which is probably slightly iffy nowadays, since the alloc doesn't actually allocate anything anymore, instead it just sets up the page directories and points the PTEs at the scratch page. Later when we do the insert step we re-program the PTEs again. Better might be to squash the alloc and insert into a single step, then bringing back this optimisation(along with some others) should be possible. Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables") Signed-off-by: Matthew Auld Cc: Jon Bloomfield Cc: Chris Wilson Cc: Daniel Vetter Cc: # v4.15+ Reviewed-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com (cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -298,10 +298,7 @@ static void __gen8_ppgtt_alloc(struct i9 __i915_gem_object_pin_pages(pt->base); i915_gem_object_make_unshrinkable(pt->base); - if (lvl || - gen8_pt_count(*start, end) < I915_PDES || - intel_vgpu_active(vm->i915)) - fill_px(pt, vm->scratch[lvl]->encode); + fill_px(pt, vm->scratch[lvl]->encode); spin_lock(&pd->lock); if (likely(!pd->entry[idx])) {