mbox series

[PATCHv3,0/7] System Cache support for GPU and required SMMU support

Message ID cover.1593344119.git.saiprakash.ranjan@codeaurora.org
Headers show
Series System Cache support for GPU and required SMMU support | expand

Message

Sai Prakash Ranjan June 29, 2020, 3:52 p.m. UTC
Some hardware variants contain a system cache or the last level
cache(llc). This cache is typically a large block which is shared
by multiple clients on the SOC. GPU uses the system cache to cache
both the GPU data buffers(like textures) as well the SMMU pagetables.
This helps with improved render performance as well as lower power
consumption by reducing the bus traffic to the system memory.

The system cache architecture allows the cache to be split into slices
which then be used by multiple SOC clients. This patch series is an
effort to enable and use two of those slices perallocated for the GPU,
one for the GPU data buffers and another for the GPU SMMU hardware
pagetables.

Patch 1 adds a init_context_bank implementation hook to set SCTLR.HUPCF.
Patch 2,3,6,7 adds system cache support in SMMU and GPU driver.
Patch 4 and 5 are minor cleanups for arm-smmu impl.

Changes in v3:
 * Fix domain attribute setting to before iommu_attach_device()
 * Fix few code style and checkpatch warnings
 * Rebase on top of Jordan's latest split pagetables and per-instance
   pagetables support [1][2]

Changes in v2:
 * Addressed review comments and rebased on top of Jordan's split
   pagetables series

[1] https://lore.kernel.org/patchwork/cover/1264446/
[2] https://lore.kernel.org/patchwork/cover/1264460/

Jordan Crouse (1):
  iommu/arm-smmu: Add a init_context_bank implementation hook

Sai Prakash Ranjan (4):
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add domain attribute for system cache
  iommu: arm-smmu-impl: Remove unwanted extra blank lines
  iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl

Sharat Masetty (2):
  drm/msm: rearrange the gpu_rmw() function
  drm/msm/a6xx: Add support for using system cache(LLC)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 82 +++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  3 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 23 ++++++-
 drivers/gpu/drm/msm/msm_drv.c           |  8 +++
 drivers/gpu/drm/msm/msm_drv.h           |  1 +
 drivers/gpu/drm/msm/msm_gpu.h           |  5 +-
 drivers/gpu/drm/msm/msm_iommu.c         |  3 +
 drivers/gpu/drm/msm/msm_mmu.h           |  4 ++
 drivers/iommu/arm-smmu-impl.c           | 13 ++--
 drivers/iommu/arm-smmu-qcom.c           | 13 ++++
 drivers/iommu/arm-smmu.c                | 46 +++++++++-----
 drivers/iommu/arm-smmu.h                | 13 ++++
 drivers/iommu/io-pgtable-arm.c          |  7 ++-
 include/linux/io-pgtable.h              |  4 ++
 include/linux/iommu.h                   |  1 +
 15 files changed, 198 insertions(+), 28 deletions(-)

Comments

Will Deacon July 3, 2020, 1:37 p.m. UTC | #1
On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c

> index f455c597f76d..bd1d58229cc2 100644

> --- a/drivers/gpu/drm/msm/msm_iommu.c

> +++ b/drivers/gpu/drm/msm/msm_iommu.c

> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,

>  		iova |= GENMASK_ULL(63, 49);

>  

>  

> +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

> +		prot |= IOMMU_SYS_CACHE_ONLY;


Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then it
looks like it should actually be a property on the domain because we never
need to configure it on a per-mapping basis within a domain, and therefore
it shouldn't be exposed by the IOMMU API as a prot flag.

Do you agree?

Will
Sai Prakash Ranjan July 3, 2020, 2:53 p.m. UTC | #2
Hi Will,

On 2020-07-03 19:07, Will Deacon wrote:
> On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:

>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 

>> b/drivers/gpu/drm/msm/msm_iommu.c

>> index f455c597f76d..bd1d58229cc2 100644

>> --- a/drivers/gpu/drm/msm/msm_iommu.c

>> +++ b/drivers/gpu/drm/msm/msm_iommu.c

>> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, 

>> uint64_t iova,

>>  		iova |= GENMASK_ULL(63, 49);

>> 

>> 

>> +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

>> +		prot |= IOMMU_SYS_CACHE_ONLY;

> 

> Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then 

> it

> looks like it should actually be a property on the domain because we 

> never

> need to configure it on a per-mapping basis within a domain, and 

> therefore

> it shouldn't be exposed by the IOMMU API as a prot flag.

> 

> Do you agree?

> 


GPU being the only user is for now, but there are other clients which 
can use this.
Plus how do we set the memory attributes if we do not expose this as 
prot flag?

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
Will Deacon July 3, 2020, 3:40 p.m. UTC | #3
On Fri, Jul 03, 2020 at 08:23:07PM +0530, Sai Prakash Ranjan wrote:
> On 2020-07-03 19:07, Will Deacon wrote:

> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:

> > > diff --git a/drivers/gpu/drm/msm/msm_iommu.c

> > > b/drivers/gpu/drm/msm/msm_iommu.c

> > > index f455c597f76d..bd1d58229cc2 100644

> > > --- a/drivers/gpu/drm/msm/msm_iommu.c

> > > +++ b/drivers/gpu/drm/msm/msm_iommu.c

> > > @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,

> > > uint64_t iova,

> > >  		iova |= GENMASK_ULL(63, 49);

> > > 

> > > 

> > > +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

> > > +		prot |= IOMMU_SYS_CACHE_ONLY;

> > 

> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then

> > it

> > looks like it should actually be a property on the domain because we

> > never

> > need to configure it on a per-mapping basis within a domain, and

> > therefore

> > it shouldn't be exposed by the IOMMU API as a prot flag.

> > 

> > Do you agree?

> > 

> 

> GPU being the only user is for now, but there are other clients which can

> use this.

> Plus how do we set the memory attributes if we do not expose this as prot

> flag?


I just don't understand the need for it to be per-map operation. Put another
way, if we extended the domain attribute to apply to cacheable mappings
on the domain and not just the table walk, what would break?

Will
Rob Clark July 3, 2020, 4:04 p.m. UTC | #4
On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan
<saiprakash.ranjan@codeaurora.org> wrote:
>

> Hi Will,

>

> On 2020-07-03 19:07, Will Deacon wrote:

> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:

> >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c

> >> b/drivers/gpu/drm/msm/msm_iommu.c

> >> index f455c597f76d..bd1d58229cc2 100644

> >> --- a/drivers/gpu/drm/msm/msm_iommu.c

> >> +++ b/drivers/gpu/drm/msm/msm_iommu.c

> >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,

> >> uint64_t iova,

> >>              iova |= GENMASK_ULL(63, 49);

> >>

> >>

> >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

> >> +            prot |= IOMMU_SYS_CACHE_ONLY;

> >

> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then

> > it

> > looks like it should actually be a property on the domain because we

> > never

> > need to configure it on a per-mapping basis within a domain, and

> > therefore

> > it shouldn't be exposed by the IOMMU API as a prot flag.

> >

> > Do you agree?

> >

>

> GPU being the only user is for now, but there are other clients which

> can use this.

> Plus how do we set the memory attributes if we do not expose this as

> prot flag?


It does appear that the downstream kgsl driver sets this for basically
all mappings.. well there is some conditional stuff around
DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the
domain.  (Jordan may know more about what that is about.)  But looks
like there are a lot of different paths into iommu_map in kgsl so I
might have missed something.

Assuming there isn't some case where we specifically don't want to use
the system cache for some mapping, I think it could be a domain
attribute that sets an io_pgtable_cfg::quirks flag

BR,
-R
Sai Prakash Ranjan July 3, 2020, 4:59 p.m. UTC | #5
On 2020-07-03 21:34, Rob Clark wrote:
> On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan

> <saiprakash.ranjan@codeaurora.org> wrote:

>> 

>> Hi Will,

>> 

>> On 2020-07-03 19:07, Will Deacon wrote:

>> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:

>> >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c

>> >> b/drivers/gpu/drm/msm/msm_iommu.c

>> >> index f455c597f76d..bd1d58229cc2 100644

>> >> --- a/drivers/gpu/drm/msm/msm_iommu.c

>> >> +++ b/drivers/gpu/drm/msm/msm_iommu.c

>> >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,

>> >> uint64_t iova,

>> >>              iova |= GENMASK_ULL(63, 49);

>> >>

>> >>

>> >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

>> >> +            prot |= IOMMU_SYS_CACHE_ONLY;

>> >

>> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then

>> > it

>> > looks like it should actually be a property on the domain because we

>> > never

>> > need to configure it on a per-mapping basis within a domain, and

>> > therefore

>> > it shouldn't be exposed by the IOMMU API as a prot flag.

>> >

>> > Do you agree?

>> >

>> 

>> GPU being the only user is for now, but there are other clients which

>> can use this.

>> Plus how do we set the memory attributes if we do not expose this as

>> prot flag?

> 

> It does appear that the downstream kgsl driver sets this for basically

> all mappings.. well there is some conditional stuff around

> DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the

> domain.  (Jordan may know more about what that is about.)  But looks

> like there are a lot of different paths into iommu_map in kgsl so I

> might have missed something.

> 

> Assuming there isn't some case where we specifically don't want to use

> the system cache for some mapping, I think it could be a domain

> attribute that sets an io_pgtable_cfg::quirks flag

> 


Ok then we are good to remove unused sys cache prot flag which Will has 
posted.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
Jordan Crouse July 9, 2020, 4:13 p.m. UTC | #6
On Fri, Jul 03, 2020 at 09:04:49AM -0700, Rob Clark wrote:
> On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan

> <saiprakash.ranjan@codeaurora.org> wrote:

> >

> > Hi Will,

> >

> > On 2020-07-03 19:07, Will Deacon wrote:

> > > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:

> > >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c

> > >> b/drivers/gpu/drm/msm/msm_iommu.c

> > >> index f455c597f76d..bd1d58229cc2 100644

> > >> --- a/drivers/gpu/drm/msm/msm_iommu.c

> > >> +++ b/drivers/gpu/drm/msm/msm_iommu.c

> > >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,

> > >> uint64_t iova,

> > >>              iova |= GENMASK_ULL(63, 49);

> > >>

> > >>

> > >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)

> > >> +            prot |= IOMMU_SYS_CACHE_ONLY;

> > >

> > > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then

> > > it

> > > looks like it should actually be a property on the domain because we

> > > never

> > > need to configure it on a per-mapping basis within a domain, and

> > > therefore

> > > it shouldn't be exposed by the IOMMU API as a prot flag.

> > >

> > > Do you agree?

> > >

> >

> > GPU being the only user is for now, but there are other clients which

> > can use this.

> > Plus how do we set the memory attributes if we do not expose this as

> > prot flag?

> 

> It does appear that the downstream kgsl driver sets this for basically

> all mappings.. well there is some conditional stuff around

> DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the

> domain.  (Jordan may know more about what that is about.)  But looks

> like there are a lot of different paths into iommu_map in kgsl so I

> might have missed something.


Downstream does set it universally. There are some theoretical use cases
where it might be beneficial to set it on a per-mapping basis with a bunch
of hinting from userspace and nobody has tried to characterize this on real
hardware so it is not clear to me if it is worth it.

I think a domain wide attribute works for now but if a compelling per-mapping
use case does comes down the pipeline we need to have a backup in mind -
possibly a prot flag to disable NWA?

Jordan

> Assuming there isn't some case where we specifically don't want to use

> the system cache for some mapping, I think it could be a domain

> attribute that sets an io_pgtable_cfg::quirks flag

> 

> BR,

> -R


-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project