diff mbox series

[v6,4/6] drm/drm_file: Add DRM obj's RSS reporting function for fdinfo

Message ID 20230919233556.1458793-5-adrian.larumbe@collabora.com
State New
Headers show
Series Add fdinfo support to Panfrost | expand

Commit Message

Adrián Larumbe Sept. 19, 2023, 11:34 p.m. UTC
Some BO's might be mapped onto physical memory chunkwise and on demand,
like Panfrost's tiler heap. In this case, even though the
drm_gem_shmem_object page array might already be allocated, only a very
small fraction of the BO is currently backed by system memory, but
drm_show_memory_stats will then proceed to add its entire virtual size to
the file's total resident size regardless.

This led to very unrealistic RSS sizes being reckoned for Panfrost, where
said tiler heap buffer is initially allocated with a virtual size of 128
MiB, but only a small part of it will eventually be backed by system memory
after successive GPU page faults.

Provide a new DRM object generic function that would allow drivers to
return a more accurate RSS size for their BOs.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/drm_file.c | 5 ++++-
 include/drm/drm_gem.h      | 9 +++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

Comments

Adrián Larumbe Sept. 22, 2023, 10:58 a.m. UTC | #1
On 20.09.2023 16:53, Tvrtko Ursulin wrote:
>
>On 20/09/2023 00:34, Adrián Larumbe wrote:
>> Some BO's might be mapped onto physical memory chunkwise and on demand,
>> like Panfrost's tiler heap. In this case, even though the
>> drm_gem_shmem_object page array might already be allocated, only a very
>> small fraction of the BO is currently backed by system memory, but
>> drm_show_memory_stats will then proceed to add its entire virtual size to
>> the file's total resident size regardless.
>> 
>> This led to very unrealistic RSS sizes being reckoned for Panfrost, where
>> said tiler heap buffer is initially allocated with a virtual size of 128
>> MiB, but only a small part of it will eventually be backed by system memory
>> after successive GPU page faults.
>> 
>> Provide a new DRM object generic function that would allow drivers to
>> return a more accurate RSS size for their BOs.
>> 
>> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
>> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
>> Reviewed-by: Steven Price <steven.price@arm.com>
>> ---
>>   drivers/gpu/drm/drm_file.c | 5 ++++-
>>   include/drm/drm_gem.h      | 9 +++++++++
>>   2 files changed, 13 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>> index 883d83bc0e3d..762965e3d503 100644
>> --- a/drivers/gpu/drm/drm_file.c
>> +++ b/drivers/gpu/drm/drm_file.c
>> @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
>>   		}
>>   		if (s & DRM_GEM_OBJECT_RESIDENT) {
>> -			status.resident += obj->size;
>> +			if (obj->funcs && obj->funcs->rss)
>> +				status.resident += obj->funcs->rss(obj);
>> +			else
>> +				status.resident += obj->size;
>
>Presumably you'd want the same smaller size in both active and purgeable? Or
>you can end up with more in those two than in rss which would look odd.

I didn't think of this. I guess when an object is both resident and purgeable,
then its RSS and purgeable sizes should be the same.

>Also, alternative to adding a new callback could be adding multiple output
>parameters to the existing obj->func->status() which maybe ends up simpler due
>fewer callbacks?
>
>Like:
>
> s = obj->funcs->status(obj, &supported_status, &rss)
>
>And adjust the code flow to pick up the rss if driver signaled it supports
>reporting it.

I personally find having a separate object callback more readable in this case.
There's also the question of what output parameter value would be used as a token
that the relevant BO doesn't have an RSS different from its virtual
size. I guess '0' would be alright, but this is on the assumption that this
could never be a legitimate BO virtual size across all DRM drivers. I guess
most of them round the size up to the nearest page multiple at BO creation
time.

>
>Regards,
>
>Tvrtko
>
>>   		} else {
>>   			/* If already purged or not yet backed by pages, don't
>>   			 * count it as purgeable:
>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>> index bc9f6aa2f3fe..16364487fde9 100644
>> --- a/include/drm/drm_gem.h
>> +++ b/include/drm/drm_gem.h
>> @@ -208,6 +208,15 @@ struct drm_gem_object_funcs {
>>   	 */
>>   	enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
>> +	/**
>> +	 * @rss:
>> +	 *
>> +	 * Return resident size of the object in physical memory.
>> +	 *
>> +	 * Called by drm_show_memory_stats().
>> +	 */
>> +	size_t (*rss)(struct drm_gem_object *obj);
>> +
>>   	/**
>>   	 * @vm_ops:
>>   	 *
Tvrtko Ursulin Sept. 27, 2023, 2:36 p.m. UTC | #2
On 22/09/2023 11:58, Adrián Larumbe wrote:
> On 20.09.2023 16:53, Tvrtko Ursulin wrote:
>>
>> On 20/09/2023 00:34, Adrián Larumbe wrote:
>>> Some BO's might be mapped onto physical memory chunkwise and on demand,
>>> like Panfrost's tiler heap. In this case, even though the
>>> drm_gem_shmem_object page array might already be allocated, only a very
>>> small fraction of the BO is currently backed by system memory, but
>>> drm_show_memory_stats will then proceed to add its entire virtual size to
>>> the file's total resident size regardless.
>>>
>>> This led to very unrealistic RSS sizes being reckoned for Panfrost, where
>>> said tiler heap buffer is initially allocated with a virtual size of 128
>>> MiB, but only a small part of it will eventually be backed by system memory
>>> after successive GPU page faults.
>>>
>>> Provide a new DRM object generic function that would allow drivers to
>>> return a more accurate RSS size for their BOs.
>>>
>>> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
>>> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
>>> Reviewed-by: Steven Price <steven.price@arm.com>
>>> ---
>>>    drivers/gpu/drm/drm_file.c | 5 ++++-
>>>    include/drm/drm_gem.h      | 9 +++++++++
>>>    2 files changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>>> index 883d83bc0e3d..762965e3d503 100644
>>> --- a/drivers/gpu/drm/drm_file.c
>>> +++ b/drivers/gpu/drm/drm_file.c
>>> @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
>>>    		}
>>>    		if (s & DRM_GEM_OBJECT_RESIDENT) {
>>> -			status.resident += obj->size;
>>> +			if (obj->funcs && obj->funcs->rss)
>>> +				status.resident += obj->funcs->rss(obj);
>>> +			else
>>> +				status.resident += obj->size;
>>
>> Presumably you'd want the same smaller size in both active and purgeable? Or
>> you can end up with more in those two than in rss which would look odd.
> 
> I didn't think of this. I guess when an object is both resident and purgeable,
> then its RSS and purgeable sizes should be the same.
> 
>> Also, alternative to adding a new callback could be adding multiple output
>> parameters to the existing obj->func->status() which maybe ends up simpler due
>> fewer callbacks?
>>
>> Like:
>>
>> s = obj->funcs->status(obj, &supported_status, &rss)
>>
>> And adjust the code flow to pick up the rss if driver signaled it supports
>> reporting it.
> 
> I personally find having a separate object callback more readable in this case.
> There's also the question of what output parameter value would be used as a token
> that the relevant BO doesn't have an RSS different from its virtual
> size. I guess '0' would be alright, but this is on the assumption that this
> could never be a legitimate BO virtual size across all DRM drivers. I guess
> most of them round the size up to the nearest page multiple at BO creation
> time.

Okay. See how it will look once you need to apply it to resident and 
purgeable. I wonder if "driver knows better" will end up a dominant case 
and we do end up considering reversing the scheme (like ask the driver 
to fill in the meminfo record). TBH I do not remember all the flavours 
both Rob and I proposed at this point.

Regards,

Tvrtko

> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>    		} else {
>>>    			/* If already purged or not yet backed by pages, don't
>>>    			 * count it as purgeable:
>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>>> index bc9f6aa2f3fe..16364487fde9 100644
>>> --- a/include/drm/drm_gem.h
>>> +++ b/include/drm/drm_gem.h
>>> @@ -208,6 +208,15 @@ struct drm_gem_object_funcs {
>>>    	 */
>>>    	enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
>>> +	/**
>>> +	 * @rss:
>>> +	 *
>>> +	 * Return resident size of the object in physical memory.
>>> +	 *
>>> +	 * Called by drm_show_memory_stats().
>>> +	 */
>>> +	size_t (*rss)(struct drm_gem_object *obj);
>>> +
>>>    	/**
>>>    	 * @vm_ops:
>>>    	 *
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 883d83bc0e3d..762965e3d503 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -944,7 +944,10 @@  void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
 		}
 
 		if (s & DRM_GEM_OBJECT_RESIDENT) {
-			status.resident += obj->size;
+			if (obj->funcs && obj->funcs->rss)
+				status.resident += obj->funcs->rss(obj);
+			else
+				status.resident += obj->size;
 		} else {
 			/* If already purged or not yet backed by pages, don't
 			 * count it as purgeable:
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bc9f6aa2f3fe..16364487fde9 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -208,6 +208,15 @@  struct drm_gem_object_funcs {
 	 */
 	enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
 
+	/**
+	 * @rss:
+	 *
+	 * Return resident size of the object in physical memory.
+	 *
+	 * Called by drm_show_memory_stats().
+	 */
+	size_t (*rss)(struct drm_gem_object *obj);
+
 	/**
 	 * @vm_ops:
 	 *