Message ID | 20230302235356.3148279-16-robdclark@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | dma-fence: Deadline awareness | expand |
On 03/03/2023 14:48, Rob Clark wrote: > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: >> >> >> On 03/03/2023 03:21, Rodrigo Vivi wrote: >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: >>>> From: Rob Clark <robdclark@chromium.org> >>>> >>> >>> missing some wording here... >>> >>>> v2: rebase >>>> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> >>>> --- >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ >>>> 1 file changed, 20 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c >>>> index 7503dcb9043b..44491e7e214c 100644 >>>> --- a/drivers/gpu/drm/i915/i915_request.c >>>> +++ b/drivers/gpu/drm/i915/i915_request.c >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) >>>> return i915_request_enable_breadcrumb(to_request(fence)); >>>> } >>>> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) >>>> +{ >>>> + struct i915_request *rq = to_request(fence); >>>> + >>>> + if (i915_request_completed(rq)) >>>> + return; >>>> + >>>> + if (i915_request_started(rq)) >>>> + return; >>> >>> why do we skip the boost if already started? >>> don't we want to boost the freq anyway? >> >> I'd wager Rob is just copying the current i915 wait boost logic. > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > compared to your RFC, this could be the bug Hm, there I have preserved this same !i915_request_started logic. Presumably it's not just fewer boosts but lower performance. How is he setting the deadline? Somehow from clFlush or so? Regards, Tvrtko P.S. Take note that I did not post the latest version of my RFC. The one where I fix the fence chain and array misses you pointed out. I did not think it would be worthwhile given no universal love for it, but if people are testing with it more widely that I was aware perhaps I should. >>>> + >>>> + /* >>>> + * TODO something more clever for deadlines that are in the >>>> + * future. I think probably track the nearest deadline in >>>> + * rq->timeline and set timer to trigger boost accordingly? >>>> + */ >>> >>> I'm afraid it will be very hard to find some heuristics of what's >>> late enough for the boost no? >>> I mean, how early to boost the freq on an upcoming deadline for the >>> timer? >> >> We can off load this patch from Rob and deal with it separately, or >> after the fact? > > That is completely my intention, I expect you to replace my i915 patch ;-) > > Rough idea when everyone is happy with the core bits is to setup an > immutable branch without the driver specific patches, which could be > merged into drm-next and $driver-next and then each driver team can > add there own driver patches on top > > BR, > -R > >> It's a half solution without a smarter scheduler too. Like >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, >> or if GuC plans to do something like that at any point. >> >> Or bump the priority too if deadline is looming? >> >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc >> basis. For instance I have a new heuristics which improves the >> problematic OpenCL cases for further 5% (relative to the current >> waitboost improvement from adding missing syncobj waitboost). But I >> can't really test properly for regressions over platforms, stacks, >> workloads.. :( >> >> Regards, >> >> Tvrtko >> >>> >>>> + >>>> + intel_rps_boost(rq); >>>> +} >>>> + >>>> static signed long i915_fence_wait(struct dma_fence *fence, >>>> bool interruptible, >>>> signed long timeout) >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { >>>> .signaled = i915_fence_signaled, >>>> .wait = i915_fence_wait, >>>> .release = i915_fence_release, >>>> + .set_deadline = i915_fence_set_deadline, >>>> }; >>>> >>>> static void irq_execute_cb(struct irq_work *wrk) >>>> -- >>>> 2.39.1 >>>>
On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >>>> From: Rob Clark <robdclark@chromium.org> > >>>> > >>> > >>> missing some wording here... > >>> > >>>> v2: rebase > >>>> > >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >>>> index 7503dcb9043b..44491e7e214c 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_request.c > >>>> +++ b/drivers/gpu/drm/i915/i915_request.c > >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > >>>> return i915_request_enable_breadcrumb(to_request(fence)); > >>>> } > >>>> > >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > >>>> +{ > >>>> + struct i915_request *rq = to_request(fence); > >>>> + > >>>> + if (i915_request_completed(rq)) > >>>> + return; > >>>> + > >>>> + if (i915_request_started(rq)) > >>>> + return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa hack to set the DEADLINE flag on syncobj waits, but it seems likely that I missed something somewhere BR, -R > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. > > >>>> + > >>>> + /* > >>>> + * TODO something more clever for deadlines that are in the > >>>> + * future. I think probably track the nearest deadline in > >>>> + * rq->timeline and set timer to trigger boost accordingly? > >>>> + */ > >>> > >>> I'm afraid it will be very hard to find some heuristics of what's > >>> late enough for the boost no? > >>> I mean, how early to boost the freq on an upcoming deadline for the > >>> timer? > >> > >> We can off load this patch from Rob and deal with it separately, or > >> after the fact? > > > > That is completely my intention, I expect you to replace my i915 patch ;-) > > > > Rough idea when everyone is happy with the core bits is to setup an > > immutable branch without the driver specific patches, which could be > > merged into drm-next and $driver-next and then each driver team can > > add there own driver patches on top > > > > BR, > > -R > > > >> It's a half solution without a smarter scheduler too. Like > >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, > >> or if GuC plans to do something like that at any point. > >> > >> Or bump the priority too if deadline is looming? > >> > >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc > >> basis. For instance I have a new heuristics which improves the > >> problematic OpenCL cases for further 5% (relative to the current > >> waitboost improvement from adding missing syncobj waitboost). But I > >> can't really test properly for regressions over platforms, stacks, > >> workloads.. :( > >> > >> Regards, > >> > >> Tvrtko > >> > >>> > >>>> + > >>>> + intel_rps_boost(rq); > >>>> +} > >>>> + > >>>> static signed long i915_fence_wait(struct dma_fence *fence, > >>>> bool interruptible, > >>>> signed long timeout) > >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > >>>> .signaled = i915_fence_signaled, > >>>> .wait = i915_fence_wait, > >>>> .release = i915_fence_release, > >>>> + .set_deadline = i915_fence_set_deadline, > >>>> }; > >>>> > >>>> static void irq_execute_cb(struct irq_work *wrk) > >>>> -- > >>>> 2.39.1 > >>>>
On Fri, Mar 3, 2023 at 7:20 AM Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote: > > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > > <tvrtko.ursulin@linux.intel.com> wrote: > > > > > > > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > > >> From: Rob Clark <robdclark@chromium.org> > > > > >> > > > > > > > > > > missing some wording here... > > > > > > > > > >> v2: rebase > > > > >> > > > > >> Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > >> --- > > > > >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > > > >> 1 file changed, 20 insertions(+) > > > > >> > > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > > > >> index 7503dcb9043b..44491e7e214c 100644 > > > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > > > >> } > > > > >> > > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > > > >> +{ > > > > >> + struct i915_request *rq = to_request(fence); > > > > >> + > > > > >> + if (i915_request_completed(rq)) > > > > >> + return; > > > > >> + > > > > >> + if (i915_request_started(rq)) > > > > >> + return; > > > > > > > > > > why do we skip the boost if already started? > > > > > don't we want to boost the freq anyway? > > > > > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > > compared to your RFC, this could be the bug > > > > I don't think i915 calls drm_atomic_helper_wait_for_fences() > > so that could explain something. > > Oh, I guess this wasn't even supposed to take over the current > display boost stuff since you didn't remove the old one. Right, I didn't try to replace the current thing.. but hopefully at least make it possible for i915 to use more of the atomic helpers in the future BR, -R > The current one just boosts after a missed vblank. The deadline > could use your timer approach I suppose and boost already a bit > earlier in the hopes of not missing the vblank. > > -- > Ville Syrjälä > Intel
On Fri, Mar 3, 2023 at 10:08 AM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >>>> From: Rob Clark <robdclark@chromium.org> > >>>> > >>> > >>> missing some wording here... > >>> > >>>> v2: rebase > >>>> > >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >>>> index 7503dcb9043b..44491e7e214c 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_request.c > >>>> +++ b/drivers/gpu/drm/i915/i915_request.c > >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > >>>> return i915_request_enable_breadcrumb(to_request(fence)); > >>>> } > >>>> > >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > >>>> +{ > >>>> + struct i915_request *rq = to_request(fence); > >>>> + > >>>> + if (i915_request_completed(rq)) > >>>> + return; > >>>> + > >>>> + if (i915_request_started(rq)) > >>>> + return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? > > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. Yep, that would be great. We're interested in it for ChromeOS. Please Cc me on the series when you send it.
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 7503dcb9043b..44491e7e214c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) return i915_request_enable_breadcrumb(to_request(fence)); } +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + struct i915_request *rq = to_request(fence); + + if (i915_request_completed(rq)) + return; + + if (i915_request_started(rq)) + return; + + /* + * TODO something more clever for deadlines that are in the + * future. I think probably track the nearest deadline in + * rq->timeline and set timer to trigger boost accordingly? + */ + + intel_rps_boost(rq); +} + static signed long i915_fence_wait(struct dma_fence *fence, bool interruptible, signed long timeout) @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { .signaled = i915_fence_signaled, .wait = i915_fence_wait, .release = i915_fence_release, + .set_deadline = i915_fence_set_deadline, }; static void irq_execute_cb(struct irq_work *wrk)