Message ID | 20220310234611.424743-3-robdclark@gmail.com |
---|---|
State | New |
Headers | show |
Series | drm/msm/gpu: More system suspend fixes | expand |
Am 17.03.22 um 16:10 schrieb Rob Clark: > [SNIP] > userspace frozen != kthread frozen .. that is what this patch is > trying to address, so we aren't racing between shutting down the hw > and the scheduler shoveling more jobs at us. Well exactly that's the problem. The scheduler is supposed to shoveling more jobs at us until it is empty. Thinking more about it we will then keep some dma_fence instance unsignaled and that is and extremely bad idea since it can lead to deadlocks during suspend. So this patch here is an absolute clear NAK from my side. If amdgpu is doing something similar that is a severe bug and needs to be addressed somehow. Regards, Christian. > > BR, > -R >
Am 17.03.22 um 17:18 schrieb Rob Clark: > On Thu, Mar 17, 2022 at 9:04 AM Christian König > <christian.koenig@amd.com> wrote: >> Am 17.03.22 um 16:10 schrieb Rob Clark: >>> [SNIP] >>> userspace frozen != kthread frozen .. that is what this patch is >>> trying to address, so we aren't racing between shutting down the hw >>> and the scheduler shoveling more jobs at us. >> Well exactly that's the problem. The scheduler is supposed to shoveling >> more jobs at us until it is empty. >> >> Thinking more about it we will then keep some dma_fence instance >> unsignaled and that is and extremely bad idea since it can lead to >> deadlocks during suspend. > Hmm, perhaps that is true if you need to migrate things out of vram? > It is at least not a problem when vram is not involved. No, it's much wider than that. See what can happen is that the memory management shrinkers want to wait for a dma_fence during suspend. And if you stop the scheduler they will just wait forever. What you need to do instead is to drain the scheduler, e.g. call drm_sched_entity_flush() with a proper timeout for each entity you have created. Regards, Christian. > >> So this patch here is an absolute clear NAK from my side. If amdgpu is >> doing something similar that is a severe bug and needs to be addressed >> somehow. > I think amdgpu's use of kthread_park is not related to suspend, but > didn't look too closely. > > And perhaps the solution for this problem is more complex in the case > of amdgpu, I'm not super familiar with the constraints there. But I > think it is a fine solution for integrated GPUs. > > BR, > -R > >> Regards, >> Christian. >> >>> BR, >>> -R >>>
On Thu, Mar 17, 2022 at 9:45 AM Christian König <christian.koenig@amd.com> wrote: > > Am 17.03.22 um 17:18 schrieb Rob Clark: > > On Thu, Mar 17, 2022 at 9:04 AM Christian König > > <christian.koenig@amd.com> wrote: > >> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>> [SNIP] > >>> userspace frozen != kthread frozen .. that is what this patch is > >>> trying to address, so we aren't racing between shutting down the hw > >>> and the scheduler shoveling more jobs at us. > >> Well exactly that's the problem. The scheduler is supposed to shoveling > >> more jobs at us until it is empty. > >> > >> Thinking more about it we will then keep some dma_fence instance > >> unsignaled and that is and extremely bad idea since it can lead to > >> deadlocks during suspend. > > Hmm, perhaps that is true if you need to migrate things out of vram? > > It is at least not a problem when vram is not involved. > > No, it's much wider than that. > > See what can happen is that the memory management shrinkers want to wait > for a dma_fence during suspend. we don't wait on fences in shrinker, only purging or evicting things that are already ready. Actually, waiting on fences in shrinker path sounds like a pretty bad idea. > And if you stop the scheduler they will just wait forever. > > What you need to do instead is to drain the scheduler, e.g. call > drm_sched_entity_flush() with a proper timeout for each entity you have > created. yeah, it would work to drain the scheduler.. I guess that might be the more portable approach as far as generic solution for suspend. BR, -R > Regards, > Christian. > > > > >> So this patch here is an absolute clear NAK from my side. If amdgpu is > >> doing something similar that is a severe bug and needs to be addressed > >> somehow. > > I think amdgpu's use of kthread_park is not related to suspend, but > > didn't look too closely. > > > > And perhaps the solution for this problem is more complex in the case > > of amdgpu, I'm not super familiar with the constraints there. But I > > think it is a fine solution for integrated GPUs. > > > > BR, > > -R > > > >> Regards, > >> Christian. > >> > >>> BR, > >>> -R > >>> >
On 2022-03-17 12:04, Christian König wrote: > Am 17.03.22 um 16:10 schrieb Rob Clark: >> [SNIP] >> userspace frozen != kthread frozen .. that is what this patch is >> trying to address, so we aren't racing between shutting down the hw >> and the scheduler shoveling more jobs at us. > > Well exactly that's the problem. The scheduler is supposed to > shoveling more jobs at us until it is empty. > > Thinking more about it we will then keep some dma_fence instance > unsignaled and that is and extremely bad idea since it can lead to > deadlocks during suspend. > > So this patch here is an absolute clear NAK from my side. If amdgpu is > doing something similar that is a severe bug and needs to be addressed > somehow. From looking at latest amd-stagin-drm-next we only use directly kthread_park during in debugfs IB hooks. For S3 suspend (amdgpu_pmops_suspend) we will only flush all the HW fences (amdgpu_fence_wait_empty) so we don't freeze the scheduler thread and don't flush scheduler entities. Andrey > > Regards, > Christian. > >> >> BR, >> -R >> >
On 2022-03-17 13:35, Rob Clark wrote: > On Thu, Mar 17, 2022 at 9:45 AM Christian König > <christian.koenig@amd.com> wrote: >> Am 17.03.22 um 17:18 schrieb Rob Clark: >>> On Thu, Mar 17, 2022 at 9:04 AM Christian König >>> <christian.koenig@amd.com> wrote: >>>> Am 17.03.22 um 16:10 schrieb Rob Clark: >>>>> [SNIP] >>>>> userspace frozen != kthread frozen .. that is what this patch is >>>>> trying to address, so we aren't racing between shutting down the hw >>>>> and the scheduler shoveling more jobs at us. >>>> Well exactly that's the problem. The scheduler is supposed to shoveling >>>> more jobs at us until it is empty. >>>> >>>> Thinking more about it we will then keep some dma_fence instance >>>> unsignaled and that is and extremely bad idea since it can lead to >>>> deadlocks during suspend. >>> Hmm, perhaps that is true if you need to migrate things out of vram? >>> It is at least not a problem when vram is not involved. >> No, it's much wider than that. >> >> See what can happen is that the memory management shrinkers want to wait >> for a dma_fence during suspend. > we don't wait on fences in shrinker, only purging or evicting things > that are already ready. Actually, waiting on fences in shrinker path > sounds like a pretty bad idea. > >> And if you stop the scheduler they will just wait forever. >> >> What you need to do instead is to drain the scheduler, e.g. call >> drm_sched_entity_flush() with a proper timeout for each entity you have >> created. > yeah, it would work to drain the scheduler.. I guess that might be the > more portable approach as far as generic solution for suspend. > > BR, > -R I am not sure how this drains the scheduler ? Suppose we done the waiting in drm_sched_entity_flush, what prevents someone to push right away another job into the same entity's queue right after that ? Shouldn't we first disable further pushing of jobs into entity before we wait for sched->job_scheduled ? Andrey > >> Regards, >> Christian. >> >>>> So this patch here is an absolute clear NAK from my side. If amdgpu is >>>> doing something similar that is a severe bug and needs to be addressed >>>> somehow. >>> I think amdgpu's use of kthread_park is not related to suspend, but >>> didn't look too closely. >>> >>> And perhaps the solution for this problem is more complex in the case >>> of amdgpu, I'm not super familiar with the constraints there. But I >>> think it is a fine solution for integrated GPUs. >>> >>> BR, >>> -R >>> >>>> Regards, >>>> Christian. >>>> >>>>> BR, >>>>> -R >>>>>
On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote: > > > On 2022-03-17 13:35, Rob Clark wrote: > > On Thu, Mar 17, 2022 at 9:45 AM Christian König > > <christian.koenig@amd.com> wrote: > >> Am 17.03.22 um 17:18 schrieb Rob Clark: > >>> On Thu, Mar 17, 2022 at 9:04 AM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>>>> [SNIP] > >>>>> userspace frozen != kthread frozen .. that is what this patch is > >>>>> trying to address, so we aren't racing between shutting down the hw > >>>>> and the scheduler shoveling more jobs at us. > >>>> Well exactly that's the problem. The scheduler is supposed to shoveling > >>>> more jobs at us until it is empty. > >>>> > >>>> Thinking more about it we will then keep some dma_fence instance > >>>> unsignaled and that is and extremely bad idea since it can lead to > >>>> deadlocks during suspend. > >>> Hmm, perhaps that is true if you need to migrate things out of vram? > >>> It is at least not a problem when vram is not involved. > >> No, it's much wider than that. > >> > >> See what can happen is that the memory management shrinkers want to wait > >> for a dma_fence during suspend. > > we don't wait on fences in shrinker, only purging or evicting things > > that are already ready. Actually, waiting on fences in shrinker path > > sounds like a pretty bad idea. > > > >> And if you stop the scheduler they will just wait forever. > >> > >> What you need to do instead is to drain the scheduler, e.g. call > >> drm_sched_entity_flush() with a proper timeout for each entity you have > >> created. > > yeah, it would work to drain the scheduler.. I guess that might be the > > more portable approach as far as generic solution for suspend. > > > > BR, > > -R > > > I am not sure how this drains the scheduler ? Suppose we done the > waiting in drm_sched_entity_flush, > what prevents someone to push right away another job into the same > entity's queue right after that ? > Shouldn't we first disable further pushing of jobs into entity before we > wait for sched->job_scheduled ? > In the system suspend path, userspace processes will have already been frozen, so there should be no way to push more jobs to the scheduler, unless they are pushed from the kernel itself. We don't do that in drm/msm, but maybe you need to to move things btwn vram and system memory? But even in that case, if the # of jobs you push is bounded I guess that is ok? BR, -R
On 2022-03-17 14:25, Rob Clark wrote: > On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky > <andrey.grodzovsky@amd.com> wrote: >> >> On 2022-03-17 13:35, Rob Clark wrote: >>> On Thu, Mar 17, 2022 at 9:45 AM Christian König >>> <christian.koenig@amd.com> wrote: >>>> Am 17.03.22 um 17:18 schrieb Rob Clark: >>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König >>>>> <christian.koenig@amd.com> wrote: >>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: >>>>>>> [SNIP] >>>>>>> userspace frozen != kthread frozen .. that is what this patch is >>>>>>> trying to address, so we aren't racing between shutting down the hw >>>>>>> and the scheduler shoveling more jobs at us. >>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling >>>>>> more jobs at us until it is empty. >>>>>> >>>>>> Thinking more about it we will then keep some dma_fence instance >>>>>> unsignaled and that is and extremely bad idea since it can lead to >>>>>> deadlocks during suspend. >>>>> Hmm, perhaps that is true if you need to migrate things out of vram? >>>>> It is at least not a problem when vram is not involved. >>>> No, it's much wider than that. >>>> >>>> See what can happen is that the memory management shrinkers want to wait >>>> for a dma_fence during suspend. >>> we don't wait on fences in shrinker, only purging or evicting things >>> that are already ready. Actually, waiting on fences in shrinker path >>> sounds like a pretty bad idea. >>> >>>> And if you stop the scheduler they will just wait forever. >>>> >>>> What you need to do instead is to drain the scheduler, e.g. call >>>> drm_sched_entity_flush() with a proper timeout for each entity you have >>>> created. >>> yeah, it would work to drain the scheduler.. I guess that might be the >>> more portable approach as far as generic solution for suspend. >>> >>> BR, >>> -R >> >> I am not sure how this drains the scheduler ? Suppose we done the >> waiting in drm_sched_entity_flush, >> what prevents someone to push right away another job into the same >> entity's queue right after that ? >> Shouldn't we first disable further pushing of jobs into entity before we >> wait for sched->job_scheduled ? >> > In the system suspend path, userspace processes will have already been > frozen, so there should be no way to push more jobs to the scheduler, > unless they are pushed from the kernel itself. It was my suspicion but I wasn't sure about it. > We don't do that in > drm/msm, but maybe you need to to move things btwn vram and system > memory? Exactly, that was my main concern - if we use this method we have to use it in a point in suspend sequence when all the in kernel job submissions activity already suspended > But even in that case, if the # of jobs you push is bounded I > guess that is ok? Submissions to scheduler entities are using unbounded queue, the bounded part is when you extract next job from entity to submit to HW ring and it rejects if submission limit reached (drm_sched_ready) In general - It looks to me at least that what we what we want her is more of a drain operation then flush (i.e. we first want to disable any further job submission to entity's queue and then flush all in flight ones). As example for this i was looking at flush_workqueue vs. drain_workqueue Andrey > > BR, > -R
On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote: > > > On 2022-03-17 14:25, Rob Clark wrote: > > On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky > > <andrey.grodzovsky@amd.com> wrote: > >> > >> On 2022-03-17 13:35, Rob Clark wrote: > >>> On Thu, Mar 17, 2022 at 9:45 AM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> Am 17.03.22 um 17:18 schrieb Rob Clark: > >>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König > >>>>> <christian.koenig@amd.com> wrote: > >>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>>>>>> [SNIP] > >>>>>>> userspace frozen != kthread frozen .. that is what this patch is > >>>>>>> trying to address, so we aren't racing between shutting down the hw > >>>>>>> and the scheduler shoveling more jobs at us. > >>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling > >>>>>> more jobs at us until it is empty. > >>>>>> > >>>>>> Thinking more about it we will then keep some dma_fence instance > >>>>>> unsignaled and that is and extremely bad idea since it can lead to > >>>>>> deadlocks during suspend. > >>>>> Hmm, perhaps that is true if you need to migrate things out of vram? > >>>>> It is at least not a problem when vram is not involved. > >>>> No, it's much wider than that. > >>>> > >>>> See what can happen is that the memory management shrinkers want to wait > >>>> for a dma_fence during suspend. > >>> we don't wait on fences in shrinker, only purging or evicting things > >>> that are already ready. Actually, waiting on fences in shrinker path > >>> sounds like a pretty bad idea. > >>> > >>>> And if you stop the scheduler they will just wait forever. > >>>> > >>>> What you need to do instead is to drain the scheduler, e.g. call > >>>> drm_sched_entity_flush() with a proper timeout for each entity you have > >>>> created. > >>> yeah, it would work to drain the scheduler.. I guess that might be the > >>> more portable approach as far as generic solution for suspend. > >>> > >>> BR, > >>> -R > >> > >> I am not sure how this drains the scheduler ? Suppose we done the > >> waiting in drm_sched_entity_flush, > >> what prevents someone to push right away another job into the same > >> entity's queue right after that ? > >> Shouldn't we first disable further pushing of jobs into entity before we > >> wait for sched->job_scheduled ? > >> > > In the system suspend path, userspace processes will have already been > > frozen, so there should be no way to push more jobs to the scheduler, > > unless they are pushed from the kernel itself. > > > It was my suspicion but I wasn't sure about it. > > > > We don't do that in > > drm/msm, but maybe you need to to move things btwn vram and system > > memory? > > > Exactly, that was my main concern - if we use this method we have to use > it in a point in > suspend sequence when all the in kernel job submissions activity already > suspended > > > But even in that case, if the # of jobs you push is bounded I > > guess that is ok? > > Submissions to scheduler entities are using unbounded queue, the bounded > part is when > you extract next job from entity to submit to HW ring and it rejects if > submission limit reached (drm_sched_ready) > > In general - It looks to me at least that what we what we want her is > more of a drain operation then flush (i.e. > we first want to disable any further job submission to entity's queue > and then flush all in flight ones). As example > for this i was looking at flush_workqueue vs. drain_workqueue Would it be possible for amdgpu to, in the system suspend task, 1) first queue up all the jobs needed to migrate bos out of vram, and whatever other housekeeping jobs are needed 2) then drain gpu scheduler's queues 3) and then finally wait for jobs executing on GPU to complete BR, -R > Andrey > > > > > > BR, > > -R
On 2022-03-17 16:35, Rob Clark wrote: > On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky > <andrey.grodzovsky@amd.com> wrote: >> >> On 2022-03-17 14:25, Rob Clark wrote: >>> On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky >>> <andrey.grodzovsky@amd.com> wrote: >>>> On 2022-03-17 13:35, Rob Clark wrote: >>>>> On Thu, Mar 17, 2022 at 9:45 AM Christian König >>>>> <christian.koenig@amd.com> wrote: >>>>>> Am 17.03.22 um 17:18 schrieb Rob Clark: >>>>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König >>>>>>> <christian.koenig@amd.com> wrote: >>>>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: >>>>>>>>> [SNIP] >>>>>>>>> userspace frozen != kthread frozen .. that is what this patch is >>>>>>>>> trying to address, so we aren't racing between shutting down the hw >>>>>>>>> and the scheduler shoveling more jobs at us. >>>>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling >>>>>>>> more jobs at us until it is empty. >>>>>>>> >>>>>>>> Thinking more about it we will then keep some dma_fence instance >>>>>>>> unsignaled and that is and extremely bad idea since it can lead to >>>>>>>> deadlocks during suspend. >>>>>>> Hmm, perhaps that is true if you need to migrate things out of vram? >>>>>>> It is at least not a problem when vram is not involved. >>>>>> No, it's much wider than that. >>>>>> >>>>>> See what can happen is that the memory management shrinkers want to wait >>>>>> for a dma_fence during suspend. >>>>> we don't wait on fences in shrinker, only purging or evicting things >>>>> that are already ready. Actually, waiting on fences in shrinker path >>>>> sounds like a pretty bad idea. >>>>> >>>>>> And if you stop the scheduler they will just wait forever. >>>>>> >>>>>> What you need to do instead is to drain the scheduler, e.g. call >>>>>> drm_sched_entity_flush() with a proper timeout for each entity you have >>>>>> created. >>>>> yeah, it would work to drain the scheduler.. I guess that might be the >>>>> more portable approach as far as generic solution for suspend. >>>>> >>>>> BR, >>>>> -R >>>> I am not sure how this drains the scheduler ? Suppose we done the >>>> waiting in drm_sched_entity_flush, >>>> what prevents someone to push right away another job into the same >>>> entity's queue right after that ? >>>> Shouldn't we first disable further pushing of jobs into entity before we >>>> wait for sched->job_scheduled ? >>>> >>> In the system suspend path, userspace processes will have already been >>> frozen, so there should be no way to push more jobs to the scheduler, >>> unless they are pushed from the kernel itself. >>> amdgpu_device_suspend >> >> It was my suspicion but I wasn't sure about it. >> >> >>> We don't do that in >>> drm/msm, but maybe you need to to move things btwn vram and system >>> memory? >> >> Exactly, that was my main concern - if we use this method we have to use >> it in a point in >> suspend sequence when all the in kernel job submissions activity already >> suspended >> >>> But even in that case, if the # of jobs you push is bounded I >>> guess that is ok? >> Submissions to scheduler entities are using unbounded queue, the bounded >> part is when >> you extract next job from entity to submit to HW ring and it rejects if >> submission limit reached (drm_sched_ready) >> >> In general - It looks to me at least that what we what we want her is >> more of a drain operation then flush (i.e. >> we first want to disable any further job submission to entity's queue >> and then flush all in flight ones). As example >> for this i was looking at flush_workqueue vs. drain_workqueue > Would it be possible for amdgpu to, in the system suspend task, > > 1) first queue up all the jobs needed to migrate bos out of vram, and > whatever other housekeeping jobs are needed > 2) then drain gpu scheduler's queues > 3) and then finally wait for jobs executing on GPU to complete We already do most of it in amdgpu_device_suspend, amdgpu_device_ip_suspend_phase1 followed by amdgpu_device_evict_resources followed by amdgpu_fence_driver_hw_fini is exactly steps 1 + 3. What we are missing is step 2). For this step I suggest adding a function called drm_sched_entity_drain which basically sets entity->stopped = true and then calls drm_sched_entity_flush. This will both reject any new insertions into entity's job queue and will flush all pending job submissions to HW from that entity. One point is we need to make make drm_sched_entity_push_job return value so the caller knows about job enqueue rejection. What about runtime suspend ? I guess same issue with scheduler racing against HW susppend is relevant there ? Also, could you point to a particular buggy scenario where the race between SW shceduler and suspend is causing a problem ? Andrey > > BR, > -R > >> Andrey >> >> >>> BR, >>> -R
On Fri, Mar 18, 2022 at 9:04 AM Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote: > > > On 2022-03-17 16:35, Rob Clark wrote: > > On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky > > <andrey.grodzovsky@amd.com> wrote: > >> > >> On 2022-03-17 14:25, Rob Clark wrote: > >>> On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky > >>> <andrey.grodzovsky@amd.com> wrote: > >>>> On 2022-03-17 13:35, Rob Clark wrote: > >>>>> On Thu, Mar 17, 2022 at 9:45 AM Christian König > >>>>> <christian.koenig@amd.com> wrote: > >>>>>> Am 17.03.22 um 17:18 schrieb Rob Clark: > >>>>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König > >>>>>>> <christian.koenig@amd.com> wrote: > >>>>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>>>>>>>> [SNIP] > >>>>>>>>> userspace frozen != kthread frozen .. that is what this patch is > >>>>>>>>> trying to address, so we aren't racing between shutting down the hw > >>>>>>>>> and the scheduler shoveling more jobs at us. > >>>>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling > >>>>>>>> more jobs at us until it is empty. > >>>>>>>> > >>>>>>>> Thinking more about it we will then keep some dma_fence instance > >>>>>>>> unsignaled and that is and extremely bad idea since it can lead to > >>>>>>>> deadlocks during suspend. > >>>>>>> Hmm, perhaps that is true if you need to migrate things out of vram? > >>>>>>> It is at least not a problem when vram is not involved. > >>>>>> No, it's much wider than that. > >>>>>> > >>>>>> See what can happen is that the memory management shrinkers want to wait > >>>>>> for a dma_fence during suspend. > >>>>> we don't wait on fences in shrinker, only purging or evicting things > >>>>> that are already ready. Actually, waiting on fences in shrinker path > >>>>> sounds like a pretty bad idea. > >>>>> > >>>>>> And if you stop the scheduler they will just wait forever. > >>>>>> > >>>>>> What you need to do instead is to drain the scheduler, e.g. call > >>>>>> drm_sched_entity_flush() with a proper timeout for each entity you have > >>>>>> created. > >>>>> yeah, it would work to drain the scheduler.. I guess that might be the > >>>>> more portable approach as far as generic solution for suspend. > >>>>> > >>>>> BR, > >>>>> -R > >>>> I am not sure how this drains the scheduler ? Suppose we done the > >>>> waiting in drm_sched_entity_flush, > >>>> what prevents someone to push right away another job into the same > >>>> entity's queue right after that ? > >>>> Shouldn't we first disable further pushing of jobs into entity before we > >>>> wait for sched->job_scheduled ? > >>>> > >>> In the system suspend path, userspace processes will have already been > >>> frozen, so there should be no way to push more jobs to the scheduler, > >>> unless they are pushed from the kernel itself. > >>> amdgpu_device_suspend > >> > >> It was my suspicion but I wasn't sure about it. > >> > >> > >>> We don't do that in > >>> drm/msm, but maybe you need to to move things btwn vram and system > >>> memory? > >> > >> Exactly, that was my main concern - if we use this method we have to use > >> it in a point in > >> suspend sequence when all the in kernel job submissions activity already > >> suspended > >> > >>> But even in that case, if the # of jobs you push is bounded I > >>> guess that is ok? > >> Submissions to scheduler entities are using unbounded queue, the bounded > >> part is when > >> you extract next job from entity to submit to HW ring and it rejects if > >> submission limit reached (drm_sched_ready) > >> > >> In general - It looks to me at least that what we what we want her is > >> more of a drain operation then flush (i.e. > >> we first want to disable any further job submission to entity's queue > >> and then flush all in flight ones). As example > >> for this i was looking at flush_workqueue vs. drain_workqueue > > Would it be possible for amdgpu to, in the system suspend task, > > > > 1) first queue up all the jobs needed to migrate bos out of vram, and > > whatever other housekeeping jobs are needed > > 2) then drain gpu scheduler's queues > > 3) and then finally wait for jobs executing on GPU to complete > > > We already do most of it in amdgpu_device_suspend, > amdgpu_device_ip_suspend_phase1 > followed by amdgpu_device_evict_resources followed by > amdgpu_fence_driver_hw_fini is > exactly steps 1 + 3. What we are missing is step 2). For this step I > suggest adding a function > called drm_sched_entity_drain which basically sets entity->stopped = > true and then calls drm_sched_entity_flush. > This will both reject any new insertions into entity's job queue and > will flush all pending job submissions to HW from that entity. > One point is we need to make make drm_sched_entity_push_job return value > so the caller knows about job enqueue > rejection. Hmm, seems like job enqueue that is rejected because we are in the process of suspending should be more of a WARN_ON() sort of thing? Not sure if there is something sensible to do for the caller at that point? > > What about runtime suspend ? I guess same issue with scheduler racing > against HW susppend is relevant there ? Runtime suspend should be ok, as long as the driver holds a runpm reference whenever the hw needs to be awake. The problem with system suspend (at least if you are using pm_runtime_force_suspend() or doing something equivalent) is that it bypasses the runpm reference. (Which, IMO, seems like a bad design..) > Also, could you point to a particular buggy scenario where the race > between SW shceduler and suspend is causing a problem ? I wrote a piglit test[1] to try to trigger this scenario.. it isn't really that easy to hit BR, -R [1] https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/643 > Andrey > > > > > > BR, > > -R > > > >> Andrey > >> > >> > >>> BR, > >>> -R
On 2022-03-18 12:20, Rob Clark wrote: > On Fri, Mar 18, 2022 at 9:04 AM Andrey Grodzovsky > <andrey.grodzovsky@amd.com> wrote: >> >> On 2022-03-17 16:35, Rob Clark wrote: >>> On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky >>> <andrey.grodzovsky@amd.com> wrote: >>>> On 2022-03-17 14:25, Rob Clark wrote: >>>>> On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky >>>>> <andrey.grodzovsky@amd.com> wrote: >>>>>> On 2022-03-17 13:35, Rob Clark wrote: >>>>>>> On Thu, Mar 17, 2022 at 9:45 AM Christian König >>>>>>> <christian.koenig@amd.com> wrote: >>>>>>>> Am 17.03.22 um 17:18 schrieb Rob Clark: >>>>>>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König >>>>>>>>> <christian.koenig@amd.com> wrote: >>>>>>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: >>>>>>>>>>> [SNIP] >>>>>>>>>>> userspace frozen != kthread frozen .. that is what this patch is >>>>>>>>>>> trying to address, so we aren't racing between shutting down the hw >>>>>>>>>>> and the scheduler shoveling more jobs at us. >>>>>>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling >>>>>>>>>> more jobs at us until it is empty. >>>>>>>>>> >>>>>>>>>> Thinking more about it we will then keep some dma_fence instance >>>>>>>>>> unsignaled and that is and extremely bad idea since it can lead to >>>>>>>>>> deadlocks during suspend. >>>>>>>>> Hmm, perhaps that is true if you need to migrate things out of vram? >>>>>>>>> It is at least not a problem when vram is not involved. >>>>>>>> No, it's much wider than that. >>>>>>>> >>>>>>>> See what can happen is that the memory management shrinkers want to wait >>>>>>>> for a dma_fence during suspend. >>>>>>> we don't wait on fences in shrinker, only purging or evicting things >>>>>>> that are already ready. Actually, waiting on fences in shrinker path >>>>>>> sounds like a pretty bad idea. >>>>>>> >>>>>>>> And if you stop the scheduler they will just wait forever. >>>>>>>> >>>>>>>> What you need to do instead is to drain the scheduler, e.g. call >>>>>>>> drm_sched_entity_flush() with a proper timeout for each entity you have >>>>>>>> created. >>>>>>> yeah, it would work to drain the scheduler.. I guess that might be the >>>>>>> more portable approach as far as generic solution for suspend. >>>>>>> >>>>>>> BR, >>>>>>> -R >>>>>> I am not sure how this drains the scheduler ? Suppose we done the >>>>>> waiting in drm_sched_entity_flush, >>>>>> what prevents someone to push right away another job into the same >>>>>> entity's queue right after that ? >>>>>> Shouldn't we first disable further pushing of jobs into entity before we >>>>>> wait for sched->job_scheduled ? >>>>>> >>>>> In the system suspend path, userspace processes will have already been >>>>> frozen, so there should be no way to push more jobs to the scheduler, >>>>> unless they are pushed from the kernel itself. >>>>> amdgpu_device_suspend >>>> It was my suspicion but I wasn't sure about it. >>>> >>>> >>>>> We don't do that in >>>>> drm/msm, but maybe you need to to move things btwn vram and system >>>>> memory? >>>> Exactly, that was my main concern - if we use this method we have to use >>>> it in a point in >>>> suspend sequence when all the in kernel job submissions activity already >>>> suspended >>>> >>>>> But even in that case, if the # of jobs you push is bounded I >>>>> guess that is ok? >>>> Submissions to scheduler entities are using unbounded queue, the bounded >>>> part is when >>>> you extract next job from entity to submit to HW ring and it rejects if >>>> submission limit reached (drm_sched_ready) >>>> >>>> In general - It looks to me at least that what we what we want her is >>>> more of a drain operation then flush (i.e. >>>> we first want to disable any further job submission to entity's queue >>>> and then flush all in flight ones). As example >>>> for this i was looking at flush_workqueue vs. drain_workqueue >>> Would it be possible for amdgpu to, in the system suspend task, >>> >>> 1) first queue up all the jobs needed to migrate bos out of vram, and >>> whatever other housekeeping jobs are needed >>> 2) then drain gpu scheduler's queues >>> 3) and then finally wait for jobs executing on GPU to complete >> >> We already do most of it in amdgpu_device_suspend, >> amdgpu_device_ip_suspend_phase1 >> followed by amdgpu_device_evict_resources followed by >> amdgpu_fence_driver_hw_fini is >> exactly steps 1 + 3. What we are missing is step 2). For this step I >> suggest adding a function >> called drm_sched_entity_drain which basically sets entity->stopped = >> true and then calls drm_sched_entity_flush. >> This will both reject any new insertions into entity's job queue and >> will flush all pending job submissions to HW from that entity. >> One point is we need to make make drm_sched_entity_push_job return value >> so the caller knows about job enqueue >> rejection. > Hmm, seems like job enqueue that is rejected because we are in the > process of suspending should be more of a WARN_ON() sort of thing? > Not sure if there is something sensible to do for the caller at that > point? What about the job's fence the caller is waiting on ? If we rejected job submission the caller must know about it to not get stuck waiting on that fence. > >> What about runtime suspend ? I guess same issue with scheduler racing >> against HW susppend is relevant there ? > Runtime suspend should be ok, as long as the driver holds a runpm > reference whenever the hw needs to be awake. The problem with system > suspend (at least if you are using pm_runtime_force_suspend() or doing > something equivalent) is that it bypasses the runpm reference. > (Which, IMO, seems like a bad design..) I am not totally clear yet - can you expand a bit why one case is ok but the other problematic ? Andrey > >> Also, could you point to a particular buggy scenario where the race >> between SW shceduler and suspend is causing a problem ? > I wrote a piglit test[1] to try to trigger this scenario.. it isn't > really that easy to hit > > BR, > -R > > [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fpiglit%2F-%2Fmerge_requests%2F643&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C502ac8db4fb94b3b0e9d08da08fb270e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637832172051790527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=u2Fqq%2BZpmjFHQFK77xwxEA5092O3Nc%2FdCMllfejgnvU%3D&reserved=0 > >> Andrey >> >> >>> BR, >>> -R >>> >>>> Andrey >>>> >>>> >>>>> BR, >>>>> -R
On Fri, Mar 18, 2022 at 9:27 AM Andrey Grodzovsky <andrey.grodzovsky@amd.com> wrote: > > > On 2022-03-18 12:20, Rob Clark wrote: > > On Fri, Mar 18, 2022 at 9:04 AM Andrey Grodzovsky > > <andrey.grodzovsky@amd.com> wrote: > >> > >> On 2022-03-17 16:35, Rob Clark wrote: > >>> On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky > >>> <andrey.grodzovsky@amd.com> wrote: > >>>> On 2022-03-17 14:25, Rob Clark wrote: > >>>>> On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky > >>>>> <andrey.grodzovsky@amd.com> wrote: > >>>>>> On 2022-03-17 13:35, Rob Clark wrote: > >>>>>>> On Thu, Mar 17, 2022 at 9:45 AM Christian König > >>>>>>> <christian.koenig@amd.com> wrote: > >>>>>>>> Am 17.03.22 um 17:18 schrieb Rob Clark: > >>>>>>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König > >>>>>>>>> <christian.koenig@amd.com> wrote: > >>>>>>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>>>>>>>>>> [SNIP] > >>>>>>>>>>> userspace frozen != kthread frozen .. that is what this patch is > >>>>>>>>>>> trying to address, so we aren't racing between shutting down the hw > >>>>>>>>>>> and the scheduler shoveling more jobs at us. > >>>>>>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling > >>>>>>>>>> more jobs at us until it is empty. > >>>>>>>>>> > >>>>>>>>>> Thinking more about it we will then keep some dma_fence instance > >>>>>>>>>> unsignaled and that is and extremely bad idea since it can lead to > >>>>>>>>>> deadlocks during suspend. > >>>>>>>>> Hmm, perhaps that is true if you need to migrate things out of vram? > >>>>>>>>> It is at least not a problem when vram is not involved. > >>>>>>>> No, it's much wider than that. > >>>>>>>> > >>>>>>>> See what can happen is that the memory management shrinkers want to wait > >>>>>>>> for a dma_fence during suspend. > >>>>>>> we don't wait on fences in shrinker, only purging or evicting things > >>>>>>> that are already ready. Actually, waiting on fences in shrinker path > >>>>>>> sounds like a pretty bad idea. > >>>>>>> > >>>>>>>> And if you stop the scheduler they will just wait forever. > >>>>>>>> > >>>>>>>> What you need to do instead is to drain the scheduler, e.g. call > >>>>>>>> drm_sched_entity_flush() with a proper timeout for each entity you have > >>>>>>>> created. > >>>>>>> yeah, it would work to drain the scheduler.. I guess that might be the > >>>>>>> more portable approach as far as generic solution for suspend. > >>>>>>> > >>>>>>> BR, > >>>>>>> -R > >>>>>> I am not sure how this drains the scheduler ? Suppose we done the > >>>>>> waiting in drm_sched_entity_flush, > >>>>>> what prevents someone to push right away another job into the same > >>>>>> entity's queue right after that ? > >>>>>> Shouldn't we first disable further pushing of jobs into entity before we > >>>>>> wait for sched->job_scheduled ? > >>>>>> > >>>>> In the system suspend path, userspace processes will have already been > >>>>> frozen, so there should be no way to push more jobs to the scheduler, > >>>>> unless they are pushed from the kernel itself. > >>>>> amdgpu_device_suspend > >>>> It was my suspicion but I wasn't sure about it. > >>>> > >>>> > >>>>> We don't do that in > >>>>> drm/msm, but maybe you need to to move things btwn vram and system > >>>>> memory? > >>>> Exactly, that was my main concern - if we use this method we have to use > >>>> it in a point in > >>>> suspend sequence when all the in kernel job submissions activity already > >>>> suspended > >>>> > >>>>> But even in that case, if the # of jobs you push is bounded I > >>>>> guess that is ok? > >>>> Submissions to scheduler entities are using unbounded queue, the bounded > >>>> part is when > >>>> you extract next job from entity to submit to HW ring and it rejects if > >>>> submission limit reached (drm_sched_ready) > >>>> > >>>> In general - It looks to me at least that what we what we want her is > >>>> more of a drain operation then flush (i.e. > >>>> we first want to disable any further job submission to entity's queue > >>>> and then flush all in flight ones). As example > >>>> for this i was looking at flush_workqueue vs. drain_workqueue > >>> Would it be possible for amdgpu to, in the system suspend task, > >>> > >>> 1) first queue up all the jobs needed to migrate bos out of vram, and > >>> whatever other housekeeping jobs are needed > >>> 2) then drain gpu scheduler's queues > >>> 3) and then finally wait for jobs executing on GPU to complete > >> > >> We already do most of it in amdgpu_device_suspend, > >> amdgpu_device_ip_suspend_phase1 > >> followed by amdgpu_device_evict_resources followed by > >> amdgpu_fence_driver_hw_fini is > >> exactly steps 1 + 3. What we are missing is step 2). For this step I > >> suggest adding a function > >> called drm_sched_entity_drain which basically sets entity->stopped = > >> true and then calls drm_sched_entity_flush. > >> This will both reject any new insertions into entity's job queue and > >> will flush all pending job submissions to HW from that entity. > >> One point is we need to make make drm_sched_entity_push_job return value > >> so the caller knows about job enqueue > >> rejection. > > Hmm, seems like job enqueue that is rejected because we are in the > > process of suspending should be more of a WARN_ON() sort of thing? > > Not sure if there is something sensible to do for the caller at that > > point? > > > What about the job's fence the caller is waiting on ? If we rejected > job submission the caller must know about it to not get stuck waiting > on that fence. > Hmm, perhaps I'm not being imaginative enough, but this sort of scenario seems like it should only arise from a bug in the driver's suspend path, Ie. not doing all the job submission before shutting down the scheduler. I don't think anything good is going to result either way, which is why I was thinking you'd want a WARN_ON() to help debug/fix that case. > > > > >> What about runtime suspend ? I guess same issue with scheduler racing > >> against HW susppend is relevant there ? > > Runtime suspend should be ok, as long as the driver holds a runpm > > reference whenever the hw needs to be awake. The problem with system > > suspend (at least if you are using pm_runtime_force_suspend() or doing > > something equivalent) is that it bypasses the runpm reference. > > (Which, IMO, seems like a bad design..) > > > I am not totally clear yet - can you expand a bit why one case is ok > but the other > problematic ? > Sure, normally pm_runtime_get/put increment a reference count, as long as there have been more get's than puts, the device won't runtime suspend. So, for ex, msm's run_job fxn does a pm_runtime_get_sync(). And retire_submit() which runs after job completes on GPU does a pm_runtime_put_autosuspend(). System suspend, OTOH, bypasses this reference counting. Which is why extra care is needed. BR, -R > Andrey > > > > > >> Also, could you point to a particular buggy scenario where the race > >> between SW shceduler and suspend is causing a problem ? > > I wrote a piglit test[1] to try to trigger this scenario.. it isn't > > really that easy to hit > > > > BR, > > -R > > > > [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fpiglit%2F-%2Fmerge_requests%2F643&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C502ac8db4fb94b3b0e9d08da08fb270e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637832172051790527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=u2Fqq%2BZpmjFHQFK77xwxEA5092O3Nc%2FdCMllfejgnvU%3D&reserved=0 > > > >> Andrey > >> > >> > >>> BR, > >>> -R > >>> > >>>> Andrey > >>>> > >>>> > >>>>> BR, > >>>>> -R
On 2022-03-18 13:22, Rob Clark wrote: > On Fri, Mar 18, 2022 at 9:27 AM Andrey Grodzovsky > <andrey.grodzovsky@amd.com> wrote: >> >> On 2022-03-18 12:20, Rob Clark wrote: >>> On Fri, Mar 18, 2022 at 9:04 AM Andrey Grodzovsky >>> <andrey.grodzovsky@amd.com> wrote: >>>> On 2022-03-17 16:35, Rob Clark wrote: >>>>> On Thu, Mar 17, 2022 at 12:50 PM Andrey Grodzovsky >>>>> <andrey.grodzovsky@amd.com> wrote: >>>>>> On 2022-03-17 14:25, Rob Clark wrote: >>>>>>> On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky >>>>>>> <andrey.grodzovsky@amd.com> wrote: >>>>>>>> On 2022-03-17 13:35, Rob Clark wrote: >>>>>>>>> On Thu, Mar 17, 2022 at 9:45 AM Christian König >>>>>>>>> <christian.koenig@amd.com> wrote: >>>>>>>>>> Am 17.03.22 um 17:18 schrieb Rob Clark: >>>>>>>>>>> On Thu, Mar 17, 2022 at 9:04 AM Christian König >>>>>>>>>>> <christian.koenig@amd.com> wrote: >>>>>>>>>>>> Am 17.03.22 um 16:10 schrieb Rob Clark: >>>>>>>>>>>>> [SNIP] >>>>>>>>>>>>> userspace frozen != kthread frozen .. that is what this patch is >>>>>>>>>>>>> trying to address, so we aren't racing between shutting down the hw >>>>>>>>>>>>> and the scheduler shoveling more jobs at us. >>>>>>>>>>>> Well exactly that's the problem. The scheduler is supposed to shoveling >>>>>>>>>>>> more jobs at us until it is empty. >>>>>>>>>>>> >>>>>>>>>>>> Thinking more about it we will then keep some dma_fence instance >>>>>>>>>>>> unsignaled and that is and extremely bad idea since it can lead to >>>>>>>>>>>> deadlocks during suspend. >>>>>>>>>>> Hmm, perhaps that is true if you need to migrate things out of vram? >>>>>>>>>>> It is at least not a problem when vram is not involved. >>>>>>>>>> No, it's much wider than that. >>>>>>>>>> >>>>>>>>>> See what can happen is that the memory management shrinkers want to wait >>>>>>>>>> for a dma_fence during suspend. >>>>>>>>> we don't wait on fences in shrinker, only purging or evicting things >>>>>>>>> that are already ready. Actually, waiting on fences in shrinker path >>>>>>>>> sounds like a pretty bad idea. >>>>>>>>> >>>>>>>>>> And if you stop the scheduler they will just wait forever. >>>>>>>>>> >>>>>>>>>> What you need to do instead is to drain the scheduler, e.g. call >>>>>>>>>> drm_sched_entity_flush() with a proper timeout for each entity you have >>>>>>>>>> created. >>>>>>>>> yeah, it would work to drain the scheduler.. I guess that might be the >>>>>>>>> more portable approach as far as generic solution for suspend. >>>>>>>>> >>>>>>>>> BR, >>>>>>>>> -R >>>>>>>> I am not sure how this drains the scheduler ? Suppose we done the >>>>>>>> waiting in drm_sched_entity_flush, >>>>>>>> what prevents someone to push right away another job into the same >>>>>>>> entity's queue right after that ? >>>>>>>> Shouldn't we first disable further pushing of jobs into entity before we >>>>>>>> wait for sched->job_scheduled ? >>>>>>>> >>>>>>> In the system suspend path, userspace processes will have already been >>>>>>> frozen, so there should be no way to push more jobs to the scheduler, >>>>>>> unless they are pushed from the kernel itself. >>>>>>> amdgpu_device_suspend >>>>>> It was my suspicion but I wasn't sure about it. >>>>>> >>>>>> >>>>>>> We don't do that in >>>>>>> drm/msm, but maybe you need to to move things btwn vram and system >>>>>>> memory? >>>>>> Exactly, that was my main concern - if we use this method we have to use >>>>>> it in a point in >>>>>> suspend sequence when all the in kernel job submissions activity already >>>>>> suspended >>>>>> >>>>>>> But even in that case, if the # of jobs you push is bounded I >>>>>>> guess that is ok? >>>>>> Submissions to scheduler entities are using unbounded queue, the bounded >>>>>> part is when >>>>>> you extract next job from entity to submit to HW ring and it rejects if >>>>>> submission limit reached (drm_sched_ready) >>>>>> >>>>>> In general - It looks to me at least that what we what we want her is >>>>>> more of a drain operation then flush (i.e. >>>>>> we first want to disable any further job submission to entity's queue >>>>>> and then flush all in flight ones). As example >>>>>> for this i was looking at flush_workqueue vs. drain_workqueue >>>>> Would it be possible for amdgpu to, in the system suspend task, >>>>> >>>>> 1) first queue up all the jobs needed to migrate bos out of vram, and >>>>> whatever other housekeeping jobs are needed >>>>> 2) then drain gpu scheduler's queues >>>>> 3) and then finally wait for jobs executing on GPU to complete >>>> We already do most of it in amdgpu_device_suspend, >>>> amdgpu_device_ip_suspend_phase1 >>>> followed by amdgpu_device_evict_resources followed by >>>> amdgpu_fence_driver_hw_fini is >>>> exactly steps 1 + 3. What we are missing is step 2). For this step I >>>> suggest adding a function >>>> called drm_sched_entity_drain which basically sets entity->stopped = >>>> true and then calls drm_sched_entity_flush. >>>> This will both reject any new insertions into entity's job queue and >>>> will flush all pending job submissions to HW from that entity. >>>> One point is we need to make make drm_sched_entity_push_job return value >>>> so the caller knows about job enqueue >>>> rejection. >>> Hmm, seems like job enqueue that is rejected because we are in the >>> process of suspending should be more of a WARN_ON() sort of thing? >>> Not sure if there is something sensible to do for the caller at that >>> point? >> >> What about the job's fence the caller is waiting on ? If we rejected >> job submission the caller must know about it to not get stuck waiting >> on that fence. >> > Hmm, perhaps I'm not being imaginative enough, but this sort of > scenario seems like it should only arise from a bug in the driver's > suspend path, Ie. not doing all the job submission before shutting > down the scheduler. I don't think anything good is going to result > either way, which is why I was thinking you'd want a WARN_ON() to help > debug/fix that case. Yes, I just wanted the code to not allow such bugs to go through unnoticed. I guess WARN_ON should give laud enough warning anyway Andrey >>>> What about runtime suspend ? I guess same issue with scheduler racing >>>> against HW susppend is relevant there ? >>> Runtime suspend should be ok, as long as the driver holds a runpm >>> reference whenever the hw needs to be awake. The problem with system >>> suspend (at least if you are using pm_runtime_force_suspend() or doing >>> something equivalent) is that it bypasses the runpm reference. >>> (Which, IMO, seems like a bad design..) >> >> I am not totally clear yet - can you expand a bit why one case is ok >> but the other >> problematic ? >> > Sure, normally pm_runtime_get/put increment a reference count, as long > as there have been more get's than puts, the device won't runtime > suspend. So, for ex, msm's run_job fxn does a pm_runtime_get_sync(). > And retire_submit() which runs after job completes on GPU does a > pm_runtime_put_autosuspend(). > > System suspend, OTOH, bypasses this reference counting. Which is why > extra care is needed. > > BR, > -R > > >> Andrey >> >> >>>> Also, could you point to a particular buggy scenario where the race >>>> between SW shceduler and suspend is causing a problem ? >>> I wrote a piglit test[1] to try to trigger this scenario.. it isn't >>> really that easy to hit >>> >>> BR, >>> -R >>> >>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fpiglit%2F-%2Fmerge_requests%2F643&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C35f0d7d9282044651c9708da0903d4f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637832209324217553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=dwjPEVAYgCI%2BtEyzBirfAQkJjZax2NdiLQfNeFfImtU%3D&reserved=0 >>> >>>> Andrey >>>> >>>> >>>>> BR, >>>>> -R >>>>> >>>>>> Andrey >>>>>> >>>>>> >>>>>>> BR, >>>>>>> -R
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 8859834b51b8..0440a98988fc 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu) static int adreno_runtime_suspend(struct device *dev) { struct msm_gpu *gpu = dev_to_gpu(dev); - int remaining; + + /* + * We should be holding a runpm ref, which will prevent + * runtime suspend. In the system suspend path, we've + * already waited for active jobs to complete. + */ + WARN_ON_ONCE(gpu->active_submits); + + return gpu->funcs->pm_suspend(gpu); +} + +static void suspend_scheduler(struct msm_gpu *gpu) +{ + int i; + + /* + * Shut down the scheduler before we force suspend, so that + * suspend isn't racing with scheduler kthread feeding us + * more work. + * + * Note, we just want to park the thread, and let any jobs + * that are already on the hw queue complete normally, as + * opposed to the drm_sched_stop() path used for handling + * faulting/timed-out jobs. We can't really cancel any jobs + * already on the hw queue without racing with the GPU. + */ + for (i = 0; i < gpu->nr_rings; i++) { + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; + kthread_park(sched->thread); + } +} + +static void resume_scheduler(struct msm_gpu *gpu) +{ + int i; + + for (i = 0; i < gpu->nr_rings; i++) { + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; + kthread_unpark(sched->thread); + } +} + +static int adreno_system_suspend(struct device *dev) +{ + struct msm_gpu *gpu = dev_to_gpu(dev); + int remaining, ret; + + suspend_scheduler(gpu); remaining = wait_event_timeout(gpu->retire_event, active_submits(gpu) == 0, msecs_to_jiffies(1000)); if (remaining == 0) { dev_err(dev, "Timeout waiting for GPU to suspend\n"); - return -EBUSY; + ret = -EBUSY; + goto out; } - return gpu->funcs->pm_suspend(gpu); + ret = pm_runtime_force_suspend(dev); +out: + if (ret) + resume_scheduler(gpu); + + return ret; } + +static int adreno_system_resume(struct device *dev) +{ + resume_scheduler(dev_to_gpu(dev)); + return pm_runtime_force_resume(dev); +} + #endif static const struct dev_pm_ops adreno_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) + SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume) SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL) };