diff mbox series

drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl

Message ID 20230526130131.16521-1-lm0963hack@gmail.com
State New
Headers show
Series drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl | expand

Commit Message

Min Li May 26, 2023, 1:01 p.m. UTC
If it is async, runqueue_node is freed in g2d_runqueue_worker on another
worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
then executes the following if statement, there will be use-after-free.

Signed-off-by: Min Li <lm0963hack@gmail.com>
---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andi Shyti May 30, 2023, 10:21 p.m. UTC | #1
Hi Min,

On Fri, May 26, 2023 at 09:01:31PM +0800, Min Li wrote:
> If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> then executes the following if statement, there will be use-after-free.
> 
> Signed-off-by: Min Li <lm0963hack@gmail.com>
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index ec784e58da5c..414e585ec7dd 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
>  	/* Let the runqueue know that there is work to do. */
>  	queue_work(g2d->g2d_workq, &g2d->runqueue_work);
>  
> -	if (runqueue_node->async)
> +	if (req->async)

did you actually hit this? If you did, then the fix is not OK.

Andi

>  		goto out;
>  
>  	wait_for_completion(&runqueue_node->complete);
> -- 
> 2.34.1
>
Min Li May 31, 2023, 4:39 a.m. UTC | #2
Hi Andi,

On Wed, May 31, 2023 at 6:21 AM Andi Shyti <andi.shyti@kernel.org> wrote:
>
> Hi Min,
>
> On Fri, May 26, 2023 at 09:01:31PM +0800, Min Li wrote:
> > If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> > worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> > then executes the following if statement, there will be use-after-free.
> >
> > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > ---
> >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > index ec784e58da5c..414e585ec7dd 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
> >       /* Let the runqueue know that there is work to do. */
> >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> >
> > -     if (runqueue_node->async)
> > +     if (req->async)
>
> did you actually hit this? If you did, then the fix is not OK.

No, I didn't actually hit this. I found it through code review. This
is only a theoretical issue that can only be triggered in extreme
cases.

>
> Andi
>
> >               goto out;
> >
> >       wait_for_completion(&runqueue_node->complete);
> > --
> > 2.34.1
> >
대인기/Tizen Platform Lab(SR)/삼성전자 May 31, 2023, 7:45 a.m. UTC | #3
Hi,


> -----Original Message-----
> From: Min Li <lm0963hack@gmail.com>
> Sent: Friday, May 26, 2023 10:02 PM
> To: inki.dae@samsung.com
> Cc: sw0312.kim@samsung.com; kyungmin.park@samsung.com; airlied@gmail.com;
> daniel@ffwll.ch; krzysztof.kozlowski@linaro.org; alim.akhtar@samsung.com;
> dri-devel@lists.freedesktop.org; linux-arm-kernel@lists.infradead.org;
> linux-samsung-soc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH] drm/exynos: fix race condition UAF in
> exynos_g2d_exec_ioctl
> 
> If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> then executes the following if statement, there will be use-after-free.
> 

I received a report about the related issue from a white hacker before.
Thanks for contribution. :)

> Signed-off-by: Min Li <lm0963hack@gmail.com>
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index ec784e58da5c..414e585ec7dd 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device
*drm_dev,
> void *data,
>  	/* Let the runqueue know that there is work to do. */
>  	queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> 
> -	if (runqueue_node->async)
> +	if (req->async)
>  		goto out;
> 
>  	wait_for_completion(&runqueue_node->complete);
> --
> 2.34.1
Andi Shyti May 31, 2023, 8:19 a.m. UTC | #4
Hi Min,

> > > If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> > > worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> > > then executes the following if statement, there will be use-after-free.
> > >
> > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > ---
> > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > index ec784e58da5c..414e585ec7dd 100644
> > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
> > >       /* Let the runqueue know that there is work to do. */
> > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > >
> > > -     if (runqueue_node->async)
> > > +     if (req->async)
> >
> > did you actually hit this? If you did, then the fix is not OK.
> 
> No, I didn't actually hit this. I found it through code review. This
> is only a theoretical issue that can only be triggered in extreme
> cases.

first of all runqueue is used again two lines below this, which
means that if you don't hit the uaf here you will hit it
immediately after.

Second, if runqueue is freed, than we need to remove the part
where it's freed because it doesn't make sense to free runqueue
at this stage.

Finally, can you elaborate on the code review that you did so
that we all understand it?

Andi
Min Li May 31, 2023, 10:54 a.m. UTC | #5
Hi Andi,

On Wed, May 31, 2023 at 4:19 PM Andi Shyti <andi.shyti@kernel.org> wrote:
>
> Hi Min,
>
> > > > If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> > > > worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> > > > then executes the following if statement, there will be use-after-free.
> > > >
> > > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > index ec784e58da5c..414e585ec7dd 100644
> > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
> > > >       /* Let the runqueue know that there is work to do. */
> > > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > >
> > > > -     if (runqueue_node->async)
> > > > +     if (req->async)
> > >
> > > did you actually hit this? If you did, then the fix is not OK.
> >
> > No, I didn't actually hit this. I found it through code review. This
> > is only a theoretical issue that can only be triggered in extreme
> > cases.
>
> first of all runqueue is used again two lines below this, which
> means that if you don't hit the uaf here you will hit it
> immediately after.

No, if async is true, then it will goto out, which will directly return.

if (runqueue_node->async)
    goto out;   // here, go to out, will directly return

wait_for_completion(&runqueue_node->complete);      // not hit
g2d_free_runqueue_node(g2d, runqueue_node);

out:
return 0;

>
> Second, if runqueue is freed, than we need to remove the part
> where it's freed because it doesn't make sense to free runqueue
> at this stage.

It is freed by g2d_free_runqueue_node in g2d_runqueue_worker

static void g2d_runqueue_worker(struct work_struct *work)
{
    ......
    if (runqueue_node) {
        pm_runtime_mark_last_busy(g2d->dev);
        pm_runtime_put_autosuspend(g2d->dev);

        complete(&runqueue_node->complete);
        if (runqueue_node->async)
            g2d_free_runqueue_node(g2d, runqueue_node);        // freed here
    }

>
> Finally, can you elaborate on the code review that you did so
> that we all understand it?

queue_work(g2d->g2d_workq, &g2d->runqueue_work);
msleep(100);        // add sleep here to let g2d_runqueue_worker run first
if (runqueue_node->async)
    goto out;


>
> Andi



--
Min Li
Andi Shyti May 31, 2023, 12:05 p.m. UTC | #6
Hi Min,

On Wed, May 31, 2023 at 06:54:34PM +0800, lm0963 wrote:
> Hi Andi,
> 
> On Wed, May 31, 2023 at 4:19 PM Andi Shyti <andi.shyti@kernel.org> wrote:
> >
> > Hi Min,
> >
> > > > > If it is async, runqueue_node is freed in g2d_runqueue_worker on another
> > > > > worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and
> > > > > then executes the following if statement, there will be use-after-free.
> > > > >
> > > > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
> > > > >       /* Let the runqueue know that there is work to do. */
> > > > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > >
> > > > > -     if (runqueue_node->async)
> > > > > +     if (req->async)
> > > >
> > > > did you actually hit this? If you did, then the fix is not OK.
> > >
> > > No, I didn't actually hit this. I found it through code review. This
> > > is only a theoretical issue that can only be triggered in extreme
> > > cases.
> >
> > first of all runqueue is used again two lines below this, which
> > means that if you don't hit the uaf here you will hit it
> > immediately after.
> 
> No, if async is true, then it will goto out, which will directly return.
> 
> if (runqueue_node->async)
>     goto out;   // here, go to out, will directly return
> 
> wait_for_completion(&runqueue_node->complete);      // not hit
> g2d_free_runqueue_node(g2d, runqueue_node);
> 
> out:
> return 0;

that's right, sorry, I misread it.

> > Second, if runqueue is freed, than we need to remove the part
> > where it's freed because it doesn't make sense to free runqueue
> > at this stage.
> 
> It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> 
> static void g2d_runqueue_worker(struct work_struct *work)
> {
>     ......
>     if (runqueue_node) {
>         pm_runtime_mark_last_busy(g2d->dev);
>         pm_runtime_put_autosuspend(g2d->dev);
> 
>         complete(&runqueue_node->complete);
>         if (runqueue_node->async)
>             g2d_free_runqueue_node(g2d, runqueue_node);        // freed here

this is what I'm wondering: is it correct to free a resource
here? The design looks to me a bit fragile and prone to mistakes.

The patch per se is OK. It doesn't make much difference to me
where you actually read async, although this patch looks a bit
safer:

Reviewed-by: Andi Shyti <andi.shyti@kernel.org> 

However some refactoring might be needed to make it a bit more
robust.

Thanks,
Andi

>     }
> 
> >
> > Finally, can you elaborate on the code review that you did so
> > that we all understand it?
> 
> queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> msleep(100);        // add sleep here to let g2d_runqueue_worker run first
> if (runqueue_node->async)
>     goto out;
> 
> 
> >
> > Andi
> 
> 
> 
> --
> Min Li
대인기/Tizen Platform Lab(SR)/삼성전자 May 31, 2023, 10:55 p.m. UTC | #7
Hi Andi,

> -----Original Message-----
> From: Andi Shyti <andi.shyti@kernel.org>
> Sent: Wednesday, May 31, 2023 9:06 PM
> To: lm0963 <lm0963hack@gmail.com>
> Cc: inki.dae@samsung.com; sw0312.kim@samsung.com;
> kyungmin.park@samsung.com; airlied@gmail.com; daniel@ffwll.ch;
> krzysztof.kozlowski@linaro.org; alim.akhtar@samsung.com; dri-
> devel@lists.freedesktop.org; linux-arm-kernel@lists.infradead.org; linux-
> samsung-soc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] drm/exynos: fix race condition UAF in
> exynos_g2d_exec_ioctl
> 
> Hi Min,
> 
> On Wed, May 31, 2023 at 06:54:34PM +0800, lm0963 wrote:
> > Hi Andi,
> >
> > On Wed, May 31, 2023 at 4:19 PM Andi Shyti <andi.shyti@kernel.org> wrote:
> > >
> > > Hi Min,
> > >
> > > > > > If it is async, runqueue_node is freed in g2d_runqueue_worker on
> another
> > > > > > worker thread. So in extreme cases, if g2d_runqueue_worker runs
> first, and
> > > > > > then executes the following if statement, there will be use-
> after-free.
> > > > > >
> > > > > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct
> drm_device *drm_dev, void *data,
> > > > > >       /* Let the runqueue know that there is work to do. */
> > > > > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > > >
> > > > > > -     if (runqueue_node->async)
> > > > > > +     if (req->async)
> > > > >
> > > > > did you actually hit this? If you did, then the fix is not OK.
> > > >
> > > > No, I didn't actually hit this. I found it through code review. This
> > > > is only a theoretical issue that can only be triggered in extreme
> > > > cases.
> > >
> > > first of all runqueue is used again two lines below this, which
> > > means that if you don't hit the uaf here you will hit it
> > > immediately after.
> >
> > No, if async is true, then it will goto out, which will directly return.
> >
> > if (runqueue_node->async)
> >     goto out;   // here, go to out, will directly return
> >
> > wait_for_completion(&runqueue_node->complete);      // not hit
> > g2d_free_runqueue_node(g2d, runqueue_node);
> >
> > out:
> > return 0;
> 
> that's right, sorry, I misread it.
> 
> > > Second, if runqueue is freed, than we need to remove the part
> > > where it's freed because it doesn't make sense to free runqueue
> > > at this stage.
> >
> > It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> >
> > static void g2d_runqueue_worker(struct work_struct *work)
> > {
> >     ......
> >     if (runqueue_node) {
> >         pm_runtime_mark_last_busy(g2d->dev);
> >         pm_runtime_put_autosuspend(g2d->dev);
> >
> >         complete(&runqueue_node->complete);
> >         if (runqueue_node->async)
> >             g2d_free_runqueue_node(g2d, runqueue_node);        // freed here
> 
> this is what I'm wondering: is it correct to free a resource
> here? The design looks to me a bit fragile and prone to mistakes.

This question seems to deviate from the purpose of this patch. If you are providing additional opinions for code quality improvement unrelated to this patch, it would be more appropriate for me to answer instead of him.

The runqueue node - which contains command list for g2d rendering - is generated when the user calls the ioctl system call. Therefore, if the user-requested command list is rendered by g2d device then there is no longer a reason to keep it. :)

> 
> The patch per se is OK. It doesn't make much difference to me
> where you actually read async, although this patch looks a bit
> safer:
> 
> Reviewed-by: Andi Shyti <andi.shyti@kernel.org>

Thanks,
Inki Dae

> 
> However some refactoring might be needed to make it a bit more
> robust.
> 
> Thanks,
> Andi
> 
> >     }
> >
> > >
> > > Finally, can you elaborate on the code review that you did so
> > > that we all understand it?
> >
> > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > msleep(100);        // add sleep here to let g2d_runqueue_worker run first
> > if (runqueue_node->async)
> >     goto out;
> >
> >
> > >
> > > Andi
> >
> >
> >
> > --
> > Min Li
Andi Shyti June 1, 2023, 8:29 a.m. UTC | #8
Hi Inki,

> > > > > > > If it is async, runqueue_node is freed in g2d_runqueue_worker on
> > another
> > > > > > > worker thread. So in extreme cases, if g2d_runqueue_worker runs
> > first, and
> > > > > > > then executes the following if statement, there will be use-
> > after-free.
> > > > > > >
> > > > > > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct
> > drm_device *drm_dev, void *data,
> > > > > > >       /* Let the runqueue know that there is work to do. */
> > > > > > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > > > >
> > > > > > > -     if (runqueue_node->async)
> > > > > > > +     if (req->async)
> > > > > >
> > > > > > did you actually hit this? If you did, then the fix is not OK.
> > > > >
> > > > > No, I didn't actually hit this. I found it through code review. This
> > > > > is only a theoretical issue that can only be triggered in extreme
> > > > > cases.
> > > >
> > > > first of all runqueue is used again two lines below this, which
> > > > means that if you don't hit the uaf here you will hit it
> > > > immediately after.
> > >
> > > No, if async is true, then it will goto out, which will directly return.
> > >
> > > if (runqueue_node->async)
> > >     goto out;   // here, go to out, will directly return
> > >
> > > wait_for_completion(&runqueue_node->complete);      // not hit
> > > g2d_free_runqueue_node(g2d, runqueue_node);
> > >
> > > out:
> > > return 0;
> > 
> > that's right, sorry, I misread it.
> > 
> > > > Second, if runqueue is freed, than we need to remove the part
> > > > where it's freed because it doesn't make sense to free runqueue
> > > > at this stage.
> > >
> > > It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> > >
> > > static void g2d_runqueue_worker(struct work_struct *work)
> > > {
> > >     ......
> > >     if (runqueue_node) {
> > >         pm_runtime_mark_last_busy(g2d->dev);
> > >         pm_runtime_put_autosuspend(g2d->dev);
> > >
> > >         complete(&runqueue_node->complete);
> > >         if (runqueue_node->async)
> > >             g2d_free_runqueue_node(g2d, runqueue_node);        // freed here
> > 
> > this is what I'm wondering: is it correct to free a resource
> > here? The design looks to me a bit fragile and prone to mistakes.
> 
> This question seems to deviate from the purpose of this patch. If you are providing additional opinions for code quality improvement unrelated to this patch, it would be more appropriate for me to answer instead of him.

It's not deviating as the question was already made in my first
review. It just looks strange to me that a piece of data shared
amongst processes can be freed up without sinchronizing. A bunch
of if's do not make it robust enough.

The patch itself, in my point of view, is not really fixing much
and won't make any difference, it's just exposing the weakness I
mentioned.

However, honestly speaking, I don't know the driver well enough
to suggest architectural changes and that's why I r-b'ed this
one. But the first thing that comes to my mind, without looking
much at the code, is using kref's as a way to make sure that a
resource doesn't magically disappear under your nose.

But, of course, this is up to you and if in your opinion this is
OK and it fixes it... then you definitely know better :)

Thanks for this discussion,
Andi

> The runqueue node - which contains command list for g2d rendering - is generated when the user calls the ioctl system call. Therefore, if the user-requested command list is rendered by g2d device then there is no longer a reason to keep it. :)
> 
> > 
> > The patch per se is OK. It doesn't make much difference to me
> > where you actually read async, although this patch looks a bit
> > safer:
> > 
> > Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
> 
> Thanks,
> Inki Dae
> 
> > 
> > However some refactoring might be needed to make it a bit more
> > robust.
> > 
> > Thanks,
> > Andi
> > 
> > >     }
> > >
> > > >
> > > > Finally, can you elaborate on the code review that you did so
> > > > that we all understand it?
> > >
> > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > msleep(100);        // add sleep here to let g2d_runqueue_worker run first
> > > if (runqueue_node->async)
> > >     goto out;
> > >
> > >
> > > >
> > > > Andi
> > >
> > >
> > >
> > > --
> > > Min Li
> 
>
대인기/Tizen Platform Lab(SR)/삼성전자 June 2, 2023, 1:20 a.m. UTC | #9
Andi~ :)

> -----Original Message-----
> From: Andi Shyti <andi.shyti@kernel.org>
> Sent: Thursday, June 1, 2023 5:29 PM
> To: 대인기/Tizen Platform Lab(SR)/삼성전자 <inki.dae@samsung.com>
> Cc: 'lm0963' <lm0963hack@gmail.com>; sw0312.kim@samsung.com;
> kyungmin.park@samsung.com; airlied@gmail.com; daniel@ffwll.ch;
> krzysztof.kozlowski@linaro.org; alim.akhtar@samsung.com; dri-
> devel@lists.freedesktop.org; linux-arm-kernel@lists.infradead.org; linux-
> samsung-soc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] drm/exynos: fix race condition UAF in
> exynos_g2d_exec_ioctl
> 
> Hi Inki,
> 
> > > > > > > > If it is async, runqueue_node is freed in
> g2d_runqueue_worker on
> > > another
> > > > > > > > worker thread. So in extreme cases, if g2d_runqueue_worker
> runs
> > > first, and
> > > > > > > > then executes the following if statement, there will be use-
> > > after-free.
> > > > > > > >
> > > > > > > > Signed-off-by: Min Li <lm0963hack@gmail.com>
> > > > > > > > ---
> > > > > > > >  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct
> > > drm_device *drm_dev, void *data,
> > > > > > > >       /* Let the runqueue know that there is work to do. */
> > > > > > > >       queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > > > > >
> > > > > > > > -     if (runqueue_node->async)
> > > > > > > > +     if (req->async)
> > > > > > >
> > > > > > > did you actually hit this? If you did, then the fix is not OK.
> > > > > >
> > > > > > No, I didn't actually hit this. I found it through code review.
> This
> > > > > > is only a theoretical issue that can only be triggered in
> extreme
> > > > > > cases.
> > > > >
> > > > > first of all runqueue is used again two lines below this, which
> > > > > means that if you don't hit the uaf here you will hit it
> > > > > immediately after.
> > > >
> > > > No, if async is true, then it will goto out, which will directly
> return.
> > > >
> > > > if (runqueue_node->async)
> > > >     goto out;   // here, go to out, will directly return
> > > >
> > > > wait_for_completion(&runqueue_node->complete);      // not hit
> > > > g2d_free_runqueue_node(g2d, runqueue_node);
> > > >
> > > > out:
> > > > return 0;
> > >
> > > that's right, sorry, I misread it.
> > >
> > > > > Second, if runqueue is freed, than we need to remove the part
> > > > > where it's freed because it doesn't make sense to free runqueue
> > > > > at this stage.
> > > >
> > > > It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> > > >
> > > > static void g2d_runqueue_worker(struct work_struct *work)
> > > > {
> > > >     ......
> > > >     if (runqueue_node) {
> > > >         pm_runtime_mark_last_busy(g2d->dev);
> > > >         pm_runtime_put_autosuspend(g2d->dev);
> > > >
> > > >         complete(&runqueue_node->complete);
> > > >         if (runqueue_node->async)
> > > >             g2d_free_runqueue_node(g2d, runqueue_node);        //
freed
> here
> > >
> > > this is what I'm wondering: is it correct to free a resource
> > > here? The design looks to me a bit fragile and prone to mistakes.
> >
> > This question seems to deviate from the purpose of this patch. If you
> are providing additional opinions for code quality improvement unrelated
> to this patch, it would be more appropriate for me to answer instead of
> him.
> 
> It's not deviating as the question was already made in my first
> review. It just looks strange to me that a piece of data shared
> amongst processes can be freed up without sinchronizing. A bunch

I believe that if we overlook any doubts or concerns about worrisome
aspects without completely resolving them, it wouldn't be helpful to the
community.
Therefore, I would like to clarify more explicitly in order to ensure a
better understanding.

AFAIK, the data you mentioned isn't shared between processes. This data is
generated driver-internally when the user makes a rendering request and
will be removed once the 2D GPU finishes rendering.


However, there may be another issue that I'm not aware of, so if there is
any, give me it more specifically as it would help improve driver stability.

Thanks again,
Inki Dae

> of if's do not make it robust enough.
> 
> The patch itself, in my point of view, is not really fixing much
> and won't make any difference, it's just exposing the weakness I
> mentioned.
> 
> However, honestly speaking, I don't know the driver well enough
> to suggest architectural changes and that's why I r-b'ed this
> one. But the first thing that comes to my mind, without looking
> much at the code, is using kref's as a way to make sure that a
> resource doesn't magically disappear under your nose.
> 
> But, of course, this is up to you and if in your opinion this is
> OK and it fixes it... then you definitely know better :)
> 
> Thanks for this discussion,
> Andi
> 
> > The runqueue node - which contains command list for g2d rendering - is
> generated when the user calls the ioctl system call. Therefore, if the
> user-requested command list is rendered by g2d device then there is no
> longer a reason to keep it. :)
> >
> > >
> > > The patch per se is OK. It doesn't make much difference to me
> > > where you actually read async, although this patch looks a bit
> > > safer:
> > >
> > > Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
> >
> > Thanks,
> > Inki Dae
> >
> > >
> > > However some refactoring might be needed to make it a bit more
> > > robust.
> > >
> > > Thanks,
> > > Andi
> > >
> > > >     }
> > > >
> > > > >
> > > > > Finally, can you elaborate on the code review that you did so
> > > > > that we all understand it?
> > > >
> > > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > msleep(100);        // add sleep here to let g2d_runqueue_worker run
> first
> > > > if (runqueue_node->async)
> > > >     goto out;
> > > >
> > > >
> > > > >
> > > > > Andi
> > > >
> > > >
> > > >
> > > > --
> > > > Min Li
> >
> >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index ec784e58da5c..414e585ec7dd 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -1335,7 +1335,7 @@  int exynos_g2d_exec_ioctl(struct drm_device *drm_dev, void *data,
 	/* Let the runqueue know that there is work to do. */
 	queue_work(g2d->g2d_workq, &g2d->runqueue_work);
 
-	if (runqueue_node->async)
+	if (req->async)
 		goto out;
 
 	wait_for_completion(&runqueue_node->complete);