binder: fix memory corruption in binder_transaction binder

Message ID 20170905172152.36227-1-tkjos@google.com
State New
Headers show
Series
  • binder: fix memory corruption in binder_transaction binder
Related show

Commit Message

Todd Kjos Sept. 5, 2017, 5:21 p.m.
From: Xu YiPing <xuyiping@hisilicon.com>


commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are
safe") made a change to enqueue tcomplete to thread->todo before
enqueuing the transaction. However, in err_dead_proc_or_thread case,
the tcomplete is directly freed, without dequeued. It may cause the
thread->todo list to be corrupted.

So, dequeue it before freeing.

Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

Signed-off-by: Todd Kjos <tkjos@google.com>

---
 drivers/android/binder.c | 1 +
 1 file changed, 1 insertion(+)

-- 
2.14.1.581.gf28d330327-goog

Comments

Amit Pundir Sept. 11, 2017, 12:18 p.m. | #1
On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:
> From: Xu YiPing <xuyiping@hisilicon.com>

>

> commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

> safe") made a change to enqueue tcomplete to thread->todo before

> enqueuing the transaction. However, in err_dead_proc_or_thread case,

> the tcomplete is directly freed, without dequeued. It may cause the

> thread->todo list to be corrupted.

>

> So, dequeue it before freeing.


I see Android boot loops with this patch on hikey tracking
linux/master branch. 1st boot is fine but hikey runs into an
unexpected short boot loops on 2nd and successive boots.

It takes about 3-4 iterations to finally come to sane state and boot
to UI. I don't see this behaviour if I revert this patch.

Regards,
Amit Pundir

>

> Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

> Signed-off-by: Todd Kjos <tkjos@google.com>

> ---

>  drivers/android/binder.c | 1 +

>  1 file changed, 1 insertion(+)

>

> diff --git a/drivers/android/binder.c b/drivers/android/binder.c

> index d055b3f2a207..96cc28afa383 100644

> --- a/drivers/android/binder.c

> +++ b/drivers/android/binder.c

> @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc *proc,

>  err_dead_proc_or_thread:

>         return_error = BR_DEAD_REPLY;

>         return_error_line = __LINE__;

> +       binder_dequeue_work(proc, tcomplete);

>  err_translate_failed:

>  err_bad_object_type:

>  err_bad_offset:

> --

> 2.14.1.581.gf28d330327-goog

>
Todd Kjos Sept. 11, 2017, 3:40 p.m. | #2
(resend in plain-text mode -- sorry about that)

Amit,

Are you sure this patch is the culprit? That is pretty surprising
since this change can only be hit in a uncommon case (the target node
is valid when we start creating the transaction, but dead when we
check right before sending it) so it is unlikely to be hit during a
normal boot. It also fixes a corruption -- so if you were actually
hitting the case, it would likely have caused issues before and not
now. Take a look at it and see if you think it is really possible.

I just booted hikey to Android with this patch 10 times in a row with
no issues (used hikey-linaro 4.9 kernel which has this patch).

-Todd

> On Mon, Sep 11, 2017 at 5:18 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>

>> On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:

>> > From: Xu YiPing <xuyiping@hisilicon.com>

>> >

>> > commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

>> > safe") made a change to enqueue tcomplete to thread->todo before

>> > enqueuing the transaction. However, in err_dead_proc_or_thread case,

>> > the tcomplete is directly freed, without dequeued. It may cause the

>> > thread->todo list to be corrupted.

>> >

>> > So, dequeue it before freeing.

>>

>> I see Android boot loops with this patch on hikey tracking

>> linux/master branch. 1st boot is fine but hikey runs into an

>> unexpected short boot loops on 2nd and successive boots.

>>

>> It takes about 3-4 iterations to finally come to sane state and boot

>> to UI. I don't see this behaviour if I revert this patch.

>>

>> Regards,

>> Amit Pundir

>>

>> >

>> > Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

>> > Signed-off-by: Todd Kjos <tkjos@google.com>

>> > ---

>> >  drivers/android/binder.c | 1 +

>> >  1 file changed, 1 insertion(+)

>> >

>> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c

>> > index d055b3f2a207..96cc28afa383 100644

>> > --- a/drivers/android/binder.c

>> > +++ b/drivers/android/binder.c

>> > @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc

>> > *proc,

>> >  err_dead_proc_or_thread:

>> >         return_error = BR_DEAD_REPLY;

>> >         return_error_line = __LINE__;

>> > +       binder_dequeue_work(proc, tcomplete);

>> >  err_translate_failed:

>> >  err_bad_object_type:

>> >  err_bad_offset:

>> > --

>> > 2.14.1.581.gf28d330327-goog

>> >

>

>
Amit Pundir Sept. 11, 2017, 4:55 p.m. | #3
Hi Todd,

On 11 September 2017 at 21:10, Todd Kjos <tkjos@google.com> wrote:
> (resend in plain-text mode -- sorry about that)

>

> Amit,

>

> Are you sure this patch is the culprit? That is pretty surprising

> since this change can only be hit in a uncommon case (the target node

> is valid when we start creating the transaction, but dead when we

> check right before sending it) so it is unlikely to be hit during a

> normal boot. It also fixes a corruption -- so if you were actually

> hitting the case, it would likely have caused issues before and not

> now. Take a look at it and see if you think it is really possible.

>

> I just booted hikey to Android with this patch 10 times in a row with

> no issues (used hikey-linaro 4.9 kernel which has this patch).


Sorry for not being clear enough in the bug report. android-4.9 is
fine, I see this issue on linux mainline tree with this patch.

I can reproduce it on John's minimal Android tree for hikey hosted
here https://git.linaro.org/people/john.stultz/android-dev.git/log/?h=dev/hikey-mainline-WIP
and hikey-llct (android-4.9 patchset rebased to mainline) tree hosted
here https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.
I have already reverted this patch in hikey-llct so you have to revert
that revert to reproduce this issue on hikey-llct tree.

Regards,
Amit Pundir

>

> -Todd

>

>> On Mon, Sep 11, 2017 at 5:18 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>>

>>> On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:

>>> > From: Xu YiPing <xuyiping@hisilicon.com>

>>> >

>>> > commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

>>> > safe") made a change to enqueue tcomplete to thread->todo before

>>> > enqueuing the transaction. However, in err_dead_proc_or_thread case,

>>> > the tcomplete is directly freed, without dequeued. It may cause the

>>> > thread->todo list to be corrupted.

>>> >

>>> > So, dequeue it before freeing.

>>>

>>> I see Android boot loops with this patch on hikey tracking

>>> linux/master branch. 1st boot is fine but hikey runs into an

>>> unexpected short boot loops on 2nd and successive boots.

>>>

>>> It takes about 3-4 iterations to finally come to sane state and boot

>>> to UI. I don't see this behaviour if I revert this patch.

>>>

>>> Regards,

>>> Amit Pundir

>>>

>>> >

>>> > Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

>>> > Signed-off-by: Todd Kjos <tkjos@google.com>

>>> > ---

>>> >  drivers/android/binder.c | 1 +

>>> >  1 file changed, 1 insertion(+)

>>> >

>>> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c

>>> > index d055b3f2a207..96cc28afa383 100644

>>> > --- a/drivers/android/binder.c

>>> > +++ b/drivers/android/binder.c

>>> > @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc

>>> > *proc,

>>> >  err_dead_proc_or_thread:

>>> >         return_error = BR_DEAD_REPLY;

>>> >         return_error_line = __LINE__;

>>> > +       binder_dequeue_work(proc, tcomplete);

>>> >  err_translate_failed:

>>> >  err_bad_object_type:

>>> >  err_bad_offset:

>>> > --

>>> > 2.14.1.581.gf28d330327-goog

>>> >

>>

>>
Greg Kroah-Hartman Sept. 11, 2017, 5:24 p.m. | #4
On Mon, Sep 11, 2017 at 10:25:14PM +0530, Amit Pundir wrote:
> Hi Todd,

> 

> On 11 September 2017 at 21:10, Todd Kjos <tkjos@google.com> wrote:

> > (resend in plain-text mode -- sorry about that)

> >

> > Amit,

> >

> > Are you sure this patch is the culprit? That is pretty surprising

> > since this change can only be hit in a uncommon case (the target node

> > is valid when we start creating the transaction, but dead when we

> > check right before sending it) so it is unlikely to be hit during a

> > normal boot. It also fixes a corruption -- so if you were actually

> > hitting the case, it would likely have caused issues before and not

> > now. Take a look at it and see if you think it is really possible.

> >

> > I just booted hikey to Android with this patch 10 times in a row with

> > no issues (used hikey-linaro 4.9 kernel which has this patch).

> 

> Sorry for not being clear enough in the bug report. android-4.9 is

> fine, I see this issue on linux mainline tree with this patch.


What exact kernel release?  A number of binder fixes have recently
landed in the stable trees, and in Linus's tree.

thanks,

greg k-h
Todd Kjos Sept. 11, 2017, 7:59 p.m. | #5
Amit,

I tested with https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.
I added a pr_info() above the patch's single line change and in
binder_init (so I could easily prove that I was running the correct
kernel).

First I did 10 reboots with the patch. I saw one failure to reach the
Android home screen in boot #7 (but the new line of code was never
reached, so the patch cannot be the cause)... so 9 out of 10 reboots
were fine and the failure does not point to this patch.

Then I did 10 reboots without the patch. No failures.

Then 10 more with the patch. No failures.

Then with the patch: power-on, reboot twice, no failures (repeat, no failures).

I think the issue you are seeing cannot be caused by this patch --
take a look at it and see if you think its really possible...

-Todd

On Mon, Sep 11, 2017 at 9:55 AM, Amit Pundir <amit.pundir@linaro.org> wrote:
> Hi Todd,

>

> On 11 September 2017 at 21:10, Todd Kjos <tkjos@google.com> wrote:

>> (resend in plain-text mode -- sorry about that)

>>

>> Amit,

>>

>> Are you sure this patch is the culprit? That is pretty surprising

>> since this change can only be hit in a uncommon case (the target node

>> is valid when we start creating the transaction, but dead when we

>> check right before sending it) so it is unlikely to be hit during a

>> normal boot. It also fixes a corruption -- so if you were actually

>> hitting the case, it would likely have caused issues before and not

>> now. Take a look at it and see if you think it is really possible.

>>

>> I just booted hikey to Android with this patch 10 times in a row with

>> no issues (used hikey-linaro 4.9 kernel which has this patch).

>

> Sorry for not being clear enough in the bug report. android-4.9 is

> fine, I see this issue on linux mainline tree with this patch.

>

> I can reproduce it on John's minimal Android tree for hikey hosted

> here https://git.linaro.org/people/john.stultz/android-dev.git/log/?h=dev/hikey-mainline-WIP

> and hikey-llct (android-4.9 patchset rebased to mainline) tree hosted

> here https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.

> I have already reverted this patch in hikey-llct so you have to revert

> that revert to reproduce this issue on hikey-llct tree.

>

> Regards,

> Amit Pundir

>

>>

>> -Todd

>>

>>> On Mon, Sep 11, 2017 at 5:18 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>>>

>>>> On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:

>>>> > From: Xu YiPing <xuyiping@hisilicon.com>

>>>> >

>>>> > commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

>>>> > safe") made a change to enqueue tcomplete to thread->todo before

>>>> > enqueuing the transaction. However, in err_dead_proc_or_thread case,

>>>> > the tcomplete is directly freed, without dequeued. It may cause the

>>>> > thread->todo list to be corrupted.

>>>> >

>>>> > So, dequeue it before freeing.

>>>>

>>>> I see Android boot loops with this patch on hikey tracking

>>>> linux/master branch. 1st boot is fine but hikey runs into an

>>>> unexpected short boot loops on 2nd and successive boots.

>>>>

>>>> It takes about 3-4 iterations to finally come to sane state and boot

>>>> to UI. I don't see this behaviour if I revert this patch.

>>>>

>>>> Regards,

>>>> Amit Pundir

>>>>

>>>> >

>>>> > Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

>>>> > Signed-off-by: Todd Kjos <tkjos@google.com>

>>>> > ---

>>>> >  drivers/android/binder.c | 1 +

>>>> >  1 file changed, 1 insertion(+)

>>>> >

>>>> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c

>>>> > index d055b3f2a207..96cc28afa383 100644

>>>> > --- a/drivers/android/binder.c

>>>> > +++ b/drivers/android/binder.c

>>>> > @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc

>>>> > *proc,

>>>> >  err_dead_proc_or_thread:

>>>> >         return_error = BR_DEAD_REPLY;

>>>> >         return_error_line = __LINE__;

>>>> > +       binder_dequeue_work(proc, tcomplete);

>>>> >  err_translate_failed:

>>>> >  err_bad_object_type:

>>>> >  err_bad_offset:

>>>> > --

>>>> > 2.14.1.581.gf28d330327-goog

>>>> >

>>>

>>>
Martijn Coenen Sept. 12, 2017, 8:20 a.m. | #6
Hi Amit,

Can you try with the patch I sent to LKML recently, "[PATCH v2 10/13]
ANDROID: binder: call poll_wait() unconditionally."? This fixes a
problem in binder's poll() implementation that only causes issues
under certain racy conditions. I'm not sure why it would only trigger
now, as this problem has always been there, but perhaps my patches to
remove the proc waitqueue (which were merged recently) have
exacerbated this problem.

Thanks,
Martijn

On Mon, Sep 11, 2017 at 9:59 PM, Todd Kjos <tkjos@google.com> wrote:
> Amit,

>

> I tested with https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.

> I added a pr_info() above the patch's single line change and in

> binder_init (so I could easily prove that I was running the correct

> kernel).

>

> First I did 10 reboots with the patch. I saw one failure to reach the

> Android home screen in boot #7 (but the new line of code was never

> reached, so the patch cannot be the cause)... so 9 out of 10 reboots

> were fine and the failure does not point to this patch.

>

> Then I did 10 reboots without the patch. No failures.

>

> Then 10 more with the patch. No failures.

>

> Then with the patch: power-on, reboot twice, no failures (repeat, no failures).

>

> I think the issue you are seeing cannot be caused by this patch --

> take a look at it and see if you think its really possible...

>

> -Todd

>

> On Mon, Sep 11, 2017 at 9:55 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>> Hi Todd,

>>

>> On 11 September 2017 at 21:10, Todd Kjos <tkjos@google.com> wrote:

>>> (resend in plain-text mode -- sorry about that)

>>>

>>> Amit,

>>>

>>> Are you sure this patch is the culprit? That is pretty surprising

>>> since this change can only be hit in a uncommon case (the target node

>>> is valid when we start creating the transaction, but dead when we

>>> check right before sending it) so it is unlikely to be hit during a

>>> normal boot. It also fixes a corruption -- so if you were actually

>>> hitting the case, it would likely have caused issues before and not

>>> now. Take a look at it and see if you think it is really possible.

>>>

>>> I just booted hikey to Android with this patch 10 times in a row with

>>> no issues (used hikey-linaro 4.9 kernel which has this patch).

>>

>> Sorry for not being clear enough in the bug report. android-4.9 is

>> fine, I see this issue on linux mainline tree with this patch.

>>

>> I can reproduce it on John's minimal Android tree for hikey hosted

>> here https://git.linaro.org/people/john.stultz/android-dev.git/log/?h=dev/hikey-mainline-WIP

>> and hikey-llct (android-4.9 patchset rebased to mainline) tree hosted

>> here https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.

>> I have already reverted this patch in hikey-llct so you have to revert

>> that revert to reproduce this issue on hikey-llct tree.

>>

>> Regards,

>> Amit Pundir

>>

>>>

>>> -Todd

>>>

>>>> On Mon, Sep 11, 2017 at 5:18 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>>>>

>>>>> On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:

>>>>> > From: Xu YiPing <xuyiping@hisilicon.com>

>>>>> >

>>>>> > commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

>>>>> > safe") made a change to enqueue tcomplete to thread->todo before

>>>>> > enqueuing the transaction. However, in err_dead_proc_or_thread case,

>>>>> > the tcomplete is directly freed, without dequeued. It may cause the

>>>>> > thread->todo list to be corrupted.

>>>>> >

>>>>> > So, dequeue it before freeing.

>>>>>

>>>>> I see Android boot loops with this patch on hikey tracking

>>>>> linux/master branch. 1st boot is fine but hikey runs into an

>>>>> unexpected short boot loops on 2nd and successive boots.

>>>>>

>>>>> It takes about 3-4 iterations to finally come to sane state and boot

>>>>> to UI. I don't see this behaviour if I revert this patch.

>>>>>

>>>>> Regards,

>>>>> Amit Pundir

>>>>>

>>>>> >

>>>>> > Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

>>>>> > Signed-off-by: Todd Kjos <tkjos@google.com>

>>>>> > ---

>>>>> >  drivers/android/binder.c | 1 +

>>>>> >  1 file changed, 1 insertion(+)

>>>>> >

>>>>> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c

>>>>> > index d055b3f2a207..96cc28afa383 100644

>>>>> > --- a/drivers/android/binder.c

>>>>> > +++ b/drivers/android/binder.c

>>>>> > @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc

>>>>> > *proc,

>>>>> >  err_dead_proc_or_thread:

>>>>> >         return_error = BR_DEAD_REPLY;

>>>>> >         return_error_line = __LINE__;

>>>>> > +       binder_dequeue_work(proc, tcomplete);

>>>>> >  err_translate_failed:

>>>>> >  err_bad_object_type:

>>>>> >  err_bad_offset:

>>>>> > --

>>>>> > 2.14.1.581.gf28d330327-goog

>>>>> >

>>>>

>>>>
Amit Pundir Oct. 3, 2017, 8:45 a.m. | #7
Hi,

On 12 September 2017 at 13:50, Martijn Coenen <maco@google.com> wrote:
> Hi Amit,

>

> Can you try with the patch I sent to LKML recently, "[PATCH v2 10/13]

> ANDROID: binder: call poll_wait() unconditionally."? This fixes a

> problem in binder's poll() implementation that only causes issues

> under certain racy conditions. I'm not sure why it would only trigger

> now, as this problem has always been there, but perhaps my patches to

> remove the proc waitqueue (which were merged recently) have

> exacerbated this problem.

>


Sorry took me a while to get back to test this patch again. I didn't
try your binder poll fix yet. I can not reproduce this problem on hikey
anyway, running 4.14-rc3 on latest AOSP(rootfs) master snapshot.

It could be my older AOSP rootfs snapshot which was running into that
random system crash on boot, causing boot animation loop. I've been
bitten by such intermittent AOSP issues before.

I also ran binder tests from frameworks/native/libs/binder/tests/ to
be sure and found the results mostly inline with android-4.9 kernel.
Sorry for all the noise.

Regards,
Amit Pundir

> Thanks,

> Martijn

>

> On Mon, Sep 11, 2017 at 9:59 PM, Todd Kjos <tkjos@google.com> wrote:

>> Amit,

>>

>> I tested with https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.

>> I added a pr_info() above the patch's single line change and in

>> binder_init (so I could easily prove that I was running the correct

>> kernel).

>>

>> First I did 10 reboots with the patch. I saw one failure to reach the

>> Android home screen in boot #7 (but the new line of code was never

>> reached, so the patch cannot be the cause)... so 9 out of 10 reboots

>> were fine and the failure does not point to this patch.

>>

>> Then I did 10 reboots without the patch. No failures.

>>

>> Then 10 more with the patch. No failures.

>>

>> Then with the patch: power-on, reboot twice, no failures (repeat, no failures).

>>

>> I think the issue you are seeing cannot be caused by this patch --

>> take a look at it and see if you think its really possible...

>>

>> -Todd

>>

>> On Mon, Sep 11, 2017 at 9:55 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>> Hi Todd,

>>>

>>> On 11 September 2017 at 21:10, Todd Kjos <tkjos@google.com> wrote:

>>>> (resend in plain-text mode -- sorry about that)

>>>>

>>>> Amit,

>>>>

>>>> Are you sure this patch is the culprit? That is pretty surprising

>>>> since this change can only be hit in a uncommon case (the target node

>>>> is valid when we start creating the transaction, but dead when we

>>>> check right before sending it) so it is unlikely to be hit during a

>>>> normal boot. It also fixes a corruption -- so if you were actually

>>>> hitting the case, it would likely have caused issues before and not

>>>> now. Take a look at it and see if you think it is really possible.

>>>>

>>>> I just booted hikey to Android with this patch 10 times in a row with

>>>> no issues (used hikey-linaro 4.9 kernel which has this patch).

>>>

>>> Sorry for not being clear enough in the bug report. android-4.9 is

>>> fine, I see this issue on linux mainline tree with this patch.

>>>

>>> I can reproduce it on John's minimal Android tree for hikey hosted

>>> here https://git.linaro.org/people/john.stultz/android-dev.git/log/?h=dev/hikey-mainline-WIP

>>> and hikey-llct (android-4.9 patchset rebased to mainline) tree hosted

>>> here https://android-git.linaro.org/kernel/linaro-android.git/log/?h=test/hikey-llct.

>>> I have already reverted this patch in hikey-llct so you have to revert

>>> that revert to reproduce this issue on hikey-llct tree.

>>>

>>> Regards,

>>> Amit Pundir

>>>

>>>>

>>>> -Todd

>>>>

>>>>> On Mon, Sep 11, 2017 at 5:18 AM, Amit Pundir <amit.pundir@linaro.org> wrote:

>>>>>>

>>>>>> On 5 September 2017 at 22:51, Todd Kjos <tkjos@android.com> wrote:

>>>>>> > From: Xu YiPing <xuyiping@hisilicon.com>

>>>>>> >

>>>>>> > commit 7a4408c6bd3e ("binder: make sure accesses to proc/thread are

>>>>>> > safe") made a change to enqueue tcomplete to thread->todo before

>>>>>> > enqueuing the transaction. However, in err_dead_proc_or_thread case,

>>>>>> > the tcomplete is directly freed, without dequeued. It may cause the

>>>>>> > thread->todo list to be corrupted.

>>>>>> >

>>>>>> > So, dequeue it before freeing.

>>>>>>

>>>>>> I see Android boot loops with this patch on hikey tracking

>>>>>> linux/master branch. 1st boot is fine but hikey runs into an

>>>>>> unexpected short boot loops on 2nd and successive boots.

>>>>>>

>>>>>> It takes about 3-4 iterations to finally come to sane state and boot

>>>>>> to UI. I don't see this behaviour if I revert this patch.

>>>>>>

>>>>>> Regards,

>>>>>> Amit Pundir

>>>>>>

>>>>>> >

>>>>>> > Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>

>>>>>> > Signed-off-by: Todd Kjos <tkjos@google.com>

>>>>>> > ---

>>>>>> >  drivers/android/binder.c | 1 +

>>>>>> >  1 file changed, 1 insertion(+)

>>>>>> >

>>>>>> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c

>>>>>> > index d055b3f2a207..96cc28afa383 100644

>>>>>> > --- a/drivers/android/binder.c

>>>>>> > +++ b/drivers/android/binder.c

>>>>>> > @@ -3083,6 +3083,7 @@ static void binder_transaction(struct binder_proc

>>>>>> > *proc,

>>>>>> >  err_dead_proc_or_thread:

>>>>>> >         return_error = BR_DEAD_REPLY;

>>>>>> >         return_error_line = __LINE__;

>>>>>> > +       binder_dequeue_work(proc, tcomplete);

>>>>>> >  err_translate_failed:

>>>>>> >  err_bad_object_type:

>>>>>> >  err_bad_offset:

>>>>>> > --

>>>>>> > 2.14.1.581.gf28d330327-goog

>>>>>> >

>>>>>

>>>>>

Patch

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index d055b3f2a207..96cc28afa383 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3083,6 +3083,7 @@  static void binder_transaction(struct binder_proc *proc,
 err_dead_proc_or_thread:
 	return_error = BR_DEAD_REPLY;
 	return_error_line = __LINE__;
+	binder_dequeue_work(proc, tcomplete);
 err_translate_failed:
 err_bad_object_type:
 err_bad_offset: