diff mbox series

[1/3] mailbox: always wait in mbox_send_message for blocking Tx mode

Message ID 1490095816-10943-1-git-send-email-sudeep.holla@arm.com
State New
Headers show
Series [1/3] mailbox: always wait in mbox_send_message for blocking Tx mode | expand

Commit Message

Sudeep Holla March 21, 2017, 11:30 a.m. UTC
There exists a race when msg_submit return immediately as there was an
active request being processed which may have completed just before it's
checked again in mbox_send_message. This will result in return to the
caller without waiting in mbox_send_message even when it's blocking Tx.

This patch fixes the issue by waiting for the completion always if Tx
is in blocking mode.

Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
Cc: Jassi Brar <jassisinghbrar@gmail.com>
Reported-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

---
 drivers/mailbox/mailbox.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Hi Jassi,

Here are fixes for few issues we encountered when dealing with multiple
requests on multiple channels simultaneously.

Regards,
Sudeep

--
2.7.4

Comments

Jassi Brar March 28, 2017, 6:20 p.m. UTC | #1
Hi Sudeep,

On Tue, Mar 21, 2017 at 5:00 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:
> There exists a race when msg_submit return immediately as there was an

> active request being processed which may have completed just before it's

> checked again in mbox_send_message. This will result in return to the

> caller without waiting in mbox_send_message even when it's blocking Tx.

>

> This patch fixes the issue by waiting for the completion always if Tx

> is in blocking mode.

>

> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")

> Cc: Jassi Brar <jassisinghbrar@gmail.com>

> Reported-by: Alexey Klimov <alexey.klimov@arm.com>

> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

> ---

>  drivers/mailbox/mailbox.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

>

> Hi Jassi,

>

> Here are fixes for few issues we encountered when dealing with multiple

> requests on multiple channels simultaneously.

>

Thanks for finding the bug.

I see patch-1 tries to fix the bug.  Patch-2,3 try to fix the
ramifications of the bug
but they may change behaviour for some users. Do you face any issue even after
applying patch-1 ?

Thanks
Sudeep Holla March 29, 2017, 11:34 a.m. UTC | #2
On 28/03/17 19:20, Jassi Brar wrote:
> Hi Sudeep,

> 

> On Tue, Mar 21, 2017 at 5:00 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:

>> There exists a race when msg_submit return immediately as there was an

>> active request being processed which may have completed just before it's

>> checked again in mbox_send_message. This will result in return to the

>> caller without waiting in mbox_send_message even when it's blocking Tx.

>>

>> This patch fixes the issue by waiting for the completion always if Tx

>> is in blocking mode.

>>

>> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")

>> Cc: Jassi Brar <jassisinghbrar@gmail.com>

>> Reported-by: Alexey Klimov <alexey.klimov@arm.com>

>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

>> ---

>>  drivers/mailbox/mailbox.c | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> Hi Jassi,

>>

>> Here are fixes for few issues we encountered when dealing with multiple

>> requests on multiple channels simultaneously.

>>

> Thanks for finding the bug.

> 

> I see patch-1 tries to fix the bug.  Patch-2,3 try to fix the

> ramifications of the bug but they may change behaviour for some users.

> Do you face any issue even after applying patch-1 ?

> 


Unfortunately yes. Are you concerned with the change in return value on
timeout ? I understand and then I chose -ETIME vs -ETIMEDOUT as hardware
can still use it and we can distinguish the software timer expiry from
that. Even -EIO seems incorrect for s/w timeout as it exists today, but
I agree it has some impact on existing users.

Also Patch 3 seems independent for me just to avoid complete call if it
was empty message.

Alexey also brought up another issue which is relating to ordering and
may require completion flags per message instead of per channel. Today
we can't guarantee that first blocker on the wait queue is same as the
first in the mailbox queue.

e.g.:
	Thread#1(T1)		   Thread#2(T2)
     mbox_send_message		 mbox_send_message
            |				|
	    V				|
	add_to_rbuf(M1)			V
	    |			  add_to_rbuf(M2)
	    |				|
	    |				V
	    V			   msg_submit(picks M1)
	msg_submit			|
	    |				V
	    V			wait_for_completion(on M2)
     wait_for_completion(on M1)		|  (1st in waitQ)
     	    |	(2nd in waitQ)		V
     	    V			wake_up(on completion of M1)<--incorrect

I will let him dive in as he had some thoughts on that.

-- 
Regards,
Sudeep
Jassi Brar March 29, 2017, 5:46 p.m. UTC | #3
On Wed, Mar 29, 2017 at 5:04 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:
>

> On 28/03/17 19:20, Jassi Brar wrote:

>> Hi Sudeep,

>>

>> On Tue, Mar 21, 2017 at 5:00 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:

>>> There exists a race when msg_submit return immediately as there was an

>>> active request being processed which may have completed just before it's

>>> checked again in mbox_send_message. This will result in return to the

>>> caller without waiting in mbox_send_message even when it's blocking Tx.

>>>

>>> This patch fixes the issue by waiting for the completion always if Tx

>>> is in blocking mode.

>>>

>>> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")

>>> Cc: Jassi Brar <jassisinghbrar@gmail.com>

>>> Reported-by: Alexey Klimov <alexey.klimov@arm.com>

>>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

>>> ---

>>>  drivers/mailbox/mailbox.c | 2 +-

>>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>>

>>> Hi Jassi,

>>>

>>> Here are fixes for few issues we encountered when dealing with multiple

>>> requests on multiple channels simultaneously.

>>>

>> Thanks for finding the bug.

>>

>> I see patch-1 tries to fix the bug.  Patch-2,3 try to fix the

>> ramifications of the bug but they may change behaviour for some users.

>> Do you face any issue even after applying patch-1 ?

>>

>

> Unfortunately yes. Are you concerned with the change in return value on

> timeout ? I understand and then I chose -ETIME vs -ETIMEDOUT as hardware

> can still use it and we can distinguish the software timer expiry from

> that. Even -EIO seems incorrect for s/w timeout as it exists today, but

> I agree it has some impact on existing users.

>

> Also Patch 3 seems independent for me just to avoid complete call if it

> was empty message.

>

> Alexey also brought up another issue which is relating to ordering and

> may require completion flags per message instead of per channel. Today

> we can't guarantee that first blocker on the wait queue is same as the

> first in the mailbox queue.

>

> e.g.:

>         Thread#1(T1)               Thread#2(T2)

>      mbox_send_message           mbox_send_message

>             |                           |

>             V                           |

>         add_to_rbuf(M1)                 V

>             |                     add_to_rbuf(M2)

>             |                           |

>             |                           V

>             V                      msg_submit(picks M1)

>         msg_submit                      |

>             |                           V

>             V                   wait_for_completion(on M2)

>      wait_for_completion(on M1)         |  (1st in waitQ)

>             |   (2nd in waitQ)          V

>             V                   wake_up(on completion of M1)<--incorrect

>

Yes, that is a possibility. I have sent a fix for this. It would help
if Alexey/you could give it a try.

Thanks
Alexey Klimov April 11, 2017, 1 p.m. UTC | #4
On Tue, Mar 21, 2017 at 11:30:14AM +0000, Sudeep Holla wrote:
> There exists a race when msg_submit return immediately as there was an

> active request being processed which may have completed just before it's

> checked again in mbox_send_message. This will result in return to the

> caller without waiting in mbox_send_message even when it's blocking Tx.

> 

> This patch fixes the issue by waiting for the completion always if Tx

> is in blocking mode.

> 

> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")

> Cc: Jassi Brar <jassisinghbrar@gmail.com>

> Reported-by: Alexey Klimov <alexey.klimov@arm.com>

> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>



Reviewed-by: Alexey Klimov <alexey.klimov@arm.com>




> ---

>  drivers/mailbox/mailbox.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> Hi Jassi,

> 

> Here are fixes for few issues we encountered when dealing with multiple

> requests on multiple channels simultaneously.

> 

> Regards,

> Sudeep

> 

> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c

> index 4671f8a12872..160d6640425a 100644

> --- a/drivers/mailbox/mailbox.c

> +++ b/drivers/mailbox/mailbox.c

> @@ -260,7 +260,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)

> 

>  	msg_submit(chan);

> 

> -	if (chan->cl->tx_block && chan->active_req) {

> +	if (chan->cl->tx_block) {

>  		unsigned long wait;

>  		int ret;

> 

> --

> 2.7.4

>
diff mbox series

Patch

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 4671f8a12872..160d6640425a 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -260,7 +260,7 @@  int mbox_send_message(struct mbox_chan *chan, void *mssg)

 	msg_submit(chan);

-	if (chan->cl->tx_block && chan->active_req) {
+	if (chan->cl->tx_block) {
 		unsigned long wait;
 		int ret;