diff mbox series

staging: vchiq: rework remove_event handling

Message ID 20181210211231.4084251-1-arnd@arndb.de
State Accepted
Commit 852b2876a8a82f2c618da40b881bb07c2d42eb5c
Headers show
Series staging: vchiq: rework remove_event handling | expand

Commit Message

Arnd Bergmann Dec. 10, 2018, 9:11 p.m. UTC
I had started the removal of semaphores in this driver without knowing
that Nicolas Saenz Julienne also worked on this. In case of the "remote
event" infrastructure, my solution seemed significantly better, so I'm
proposing this as a change on top.

The problem with using either semaphores or completions here is that
it's an overly complex way of waking up a thread, and it looks like the
'count' of the semaphore can easily get out of sync, even though I found
it hard to come up with a specific example.

Changing it to a 'wait_queue_head_t' instead of a completion simplifies
this by letting us wait directly on the 'event->fired' variable that is
set by the videocore.

Another simplification is passing the wait queue directly into the helper
functions instead of going through the fragile logic of recording the
offset inside of a structure as part of a shared memory variable. This
also avoids one uncached memory read and should be faster.

Note that I'm changing it back to 'killable' after the previous patch
changed 'killable' to 'interruptible', apparently based on a misunderstanding
of the subtle down_interruptible() macro override in vchiq_killable.h.

Fixes: f27e47bc6b8b ("staging: vchiq: use completions instead of semaphores")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
 .../interface/vchiq_arm/vchiq_core.c          | 63 +++++++------------
 .../interface/vchiq_arm/vchiq_core.h          | 12 ++--
 2 files changed, 30 insertions(+), 45 deletions(-)

-- 
2.20.0

Comments

Dan Carpenter Dec. 11, 2018, 10:07 a.m. UTC | #1
On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote:
> Note that I'm changing it back to 'killable' after the previous patch

> changed 'killable' to 'interruptible', apparently based on a misunderstanding

> of the subtle down_interruptible() macro override in vchiq_killable.h.


Oh wow...  The non standard down_interruptible() macro is ugly.

regards,
dan carpenter
Nicolas Saenz Julienne Dec. 11, 2018, 12:30 p.m. UTC | #2
On Tue, 2018-12-11 at 13:07 +0300, Dan Carpenter wrote:
> On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote:

> > Note that I'm changing it back to 'killable' after the previous

> > patch

> > changed 'killable' to 'interruptible', apparently based on a

> > misunderstanding

> > of the subtle down_interruptible() macro override in

> > vchiq_killable.h.

> 

> Oh wow...  The non standard down_interruptible() macro is ugly.

> 


Same reaction here, *_*. I wasn't even aware of it.

Arnd, as long as it you're not doing it already. I'll prepare some
extra fixes for the rest of completions and get rid of that file
altogether.

Regards,
Nicolas
Nicolas Saenz Julienne Dec. 11, 2018, 12:36 p.m. UTC | #3
Hi Arnd, thanks for the patch!

On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote:
> I had started the removal of semaphores in this driver without

> knowing

> that Nicolas Saenz Julienne also worked on this. In case of the

> "remote

> event" infrastructure, my solution seemed significantly better, so

> I'm

> proposing this as a change on top.

> 

> The problem with using either semaphores or completions here is that

> it's an overly complex way of waking up a thread, and it looks like

> the

> 'count' of the semaphore can easily get out of sync, even though I

> found

> it hard to come up with a specific example.

> 

> Changing it to a 'wait_queue_head_t' instead of a completion

> simplifies

> this by letting us wait directly on the 'event->fired' variable that

> is

> set by the videocore.

> 

> Another simplification is passing the wait queue directly into the

> helper

> functions instead of going through the fragile logic of recording the

> offset inside of a structure as part of a shared memory variable.

> This

> also avoids one uncached memory read and should be faster.

> 

> Note that I'm changing it back to 'killable' after the previous patch

> changed 'killable' to 'interruptible', apparently based on a

> misunderstanding

> of the subtle down_interruptible() macro override in

> vchiq_killable.h.

> 

> Fixes: f27e47bc6b8b ("staging: vchiq: use completions instead of

> semaphores")

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> ---

>  .../interface/vchiq_arm/vchiq_core.c          | 63 +++++++--------

> ----

>  .../interface/vchiq_arm/vchiq_core.h          | 12 ++--

>  2 files changed, 30 insertions(+), 45 deletions(-)

> 

> diff --git

> a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c

> b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c

> index 482b5daf6c0c..eda3004a0c6a 100644

> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c

> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c

> @@ -417,26 +417,23 @@ vchiq_set_conn_state(VCHIQ_STATE_T *state,

> VCHIQ_CONNSTATE_T newstate)

>  }

>  

>  static inline void

> -remote_event_create(REMOTE_EVENT_T *event)

> +remote_event_create(wait_queue_head_t *wq, REMOTE_EVENT_T *event)

>  {

>  	event->armed = 0;

>  	/* Don't clear the 'fired' flag because it may already have

> been set

>  	** by the other side. */

> +	init_waitqueue_head(wq);

>  }

>  

>  static inline int

> -remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event)

> +remote_event_wait(wait_queue_head_t *wq, REMOTE_EVENT_T *event)

>  {

>  	if (!event->fired) {

>  		event->armed = 1;

>  		dsb(sy);

> -		if (!event->fired) {

> -			if (wait_for_completion_interruptible(

> -					(struct completion *)

> -					((char *)state + event-

> >event))) {

> -				event->armed = 0;

> -				return 0;

> -			}

> +		if (wait_event_killable(*wq, event->fired)) {

> +			event->armed = 0;

> +			return 0;

>  		}

>  		event->armed = 0;

>  		wmb();

> @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state,

> REMOTE_EVENT_T *event)

>  }

>  

>  static inline void

> -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T

> *event)

> +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T

> *event)

>  {

>  	event->armed = 0;

> -	complete((struct completion *)((char *)state + event->event));

> +	wake_up_all(wq);


Shouldn't this just be "wake_up(wq)"? 

Regards,
Nicolas
Arnd Bergmann Dec. 11, 2018, 2:18 p.m. UTC | #4
On Tue, Dec 11, 2018 at 1:31 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>

> On Tue, 2018-12-11 at 13:07 +0300, Dan Carpenter wrote:

> > On Mon, Dec 10, 2018 at 10:11:58PM +0100, Arnd Bergmann wrote:

> > > Note that I'm changing it back to 'killable' after the previous

> > > patch

> > > changed 'killable' to 'interruptible', apparently based on a

> > > misunderstanding

> > > of the subtle down_interruptible() macro override in

> > > vchiq_killable.h.

> >

> > Oh wow...  The non standard down_interruptible() macro is ugly.

> >

>

> Same reaction here, *_*. I wasn't even aware of it.

>

> Arnd, as long as it you're not doing it already. I'll prepare some

> extra fixes for the rest of completions and get rid of that file

> altogether.


Ok, please do that, and add me to Cc. I think there is one
counting semaphore left (g_free_fragments_sema), which could
just use the plain down_killable() before you remove the wrapper.

I actually have another (longer) patch series that is not entirely
ready to convert all counting semaphores into a new
'struct counting_semaphore' to split them away from the
semaphores that can be converted into mutexes, completions
or other replacements.

       Arnd
Arnd Bergmann Dec. 11, 2018, 2:20 p.m. UTC | #5
On Tue, Dec 11, 2018 at 1:36 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
> On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote:


> > @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state,

> > REMOTE_EVENT_T *event)

> >  }

> >

> >  static inline void

> > -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T

> > *event)

> > +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T

> > *event)

> >  {

> >       event->armed = 0;

> > -     complete((struct completion *)((char *)state + event->event));

> > +     wake_up_all(wq);

>

> Shouldn't this just be "wake_up(wq)"?


I wasn't entirely sure if we could get with more than one thread waiting
for the wakeup. With the semaphore or completion that would already
be broken because we'd only wake up one of them, but I was hoping
to stay on the safe side with wake_up_all().

       Arnd
Nicolas Saenz Julienne Dec. 12, 2018, 4:24 p.m. UTC | #6
On Tue, 2018-12-11 at 15:20 +0100, Arnd Bergmann wrote:
> On Tue, Dec 11, 2018 at 1:36 PM Nicolas Saenz Julienne

> <nsaenzjulienne@suse.de> wrote:

> > On Mon, 2018-12-10 at 22:11 +0100, Arnd Bergmann wrote:

> > > @@ -447,26 +444,26 @@ remote_event_wait(VCHIQ_STATE_T *state,

> > > REMOTE_EVENT_T *event)

> > >  }

> > > 

> > >  static inline void

> > > -remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T

> > > *event)

> > > +remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T

> > > *event)

> > >  {

> > >       event->armed = 0;

> > > -     complete((struct completion *)((char *)state + event-

> > > >event));

> > > +     wake_up_all(wq);

> > 

> > Shouldn't this just be "wake_up(wq)"?

> 

> I wasn't entirely sure if we could get with more than one thread

> waiting

> for the wakeup. With the semaphore or completion that would already

> be broken because we'd only wake up one of them, but I was hoping

> to stay on the safe side with wake_up_all().


You're right. Had a look at the code and there shouldn't be more than
one thread waiting. wake_up_all() looks OK.

Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>


Regards,
Nicolas
diff mbox series

Patch

diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c
index 482b5daf6c0c..eda3004a0c6a 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c
@@ -417,26 +417,23 @@  vchiq_set_conn_state(VCHIQ_STATE_T *state, VCHIQ_CONNSTATE_T newstate)
 }
 
 static inline void
-remote_event_create(REMOTE_EVENT_T *event)
+remote_event_create(wait_queue_head_t *wq, REMOTE_EVENT_T *event)
 {
 	event->armed = 0;
 	/* Don't clear the 'fired' flag because it may already have been set
 	** by the other side. */
+	init_waitqueue_head(wq);
 }
 
 static inline int
-remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event)
+remote_event_wait(wait_queue_head_t *wq, REMOTE_EVENT_T *event)
 {
 	if (!event->fired) {
 		event->armed = 1;
 		dsb(sy);
-		if (!event->fired) {
-			if (wait_for_completion_interruptible(
-					(struct completion *)
-					((char *)state + event->event))) {
-				event->armed = 0;
-				return 0;
-			}
+		if (wait_event_killable(*wq, event->fired)) {
+			event->armed = 0;
+			return 0;
 		}
 		event->armed = 0;
 		wmb();
@@ -447,26 +444,26 @@  remote_event_wait(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event)
 }
 
 static inline void
-remote_event_signal_local(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event)
+remote_event_signal_local(wait_queue_head_t *wq, REMOTE_EVENT_T *event)
 {
 	event->armed = 0;
-	complete((struct completion *)((char *)state + event->event));
+	wake_up_all(wq);
 }
 
 static inline void
-remote_event_poll(VCHIQ_STATE_T *state, REMOTE_EVENT_T *event)
+remote_event_poll(wait_queue_head_t *wq, REMOTE_EVENT_T *event)
 {
 	if (event->fired && event->armed)
-		remote_event_signal_local(state, event);
+		remote_event_signal_local(wq, event);
 }
 
 void
 remote_event_pollall(VCHIQ_STATE_T *state)
 {
-	remote_event_poll(state, &state->local->sync_trigger);
-	remote_event_poll(state, &state->local->sync_release);
-	remote_event_poll(state, &state->local->trigger);
-	remote_event_poll(state, &state->local->recycle);
+	remote_event_poll(&state->sync_trigger_event, &state->local->sync_trigger);
+	remote_event_poll(&state->sync_release_event, &state->local->sync_release);
+	remote_event_poll(&state->trigger_event, &state->local->trigger);
+	remote_event_poll(&state->recycle_event, &state->local->recycle);
 }
 
 /* Round up message sizes so that any space at the end of a slot is always big
@@ -550,7 +547,7 @@  request_poll(VCHIQ_STATE_T *state, VCHIQ_SERVICE_T *service, int poll_type)
 	wmb();
 
 	/* ... and ensure the slot handler runs. */
-	remote_event_signal_local(state, &state->local->trigger);
+	remote_event_signal_local(&state->trigger_event, &state->local->trigger);
 }
 
 /* Called from queue_message, by the slot handler and application threads,
@@ -1069,7 +1066,7 @@  queue_message_sync(VCHIQ_STATE_T *state, VCHIQ_SERVICE_T *service,
 		(mutex_lock_killable(&state->sync_mutex) != 0))
 		return VCHIQ_RETRY;
 
-	remote_event_wait(state, &local->sync_release);
+	remote_event_wait(&state->sync_release_event, &local->sync_release);
 
 	rmb();
 
@@ -1887,7 +1884,7 @@  slot_handler_func(void *v)
 	while (1) {
 		DEBUG_COUNT(SLOT_HANDLER_COUNT);
 		DEBUG_TRACE(SLOT_HANDLER_LINE);
-		remote_event_wait(state, &local->trigger);
+		remote_event_wait(&state->trigger_event, &local->trigger);
 
 		rmb();
 
@@ -1976,7 +1973,7 @@  recycle_func(void *v)
 		return -ENOMEM;
 
 	while (1) {
-		remote_event_wait(state, &local->recycle);
+		remote_event_wait(&state->recycle_event, &local->recycle);
 
 		process_free_queue(state, found, length);
 	}
@@ -1998,7 +1995,7 @@  sync_func(void *v)
 		int type;
 		unsigned int localport, remoteport;
 
-		remote_event_wait(state, &local->sync_trigger);
+		remote_event_wait(&state->sync_trigger_event, &local->sync_trigger);
 
 		rmb();
 
@@ -2193,11 +2190,6 @@  vchiq_init_state(VCHIQ_STATE_T *state, VCHIQ_SLOT_ZERO_T *slot_zero)
 
 	init_completion(&state->connect);
 	mutex_init(&state->mutex);
-	init_completion(&state->trigger_event);
-	init_completion(&state->recycle_event);
-	init_completion(&state->sync_trigger_event);
-	init_completion(&state->sync_release_event);
-
 	mutex_init(&state->slot_mutex);
 	mutex_init(&state->recycle_mutex);
 	mutex_init(&state->sync_mutex);
@@ -2229,24 +2221,17 @@  vchiq_init_state(VCHIQ_STATE_T *state, VCHIQ_SLOT_ZERO_T *slot_zero)
 	state->data_use_count = 0;
 	state->data_quota = state->slot_queue_available - 1;
 
-	local->trigger.event = offsetof(VCHIQ_STATE_T, trigger_event);
-	remote_event_create(&local->trigger);
+	remote_event_create(&state->trigger_event, &local->trigger);
 	local->tx_pos = 0;
-
-	local->recycle.event = offsetof(VCHIQ_STATE_T, recycle_event);
-	remote_event_create(&local->recycle);
+	remote_event_create(&state->recycle_event, &local->recycle);
 	local->slot_queue_recycle = state->slot_queue_available;
-
-	local->sync_trigger.event = offsetof(VCHIQ_STATE_T, sync_trigger_event);
-	remote_event_create(&local->sync_trigger);
-
-	local->sync_release.event = offsetof(VCHIQ_STATE_T, sync_release_event);
-	remote_event_create(&local->sync_release);
+	remote_event_create(&state->sync_trigger_event, &local->sync_trigger);
+	remote_event_create(&state->sync_release_event, &local->sync_release);
 
 	/* At start-of-day, the slot is empty and available */
 	((VCHIQ_HEADER_T *)SLOT_DATA_FROM_INDEX(state, local->slot_sync))->msgid
 		= VCHIQ_MSGID_PADDING;
-	remote_event_signal_local(state, &local->sync_release);
+	remote_event_signal_local(&state->sync_release_event, &local->sync_release);
 
 	local->debug[DEBUG_ENTRIES] = DEBUG_MAX;
 
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h
index b76281f7510e..aae2c59700bd 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h
@@ -37,6 +37,7 @@ 
 #include <linux/mutex.h>
 #include <linux/completion.h>
 #include <linux/kthread.h>
+#include <linux/wait.h>
 
 #include "vchiq_cfg.h"
 
@@ -262,8 +263,7 @@  typedef struct vchiq_bulk_queue_struct {
 typedef struct remote_event_struct {
 	int armed;
 	int fired;
-	/* Contains offset from the beginning of the VCHIQ_STATE_T structure */
-	u32 event;
+	u32 __unused;
 } REMOTE_EVENT_T;
 
 typedef struct opaque_platform_state_t *VCHIQ_PLATFORM_STATE_T;
@@ -426,16 +426,16 @@  struct vchiq_state_struct {
 	struct task_struct *sync_thread;
 
 	/* Local implementation of the trigger remote event */
-	struct completion trigger_event;
+	wait_queue_head_t trigger_event;
 
 	/* Local implementation of the recycle remote event */
-	struct completion recycle_event;
+	wait_queue_head_t recycle_event;
 
 	/* Local implementation of the sync trigger remote event */
-	struct completion sync_trigger_event;
+	wait_queue_head_t sync_trigger_event;
 
 	/* Local implementation of the sync release remote event */
-	struct completion sync_release_event;
+	wait_queue_head_t sync_release_event;
 
 	char *tx_data;
 	char *rx_data;