Message ID | 20201104161625.1085981-1-bjorn.andersson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | remoteproc: sysmon: Ensure remote notification ordering | expand |
On Wed 04 Nov 10:16 CST 2020, Bjorn Andersson wrote: > The reliance on the remoteproc's state for determining when to send > sysmon notifications to a remote processor is racy with regard to > concurrent remoteproc operations. > > Further more the advertisement of the state of other remote processor to > a newly started remote processor might not only send the wrong state, > but might result in a stream of state changes that are out of order. > > Address this by introducing state tracking within the sysmon instances > themselves and extend the locking to ensure that the notifications are > consistent with this state. > > The use of a big lock for all instances will cause contention for > concurrent remote processor state transitions, but the correctness of > the remote processors' view of their peers is more important. > > Fixes: 1f36ab3f6e3b ("remoteproc: sysmon: Inform current rproc about all active rprocs") > Fixes: 1877f54f75ad ("remoteproc: sysmon: Add notifications for events") > Fixes: 1fb82ee806d1 ("remoteproc: qcom: Introduce sysmon") > Cc: stable@vger.kernel.org > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > drivers/remoteproc/qcom_sysmon.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c > index 9eb2f6bccea6..1e507b66354a 100644 > --- a/drivers/remoteproc/qcom_sysmon.c > +++ b/drivers/remoteproc/qcom_sysmon.c > @@ -22,6 +22,8 @@ struct qcom_sysmon { > struct rproc_subdev subdev; > struct rproc *rproc; > > + int state; > + > struct list_head node; > > const char *name; > @@ -448,7 +450,10 @@ static int sysmon_prepare(struct rproc_subdev *subdev) > .ssr_event = SSCTL_SSR_EVENT_BEFORE_POWERUP > }; > > + mutex_lock(&sysmon_lock); This doesn't work, because taking the big lock prevents a concurrently failing remote processor from reaching smd orglink to indicate that that remote is dead and the first remote's notifications should be aborted/fail fast. The result is in most cases that we're stuck here waiting for a timeout, but there are extreme corner cases where the notification might be waiting for the dead remote to drain the communication fifo. Will send a new version that don't rely on the big lock, but still keeps state information consistent. Regards, Bjorn
diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c index 9eb2f6bccea6..1e507b66354a 100644 --- a/drivers/remoteproc/qcom_sysmon.c +++ b/drivers/remoteproc/qcom_sysmon.c @@ -22,6 +22,8 @@ struct qcom_sysmon { struct rproc_subdev subdev; struct rproc *rproc; + int state; + struct list_head node; const char *name; @@ -448,7 +450,10 @@ static int sysmon_prepare(struct rproc_subdev *subdev) .ssr_event = SSCTL_SSR_EVENT_BEFORE_POWERUP }; + mutex_lock(&sysmon_lock); + sysmon->state = SSCTL_SSR_EVENT_BEFORE_POWERUP; blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event); + mutex_unlock(&sysmon_lock); return 0; } @@ -472,15 +477,16 @@ static int sysmon_start(struct rproc_subdev *subdev) .ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP }; + mutex_lock(&sysmon_lock); + sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP; blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event); - mutex_lock(&sysmon_lock); list_for_each_entry(target, &sysmon_list, node) { - if (target == sysmon || - target->rproc->state != RPROC_RUNNING) + if (target == sysmon) continue; event.subsys_name = target->name; + event.ssr_event = target->state; if (sysmon->ssctl_version == 2) ssctl_send_event(sysmon, &event); @@ -500,7 +506,10 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed) .ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN }; + mutex_lock(&sysmon_lock); + sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN; blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event); + mutex_unlock(&sysmon_lock); /* Don't request graceful shutdown if we've crashed */ if (crashed) @@ -521,7 +530,10 @@ static void sysmon_unprepare(struct rproc_subdev *subdev) .ssr_event = SSCTL_SSR_EVENT_AFTER_SHUTDOWN }; + mutex_lock(&sysmon_lock); + sysmon->state = SSCTL_SSR_EVENT_AFTER_SHUTDOWN; blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event); + mutex_unlock(&sysmon_lock); } /** @@ -538,7 +550,7 @@ static int sysmon_notify(struct notifier_block *nb, unsigned long event, struct sysmon_event *sysmon_event = data; /* Skip non-running rprocs and the originating instance */ - if (rproc->state != RPROC_RUNNING || + if (sysmon->state != SSCTL_SSR_EVENT_AFTER_POWERUP || !strcmp(sysmon_event->subsys_name, sysmon->name)) { dev_dbg(sysmon->dev, "not notifying %s\n", sysmon->name); return NOTIFY_DONE;
The reliance on the remoteproc's state for determining when to send sysmon notifications to a remote processor is racy with regard to concurrent remoteproc operations. Further more the advertisement of the state of other remote processor to a newly started remote processor might not only send the wrong state, but might result in a stream of state changes that are out of order. Address this by introducing state tracking within the sysmon instances themselves and extend the locking to ensure that the notifications are consistent with this state. The use of a big lock for all instances will cause contention for concurrent remote processor state transitions, but the correctness of the remote processors' view of their peers is more important. Fixes: 1f36ab3f6e3b ("remoteproc: sysmon: Inform current rproc about all active rprocs") Fixes: 1877f54f75ad ("remoteproc: sysmon: Add notifications for events") Fixes: 1fb82ee806d1 ("remoteproc: qcom: Introduce sysmon") Cc: stable@vger.kernel.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- drivers/remoteproc/qcom_sysmon.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)