Message ID | 1584407863-774-11-git-send-email-phil.yang@arm.com |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
> From: Phil Yang <phil.yang@arm.com> > Sent: Tuesday, March 17, 2020 1:18 AM > To: thomas@monjalon.net; Van Haaren, Harry <harry.van.haaren@intel.com>; > Ananyev, Konstantin <konstantin.ananyev@intel.com>; > stephen@networkplumber.org; maxime.coquelin@redhat.com; dev@dpdk.org > Cc: david.marchand@redhat.com; jerinj@marvell.com; hemant.agrawal@nxp.com; > Honnappa.Nagarahalli@arm.com; gavin.hu@arm.com; ruifeng.wang@arm.com; > joyce.kong@arm.com; nd@arm.com; Honnappa Nagarahalli > <honnappa.nagarahalli@arm.com>; stable@dpdk.org > Subject: [PATCH v3 10/12] service: identify service running on another core > correctly > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > The logic to identify if the MT unsafe service is running on another > core can return -EBUSY spuriously. In such cases, running the service > becomes costlier than using atomic operations. Assume that the > application passes the right parameters and reduces the number of > instructions for all cases. > > Cc: stable@dpdk.org > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > Reviewed-by: Phil Yang <phil.yang@arm.com> > Reviewed-by: Gavin Hu <gavin.hu@arm.com> Is this fixing broken functionality, or does it aim to only "optimize"? Lack of "fixes" tag suggests optimization. I'm cautious about the commit phrase "Assume that the application ...", if the code was previously checking things, we must not stop checking them now, this may introduce race-conditions in existing applications? It seems like the "serialize_mt_unsafe" branch is being pushed further down the callgraph, and instead of branching over atomics this patch forces always executing 2 atomics? This feels like too specific an optimization/tradeoff, without data to backup that there are no regressions on any DPDK supported platforms. DPDK today doesn't have a micro-benchmark to gather such perf data, but I would welcome one and we can have a data-driven decision. Hope this point-of-view makes sense, -Harry > --- > lib/librte_eal/common/rte_service.c | 26 ++++++++------------------ > 1 file changed, 8 insertions(+), 18 deletions(-) > > diff --git a/lib/librte_eal/common/rte_service.c > b/lib/librte_eal/common/rte_service.c > index 32a2f8a..0843c3c 100644 > --- a/lib/librte_eal/common/rte_service.c > +++ b/lib/librte_eal/common/rte_service.c > @@ -360,7 +360,7 @@ service_runner_do_callback(struct rte_service_spec_impl > *s, > /* Expects the service 's' is valid. */ > static int32_t > service_run(uint32_t i, struct core_state *cs, uint64_t service_mask, > - struct rte_service_spec_impl *s) > + struct rte_service_spec_impl *s, uint32_t serialize_mt_unsafe) > { > if (!s) > return -EINVAL; > @@ -374,7 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t > service_mask, > > cs->service_active_on_lcore[i] = 1; > > - if (service_mt_safe(s) == 0) { > + if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) { > if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) > return -EBUSY; > > @@ -412,24 +412,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t > serialize_mt_unsafe) > > SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL); > > - /* Atomically add this core to the mapped cores first, then examine if > - * we can run the service. This avoids a race condition between > - * checking the value, and atomically adding to the mapped count. > + /* Increment num_mapped_cores to indicate that the service > + * is running on a core. > */ > - if (serialize_mt_unsafe) > - rte_atomic32_inc(&s->num_mapped_cores); > + rte_atomic32_inc(&s->num_mapped_cores); > > - if (service_mt_safe(s) == 0 && > - rte_atomic32_read(&s->num_mapped_cores) > 1) { > - if (serialize_mt_unsafe) > - rte_atomic32_dec(&s->num_mapped_cores); > - return -EBUSY; > - } > - > - int ret = service_run(id, cs, UINT64_MAX, s); > + int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe); > > - if (serialize_mt_unsafe) > - rte_atomic32_dec(&s->num_mapped_cores); > + rte_atomic32_dec(&s->num_mapped_cores); > > return ret; > } > @@ -449,7 +439,7 @@ service_runner_func(void *arg) > if (!service_valid(i)) > continue; > /* return value ignored as no change to code flow */ > - service_run(i, cs, service_mask, service_get(i)); > + service_run(i, cs, service_mask, service_get(i), 1); > } > > cs->loops++; > -- > 2.7.4
<snip> > > Subject: [PATCH v3 10/12] service: identify service running on another > > core correctly > > > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > > > The logic to identify if the MT unsafe service is running on another > > core can return -EBUSY spuriously. In such cases, running the service > > becomes costlier than using atomic operations. Assume that the > > application passes the right parameters and reduces the number of > > instructions for all cases. > > > > Cc: stable@dpdk.org > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > > Reviewed-by: Phil Yang <phil.yang@arm.com> > > Reviewed-by: Gavin Hu <gavin.hu@arm.com> > > Is this fixing broken functionality, or does it aim to only "optimize"? > Lack of "fixes" tag suggests optimization. Good point, it is missing the 'Fixes' tag. Relooking at the code, following are few problems that I observe, please correct me if I am wrong: 1) Looking at the API rte_service_map_lcore_set, 'num_mapped_cores' keeps track of the number of cores the service is mapped to. However, in the function 'rte_service_run_iter_on_app_lcore', the 'num_mapped_cores' is incremented only if 'serialize_mt_unsafe' is set. This will return incorrect result in 'rte_service_runstate_get' API when 'serialize_mt_unsafe' is not set. 2) Since incrementing 'num_mapped_cores' and checking its value is not an atomic operation, it introduces race conditions. Consider the current code snippet from the function 'rte_service_run_iter_on_app_lcore'. 1) if (serialize_mt_unsafe) 2) rte_atomic32_inc(&s->num_mapped_cores); 3) if (service_mt_safe(s) == 0 && rte_atomic32_read(&s->num_mapped_cores) > 1) { 4) if (serialize_mt_unsafe) 5) rte_atomic32_dec(&s->num_mapped_cores); 6) return -EBUSY; 7) } It is possible that more than 1 thread is executing line 3) concurrently, in which case all of them will hit line 6). Due to this, it is possible that the service might never run or it might take multiple attempts to run the service wasting cycles. If you agree these are bugs, I can add 'fixes' tag. > > I'm cautious about the commit phrase "Assume that the application ...", if the > code was previously checking things, we must not stop checking them now, > this may introduce race-conditions in existing applications? What I meant here is, let us believe in what the application says. i.e. if the applications sets 'serialize_mt_unsafe' to 1, let us assume that the service is infact 'unsafe'. > > It seems like the "serialize_mt_unsafe" branch is being pushed further down > the callgraph, and instead of branching over atomics this patch forces always > executing 2 atomics? Yes, that is correct. Please see explanation in 1) above. > > This feels like too specific an optimization/tradeoff, without data to backup > that there are no regressions on any DPDK supported platforms. Apologies for missing the 'Fixes' tag, this patch is a bug fix. > > DPDK today doesn't have a micro-benchmark to gather such perf data, but I > would welcome one and we can have a data-driven decision. When this was developed, was the logic to avoid taking the lock measured to provide any improvements? > > Hope this point-of-view makes sense, -Harry > > > --- > > lib/librte_eal/common/rte_service.c | 26 ++++++++------------------ > > 1 file changed, 8 insertions(+), 18 deletions(-) > > > > diff --git a/lib/librte_eal/common/rte_service.c > > b/lib/librte_eal/common/rte_service.c > > index 32a2f8a..0843c3c 100644 > > --- a/lib/librte_eal/common/rte_service.c > > +++ b/lib/librte_eal/common/rte_service.c > > @@ -360,7 +360,7 @@ service_runner_do_callback(struct > > rte_service_spec_impl *s, > > /* Expects the service 's' is valid. */ static int32_t > > service_run(uint32_t i, struct core_state *cs, uint64_t service_mask, > > - struct rte_service_spec_impl *s) > > + struct rte_service_spec_impl *s, uint32_t serialize_mt_unsafe) > > { > > if (!s) > > return -EINVAL; > > @@ -374,7 +374,7 @@ service_run(uint32_t i, struct core_state *cs, > > uint64_t service_mask, > > > > cs->service_active_on_lcore[i] = 1; > > > > - if (service_mt_safe(s) == 0) { > > + if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) { > > if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) > > return -EBUSY; > > > > @@ -412,24 +412,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id, > > uint32_t > > serialize_mt_unsafe) > > > > SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL); > > > > - /* Atomically add this core to the mapped cores first, then examine if > > - * we can run the service. This avoids a race condition between > > - * checking the value, and atomically adding to the mapped count. > > + /* Increment num_mapped_cores to indicate that the service > > + * is running on a core. > > */ > > - if (serialize_mt_unsafe) > > - rte_atomic32_inc(&s->num_mapped_cores); > > + rte_atomic32_inc(&s->num_mapped_cores); > > > > - if (service_mt_safe(s) == 0 && > > - rte_atomic32_read(&s->num_mapped_cores) > 1) { > > - if (serialize_mt_unsafe) > > - rte_atomic32_dec(&s->num_mapped_cores); > > - return -EBUSY; > > - } > > - > > - int ret = service_run(id, cs, UINT64_MAX, s); > > + int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe); > > > > - if (serialize_mt_unsafe) > > - rte_atomic32_dec(&s->num_mapped_cores); > > + rte_atomic32_dec(&s->num_mapped_cores); > > > > return ret; > > } > > @@ -449,7 +439,7 @@ service_runner_func(void *arg) > > if (!service_valid(i)) > > continue; > > /* return value ignored as no change to code flow */ > > - service_run(i, cs, service_mask, service_get(i)); > > + service_run(i, cs, service_mask, service_get(i), 1); > > } > > > > cs->loops++; > > -- > > 2.7.4
diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 32a2f8a..0843c3c 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -360,7 +360,7 @@ service_runner_do_callback(struct rte_service_spec_impl *s, /* Expects the service 's' is valid. */ static int32_t service_run(uint32_t i, struct core_state *cs, uint64_t service_mask, - struct rte_service_spec_impl *s) + struct rte_service_spec_impl *s, uint32_t serialize_mt_unsafe) { if (!s) return -EINVAL; @@ -374,7 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t service_mask, cs->service_active_on_lcore[i] = 1; - if (service_mt_safe(s) == 0) { + if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) { if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1)) return -EBUSY; @@ -412,24 +412,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t serialize_mt_unsafe) SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL); - /* Atomically add this core to the mapped cores first, then examine if - * we can run the service. This avoids a race condition between - * checking the value, and atomically adding to the mapped count. + /* Increment num_mapped_cores to indicate that the service + * is running on a core. */ - if (serialize_mt_unsafe) - rte_atomic32_inc(&s->num_mapped_cores); + rte_atomic32_inc(&s->num_mapped_cores); - if (service_mt_safe(s) == 0 && - rte_atomic32_read(&s->num_mapped_cores) > 1) { - if (serialize_mt_unsafe) - rte_atomic32_dec(&s->num_mapped_cores); - return -EBUSY; - } - - int ret = service_run(id, cs, UINT64_MAX, s); + int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe); - if (serialize_mt_unsafe) - rte_atomic32_dec(&s->num_mapped_cores); + rte_atomic32_dec(&s->num_mapped_cores); return ret; } @@ -449,7 +439,7 @@ service_runner_func(void *arg) if (!service_valid(i)) continue; /* return value ignored as no change to code flow */ - service_run(i, cs, service_mask, service_get(i)); + service_run(i, cs, service_mask, service_get(i), 1); } cs->loops++;