From patchwork Tue Mar 19 04:52:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 160520 Delivered-To: patch@linaro.org Received: by 2002:a02:5cc1:0:0:0:0:0 with SMTP id w62csp3443220jad; Mon, 18 Mar 2019 21:53:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqz8iN1NOnselynItkPeur9Lvlk8slY0dkLfF/JlzZSNUXPRAEN3y9zEBdbwSjOnY54RL37/ X-Received: by 2002:adf:e7c3:: with SMTP id e3mr9318027wrn.164.1552971198913; Mon, 18 Mar 2019 21:53:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552971198; cv=none; d=google.com; s=arc-20160816; b=e88ITcIIt21ihDIAEderv8N+jjWgf2GI97GO487Bh69cUn9LVSZH8hqSLm7C6ZDrqU POhOYNp8GaGxz33y0cSkb7TD55fLXWAKrJ7Hgjzcwiae4cn6xvTDCeL4hUcIDiv8/DJp 1j+tE7BSL1UGsVmKosBIJGSHced4bgkDZrbJAUKbno4pHAFMiaTygBhVcneXhcnvL27I pOFI32Kd3AGLUP5ZgTM1Nr5HbcqbfuJMpArT9LDXz3RxswqQxfM+mrm5o27TSDQcd3SO qUtTCRHeS6EWNBec0O9KFbr8pGZQz/t66sQhgLDpNF/iid46cWqwAl8YiqtEG/N2BlV2 TmZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=1Ce4NPGYkfJcdjtWVihGyt6zutO8YXpmkSTwBHu0JdQ=; b=y0iRR2ktGh/rmZF6ScsurhxPSYazh4M5funSLbZoPY2NDQZgRvBR4u/ai3B2o+cCcq d5M7OEKnE51DbK3Egyunmsn1Q3qNSH2SZX2n4Prq07uLpNtr3P9qfVhumXNrXNy2kLcq KLGk/v3C941Vmow8tqEvpygj8WraSrGQIbeDx4yF4+cUS3FuewllrJffYx6mte8RVDFk JNjqCRUKYrPWleD7OOe1GQTfcsybaNvpp773C/Z7RzeyRVDWXsSWzx0mO2pH6Jz8T421 FE099b1w+5oGEElY795yAsy1zs9cUGFNkYlgdKMRCV484lx0cqpRA2hvsZVr7Y7l+Gay yXpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id m10si8003701wrw.230.2019.03.18.21.53.18; Mon, 18 Mar 2019 21:53:18 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 02A654CB5; Tue, 19 Mar 2019 05:53:03 +0100 (CET) Received: from foss.arm.com (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id 42F464CA0 for ; Tue, 19 Mar 2019 05:53:00 +0100 (CET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6F1A81596; Mon, 18 Mar 2019 21:52:59 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.13.209]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F2D5A3F575; Mon, 18 Mar 2019 21:52:58 -0700 (PDT) From: Honnappa Nagarahalli To: konstantin.ananyev@intel.com, stephen@networkplumber.org, paulmck@linux.ibm.com, dev@dpdk.org Cc: honnappa.nagarahalli@arm.com, gavin.hu@arm.com, dharmik.thakkar@arm.com, malvika.gupta@arm.com Date: Mon, 18 Mar 2019 23:52:28 -0500 Message-Id: <20190319045228.46879-4-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190319045228.46879-1-honnappa.nagarahalli@arm.com> References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20190319045228.46879-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [PATCH 3/3] doc/rcu: add lib_rcu documentation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add lib_rcu QSBR API and programmer guide documentation. Signed-off-by: Honnappa Nagarahalli --- doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf.in | 1 + .../prog_guide/img/rcu_general_info.svg | 494 ++++++++++++++++++ doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/rcu_lib.rst | 179 +++++++ 5 files changed, 677 insertions(+), 1 deletion(-) create mode 100644 doc/guides/prog_guide/img/rcu_general_info.svg create mode 100644 doc/guides/prog_guide/rcu_lib.rst -- 2.17.1 Reviewed-by: Marko Kovacevic diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index d95ad566c..5c1f6b477 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -54,7 +54,8 @@ The public API headers are grouped by topics: [memzone] (@ref rte_memzone.h), [mempool] (@ref rte_mempool.h), [malloc] (@ref rte_malloc.h), - [memcpy] (@ref rte_memcpy.h) + [memcpy] (@ref rte_memcpy.h), + [rcu] (@ref rte_rcu_qsbr.h) - **timers**: [cycles] (@ref rte_cycles.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index a365e669b..0b4c248a2 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -51,6 +51,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/librte_port \ @TOPDIR@/lib/librte_power \ @TOPDIR@/lib/librte_rawdev \ + @TOPDIR@/lib/librte_rcu \ @TOPDIR@/lib/librte_reorder \ @TOPDIR@/lib/librte_ring \ @TOPDIR@/lib/librte_sched \ diff --git a/doc/guides/prog_guide/img/rcu_general_info.svg b/doc/guides/prog_guide/img/rcu_general_info.svg new file mode 100644 index 000000000..3ae53bdc2 --- /dev/null +++ b/doc/guides/prog_guide/img/rcu_general_info.svg @@ -0,0 +1,494 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Page-1 + + + + Sheet.3 + + + + Sheet.4 + + + + Sheet.5 + D1 + + + + D1 + + Sheet.6 + + + + Sheet.7 + + + + Sheet.8 + D2 + + + + D2 + + Sheet.9 + + + + Sheet.10 + Reader Thread 1 + + + + Reader Thread 1 + + Sheet.11 + + + + Sheet.12 + + + + Sheet.13 + D1 + + + + D1 + + Sheet.14 + + + + Sheet.15 + + + + Sheet.16 + D2 + + + + D2 + + Sheet.17 + + + + Sheet.18 + T 2 + + + + T 2 + + Sheet.19 + + + + Sheet.20 + + + + Sheet.21 + D1 + + + + D1 + + Sheet.22 + + + + Sheet.23 + + + + Sheet.24 + D2 + + + + D2 + + Sheet.25 + + + + Sheet.26 + T 3 + + + + T 3 + + Sheet.28 + Time + + + + Time + + Sheet.29 + + + + Sheet.30 + + + + Sheet.31 + Remove reference to entry1 + + + + Remove reference to entry1 + + Sheet.33 + + + + Sheet.34 + Delete + + + + Delete + + Sheet.35 + + + + Sheet.36 + + + + Sheet.37 + Delete entry1 from D1 + + + + Delete entry1 from D1 + + Sheet.38 + + + + Sheet.39 + + + + Sheet.40 + + + + Sheet.41 + Free memory for entries1 and 2 after every reader has gone th... + + + + Free memory for entries1 and 2 after every reader has gone through at least 1 quiescent state + + Sheet.46 + Free + + + + Free + + Sheet.48 + + + + Sheet.49 + + + + Sheet.50 + Grace Period + + + + Grace Period + + Sheet.51 + + + + Sheet.52 + + + + Sheet.53 + + + + Sheet.54 + Delete entry2 from D1 + + + + Delete entry2 from D1 + + Sheet.56 + + + + Sheet.57 + + + + Sheet.58 + + + + Sheet.59 + Critical sections + + + + Critical sections + + Sheet.60 + + + + Sheet.61 + + + + Sheet.62 + + + + Sheet.63 + Quiescent states + + + + Quiescent states + + Sheet.64 + + + + Sheet.65 + + + + Sheet.66 + + + + Sheet.67 + + + + Sheet.68 + + + + Sheet.69 + + + + Sheet.70 + while(1) loop + + + + while(1) loop + + Sheet.71 + + + + Sheet.72 + Reader thread is not accessing any shared data structure. i.e... + + + + Reader thread is not accessing any shared data structure.i.e. non critical section or quiescent state. + + Sheet.73 + Dx + + + + Dx + + Sheet.74 + Reader thread is accessing the shared data structure Dx. i.e.... + + + + Reader thread is accessing the shared data structure Dx.i.e. critical section. + + Sheet.75 + Point in time when the reference to the entry is removed usin... + + + + Point in time when the reference to the entry is removed using an atomic operation. + + Sheet.76 + Delete + + + + Delete + + Sheet.77 + Point in time when the writer can free the deleted entry. + + + + Point in time when the writer can free the deleted entry. + + Sheet.78 + Free + + + + Free + + Sheet.79 + Time duration between Delete and Free, during which memory ca... + + + + Time duration between Delete and Free, during which memory cannot be freed. + + Sheet.80 + Grace Period + + + + Grace Period + + Sheet.83 + + + + Sheet.84 + + + + Sheet.85 + + + + diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 6726b1e8d..6fb3fb921 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -55,6 +55,7 @@ Programmer's Guide metrics_lib bpf_lib ipsec_lib + rcu_lib source_org dev_kit_build_system dev_kit_root_make_help diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst new file mode 100644 index 000000000..5155dd35c --- /dev/null +++ b/doc/guides/prog_guide/rcu_lib.rst @@ -0,0 +1,179 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2019 Arm Limited. + +.. _RCU_Library: + +RCU Library +============ + +Lock-less data structures provide scalability and determinism. +They enable use cases where locking may not be allowed +(for ex: real-time applications). + +In the following paras, the term 'memory' refers to memory allocated +by typical APIs like malloc or anything that is representative of +memory, for ex: an index of a free element array. + +Since these data structures are lock less, the writers and readers +are accessing the data structures concurrently. Hence, while removing +an element from a data structure, the writers cannot return the memory +to the allocator, without knowing that the readers are not +referencing that element/memory anymore. Hence, it is required to +separate the operation of removing an element into 2 steps: + +Delete: in this step, the writer removes the reference to the element from +the data structure but does not return the associated memory to the +allocator. This will ensure that new readers will not get a reference to +the removed element. Removing the reference is an atomic operation. + +Free(Reclaim): in this step, the writer returns the memory to the +memory allocator, only after knowing that all the readers have stopped +referencing the deleted element. + +This library helps the writer determine when it is safe to free the +memory. + +This library makes use of thread Quiescent State (QS). + +What is Quiescent State +----------------------- +Quiescent State can be defined as 'any point in the thread execution where the +thread does not hold a reference to shared memory'. It is upto the application +to determine its quiescent state. + +Let us consider the following diagram: + +.. figure:: img/rcu_general_info.* + + +As shown, reader thread 1 acesses data structures D1 and D2. When it is +accessing D1, if the writer has to remove an element from D1, the +writer cannot free the memory associated with that element immediately. +The writer can return the memory to the allocator only after the reader +stops referencng D1. In other words, reader thread RT1 has to enter +a quiescent state. + +Similarly, since reader thread 2 is also accessing D1, writer has to +wait till thread 2 enters quiescent state as well. + +However, the writer does not need to wait for reader thread 3 to enter +quiescent state. Reader thread 3 was not accessing D1 when the delete +operation happened. So, reader thread 1 will not have a reference to the +deleted entry. + +It can be noted that, the critical sections for D2 is a quiescent state +for D1. i.e. for a given data structure Dx, any point in the thread execution +that does not reference Dx is a quiescent state. + +Since memory is not freed immediately, there might be a need for +provisioning of additional memory, depending on the application requirements. + +Factores affecting RCU mechanism +--------------------------------- + +It is important to make sure that this library keeps the over head of +identifying the end of grace period and subsequent freeing of memory, +to a minimum. The following paras explain how grace period and critical +section affect this overhead. + +The writer has to poll the readers to identify the end of grace period. +Polling introduces memory accesses and wastes CPU cycles. The memory +is not available for reuse during grace period. Longer grace periods +exasperate these conditions. + +The length of the critical section and the number of reader threads +is proportional to the duration of the grace period. Keeping the critical +sections smaller will keep the grace period smaller. However, keeping the +critical sections smaller requires additional CPU cycles(due to additional +reporting) in the readers. + +Hence, we need the characteristics of small grace period and large critical +section. This library addresses this by allowing the writer to do +other work without having to block till the readers report their quiescent +state. + +RCU in DPDK +----------- + +For DPDK applications, the start and end of while(1) loop (where no +references to shared data structures are kept) act as perfect quiescent +states. This will combine all the shared data structure accesses into a +single, large critical section which helps keep the over head on the +reader side to a minimum. + +DPDK supports pipeline model of packet processing and service cores. +In these use cases, a given data structure may not be used by all the +workers in the application. The writer does not have to wait for all +the workers to report their quiescent state. To provide the required +flexibility, this library has a concept of QS variable. The application +can create one QS variable per data structure to help it track the +end of grace period for each data structure. This helps keep the grace +period to a minimum. + +How to use this library +----------------------- + +The application has to allocate memory and initialize a QS variable. + +Application can call **rte_rcu_qsbr_get_memsize** to calculate the size +of memory to allocate. This API takes maximum number of reader threads, +using this variable, as a parameter. Currently, a maximum of 1024 threads +are supported. + +Further, the application can initialize a QS variable using the API +**rte_rcu_qsbr_init**. + +Each reader thread is assumed to have a unique thread ID. Currently, the +management of the thread ID (for ex: allocation/free) is left to the +application. The thread ID should be in the range of 0 to +maximum number of threads provided while creating the QS variable. +The application could also use lcore_id as the thread ID where applicable. + +**rte_rcu_qsbr_thread_register** API will register a reader thread +to report its quiescent state. This can be called from a reader thread. +A control plane thread can also call this on behalf of a reader thread. +The reader thread must call **rte_rcu_qsbr_thread_online** API to start +reporting its quiescent state. + +Some of the use cases might require the reader threads to make +blocking API calls (for ex: while using eventdev APIs). The writer thread +should not wait for such reader threads to enter quiescent state. +The reader thread must call **rte_rcu_qsbr_thread_offline** API, before calling +blocking APIs. It can call **rte_rcu_qsbr_thread_online** API once the blocking +API call returns. + +The writer thread can trigger the reader threads to report their quiescent +state by calling the API **rte_rcu_qsbr_start**. It is possible for multiple +writer threads to query the quiescent state status simultaneously. Hence, +**rte_rcu_qsbr_start** returns a token to each caller. + +The writer thread has to call **rte_rcu_qsbr_check** API with the token to +get the current quiescent state status. Option to block till all the reader +threads enter the quiescent state is provided. If this API indicates that +all the reader threads have entered the quiescent state, the application +can free the deleted entry. + +The APIs **rte_rcu_qsbr_start** and **rte_rcu_qsbr_check** are lock free. +Hence, they can be called concurrently from multiple writers even while +running as worker threads. + +The separation of triggering the reporting from querying the status provides +the writer threads flexibility to do useful work instead of blocking for the +reader threads to enter the quiescent state or go offline. This reduces the +memory accesses due to continuous polling for the status. + +**rte_rcu_qsbr_synchronize** API combines the functionality of +**rte_rcu_qsbr_start** and blocking **rte_rcu_qsbr_check** into a single API. +This API triggers the reader threads to report their quiescent state and +polls till all the readers enter the quiescent state or go offline. This +API does not allow the writer to do useful work while waiting and also +introduces additional memory accesses due to continuous polling. + +The reader thread must call **rte_rcu_qsbr_thread_offline** and +**rte_rcu_qsbr_thread_unregister** APIs to remove itself from reporting its +quiescent state. The **rte_rcu_qsbr_check** API will not wait for this reader +thread to report the quiescent state status anymore. + +The reader threads should call **rte_rcu_qsbr_update** API to indicate that +they entered a quiescent state. This API checks if a writer has triggered a +quiescent state query and update the state accordingly.