diff mbox series

qat: fix deadlock in backlog processing

Message ID af9581e2-58f9-cc19-428f-6f18f1f83d54@redhat.com
State New
Headers show
Series qat: fix deadlock in backlog processing | expand

Commit Message

Mikulas Patocka Sept. 21, 2023, 8:53 p.m. UTC
Hi

I was evaluating whether it is feasible to use QAT with dm-crypt (the 
answer is that it is not - QAT is slower than AES-NI for this type of 
workload; QAT starts to be benefical for encryption requests longer than 
64k). And I got some deadlocks.

The reason for the deadlocks is this: suppose that one of the "if"
conditions in "qat_alg_send_message_maybacklog" is true and we jump to the
"enqueue" label. At this point, an interrupt comes in and clears all
pending messages. Now, the interrupt returns, we grab backlog->lock, add
the message to the backlog, drop backlog->lock - and there is no one to
remove the backlogged message out of the list and submit it.

I fixed it with this patch - with this patch, the test passes and there
are no longer any deadlocks. I didn't want to add a spinlock to the hot
path, so I take it only if some of the condition suggests that queuing may
be required.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/crypto/intel/qat/qat_common/qat_algs_send.c |   31 ++++++++++++--------
 1 file changed, 20 insertions(+), 11 deletions(-)

Comments

Giovanni Cabiddu Sept. 22, 2023, 3:13 p.m. UTC | #1
Hi Mikulas,

many thanks for reporting this issue and finding a solution.

On Thu, Sep 21, 2023 at 10:53:55PM +0200, Mikulas Patocka wrote:
> I was evaluating whether it is feasible to use QAT with dm-crypt (the 
> answer is that it is not - QAT is slower than AES-NI for this type of 
> workload; QAT starts to be benefical for encryption requests longer than 
> 64k).
Correct. Is there anything that we can do to batch requests in a single
call?

Sometime ago there was some work done to build a geniv template cipher
and optimize dm-crypt to encrypt larger block sizes in a single call,
see [1][2]. Don't know if that work was completed.

>And I got some deadlocks.
Ouch!

> The reason for the deadlocks is this: suppose that one of the "if"
> conditions in "qat_alg_send_message_maybacklog" is true and we jump to the
> "enqueue" label. At this point, an interrupt comes in and clears all
> pending messages. Now, the interrupt returns, we grab backlog->lock, add
> the message to the backlog, drop backlog->lock - and there is no one to
> remove the backlogged message out of the list and submit it.
Makes sense. In my testing I wasn't able to reproduce this condition.

> I fixed it with this patch - with this patch, the test passes and there
> are no longer any deadlocks. I didn't want to add a spinlock to the hot
> path, so I take it only if some of the condition suggests that queuing may
> be required.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
The commit message requires a bit of rework to describe the change.
Also, deserves a fixes tag.

> 
> ---
>  drivers/crypto/intel/qat/qat_common/qat_algs_send.c |   31 ++++++++++++--------
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> Index: linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
> ===================================================================
> --- linux-2.6.orig/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
> +++ linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
> @@ -40,16 +40,6 @@ void qat_alg_send_backlog(struct qat_ins
>  	spin_unlock_bh(&backlog->lock);
>  }
>  
> -static void qat_alg_backlog_req(struct qat_alg_req *req,
> -				struct qat_instance_backlog *backlog)
> -{
> -	INIT_LIST_HEAD(&req->list);
Is the initialization of an element no longer needed?

> -
> -	spin_lock_bh(&backlog->lock);
> -	list_add_tail(&req->list, &backlog->list);
> -	spin_unlock_bh(&backlog->lock);
> -}
> -
>  static int qat_alg_send_message_maybacklog(struct qat_alg_req *req)
>  {
>  	struct qat_instance_backlog *backlog = req->backlog;
> @@ -71,8 +61,27 @@ static int qat_alg_send_message_maybackl
>  	return -EINPROGRESS;
>  
>  enqueue:
> -	qat_alg_backlog_req(req, backlog);
> +	spin_lock_bh(&backlog->lock);
> +
> +	/* If any request is already backlogged, then add to backlog list */
> +	if (!list_empty(&backlog->list))
> +		goto enqueue2;
>  
> +	/* If ring is nearly full, then add to backlog list */
> +	if (adf_ring_nearly_full(tx_ring))
> +		goto enqueue2;
> +
> +	/* If adding request to HW ring fails, then add to backlog list */
> +	if (adf_send_message(tx_ring, fw_req))
> +		goto enqueue2;
In a nutshell, you are re-doing the same steps taking the backlog lock.

It should be possible to re-write it so that there is a function that
attempts enqueuing and if it fails, then the same is called again taking
the lock.
If you want I can rework it and resubmit.

> +
> +	spin_unlock_bh(&backlog->lock);
> +	return -EINPROGRESS;
> +
> +enqueue2:
> +	list_add_tail(&req->list, &backlog->list);
> +
> +	spin_unlock_bh(&backlog->lock);
>  	return -EBUSY;
>  }

[1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1276510.html
[2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1428293.html

Regards,
diff mbox series

Patch

Index: linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
===================================================================
--- linux-2.6.orig/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
+++ linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c
@@ -40,16 +40,6 @@  void qat_alg_send_backlog(struct qat_ins
 	spin_unlock_bh(&backlog->lock);
 }
 
-static void qat_alg_backlog_req(struct qat_alg_req *req,
-				struct qat_instance_backlog *backlog)
-{
-	INIT_LIST_HEAD(&req->list);
-
-	spin_lock_bh(&backlog->lock);
-	list_add_tail(&req->list, &backlog->list);
-	spin_unlock_bh(&backlog->lock);
-}
-
 static int qat_alg_send_message_maybacklog(struct qat_alg_req *req)
 {
 	struct qat_instance_backlog *backlog = req->backlog;
@@ -71,8 +61,27 @@  static int qat_alg_send_message_maybackl
 	return -EINPROGRESS;
 
 enqueue:
-	qat_alg_backlog_req(req, backlog);
+	spin_lock_bh(&backlog->lock);
+
+	/* If any request is already backlogged, then add to backlog list */
+	if (!list_empty(&backlog->list))
+		goto enqueue2;
 
+	/* If ring is nearly full, then add to backlog list */
+	if (adf_ring_nearly_full(tx_ring))
+		goto enqueue2;
+
+	/* If adding request to HW ring fails, then add to backlog list */
+	if (adf_send_message(tx_ring, fw_req))
+		goto enqueue2;
+
+	spin_unlock_bh(&backlog->lock);
+	return -EINPROGRESS;
+
+enqueue2:
+	list_add_tail(&req->list, &backlog->list);
+
+	spin_unlock_bh(&backlog->lock);
 	return -EBUSY;
 }