diff mbox series

crypto/hisilicon: Add null judgment to the callback interface

Message ID 20220930024320.29922-1-liulongfang@huawei.com
State New
Headers show
Series crypto/hisilicon: Add null judgment to the callback interface | expand

Commit Message

liulongfang Sept. 30, 2022, 2:43 a.m. UTC
The algorithm acceleration function interface provided by the
current crypto subsystem is in asynchronous mode, but some users
will call it in synchronous mode, thus not providing a callback
interface for the "base.complete" of the request.

In this usage scenario, there is no problem in using software to
complete encryption and decryption operations. But if the hardware
driver is loaded, a kernel calltrace error will be generated
because the "base.complete" callback interface is empty.

Kernel log:
 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : 0x0
 lr : sec_skcipher_callback+0x108/0x140 [hisi_sec2]
 sp : ffff80002cda3cb0
 x29: ffff80002cda3cb0 x28: ffff0020a9a56080 x27: ffff2040076a4800
 x26: 0000000000000000 x25: dead000000000100 x24: dead000000000122
 x23: ffff0020881eb4d0 x22: ffff0020960c6f00 x21: ffff0020881eb4b0
 x20: ffff0020881eb478 x19: ffff2040076a4880 x18: 000000000000001c
 x17: 00000000a3f246e1 x16: ffffb38bc5e73d40 x15: 000000006c63877d
 x14: 0000000000000000 x13: 0000000000000000 x12: ffff2040063b1dd0
 x11: ffff2040063b1da8 x10: ffff2040063b1da8 x9 : ffffb38b8da71f78
 x8 : ffff2040063b1dd0 x7 : 0000000000000000 x6 : ffff2040063b1fd0
 x5 : 0000000000000228 x4 : 0000000000000000 x3 : ffff00209ba33000
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2040076a4820
 Call trace:
  0x0
  sec_req_cb+0xc8/0x254 [hisi_sec2]
  qm_work_process+0x1d8/0x330 [hisi_qm]
  process_one_work+0x1e0/0x450
  worker_thread+0x158/0x450
  kthread+0x118/0x120
  ret_from_fork+0x10/0x20
  Code: bad PC value
  ---[ end trace 0000000000000000 ]---

In order to prevent the occurrence of kernel errors in this scenario,
it is necessary to add a judgment on whether the "base.complete"
interface of the hardware device driver is empty.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
---
 drivers/crypto/hisilicon/hpre/hpre_crypto.c | 12 ++++++++----
 drivers/crypto/hisilicon/sec2/sec_crypto.c  | 12 ++++++++----
 2 files changed, 16 insertions(+), 8 deletions(-)

Comments

Herbert Xu Sept. 30, 2022, 2:49 a.m. UTC | #1
On Fri, Sep 30, 2022 at 10:43:20AM +0800, Longfang Liu wrote:
> The algorithm acceleration function interface provided by the
> current crypto subsystem is in asynchronous mode, but some users
> will call it in synchronous mode, thus not providing a callback
> interface for the "base.complete" of the request.

Please give more details.  Who is calling the callback functions
in synchronous mode?

Thanks,
liulongfang Oct. 28, 2022, 2:26 a.m. UTC | #2
On 2022/9/30 10:49, Herbert Xu wrote:
> On Fri, Sep 30, 2022 at 10:43:20AM +0800, Longfang Liu wrote:
>> The algorithm acceleration function interface provided by the
>> current crypto subsystem is in asynchronous mode, but some users
>> will call it in synchronous mode, thus not providing a callback
>> interface for the "base.complete" of the request.
> 
> Please give more details.  Who is calling the callback functions
> in synchronous mode?
> 
> Thanks,
> 

Hi, Herbert.
Do you consider merging this patch?
Thansk,
Longfang.
Herbert Xu Oct. 28, 2022, 3:57 a.m. UTC | #3
On Fri, Sep 30, 2022 at 11:48:02AM +0800, liulongfang wrote:
>
> Even if the task is sent in synchronous mode, when using the hardware
> driver, the hardware still informs the driver software through an
> interrupt after completing the task, and the workqueue in the driver
> software will call this callback function.
> 
> And I found that the device drivers of other manufacturers under the
> crypto subsystem are also in this asynchronous mode, and this problem
> is also encountered when using the synchronous mode.

This still makes no sense to me.

Who is making an async request with no callback?

Cheers,
liulongfang Oct. 29, 2022, 1:25 a.m. UTC | #4
On 2022/10/28 11:57, Herbert Xu wrote:
> On Fri, Sep 30, 2022 at 11:48:02AM +0800, liulongfang wrote:
>>
>> Even if the task is sent in synchronous mode, when using the hardware
>> driver, the hardware still informs the driver software through an
>> interrupt after completing the task, and the workqueue in the driver
>> software will call this callback function.
>>
>> And I found that the device drivers of other manufacturers under the
>> crypto subsystem are also in this asynchronous mode, and this problem
>> is also encountered when using the synchronous mode.
> 
> This still makes no sense to me.
> 
> Who is making an async request with no callback?
> 

The context of the problem may not have been clearly stated in the previous
description.

This is a problem found in one of our real user scenarios:
In this scenario of using an accelerator to perform encryption  services,
it was originally intended to use the CPU to perform encryption services
in synchronous mode (without loading the hardware device driver, and without
registering the asynchronous callback function), but due to a deployment
error, the Hisi hardware device driver was loaded into the system,
this wrong operation causes the encryption service to call the device
driver of the hisi hardware, but the hardware device driver executes the
asynchronous mode, so the callback interface is called after the service
is completed.
This leads to this system calltrace problem.

The purpose of this patch is to ensure that the device does not appear
calltrace in this abnormal situation.

> Cheers,
> 

Thanks,
Longfang.
Herbert Xu Nov. 4, 2022, 9:08 a.m. UTC | #5
On Sat, Oct 29, 2022 at 09:25:18AM +0800, liulongfang wrote:
>
> The context of the problem may not have been clearly stated in the previous
> description.
> 
> This is a problem found in one of our real user scenarios:
> In this scenario of using an accelerator to perform encryption  services,
> it was originally intended to use the CPU to perform encryption services
> in synchronous mode (without loading the hardware device driver, and without
> registering the asynchronous callback function), but due to a deployment
> error, the Hisi hardware device driver was loaded into the system,
> this wrong operation causes the encryption service to call the device
> driver of the hisi hardware, but the hardware device driver executes the
> asynchronous mode, so the callback interface is called after the service
> is completed.
> This leads to this system calltrace problem.
> 
> The purpose of this patch is to ensure that the device does not appear
> calltrace in this abnormal situation.

I'm still having trouble understanding this.  Please give an
exact call-trace that triggers the callback with a NULL callback
pointer.

Thanks,
liulongfang Nov. 7, 2022, 1:22 p.m. UTC | #6
On 2022/11/4 17:08, Herbert Xu wrote:
> On Sat, Oct 29, 2022 at 09:25:18AM +0800, liulongfang wrote:
>>
>> The context of the problem may not have been clearly stated in the previous
>> description.
>>
>> This is a problem found in one of our real user scenarios:
>> In this scenario of using an accelerator to perform encryption  services,
>> it was originally intended to use the CPU to perform encryption services
>> in synchronous mode (without loading the hardware device driver, and without
>> registering the asynchronous callback function), but due to a deployment
>> error, the Hisi hardware device driver was loaded into the system,
>> this wrong operation causes the encryption service to call the device
>> driver of the hisi hardware, but the hardware device driver executes the
>> asynchronous mode, so the callback interface is called after the service
>> is completed.
>> This leads to this system calltrace problem.
>>
>> The purpose of this patch is to ensure that the device does not appear
>> calltrace in this abnormal situation.
> 
> I'm still having trouble understanding this.  Please give an
> exact call-trace that triggers the callback with a NULL callback
> pointer.
> 

What do you need is the log of this call trace?

Kernel log:
 Workqueue: 0000:76:00.0 qm_work_process [hisi_qm]
 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : 0x0
 lr : sec_skcipher_callback+0x108/0x140 [hisi_sec2]
 sp : ffff80002cda3cb0
 x29: ffff80002cda3cb0 x28: ffff0020a9a56080 x27: ffff2040076a4800
 x26: 0000000000000000 x25: dead000000000100 x24: dead000000000122
 x23: ffff0020881eb4d0 x22: ffff0020960c6f00 x21: ffff0020881eb4b0
 x20: ffff0020881eb478 x19: ffff2040076a4880 x18: 000000000000001c
 x17: 00000000a3f246e1 x16: ffffb38bc5e73d40 x15: 000000006c63877d
 x14: 0000000000000000 x13: 0000000000000000 x12: ffff2040063b1dd0
 x11: ffff2040063b1da8 x10: ffff2040063b1da8 x9 : ffffb38b8da71f78
 x8 : ffff2040063b1dd0 x7 : 0000000000000000 x6 : ffff2040063b1fd0
 x5 : 0000000000000228 x4 : 0000000000000000 x3 : ffff00209ba33000
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2040076a4820
 Call trace:
  0x0
  sec_req_cb+0xc8/0x254 [hisi_sec2]
  qm_work_process+0x1d8/0x330 [hisi_qm]
  process_one_work+0x1e0/0x450
  worker_thread+0x158/0x450
  kthread+0x118/0x120
  ret_from_fork+0x10/0x20
  Code: bad PC value
  ---[ end trace 0000000000000000 ]---

> Thanks,
> 
Thanks,
Longfang.
Herbert Xu Nov. 8, 2022, 9:59 a.m. UTC | #7
On Mon, Nov 07, 2022 at 09:22:22PM +0800, liulongfang wrote:
.
> What do you need is the log of this call trace?

I mean the functions in the call trace starting from the one that
sets the callback to NULL.

Cheers
liulongfang Nov. 9, 2022, 6:21 a.m. UTC | #8
On 2022/11/8 17:59, Herbert Xu wrote:
> On Mon, Nov 07, 2022 at 09:22:22PM +0800, liulongfang wrote:
> .
>> What do you need is the log of this call trace?
> 
> I mean the functions in the call trace starting from the one that
> sets the callback to NULL.
> 

The trigger method is to not call the function skcipher_request_set_callback()
when using the skcipher interface for encryption and decryption services.

> Cheers
> 
Thanks,
Longfang.
Herbert Xu Nov. 9, 2022, 9:18 a.m. UTC | #9
On Wed, Nov 09, 2022 at 02:21:11PM +0800, liulongfang wrote:
>
> The trigger method is to not call the function skcipher_request_set_callback()
> when using the skcipher interface for encryption and decryption services.

Yes but which function exactly? Please give the exact call path
leading to this crash.

Cheers,
liulongfang Nov. 10, 2022, 2:03 a.m. UTC | #10
On 2022/11/9 17:18, Herbert Xu wrote:
> On Wed, Nov 09, 2022 at 02:21:11PM +0800, liulongfang wrote:
>>
>> The trigger method is to not call the function skcipher_request_set_callback()
>> when using the skcipher interface for encryption and decryption services.
> 
> Yes but which function exactly? Please give the exact call path
> leading to this crash.
> 

This problem occurs in the application code of the encryption usage scenario
(unfortunately, these codes are not open to the public and cannot be given to you),
but its code logic is similar to test_skcipher_vec_cfg() in kernel\crypto\testmgr.c,
and it is also in accordance with this call logic execution:
crypto_alloc_skcipher()--->skcipher_request_alloc()--->crypto_skcipher_setkey()--->
sg_init_table()--->skcipher_request_set_crypt()--->
crypto_skcipher_encrypt()/crypto_skcipher_decrypt()--->
skcipher_request_free()--->crypto_free_skcipher()

Just don't add "wait" and skcipher_request_set_callback(),
use it as a synchronous mode.

Thanks
Longfang.
> Cheers,
>
Herbert Xu Nov. 10, 2022, 3:20 a.m. UTC | #11
On Thu, Nov 10, 2022 at 10:03:53AM +0800, liulongfang wrote:
.
> This problem occurs in the application code of the encryption usage scenario
> (unfortunately, these codes are not open to the public and cannot be given to you),

Are you saying this requires out-of-tree kernel code to trigger?

Then you should fix that out-of-tree code.

Thanks,
liulongfang Nov. 10, 2022, 4:11 a.m. UTC | #12
On 2022/11/10 11:20, Herbert Xu wrote:
> On Thu, Nov 10, 2022 at 10:03:53AM +0800, liulongfang wrote:
> .
>> This problem occurs in the application code of the encryption usage scenario
>> (unfortunately, these codes are not open to the public and cannot be given to you),
> 
> Are you saying this requires out-of-tree kernel code to trigger?
> 

Yes, this problem is triggered by application layer code,
but it happens on kernel driver code.

> Then you should fix that out-of-tree code.
>
When using crypto's skcipher series interfaces for encryption and decryption
services, User can use synchronous mode(by adjusting some skcipher interfaces,
here is to remove skcipher_request_set_callback()) or asynchronous mode,
but when using synchronous mode and the current asynchronous mode is loaded
it will cause a calltrace.

The current problem is that the interface of skcipher does not restrict users
to call functions in this way for encryption services.

If the current driver doesn't handle this, there is a possibility that some users
deliberately create this kind of problem to cause the kernel to crash.

> Thanks,
>
Herbert Xu Nov. 10, 2022, 8:53 a.m. UTC | #13
On Thu, Nov 10, 2022 at 12:11:15PM +0800, liulongfang wrote:
>
> When using crypto's skcipher series interfaces for encryption and decryption
> services, User can use synchronous mode(by adjusting some skcipher interfaces,
> here is to remove skcipher_request_set_callback()) or asynchronous mode,
> but when using synchronous mode and the current asynchronous mode is loaded
> it will cause a calltrace.
> 
> The current problem is that the interface of skcipher does not restrict users
> to call functions in this way for encryption services.
> 
> If the current driver doesn't handle this, there is a possibility that some users
> deliberately create this kind of problem to cause the kernel to crash.

It sounds like your code is misusing the skcipher API.  By default
skcipher is always async.  You must always set a callback.

The only way to legally use skcipher without setting a callback
is by allocating it with crypto_alloc_sync_skcipher.  In which case
unless your driver incorrectly declares itself as sync instead of
async, then it will never be used by such a user.

Cheers,
liulongfang Nov. 12, 2022, 1:51 a.m. UTC | #14
On 2022/11/10 16:53, Herbert Xu wrote:
> On Thu, Nov 10, 2022 at 12:11:15PM +0800, liulongfang wrote:
>>
>> When using crypto's skcipher series interfaces for encryption and decryption
>> services, User can use synchronous mode(by adjusting some skcipher interfaces,
>> here is to remove skcipher_request_set_callback()) or asynchronous mode,
>> but when using synchronous mode and the current asynchronous mode is loaded
>> it will cause a calltrace.
>>
>> The current problem is that the interface of skcipher does not restrict users
>> to call functions in this way for encryption services.
>>
>> If the current driver doesn't handle this, there is a possibility that some users
>> deliberately create this kind of problem to cause the kernel to crash.
> 
> It sounds like your code is misusing the skcipher API.  By default
> skcipher is always async.  You must always set a callback.
> 
> The only way to legally use skcipher without setting a callback
> is by allocating it with crypto_alloc_sync_skcipher.  In which case

OK! I found in Documentation/crypto/architecture.rst the description
that async mode must provide a callback function.

However, what is confusing is that this document does not describe
the synchronization mode so clearly.

> unless your driver incorrectly declares itself as sync instead of
> async, then it will never be used by such a user.
> 
> Cheers,
> 
Thanks,
Longfang.
diff mbox series

Patch

diff --git a/drivers/crypto/hisilicon/hpre/hpre_crypto.c b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
index 3ba6f15deafc..5b09f4a72020 100644
--- a/drivers/crypto/hisilicon/hpre/hpre_crypto.c
+++ b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
@@ -430,7 +430,8 @@  static void hpre_dh_cb(struct hpre_ctx *ctx, void *resp)
 		atomic64_inc(&dfx[HPRE_OVER_THRHLD_CNT].value);
 
 	hpre_hw_data_clr_all(ctx, req, areq->dst, areq->src);
-	kpp_request_complete(areq, ret);
+	if (areq->base.complete)
+		kpp_request_complete(areq, ret);
 	atomic64_inc(&dfx[HPRE_RECV_CNT].value);
 }
 
@@ -451,7 +452,8 @@  static void hpre_rsa_cb(struct hpre_ctx *ctx, void *resp)
 	areq = req->areq.rsa;
 	areq->dst_len = ctx->key_sz;
 	hpre_hw_data_clr_all(ctx, req, areq->dst, areq->src);
-	akcipher_request_complete(areq, ret);
+	if (areq->base.complete)
+		akcipher_request_complete(areq, ret);
 	atomic64_inc(&dfx[HPRE_RECV_CNT].value);
 }
 
@@ -1460,7 +1462,8 @@  static void hpre_ecdh_cb(struct hpre_ctx *ctx, void *resp)
 	memmove(p + curve_sz, p + areq->dst_len - curve_sz, curve_sz);
 
 	hpre_ecdh_hw_data_clr_all(ctx, req, areq->dst, areq->src);
-	kpp_request_complete(areq, ret);
+	if (areq->base.complete)
+		kpp_request_complete(areq, ret);
 
 	atomic64_inc(&dfx[HPRE_RECV_CNT].value);
 }
@@ -1766,7 +1769,8 @@  static void hpre_curve25519_cb(struct hpre_ctx *ctx, void *resp)
 	hpre_key_to_big_end(sg_virt(areq->dst), CURVE25519_KEY_SIZE);
 
 	hpre_curve25519_hw_data_clr_all(ctx, req, areq->dst, areq->src);
-	kpp_request_complete(areq, ret);
+	if (areq->base.complete)
+		kpp_request_complete(areq, ret);
 
 	atomic64_inc(&dfx[HPRE_RECV_CNT].value);
 }
diff --git a/drivers/crypto/hisilicon/sec2/sec_crypto.c b/drivers/crypto/hisilicon/sec2/sec_crypto.c
index 77c9f13cf69a..b9d74d3ac12c 100644
--- a/drivers/crypto/hisilicon/sec2/sec_crypto.c
+++ b/drivers/crypto/hisilicon/sec2/sec_crypto.c
@@ -1416,12 +1416,14 @@  static void sec_skcipher_callback(struct sec_ctx *ctx, struct sec_req *req,
 			break;
 
 		backlog_sk_req = backlog_req->c_req.sk_req;
-		backlog_sk_req->base.complete(&backlog_sk_req->base,
+		if (backlog_sk_req->base.complete)
+			backlog_sk_req->base.complete(&backlog_sk_req->base,
 						-EINPROGRESS);
 		atomic64_inc(&ctx->sec->debug.dfx.recv_busy_cnt);
 	}
 
-	sk_req->base.complete(&sk_req->base, err);
+	if (sk_req->base.complete)
+		sk_req->base.complete(&sk_req->base, err);
 }
 
 static void set_aead_auth_iv(struct sec_ctx *ctx, struct sec_req *req)
@@ -1694,12 +1696,14 @@  static void sec_aead_callback(struct sec_ctx *c, struct sec_req *req, int err)
 			break;
 
 		backlog_aead_req = backlog_req->aead_req.aead_req;
-		backlog_aead_req->base.complete(&backlog_aead_req->base,
+		if (backlog_aead_req->base.complete)
+			backlog_aead_req->base.complete(&backlog_aead_req->base,
 						-EINPROGRESS);
 		atomic64_inc(&c->sec->debug.dfx.recv_busy_cnt);
 	}
 
-	a_req->base.complete(&a_req->base, err);
+	if (a_req->base.complete)
+		a_req->base.complete(&a_req->base, err);
 }
 
 static void sec_request_uninit(struct sec_ctx *ctx, struct sec_req *req)