mbox series

[0/6] crypto/realtek: add new driver

Message ID 20221013184026.63826-1-markus.stockhausen@gmx.de
Headers show
Series crypto/realtek: add new driver | expand

Message

Markus Stockhausen Oct. 13, 2022, 6:40 p.m. UTC
This driver adds support for the Realtek crypto engine. It provides hardware
accelerated AES, SHA1 & MD5 algorithms. It is included in SoCs of the RTL838x
series, such as RTL8380, RTL8381, RTL8382, as well as SoCs from the RTL930x
series, such as RTL9301, RTL9302 and RTL9303. Some little endian and ARM based
Realtek SoCs seem to have this engine too. Nevertheless this patch was only
developed and verified on MIPS big endian devices.

Module has been successfully tested with
- lots of module loads/unloads with crypto manager extra tests enabled.
- openssl devcrypto benchmarking
- tcrypt.ko benchmarking

Benchmarks from tcrypt.ko mode=600, 402, 403 sec=5 on a 800 MHz RTL9301 SoC can
be summed up as follows:
- with smallest block sizes the engine is 8-10 times slower than software
- sweet spot (harware speed = software speed) starts at 256 byte blocks
- With large blocks the engine is round about 2 times faster than software
- md5 performance is always worse than software

op/s with default software algorithms:
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
ecb(aes) 128 bit encrypt    513593  165651   44233   11264    7846    1411
ecb(aes) 128 bit decrypt    514819  165792   44259   11268    7851    1411
ecb(aes) 192 bit encrypt    455136  142680   37761    9579    6673    1198
ecb(aes) 192 bit decrypt    456524  142836   37790    9584    6675    1200
ecb(aes) 256 bit encrypt    412102  125771   33038    8361    5825    1048
ecb(aes) 256 bit decrypt    412321  125800   33056    8368    5827    1048
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
cbc(aes) 128 bit encrypt    476081  154228   41307   10520    7331    1318
cbc(aes) 128 bit decrypt    462068  152934   41228   10516    7326    1315
cbc(aes) 192 bit encrypt    426126  133894   35598    9041    6297    1132
cbc(aes) 192 bit decrypt    416446  133116   35542    9040    6296    1131
cbc(aes) 256 bit encrypt    386841  118950   31382    7953    5539     996
cbc(aes) 256 bit decrypt    379032  118209   31324    7952    5537     995
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
ctr(aes) 128 bit encrypt    475435  152852   40825   10372    7225    1299
ctr(aes) 128 bit decrypt    475804  152852   40862   10374    7227    1299
ctr(aes) 192 bit encrypt    426900  133025   35230    8940    6228    1120
ctr(aes) 192 bit decrypt    427377  133030   35235    8942    6228    1120
ctr(aes) 256 bit encrypt    388872  118259   31086    7875    5484     985
ctr(aes) 256 bit decrypt    388862  118260   31100    7875    5483     985
                      16 B    64 B   256 B  1024 B  2048 B  4096 B  8192 B
md5                 600185  365210  166293   52399   27389   14011    7068
sha1                230154  124734   52979   16055    8322    4237    2137

op/s with module and hardware offloading enabled:
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
ecb(aes) 128 bit encrypt     65062   58964   41380   19433   14884    2712
ecb(aes) 128 bit decrypt     65288   58507   40417   18854   14400    2627
ecb(aes) 192 bit encrypt     65233   57798   39236   17849   13534    2468
ecb(aes) 192 bit decrypt     65377   57100   38444   17336   13147    2406
ecb(aes) 256 bit encrypt     65064   56928   37400   16496   12432    2270
ecb(aes) 256 bit decrypt     64932   56115   36833   16064   12097    2219
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
cbc(aes) 128 bit encrypt     64246   58073   40720   19361   14878    2718
cbc(aes) 128 bit decrypt     60969   55128   38904   18630   14184    2614
cbc(aes) 192 bit encrypt     64211   56854   38787   17793   13571    2468
cbc(aes) 192 bit decrypt     60948   53947   37209   17097   12955    2390
cbc(aes) 256 bit encrypt     63920   55889   37128   16502   12430    2267
cbc(aes) 256 bit decrypt     60680   53174   35787   15819   11961    2200
                              16 B    64 B   256 B  1024 B  1472 B  8192 B
ctr(aes) 128 bit encrypt     64452   58387   40897   19401   14921    2710
ctr(aes) 128 bit decrypt     64425   58244   41016   19433   14747    2710
ctr(aes) 192 bit encrypt     64513   57115   38884   17860   13547    2468
ctr(aes) 192 bit decrypt     64531   57116   39088   17785   13510    2468
ctr(aes) 256 bit encrypt     64284   56094   37254   16524   12411    2267
ctr(aes) 256 bit decrypt     64272   56321   37296   16436   12411    2265
                      16 B    64 B   256 B  1024 B  2048 B  4096 B  8192 B
md5                  47224   44513   39175   25264   17199   10548    5874
sha1                 46389   43578   36878   22501   14890    8796    4835

Markus Stockhausen (6)
  crypto/realtek: header definitions
  crypto/realtek: core functions
  crypto/realtek: hash algorithms
  crypto/realtek: skcipher algorithms
  crypto/realtek: enable module
  crypto/realtek: add devicetree documentation

/devicetree/bindings/crypto/realtek,realtek-crypto.yaml|   51 +
drivers/crypto/Kconfig                                 |   13
drivers/crypto/Makefile                                |    1
drivers/crypto/realtek/Makefile                        |    5
drivers/crypto/realtek/realtek_crypto.c                |  472 ++++++++++
drivers/crypto/realtek/realtek_crypto.h                |  325 ++++++
drivers/crypto/realtek/realtek_crypto_ahash.c          |  406 ++++++++
drivers/crypto/realtek/realtek_crypto_skcipher.c       |  361 +++++++
8 files changed, 1634 insertions(+)

Comments

Markus Stockhausen Dec. 5, 2022, 6:47 p.m. UTC | #1
Am Donnerstag, dem 13.10.2022 um 20:40 +0200 schrieb Markus
Stockhausen:

> This driver adds support for the Realtek crypto engine. It provides
> hardware
> accelerated AES, SHA1 & MD5 algorithms. It is included in SoCs of the
> RTL838x
> series, such as RTL8380, RTL8381, RTL8382, as well as SoCs from the
> RTL930x
> series, such as RTL9301, RTL9302 and RTL9303. Some little endian and
> ARM based
> Realtek SoCs seem to have this engine too. Nevertheless this patch
> was only
> developed and verified on MIPS big endian devices.
> 
> Module has been successfully tested with
> - lots of module loads/unloads with crypto manager extra tests
> enabled.
> - openssl devcrypto benchmarking
> - tcrypt.ko benchmarking
> 
> ...
> 
> Markus Stockhausen (6)
>   crypto/realtek: header definitions
>   crypto/realtek: core functions
>   crypto/realtek: hash algorithms
>   crypto/realtek: skcipher algorithms
>   crypto/realtek: enable module
>   crypto/realtek: add devicetree documentation
> 
> /devicetree/bindings/crypto/realtek,realtek-crypto.yaml|   51 +
> drivers/crypto/Kconfig                                 |   13
> drivers/crypto/Makefile                                |    1
> drivers/crypto/realtek/Makefile                        |    5
> drivers/crypto/realtek/realtek_crypto.c                |  472
> ++++++++++
> drivers/crypto/realtek/realtek_crypto.h                |  325 ++++++
> drivers/crypto/realtek/realtek_crypto_ahash.c          |  406
> ++++++++
> drivers/crypto/realtek/realtek_crypto_skcipher.c       |  361 +++++++
> 8 files changed, 1634 insertions(+)

Hi (Herbert),

as I got neither positive nor negative feedback after your last
question I just want to ask if there is any work for me to do on this
series?

Thanks in advance.

Markus
Herbert Xu Dec. 6, 2022, 4:10 a.m. UTC | #2
On Mon, Dec 05, 2022 at 07:47:59PM +0100, Markus Stockhausen wrote:
>
> as I got neither positive nor negative feedback after your last
> question I just want to ask if there is any work for me to do on this
> series?

Sorry about that.

There is still an issue with your import function.  You dereference
the imported state directly.  That is not allowed because there is
no guarantee that the imported state is aligned for a direct CPU
load.

So you'll either need to copy it somewhere first or use an unaligned
load to access hexp->state.

Cheers,
Markus Stockhausen Dec. 6, 2022, 8:59 a.m. UTC | #3
Am Dienstag, dem 06.12.2022 um 12:10 +0800 schrieb Herbert Xu:
> On Mon, Dec 05, 2022 at 07:47:59PM +0100, Markus Stockhausen wrote:
> > 
> > as I got neither positive nor negative feedback after your last
> > question I just want to ask if there is any work for me to do on
> > this
> > series?
> 
> Sorry about that.
> 
> There is still an issue with your import function.  You dereference
> the imported state directly.  That is not allowed because there is
> no guarantee that the imported state is aligned for a direct CPU
> load.
> 
> So you'll either need to copy it somewhere first or use an unaligned
> load to access hexp->state.
> 
> Cheers,

No problem,

this is something I can work with. Nevertheless I'm unsure about your
guidance. If I get it right, the state assignment is not ok.

	...
	const struct rtcr_ahash_req *hexp = in;

	hreq->state = hexp->state; << *** maybe unaligned? ***
	if (hreq->state & RTCR_REQ_FB_ACT)
		hreq->state |= RTCR_REQ_FB_RDY;

	if (rtcr_check_fallback(areq))
		return crypto_ahash_import(freq, fexp);

	memcpy(hreq, hexp, sizeof(struct rtcr_ahash_req));
	...

Comparing this to safeexcel_ahash_import() where I got my ideas from
one sees a similar coding:

	...
	const struct safexcel_ahash_export_state *export = in;
	int ret;

	ret = crypto_ahash_init(areq);
	if (ret)
		return ret;

	req->len = export->len; << *** same here ***
	req->processed = export->processed;

	req->digest = export->digest;

	memcpy(req->cache, export->cache, HASH_CACHE_SIZE);
	memcpy(req->state, export->state, req->state_sz);
	...

Thanks in advance for your help.
Herbert Xu Dec. 6, 2022, 9:07 a.m. UTC | #4
On Tue, Dec 06, 2022 at 09:59:49AM +0100, Markus Stockhausen wrote:
>
> 	const struct rtcr_ahash_req *hexp = in;
> 
> 	hreq->state = hexp->state; << *** maybe unaligned? ***

Try

	const struct rctr_ahash_req __packed *hexp = in;

This should tell the compiler to use unaligned accessors.

> Comparing this to safeexcel_ahash_import() where I got my ideas from
> one sees a similar coding:
> 
> 	...
> 	const struct safexcel_ahash_export_state *export = in;

This is equally broken unless that driver can only be used on
platforms where unaligned access is legal (such as x86).

Cheers,