First run (with cryptodev)
Total change for mtk-aes: 394682
Total change CPU : 2260 + 14616 + 14895 + 22625 = 54396
Second run (without cryptodev)
Total change for mtk-aes: 0
Total change CPU : 1710 + 4332 + 4860 + 4357 = 15259
This follows another test I did where the version using cryptodev performed terribly bad with blocks smaller than 1024 and then jumped to the sky (16K blocks size around 6 times faster than using CPU). Could be because of the punishment when triggering so many interrupts for small packages (my opinion).
Yes, currently the small chunk is the bottleneck. BTW, If you want to do some multiple pairs throughput test (e.g. IPSec, MACSec), you can route these interrupts to different CPU (so called smp affinity) since we use some rings for parallel processing.
I didn’t notice before that the crypto driver it is only using the first CPU, no matter if it has F as the affinity value (use all CPUs). So, if I include “other” technology on a different CPU, the cryptographic hardware will work in parallel with the first one?
If that is the case, then to use IPSec will not interfere with openssl … am I right? … a crazy idea … could be possible to add a copy of cryptodev with different internal name affined to a different CPU?
With only cryptodev it is a little complicated to do other tests. But when on purpose I desynchronized my 4 tasks using the crypto and the networking, I was able to achieve around the same throughput even when I let two threads to wait for 20 seconds. In that case, that power capacity could be used by different concurrent tasks.
In general, the sequential speed can’t tell all the secrets a platform as the R2 has. When using it wisely, we can have surprises.
These DMA interrupts will cause overhead, I guess CPU0 is busy all the time, but others are idle. That’s why I suggest you try to bind IRQ 71- IRQ74 to other CPUs.
I have been playing with the machine for a while … and there is something related with the encryption that seems not to work correctly.
When using cryptsetup / LUKS, it is necessary to specify aes as the cipher, and then the operating system uses the hardware extensions to accelerate the processing. This is very good.
However, this not always works. After rebooting, the machine no longer can mount the partitions, neither can work making new ones. The behavior is like if the SoC enter some incoherent state and can’t work reliably anymore. In fact, the only way is to send some type of “electroshock”, by erasing and recreating the physical partition … something that obviously is not a valid solution. This is happening with the latest Ubuntu from november. As a reference, I executed the same commands without indicating the cipher, so cryptsetup is using the default software based functionality, and the result is a reliable encryption before and after rebooting the computer.
The performance/code quality of MT7623 upstream crypto driver is better than MT7621 proprietary driver, so there is no reason to request such driver for mt7623, thanks.
We can use cryptodev module to create /dev/crypto and openssl command to test performance.
with hw crypto:
> root@LEDE:/# openssl speed -evp aes-128-cbc
> Doing aes-128-cbc for 3s on 16 size blocks: 103906 aes-128-cbc's in 0.07s
> Doing aes-128-cbc for 3s on 64 size blocks: 103955 aes-128-cbc's in 0.09s
> Doing aes-128-cbc for 3s on 256 size blocks: 103742 aes-128-cbc's in 0.11s
> Doing aes-128-cbc for 3s on 1024 size blocks: 85149 aes-128-cbc's in 0.09s
> Doing aes-128-cbc for 3s on 8192 size blocks: 51995 aes-128-cbc's in 0.05s
without hw crypto:
root@LEDE:/# openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 3140090 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 930010 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 244271 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 61847 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 7761 aes-128-cbc's in 3.00s
finally, if you can see many interrupts after testing, the hw crypto works.
I checked http://cryptodev-linux.org/ and found it is a general driver and not optimized for specific SOC. I am curiosity how it make better performance than mainstream kernel.
AF-ALG and Cryptodev have access to the user space cryptography exposed Kernel primitives.
Then, when using these APIs, we are calling the Kernel. If the Kernel have access to the hardware cryptography elements, then, such APIs will use the cryptography hardware. If the Kernel have no access or the hardware have no cryptography hardware capacity, software will be used instead.
So, will be faster to use Cryptodev or AF-ALG that pure cryptography software as BOTAN. And the time difference can be huge.
# openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 3146569 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 1097400 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 305013 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 78430 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 8192 size blocks: 9888 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 4944 aes-128-cbc's in 3.00s
OpenSSL 1.1.0f 25 May 2017
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/lib/ssl\"" -DENGINESDIR="\"/usr/lib/arm-linux-gnueabihf/engines-1.1\""
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 16781.70k 23411.20k 26027.78k 26860.31k 27000.83k 27000.83k
Hi, Ryder.Lee … I really desisted of using the hardware crypto extensions for LUKS in this case, as it was unstable. For that I only use the pure software option. Encryption it is a very sensitive thing; if your endianess it is different, or if there is a bit lost here or there or whatever, then everything it is different. The OpenSSL was OK.
Right now I am testing other things with AF_ALG, but with a Banana M2+ (no LUKS yet), and it works well in both directions. Maybe I could try that with the R2 later, to see how it works.
I used Debian 9 armv7l, with Kernel 4.9 patched from openwrt.
I have the “<*> Mediatek Random Number Generator support” include in the kernel config as you can see wit the interrupts.
You right i don’t have “ARM Accelerated Cryptographic Algorithms”, which one should i enable, all except ARMv8?
i try again with all all crypto option include in the kernel and same result.
# openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 3141900 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 1095190 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 304640 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 78367 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 9879 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 4942 aes-128-cbc's in 3.00s
OpenSSL 1.1.0f 25 May 2017
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/lib/ssl\"" -DENGINESDIR="\"/usr/lib/arm-linux-gnueabihf/engines-1.1\""
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 16756.80k 23364.05k 25995.95k 26749.27k 26976.26k 26989.91k
# openssl speed -evp aes-128-cbc -engine cryptodev
invalid engine "cryptodev"
3069338560:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:../crypto/dso/dso_dlfcn.c:113:filename(/usr/lib/arm-linux-gnueabihf/engines-1.1/cryptodev.so): /usr/lib/arm-linux-gnueabihf/engines-1.1/cryptodev.so: cannot open shared object file: No such file or directory
3069338560:error:25070067:DSO support routines:DSO_load:could not load the shared library:../crypto/dso/dso_lib.c:161:
3069338560:error:260B6084:engine routines:dynamic_load:dso not found:../crypto/engine/eng_dyn.c:414:
3069338560:error:2606A074:engine routines:ENGINE_by_id:no such engine:../crypto/engine/eng_list.c:339:id=cryptodev
3069338560:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:../crypto/dso/dso_dlfcn.c:113:filename(libcryptodev.so): libcryptodev.so: cannot open shared object file: No such file or directory
3069338560:error:25070067:DSO support routines:DSO_load:could not load the shared library:../crypto/dso/dso_lib.c:161:
3069338560:error:260B6084:engine routines:dynamic_load:dso not found:../crypto/engine/eng_dyn.c:414:
...