[BPI-F3]Box64 initially supports RISC-V Vector 1.0 (RVV) extensions with up to 300% performance improvements

Box64 initially supports RISC-V Vector 1.0 (RVV) extensions with up to 300% performance improvements, and the code has been open-source and integrated upstream

Box64 RISC-V back-end uses scalar instructions to simulate MMX, SSE* and other x86_64 vector extensions, achieving good compatibility for rv64gc, but a vector instruction often needs more than a dozen or even dozens of scalar instructions to simulate, so, For x86_64 programs that make heavy use of vector instructions, Box64 has a relatively large performance penalty. Recently, developers from PLCT Lab added preliminary RVV 1.0 support to the Box64 RISC-V backend, submitted more than 30 related PR, and now supports more than 100 SSE instructions to RVV instructions efficient translation.

On SpacemiT K1 (8-core, RVV 1.0, VLEN 256), the performance of the new backend was tested when running dav1d, as shown below.

dav1d is a well-known cross-platform open source AV1 video decoder, and its x86_64 version makes heavy use of SSE instructions. Box64 greatly reduces the performance penalty of binary translation by translating SSE instructions into RVV instructions. As you can see from the figure, with V extended support turned on, the performance score reached 70% of the native version and 300% of the scalar version!

Note that dav1d has a large number of SSE instructions in the hotspot code, so the measured performance improvement is large, for reference only, not universal. Each program has a different performance improvement and needs to be analyzed. In the future, PLCT Lab will continue to improve Box64 RVV 1.0 support. Welcome global developers to participate in RISC-V software ecological construction!