As the discussion on the [BPI-R4] and SFP thread aimed [X|G]PONs probems, I am creating this thread to publish and discuss test results of fiber 10G SFP+ interfaces on various rss/lro kernels on debian12. I am testing iperf3 and also nfs performance.
On BPI-R4 I am using:
- sfp module MM Go-Fibereasy 10G SFP+ AOC detected by kernel as OEM SFP-10G-AOC2M rev 1.0 used as iperf --client
- m.2 nvme 1TB KINGSTON SKC2500M81000G
- miniPCIeX to 4x sata3 controller: ASMedia Technology Inc. ASM1064 Serial ATA Controller (rev 02)
- raid0 (3x2,5"sata3 240GB WDC WDS240G2G0A-00JH30)
On the “server” side is ubuntu 24 with Intel X520
kernel 6.12.32-bpi-r4-main compiled from GitHub - frank-w/BPI-Router-Linux: Linux kernel 4.14+ for BPI-R2, 5.4+ for R64, 6.1+ for R2Pro and R3 default branch 6.12-main:
- iperf3 --client
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.0 GBytes  9.42 Gbits/sec   33             sender
[  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec                  receiver
loadavg per second: 0.01 0.01 0.17 0.17 0.17 0.17 0.17 0.24 0.24 0.24 0.24
- iperf3 --client --reverse
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.21 GBytes  4.47 Gbits/sec  117             sender
[  5]   0.00-10.00  sec  5.20 GBytes  4.47 Gbits/sec                  receiver
loadavg per second: 0.00 0.00 0.00 0.00 0.24 0.24 0.24 0.24 0.24 0.38 0.38 0.38
- iperf3 --client --bidir
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  6.08 GBytes  5.22 Gbits/sec  223             sender
[  5][TX-C]   0.00-10.00  sec  6.08 GBytes  5.22 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  2.50 GBytes  2.15 Gbits/sec  761             sender
[  7][RX-C]   0.00-10.00  sec  2.50 GBytes  2.15 Gbits/sec                  receiver
loadavg per second: 0.03 0.03 0.03 0.18 0.18 0.18 0.18 0.18 0.33 0.33 0.33 0.33
- nfs write to nvme on BPI-R4:
dd if=/dev/zero of=./test.img bs=10M count=1000 status=progress oflag=dsync
10454302720 bytes (10 GB, 9.7 GiB) copied, 45 s, 232 MB/s
loadavg per second: 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.32 0.32 0.32 0.32 0.32 0.46 0.46 0.46 0.46 0.46 0.66 0.66 0.66 0.66 0.66 0.93 0.93 0.93 0.93 0.93 1.09 1.09 1.09 1.09 1.09 1.41 1.41 1.41 1.41 1.41 1.37 1.37 1.37 1.37 1.37 1.34 1.34
- nfs read from nvme on BPI-R4:
dd if=./test.img of=/dev/null bs=10M status=progress iflag=direct
10297016320 bytes (10 GB, 9.6 GiB) copied, 13 s, 791 MB/s
loadavg per second: 0.02 0.02 0.18 0.18 0.18 0.18 0.18 0.81 0.81 0.81 0.81 0.81 1.39 1.39 1.39 1.39
- nfs write to raid0 on BPI-R4:
dd if=/dev/zero of=./test.img bs=10M count=1000 status=progress oflag=dsync
10412359680 bytes (10 GB, 9.7 GiB) copied, 62 s, 168 MB/s
loadavg per second: 0.01 0.01 0.01 0.01 0.01 0.09 0.09 0.09 0.09 0.09 0.16 0.16 0.16 0.16 0.16 0.23 0.23 0.23 0.23 0.23 0.37 0.37 0.37 0.37 0.37 0.42 0.42 0.42 0.42 0.42 0.55 0.55 0.55 0.55 0.55 0.58 0.58 0.58 0.58 0.62 0.62 0.62 0.62 0.62 0.81 0.81 0.81 0.81 0.81 0.82 0.82 0.82 0.82 0.82 1.08 1.08 1.08 1.08 1.08 1.23 1.23 1.23 1.23 1.23 1.13 1.13
- nfs read from raid0 on BPI-R4:
dd if=./test.img of=/dev/null bs=10M status=progress iflag=direct
10181672960 bytes (10 GB, 9.5 GiB) copied, 15 s, 678 MB/s
loadavg per second: 0.02 0.02 0.02 0.90 0.90 0.90 0.90 0.90 1.31 1.31 1.31 1.31 1.31 2.16 2.16 2.16 2.16
I am compiling @frank-w branch 6.16-rsslro right now. Tests results will be added today. Any other results will be appreciated. 
 ) I did not completed the tests. Based on informations from
 ) I did not completed the tests. Based on informations from  but it looks like you set it to cpu2 and later to 3 (or vice versa, as i see interrupts on tx for cpu0,2 and 3)
 but it looks like you set it to cpu2 and later to 3 (or vice versa, as i see interrupts on tx for cpu0,2 and 3)