[BPI-R4] SFP rss/lro performance

Which iperf version do you have? I had limited throughput with default from debian. LRO needed iperf2 in my tests to reach the 9.4 gbit (max)

$ iperf --version: iperf version 2.1.8 (12 August 2022) pthreads

After dealing with the nfs error:

$ journalctl -b -u nfs-server
Jul 22 06:14:45 nas systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
Jul 22 06:14:45 nas systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.

I was re-compile kernel with the nfs directly in the kernel instead of module, and nfs-server started working.


  • dd if=/dev/zero of=/mnt/nvme/test.img bs=10M count=1000 oflag=dsync

10370416640 bytes (10 GB, 9.7 GiB) copied, 39 s, 266 MB/s

9961472000 bytes (10 GB, 9.3 GiB) copied, 14 s, 711 MB/s

  • dd if=/mnt/raid0/test.img of=/dev/null bs=10M iflag=direct

9856614400 bytes (9.9 GB, 9.2 GiB) copied, 13 s, 757 MB/s


Results between 6.12.32 without rss/lro and 6.16.0 with rss/lro looks not as big as expected, but, I thing that in case of using BPI-R4 as a NAS server:

  • writing to NAS means reading data by ethernet/sfp which utilise rss/lro functionality, but is probably limited by
    • write to the NVME and SATA drives is limited because of the technology - these drivers have to clear the space before writing a new data which makes writing slower, and if I remember it correctly, it reads the data to verify succesfull write.
  • reading from NAS drives is much more faster, as expected, and I am little confused here, because I did not set-up smp_affinity for TX frames.

It should make sense, but I have to ask: Does this means, that linux core utilise all cores for TX frames by default?


edit: nfs-server error added

I did test with the 3GB of ram drive:

  • dd if=/dev/zero of=/mnt/ramdrive/test.img bs=10M count=300 oflag=dsync

2883584000 bytes (2.9 GB, 2.7 GiB) copied, 5 s, 575 MB/s

  • dd if=/mnt/ramdrive/test.img of=/dev/null bs=10M iflag=direct

2170552320 bytes (2.2 GB, 2.0 GiB) copied, 2 s, 1.1 GB/s


Why is writing β€œonly” 50% of the maximum speed?