Yes,patches are only in this 6.18-jumbo branch.
See this patch in mtk sdk for fixing the dsa warnings:
updated 6.18-jumbo branch, but only compilation tested yetā¦did first bootup, and git an oopsā¦still looking why this happensā¦seems like happen in probe (i guess i found it but need to think how to solve correctly).
edit: have fixed crash in 6.19-jumbo (and dropped that update from 6.18 tree as it breaks), but jumbo-frames handling is now completely differentā¦i have to set mtu for eth0 first before i can set dsa-user-port mtu
#BPI-R4
# ip link set eth0 mtu 9000 up
# ip link set dev lan3 mtu 9000 up
# ip a a 192.168.90.1/24 dev lan3
#other side (laptop):
$ sudo ip link set dev enx00e04c68001b mtu 9000 up
$ sudo ip a a 192.168.90.2/24 dev enx00e04c68001b
$ ping -M do -s 8972 192.168.90.1
PING 192.168.90.1 (192.168.90.1) 8972(9000) bytes of data.
8980 bytes from 192.168.90.1: icmp_seq=1 ttl=64 time=0.738 ms
...
mhm, it seems that the mac down/up is still needed, but mt533x error seems gone
root@bpi-r4-v11:~
# ip a a 192.168.90.1/24 dev lan3
root@bpi-r4-v11:~
# ip link set eth0 mtu 9000 up
Error: mtu greater than device maximum.
root@bpi-r4-v11:~
# ip link set eth0 up
[ 51.854102] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 51.862259] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
root@bpi-r4-v11:~
# ip link set eth0 down
[ 54.964902] mtk_soc_eth 15100000.ethernet eth0: Link is Down
root@bpi-r4-v11:~
# ip link set eth0 mtu 9000 up
[ 95.284815] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 95.293009] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
root@bpi-r4-v11:~
# ip link set dev lan3 mtu 9000 up
[ 131.224631] mt7530-mmio 15020000.switch lan3: configuring for phy/internal link mode
[ 131.233282] mt7530-mmio 15020000.switch lan3: Link is Up - 1Gbps/Full - flow control rx/tx
root@bpi-r4-v11:~
#
root@bpi-r4-v11:~
# dmesg | grep mt753
[ 3.359175] mt7530-mmio 15020000.switch: configuring for fixed/internal link mode
[ 3.369729] mt7530-mmio 15020000.switch: Link is Up - 10Gbps/Full - flow control rx/tx
[ 3.370382] mt7530-mmio 15020000.switch wan (uninitialized): PHY [mt7530-0:00] driver [Generic PHY] (irq=POLL)
[ 3.371955] mt7530-mmio 15020000.switch lan1 (uninitialized): PHY [mt7530-0:01] driver [Generic PHY] (irq=POLL)
[ 3.373384] mt7530-mmio 15020000.switch lan2 (uninitialized): PHY [mt7530-0:02] driver [Generic PHY] (irq=POLL)
[ 3.374836] mt7530-mmio 15020000.switch lan3 (uninitialized): PHY [mt7530-0:03] driver [Generic PHY] (irq=POLL)
[ 131.224631] mt7530-mmio 15020000.switch lan3: configuring for phy/internal link mode
[ 131.233282] mt7530-mmio 15020000.switch lan3: Link is Up - 1Gbps/Full - flow control rx/tx
Thanks! I saw your newest commits and Iāve been trying to backport them to 6.12 (since Iām running OpenWrt). The mtk_eth_soc driver has changed a lot in the past year or so ā Iāll need a while to figure out how.
CMIIW: weāre looking at ci: fix R3 filename (used in images-repo)ā¦net: mediatek: fix crash after jumbo-patch correct?
I reverted my fix and simply moved the mtk_set_max_mtu more down in mtk_add_mac in later commit (6.19-jumbo)
so basicly you need these commits:
a3207a383794 2025-12-28 net: mtk_eth_soc: try to fix crash the right way Frank Wunderlich (HEAD -> 6.19-jumbo, origin/6.19-jumbo)
ca4c976dce22 2025-11-12 net: ethernet: mtk_eth_soc: add dynamic rx buffer adjustment support Mason Chang
b4c6cb1d5532 2025-10-17 net: ethernet: mtk_eth_soc: add 9k jumbo frame support Mason Chang
92bfdbdd4de1 2025-10-08 net: ethernet: mtk_eth_soc: change default rx buffer length Mason Chang
most changes are because RSS/LRO support, but it should be no much problem to adapt the changes for 6.12
i will merge the top 2 into one tomorrow
Thanks for the update. Iāve gotten those commits in my build ā and just managed to compile the kernel fixing a few merge conflicts (as you mentioned, was mostly for RSS and LRO). Iāll figure out how to roll it out to test in the next few days.
Unfortunately when I compiled and run the updated kernel on my bpi I could no longer talk to it over Ethernet while plugged into the 1gbps eth switch. It booted ā does the change only work for the 10gbps SFP ports, or?
I donāt have a UART either so that may be the next thing I get to figure out whatās wrong.
Hello everyone, Iāve committed 4 more commits to my repo (synced with latest Frankās 6.18-jumbo,at least few days ago) 1st 3 addresses the same issue as the
Last one should add some performance.
I didnāt heavily testerd it yet, and didnāt test the 4th commit at all.
Feel free to join the testing. Iām planning to do it by the end of the week. Will update on any result
Moderated: added link
Frank, thanks for adding the link, +1 commit
short description - these commits address the next issues:
-
Global MTU config issue: Changing the MTU on a single port can lead to DMA ring buffer reading memory corruption
-
coordinated MTU change on the port - as all ports are on the same RX RMA buffers - all port stopping and re-initialization is required for safe MTU changing
-
Some performance issues with mixed MTU ports: after enabling jumbo frames on a single port the whole system, including MT-1500 ports are switching to jumbo frame mode to be consistent with DMA ring buffers, it causes cache allocation/cleaning overhead and reduces 1-flow RX performance on a 10G port from ~5G to ~2.2G
and
addresses the performance drop - with these 2 commits, the avg RX speed for a single flow is ~2.7G
before:
[ 5] 20.00-21.00 sec 256 MBytes 2.15 Gbits/sec 1 1.17 MBytes
[ 5] 21.00-22.00 sec 259 MBytes 2.17 Gbits/sec 0 1.33 MBytes
[ 5] 22.00-23.00 sec 260 MBytes 2.18 Gbits/sec 0 1.46 MBytes
[ 5] 23.00-24.00 sec 259 MBytes 2.17 Gbits/sec 0 1.59 MBytes
[ 5] 24.00-25.00 sec 260 MBytes 2.18 Gbits/sec 1 1.25 MBytes
[ 5] 25.00-26.00 sec 261 MBytes 2.19 Gbits/sec 0 1.39 MBytes
[ 5] 26.00-27.00 sec 260 MBytes 2.18 Gbits/sec 0 1.53 MBytes
[ 5] 27.00-28.00 sec 258 MBytes 2.16 Gbits/sec 3 1.17 MBytes
after:
[ 5] 9.00-10.00 sec 321 MBytes 2.69 Gbits/sec 0 1.40 MBytes
[ 5] 10.00-11.00 sec 328 MBytes 2.75 Gbits/sec 0 1.56 MBytes
[ 5] 11.00-12.00 sec 325 MBytes 2.73 Gbits/sec 26 1.24 MBytes
[ 5] 12.00-13.00 sec 318 MBytes 2.66 Gbits/sec 0 1.40 MBytes
[ 5] 13.00-14.00 sec 324 MBytes 2.72 Gbits/sec 0 1.56 MBytes
[ 5] 14.00-15.00 sec 329 MBytes 2.76 Gbits/sec 35 1.25 MBytes
[ 5] 15.00-16.00 sec 326 MBytes 2.74 Gbits/sec 0 1.43 MBytes
CAUTION changing MTU under heavy load (iperf) may halt the system
Imho mtu should be changed on system startup which is currently broken because of the flow with enable mac first (to set max-mtu),set mtu and then reset mac again.
Could you add description to the corresponsing commit? I saw that you did many cleanups (removing spaces linebreaks etc.) so hard to find real changes. Maybe cleanup first (separately).
Thanks for working on getting this in better shape!
yes it makes sense to split the cleanup and functionality changing
Sure, it will take some time
Consider it as a draft version now ![]()
If I understood correctly the runtime mtu change halts under the load due to DMA racing during the reset. Likely need to make sure that all buffers are flushed before changing MTU
+2
commit fc44b11c6870c9a3f8e87d2d8eb2c86105d8de56
commit af1b4adcffa8cfdc6781696a6f9f68f673000a68
Have improved the 10G port throughput from
to:
root@pve1:~# iperf3 -c 10.0.1.1
Connecting to host 10.0.1.1, port 5201
[ 5] local 10.0.1.10 port 52804 connected to 10.0.1.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 386 MBytes 3.24 Gbits/sec 69 1.40 MBytes
[ 5] 1.00-2.00 sec 391 MBytes 3.28 Gbits/sec 0 1.59 MBytes
[ 5] 2.00-3.00 sec 374 MBytes 3.14 Gbits/sec 8 1.31 MBytes
[ 5] 3.00-4.00 sec 362 MBytes 3.04 Gbits/sec 0 1.48 MBytes
[ 5] 4.00-5.00 sec 389 MBytes 3.26 Gbits/sec 15 1.19 MBytes
[ 5] 5.00-6.00 sec 368 MBytes 3.08 Gbits/sec 0 1.39 MBytes
[ 5] 6.00-7.00 sec 370 MBytes 3.10 Gbits/sec 0 1.56 MBytes
[ 5] 7.00-8.00 sec 379 MBytes 3.18 Gbits/sec 18 1.28 MBytes
[ 5] 8.00-9.00 sec 356 MBytes 2.99 Gbits/sec 0 1.44 MBytes
[ 5] 9.00-10.00 sec 381 MBytes 3.20 Gbits/sec 18 1.13 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 3.67 GBytes 3.15 Gbits/sec 128 sender
[ 5] 0.00-10.00 sec 3.66 GBytes 3.15 Gbits/sec receiver
with mixed MTU ports
perf top:
Samples: 421K of event 'cycles:P', 4000 Hz, Event count (approx.): 38034979025 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
23.85% [kernel] [k] __pi_dcache_clean_poc
15.10% [kernel] [k] __arch_copy_to_user
7.90% [kernel] [k] __pi_dcache_inval_poc
6.04% [kernel] [k] page_pool_alloc_pages
3.87% [kernel] [k] _copy_to_iter
3.30% [kernel] [k] gro_receive_skb
2.40% [kernel] [k] default_idle_call
1.95% [kernel] [k] mtk_poll_rx
1.84% [kernel] [k] finish_task_switch.isra.0
It shows that __pi_dcache_clean_poc is consuming ~24% of cpu time on RX is still spent in cache cleaning for DMA.
changed: typos fixed ![]()
Iāve renamed the branch to 6.18-jumbo-sandbox
Iāll continue here.
At this point iāll appreciate any testing. especially on MTU-9000 enabled port, as in my current config iām unable to test it any time soon.
The expected performance for RX should be close to wire-speed.
@frank-w Iāve committed few additional patches.
Itās still unsafe to change MTU during high load (on any port, even when the target port is down) everyth else works pretty well fow me (with described performance trade-off) is there a chance you can test it on jumbo-frames traffic?
I hope i can take a look on weekend⦠currently working on R4Pro (preparing mxl switch pr for openwrt)
made a small test, but the anoying configuration steps are still needed
and when sending (only R4 => laptop) there are many retransmitts which were imho not there before
root@bpi-r4-v11:~
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
3: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 86:bb:71:05:51:34 brd ff:ff:ff:ff:ff:ff
5: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 06:64:14:99:27:35 brd ff:ff:ff:ff:ff:ff
6: wan@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
7: lan1@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
8: lan2@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
9: lan3@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
root@bpi-r4-v11:~
# ip link set dev lan3 mtu 9000
RTNETLINK answers: Numerical result out of range
root@bpi-r4-v11:~
# ip link set dev eth0 mtu 9000
Error: mtu greater than device maximum.
root@bpi-r4-v11:~
# ip link set eth0 up
[ 102.451800] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 102.459947] mtk_soc_eth 15100000.ethernet eth0: mtk_open: set max-mtu of mac #0 to 9190 (9K+XGMII)
[ 102.459974] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control off
root@bpi-r4-v11:~
# ip link set eth0 down
[ 114.419842] mtk_soc_eth 15100000.ethernet eth0: Link is Down
root@bpi-r4-v11:~
# ip link set dev eth0 mtu 9000
root@bpi-r4-v11:~
# ip link set dev lan3 mtu 9000
root@bpi-r4-v11:~
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
3: eth0: <BROADCAST,MULTICAST> mtu 9004 qdisc mq state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 86:bb:71:05:51:34 brd ff:ff:ff:ff:ff:ff
5: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 06:64:14:99:27:35 brd ff:ff:ff:ff:ff:ff
6: wan@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
7: lan1@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
8: lan2@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
9: lan3@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 72:d7:04:8f:23:3a brd ff:ff:ff:ff:ff:ff
root@bpi-r4-v11:~
# uname -a
Linux bpi-r4-v11 6.18.0-rc1-bpi-r4-jumbo-sandbox #14 SMP Sat Jan 31 18:02:56 CET 2026 aarch64 GNU/Linux
root@bpi-r4-v11:~
# ip link set dev lan3 mtu 9000 up
[ 247.259304] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 247.267454] mtk_soc_eth 15100000.ethernet eth0: mtk_open: set max-mtu of mac #0 to 9190 (9K+XGMII)
[ 247.267491] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control off
[ 247.276570] mt7530-mmio 15020000.switch lan3: configuring for phy/internal link mode
[ 247.293435] mt7530-mmio 15020000.switch lan3: Link is Up - 1Gbps/Full - flow control rx/tx
root@bpi-r4-v11:~
# ip a a 192.168.90.1/24 dev lan3
root@bpi-r4-v11:~
# ping -M do -s 8972 192.168.90.1
PING 192.168.90.1 (192.168.90.1) 8972(9000) bytes of data.
8980 bytes from 192.168.90.1: icmp_seq=1 ttl=64 time=0.101 ms
8980 bytes from 192.168.90.1: icmp_seq=2 ttl=64 time=0.103 ms
8980 bytes from 192.168.90.1: icmp_seq=3 ttl=64 time=0.097 ms
8980 bytes from 192.168.90.1: icmp_seq=4 ttl=64 time=0.109 ms
8980 bytes from 192.168.90.1: icmp_seq=5 ttl=64 time=0.103 ms
^C
--- 192.168.90.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4155ms
rtt min/avg/max/mdev = 0.097/0.102/0.109/0.004 ms
root@bpi-r4-v11:~
# iperf3 -c 192.168.90.2
Connecting to host 192.168.90.2, port 5201
[ 5] local 192.168.90.1 port 40100 connected to 192.168.90.2 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 117 MBytes 979 Mbits/sec 46 489 KBytes
[ 5] 1.00-2.00 sec 118 MBytes 987 Mbits/sec 121 507 KBytes
[ 5] 2.00-3.00 sec 117 MBytes 983 Mbits/sec 115 524 KBytes
[ 5] 3.00-4.00 sec 116 MBytes 976 Mbits/sec 115 481 KBytes
[ 5] 4.00-5.00 sec 114 MBytes 953 Mbits/sec 94 428 KBytes
[ 5] 5.00-6.00 sec 117 MBytes 986 Mbits/sec 94 454 KBytes
[ 5] 6.00-7.00 sec 118 MBytes 989 Mbits/sec 115 472 KBytes
[ 5] 7.00-8.00 sec 117 MBytes 982 Mbits/sec 93 533 KBytes
[ 5] 8.00-9.00 sec 117 MBytes 986 Mbits/sec 119 227 KBytes
[ 5] 9.00-10.00 sec 117 MBytes 985 Mbits/sec 114 507 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.14 GBytes 980 Mbits/sec 1026 sender
[ 5] 0.00-10.00 sec 1.14 GBytes 979 Mbits/sec receiver
iperf Done.
root@bpi-r4-v11:~
# iperf3 -c 192.168.90.2 -R
Connecting to host 192.168.90.2, port 5201
Reverse mode, remote host 192.168.90.2 is sending
[ 5] local 192.168.90.1 port 51038 connected to 192.168.90.2 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 117 MBytes 984 Mbits/sec
[ 5] 1.00-2.00 sec 118 MBytes 987 Mbits/sec
[ 5] 2.00-3.00 sec 118 MBytes 990 Mbits/sec
[ 5] 3.00-4.00 sec 117 MBytes 983 Mbits/sec
[ 5] 4.00-5.00 sec 117 MBytes 979 Mbits/sec
[ 5] 5.00-6.00 sec 118 MBytes 990 Mbits/sec
[ 5] 6.00-7.00 sec 117 MBytes 982 Mbits/sec
[ 5] 7.00-8.00 sec 118 MBytes 990 Mbits/sec
[ 5] 8.00-9.00 sec 118 MBytes 990 Mbits/sec
[ 5] 9.00-10.00 sec 118 MBytes 990 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.15 GBytes 989 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.15 GBytes 986 Mbits/sec receiver
iperf Done.
Could you try to turn off tso ang gso and see if anything changed?
ethtool -K eth0 tso off gso off
Iāve ordered a 10G laptop NIC - will conduct some tests after itās delivered.
UPD:
My test on a 10G mtu-9000 port:
bpi-r4 ~ # ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 172.16.0.2 netmask 255.255.255.0 broadcast 172.16.0.255
inet6 fe80::b8bc:94ff:fe68:76f2 prefixlen 64 scopeid 0x20<link>
ether ba:bc:94:68:76:f2 txqueuelen 1000 (Ethernet)
RX packets 54039502 bytes 307146278665 (286.0 GiB)
RX errors 0 dropped 23 overruns 0 frame 0
TX packets 46428135 bytes 239507656697 (223.0 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 102
bpi-r4 ~ #
bpi-r4 ~ # iperf3 -c 172.16.0.1
^C- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
iperf3: interrupt - the client has terminated by signal Interrupt(2)
bpi-r4 ~ # iperf3 -c 172.16.0.1
Connecting to host 172.16.0.1, port 5201
[ 5] local 172.16.0.2 port 53726 connected to 172.16.0.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 1.15 GBytes 9.84 Gbits/sec 0 2.10 MBytes
[ 5] 1.00-2.00 sec 1.15 GBytes 9.89 Gbits/sec 0 2.10 MBytes
[ 5] 2.00-3.00 sec 1.15 GBytes 9.91 Gbits/sec 0 2.10 MBytes
[ 5] 3.00-4.00 sec 1.15 GBytes 9.89 Gbits/sec 0 2.10 MBytes
[ 5] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec 0 2.10 MBytes
[ 5] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec 0 2.10 MBytes
[ 5] 6.00-7.00 sec 1.15 GBytes 9.89 Gbits/sec 0 2.10 MBytes
[ 5] 7.00-8.00 sec 1.15 GBytes 9.91 Gbits/sec 0 2.10 MBytes
[ 5] 8.00-9.00 sec 1.15 GBytes 9.89 Gbits/sec 0 2.10 MBytes
[ 5] 9.00-10.00 sec 1.15 GBytes 9.89 Gbits/sec 0 2.10 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 11.5 GBytes 9.89 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec receiver
iperf Done.
bpi-r4 ~ # iperf3 -c 172.16.0.1 -R
Connecting to host 172.16.0.1, port 5201
Reverse mode, remote host 172.16.0.1 is sending
[ 5] local 172.16.0.2 port 55804 connected to 172.16.0.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 1.14 GBytes 9.80 Gbits/sec
[ 5] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 2.00-3.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 9.00-10.00 sec 1.15 GBytes 9.90 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec sender
[ 5] 0.00-10.00 sec 11.5 GBytes 9.89 Gbits/sec receiver
iperf Done.
bpi-r4 ~ #
No retransmits on a 10G, seems to run on a wire-speed.
tx perf top:
Samples: 63K of event 'cycles:P', 4000 Hz, Event count (approx.): 10598656413 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
19.88% [kernel] [k] __pi_dcache_clean_poc
17.42% [kernel] [k] __arch_copy_from_user
2.35% [kernel] [k] default_idle_call
2.21% [kernel] [k] handle_softirqs
1.52% [kernel] [k] finish_task_switch.isra.0
1.33% [kernel] [k] dma_map_page_attrs
1.21% [kernel] [k] __free_frozen_pages
1.20% [kernel] [k] mtk_start_xmit
1.04% [kernel] [k] tcp_sendmsg_locked
0.97% [kernel] [k] dma_unmap_phys
0.87% [kernel] [k] fq_codel_dequeue
0.85% [kernel] [k] __pi_dcache_inval_poc
0.81% [kernel] [k] el0_svc
0.78% [kernel] [k] mtk_poll_rx
0.75% [kernel] [k] get_page_from_freelist
0.71% [nf_conntrack] [k] nf_conntrack_tcp_packet
0.63% [kernel]
top
top - 23:23:50 up 6 days, 23:44, 2 users, load average: 0.67, 0.33, 0.18
Tasks: 126 total, 1 running, 125 sleep, 0 d-sleep, 0 stopped, 0 zombie
%Cpu0 : 1.3 us, 20.9 sy, 0.0 ni, 76.7 id, 0.3 wa, 0.3 hi, 0.3 si, 0.0 st
%Cpu1 : 0.0 us, 19.9 sy, 0.0 ni, 79.5 id, 0.0 wa, 0.3 hi, 0.3 si, 0.0 st
%Cpu2 : 0.3 us, 1.3 sy, 0.0 ni, 58.7 id, 0.0 wa, 1.7 hi, 38.0 si, 0.0 st
%Cpu3 : 0.0 us, 2.7 sy, 0.0 ni, 74.9 id, 0.0 wa, 1.0 hi, 21.4 si, 0.0 st
MiB Mem : 3927.7 total, 726.6 free, 368.1 used, 2888.1 buff/cache
MiB Swap: 8192.0 total, 8189.8 free, 2.2 used. 3559.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4193 root 20 0 41472 3892 3172 S 40.2 0.1 4:14.73 iperf3
1823 root 20 0 15320 6308 4564 S 0.7 0.2 0:00.46 sshd-session
4152 root 20 0 14660 6680 5404 S 0.7 0.2 16:02.51 hostapd
32 root 20 0 0 0
rx perf top:
Samples: 174K of event 'cycles:P', 4000 Hz, Event count (approx.): 12894998253 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
19.93% [kernel] [k] __arch_copy_to_user
12.12% [kernel] [k] __pi_dcache_clean_poc
11.81% [kernel] [k] __pi_dcache_inval_poc
3.29% [kernel] [k] finish_task_switch.isra.0
3.23% [kernel] [k] default_idle_call
2.46% [kernel] [k] mtk_poll_rx
2.13% [kernel] [k] _copy_to_iter
1.67% [kernel] [k] gro_receive_skb
1.18% [kernel] [k] page_pool_alloc_pages
0.84% [kernel] [k] handle_softirqs
0.76% [kernel] [k] el0_svc
0.70% [kernel] [k] __check_object_size
0.62% [kernel] [k] dev_gro_receive
0.61% [kernel] [k] tick_nohz_idle_exit
0.56% [nf_conntrack] [k] nf_conntrack_tcp_packet
0.55% [kernel] [k] page_pool_put_unrefed_netmem
0.47% [kernel] [k] __pi_memset_generic
0.45% [kernel] [k] mtk_start_xmit
0.45% [kernel]
top:
top - 23:23:03 up 6 days, 23:43, 2 users, load average: 0.46, 0.24, 0.14
Tasks: 126 total, 1 running, 125 sleep, 0 d-sleep, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 8.2 sy, 0.0 ni, 91.4 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 5.5 sy, 0.0 ni, 19.2 id, 0.0 wa, 1.0 hi, 74.3 si, 0.0 st
%Cpu2 : 0.7 us, 26.4 sy, 0.0 ni, 67.8 id, 0.0 wa, 2.1 hi, 3.1 si, 0.0 st
%Cpu3 : 1.0 us, 25.3 sy, 0.0 ni, 73.0 id, 0.0 wa, 0.7 hi, 0.0 si, 0.0 st
MiB Mem : 3927.7 total, 722.2 free, 372.6 used, 2888.0 buff/cache
MiB Swap: 8192.0 total, 8189.8 free, 2.2 used. 3555.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4193 root 20 0 41472 3892 3172 S 60.6 0.1 3:48.74 iperf3
22 root 20 0 0 0 0 S 0.7 0.0 0:14.36 ksoftirqd/1
2089 root 20 0 0 0 0 S 0.3 0.0 9:01.65 napi/phy0-0
2090 root 20 0 0 0 0 S 0.3 0.0 10:05.01 napi/phy0-0
4025 named 20 0 986484 35044 6928 S 0.3 0.9 13:28.81 named
4124 root 20 0 14656
Iāve cleaned and refactored the commits from sandbox branch.
Not tested yet, building in progress.
Will update after testing.
P.S. no retransmit issue was addressed yet.
UPD. the kernel seems to be functional, no diiiference with previous tests spotted.
For anyone interested in trying out jumbo frames, here is a testscript you can run on 1 machine to test a connection using mtu 9000.
Youāll need to remove control of 2 interfaces from any manager like systemd-networkd or NetwokManager⦠Connect the 2 interfaces with real copper. Then edit the script intf1 and intf2.
When the first argument is veth you can test the script using a veth device pair (if this is added to kernel).
#!/bin/bash
# Run with sudo
if [[ "$1" == "veth" ]]; then
ip link add name veth1a type veth peer name veth1b
intf1="veth1a"
intf2="veth1b"
else
intf1="eth1"
intf2="enu1u2c2"
fi
mtu="9000"
cleanup() {
kill -9 $(pidof iperf3)
if [[ "$1" == "veth" ]]; then
ip -net ns1 link set $intf1 down
ip -net ns2 link set $intf2 down
ip -net ns1 link del name $intf1
fi
ip netns delete ns1
ip netns delete ns2
echo
}
trap cleanup EXIT
# NS1
ip netns add ns1
ip link set $intf1 netns ns1
# cleanup from any preveous attempt
ip -net ns1 link set $intf1 down
ip -net ns1 link set dev $intf1 mtu 1500
ip -net ns1 link set dev $intf1 nomaster 2> /dev/null
ip -net ns1 route del default 2> /dev/null
ip -net ns1 addr del 192.168.22.1/24 dev $intf1 2> /dev/null
ip -net ns1 link set $intf1 up mtu $mtu
ip -net ns1 addr add 192.168.22.1/24 broadcast 192.168.22.255 dev $intf1
ip -net ns1 route add default via 192.168.22.2 dev $intf1
# NS2
ip netns add ns2
ip link set $intf2 netns ns2
# cleanup from any preveous attempt
ip -net ns2 link set $intf2 down
ip -net ns2 link set dev $intf2 mtu 1500
ip -net ns1 link set dev $intf2 nomaster 2> /dev/null
ip -net ns2 route del default 2> /dev/null
ip -net ns2 addr del 192.168.22.2/24 dev $intf2 2> /dev/null
ip -net ns2 link set $intf2 up mtu $mtu
ip -net ns2 addr add 192.168.22.2/24 broadcast 192.168.22.255 dev $intf2
ip -net ns2 route add default via 192.168.22.1 dev $intf2
while ! ip -net ns1 link show dev $intf1 up 2>/dev/null | grep -q "state UP"
do sleep 0.2; done
while ! ip -net ns2 link show dev $intf2 up 2>/dev/null | grep -q "state UP"
do sleep 0.2; done
ip -net ns1 a show dev $intf1
ip -net ns1 r
ip -net ns2 a show dev $intf2
ip -net ns2 r
echo GO!
ip netns exec ns2 ping -l 10 -c 3 192.168.22.1
ip netns exec ns1 iperf3 -s &
ip netns exec ns2 iperf3 -c 192.168.22.1 --bidir -t 2
Result:
#sudo ./mtutest.sh veth
168: veth1a@if167: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether 9e:a9:8a:2a:ba:38 brd ff:ff:ff:ff:ff:ff link-netns ns2
inet 192.168.22.1/24 brd 192.168.22.255 scope global veth1a
valid_lft forever preferred_lft forever
inet6 fe80::9ca9:8aff:fe2a:ba38/64 scope link tentative proto kernel_ll
valid_lft forever preferred_lft forever
default via 192.168.22.2 dev veth1a
192.168.22.0/24 dev veth1a proto kernel scope link src 192.168.22.1
167: veth1b@if168: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether 82:ef:f7:5e:61:66 brd ff:ff:ff:ff:ff:ff link-netns ns1
inet 192.168.22.2/24 brd 192.168.22.255 scope global veth1b
valid_lft forever preferred_lft forever
inet6 fe80::80ef:f7ff:fe5e:6166/64 scope link tentative proto kernel_ll
valid_lft forever preferred_lft forever
default via 192.168.22.1 dev veth1b
192.168.22.0/24 dev veth1b proto kernel scope link src 192.168.22.2
GO!
PING 192.168.22.1 (192.168.22.1) 56(84) bytes of data.
64 bytes from 192.168.22.1: icmp_seq=1 ttl=64 time=0.039 ms
64 bytes from 192.168.22.1: icmp_seq=2 ttl=64 time=0.014 ms
64 bytes from 192.168.22.1: icmp_seq=3 ttl=64 time=0.015 ms
--- 192.168.22.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.014/0.022/0.039/0.011 ms, pipe 3
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
Connecting to host 192.168.22.1, port 5201
Accepted connection from 192.168.22.2, port 44594
[ 5] local 192.168.22.1 port 5201 connected to 192.168.22.2 port 44602
[ 5] local 192.168.22.2 port 44602 connected to 192.168.22.1 port 5201
[ 7] local 192.168.22.2 port 44616 connected to 192.168.22.1 port 5201
[ 8] local 192.168.22.1 port 5201 connected to 192.168.22.2 port 44616
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][TX-C] 0.00-1.00 sec 8.20 GBytes 70.4 Gbits/sec 1 1.60 MBytes
[ 7][RX-C] 0.00-1.03 sec 5.79 GBytes 48.4 Gbits/sec
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][RX-S] 0.00-1.00 sec 8.20 GBytes 70.4 Gbits/sec
[ 8][TX-S] 0.00-1.00 sec 5.66 GBytes 48.6 Gbits/sec 0 1.37 MBytes
[ 5][RX-S] 1.00-2.00 sec 8.14 GBytes 69.9 Gbits/sec
[ 8][TX-S] 1.00-2.00 sec 6.19 GBytes 53.2 Gbits/sec 0 1.37 MBytes
[ 5][RX-S] 2.00-2.00 sec 512 KBytes 6.05 Gbits/sec
[ 8][TX-S] 2.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 1.37 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][RX-S] 0.00-2.00 sec 16.3 GBytes 70.1 Gbits/sec receiver
[ 8][TX-S] 0.00-2.00 sec 12.4 GBytes 53.2 Gbits/sec 0 sender
[ 5][TX-C] 1.00-2.00 sec 7.89 GBytes 67.7 Gbits/sec 0 1.29 MBytes
[ 7][RX-C] 1.03-2.00 sec 6.61 GBytes 58.3 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-2.00 sec 16.3 GBytes 70.1 Gbits/sec 1 sender
[ 5][TX-C] 0.00-2.00 sec 16.3 GBytes 70.1 Gbits/sec receiver
[ 7][RX-C] 0.00-2.00 sec 12.4 GBytes 53.2 Gbits/sec 0 sender
[ 7][RX-C] 0.00-2.00 sec 12.4 GBytes 53.2 Gbits/sec receiver
iperf Done.
-----------------------------------------------------------
Server listening on 5201 (test #2)
-----------------------------------------------------------
It is a lot easier to test with this script, then using 2 clients.
You could use eth1 - eth2, or you could use some usb ethernet enu1u2c2 - eth1
Could even connect eth1 - eth2 with a dac cable and compare speed 1500 with 9000