Can you explain how you setup iperf (server config) and your client command?
root@bpi-r2:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.0.42 port 5001 connected with 192.168.0.21 port 50830
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.08 GBytes 931 Mbits/sec
running on client:
frank@frank-N56VZ:~
[19:05:12]$ iperf -c 192.168.0.42
------------------------------------------------------------
Client connecting to 192.168.0.42, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.21 port 50830 connected with 192.168.0.42 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.08 GBytes 931 Mbits/sec
info on r2 (kernel+ip-address)
root@bpi-r2:~# uname -r
4.19.4-bpi-r2-testing
root@bpi-r2:~# ip addr show wan
5: wan@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 0
link/ether 8a:dc:3b:8b:15:ed brd ff:ff:ff:ff:ff:ff
inet 192.168.0.42/24 brd 192.168.0.255 scope global wan
valid_lft forever preferred_lft forever
inet6 fe80::88dc:3bff:fe8b:15ed/64 scope link
valid_lft forever preferred_lft forever
Looks like youāve done everything right
But iām not sure that both directions are tested, anyway you can wsitch client and server side if there are no options for choosing traffic direction.
have done it the opposite direction and also over lan0, speed is comparable to the above results
I never did find a solution to this issue other than to use some external USB-NICs not ideal by any stretch and sorta obviates the reason for choosing the R2 in the fist place.
I have the same issue you throw packets at the pi, but if the pi is generating packets even if its a simple script or something to just kick out zeros its terribly slow and the error rate is very high.
Myshob, sunarowicz, malvcr, Have you guys capture packets on both the pi side and client side? Do you see a lot of re-transmission?
I do also have an issue with BPi-R2 and network speed.
If I make tests with iperf, everything is fine. If I connect the R2 behind my providers router and do measures with https://fast.com I reach around 200 Mbit/s (should be 400Mbit/s which I do get when connecting directly to my providers router, but okay). If I connect more devices to all four ports of the R2 the speed measured with https://fast.com drops below 1Mbit/s. If I do exactly the same with a BPi-R1 I do reach arround 200Mbit/s on https://fast.com, also if the other devices are connected. During the tests with R1 and R2; the other devices were an Odroid-C2 and a Raspberry PI Zero, both running shairport-sync playing multiroom audio from a MacBook Pro.
Kernel used:
- R2: 4.14.88-bpi-r2-main (frank-w)
- R1: 4.14.55-1-ARCH
both running archlinux and using arno-iptables-firewall and dnsmasq. All LAN Ports are in one bridge.
@frank-w do you use the R2 as router or are you only doing some tests with it?
I use r2 (1 of 2) as router but i have only a adsl2+ with 12mbit/s
tested again with 4.19 and my gmac-patches
from r2 speed is ~940Mbit/s, to r2 i have only ~700Mbit/s
now i tested with iperf3 from r2 (first normal, second with -R, output from my laptop/client):
wan (eth1):
[18:46:48]$ iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.19, port 49282
[ 5] local 192.168.0.21 port 5201 connected to 192.168.0.19 port 49284
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 107 MBytes 895 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 939 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 939 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 939 Mbits/sec
[ 5] 4.00-5.00 sec 112 MBytes 939 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 939 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 939 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 939 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 939 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 939 Mbits/sec
[ 5] 10.00-10.05 sec 5.32 MBytes 939 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.05 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.05 sec 1.09 GBytes 935 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.19, port 49286
[ 5] local 192.168.0.21 port 5201 connected to 192.168.0.19 port 49288
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 5] 0.00-1.00 sec 102 MBytes 851 Mbits/sec 0 576 KBytes
[ 5] 1.00-2.00 sec 111 MBytes 934 Mbits/sec 0 576 KBytes
[ 5] 2.00-3.00 sec 111 MBytes 933 Mbits/sec 0 576 KBytes
[ 5] 3.00-4.00 sec 80.3 MBytes 674 Mbits/sec 0 771 KBytes
[ 5] 4.00-5.00 sec 73.0 MBytes 613 Mbits/sec 0 771 KBytes
[ 5] 5.00-6.00 sec 73.5 MBytes 617 Mbits/sec 0 771 KBytes
[ 5] 6.00-7.00 sec 73.5 MBytes 617 Mbits/sec 0 771 KBytes
[ 5] 7.00-8.00 sec 73.5 MBytes 617 Mbits/sec 0 771 KBytes
[ 5] 8.00-9.00 sec 73.5 MBytes 617 Mbits/sec 0 771 KBytes
[ 5] 9.00-10.00 sec 73.5 MBytes 616 Mbits/sec 0 771 KBytes
[ 5] 10.00-10.04 sec 2.49 MBytes 523 Mbits/sec 0 771 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 5] 0.00-10.04 sec 847 MBytes 708 Mbits/sec 0 sender
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec receiver
lan0 (eth0):
Accepted connection from 192.168.0.19, port 49290
[ 5] local 192.168.0.21 port 5201 connected to 192.168.0.19 port 49292
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 107 MBytes 894 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 939 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 939 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 939 Mbits/sec
[ 5] 4.00-5.00 sec 112 MBytes 939 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 939 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 939 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 939 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 939 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 939 Mbits/sec
[ 5] 10.00-10.05 sec 5.27 MBytes 935 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.05 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.05 sec 1.09 GBytes 935 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.19, port 49294
[ 5] local 192.168.0.21 port 5201 connected to 192.168.0.19 port 49296
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 5] 0.00-1.00 sec 106 MBytes 887 Mbits/sec 0 143 KBytes
[ 5] 1.00-2.00 sec 85.0 MBytes 713 Mbits/sec 0 583 KBytes
[ 5] 2.00-3.00 sec 73.5 MBytes 617 Mbits/sec 0 583 KBytes
[ 5] 3.00-4.00 sec 73.5 MBytes 617 Mbits/sec 0 583 KBytes
[ 5] 4.00-5.00 sec 73.5 MBytes 617 Mbits/sec 0 583 KBytes
[ 5] 5.00-6.00 sec 74.0 MBytes 621 Mbits/sec 0 583 KBytes
[ 5] 6.00-7.00 sec 73.3 MBytes 614 Mbits/sec 0 583 KBytes
[ 5] 7.00-8.00 sec 73.5 MBytes 617 Mbits/sec 0 583 KBytes
[ 5] 8.00-9.00 sec 73.4 MBytes 616 Mbits/sec 0 583 KBytes
[ 5] 9.00-10.00 sec 73.4 MBytes 615 Mbits/sec 0 583 KBytes
[ 5] 10.00-10.05 sec 3.01 MBytes 537 Mbits/sec 0 583 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 5] 0.00-10.05 sec 782 MBytes 653 Mbits/sec 0 sender
[ 5] 0.00-10.05 sec 0.00 Bytes 0.00 bits/sec receiver
Today I did more tests.
As long as I do only connect 1 device to the lan ports (no matter which one) I do get ~200MBit/s with https://fast.com
If I connect more devices to the other lan ports, even if these device do not do any network communication, the more devices I do connect; the data-rate goes down to 3 MBit/s or even only some kbit/s.
-
So I tried to not use arno-iptables-firewall, but did only activate forward and a masquerade rule, nothing else: Same result
-
Then I thought maybe the bridge is the problem, so I removed the bridge and configured every lan port with itās own subnet: same result
Then I thought maybe the MAC Addresses cause an issue, so I tried 2 different things:
- Give the same mac address to eth1 and wan, like it is the default: Same result
- Give every interface (eth0,eth1,lan0-lan3,wan) a different MAC address: Same result
So I have no idea what is causing this. BPI-R2 is not usable as router for my personal setup.
EDIT: tested with kernel 4.14.92_main selfcompiled and from frankās github releases
Important info that this issue only appears only if more than 1 lanport is connected as my tests are only over one lan/wan-port or from lan to wan.
On my main-router i also have only 1 lan connected because this lan is a trunk to managed switch there my other devices are connected
@Ryder.Lee / @moore / @linkerosa / @Jackzeng can bpi-r2 operate without dsa-driver in 4.14+ like it works in 4.4 (eth1 fixed to wan,eth0 to lan-ports)
As I mentioned in the other thread, Iām getting slow speeds. Iāll put my network configuration details in here.
My main router is a Netgear WNDR3800 running OpenWRT. I was testing out the banana pi r2, hoping it could replace the netgear. The netgear has an internal IP address of 192.168.15.1. It connects to my ISP using PPPOE. I posted details of its configuration on the OpenWRT board.
One of the lan ports of my router is connected to a switch. I am running the test between 2 computers. One of the computers, mihoshi is connected to that switch, and has an IP of 192.168.15.2. Because my main internet connection goes through PPPoE, mihoshiās MTU is 1492.
The wan port of the Banana PI R2 is connected to the switch. The other computer, ryoko is connected to lan3 of the Banana Pi R2. The Banana Pi is performing a NAT translation between the two computers. Iāve set ryokoās MTU to 1492 to match mihoshiās.
Hereās how Iām measuring the speed:
$ ssh [email protected] 'cat /dev/zero' | pv >/dev/null
^C02GiB 0:01:29 [37.3MiB/s]
I think mihoshi and ryokoās network settings are correct, because the gets about 90MiB/s when I use an expressobin instead of the banana pi.
Here is the configuration of my banana pi:
root@bpi-r2:~# uname -a
Linux bpi-r2 4.14.80-bpi-r2-main #177 SMP Sun Nov 11 10:03:58 CET 2018 armv7l GNU/Linux
root@bpi-r2:~# iptables-save
# Generated by iptables-save v1.6.0 on Sun Jan 20 09:26:23 2019
*filter
:INPUT ACCEPT [8827:730234]
:FORWARD ACCEPT [54709:25786012]
:OUTPUT ACCEPT [9931:1198866]
COMMIT
# Completed on Sun Jan 20 09:26:23 2019
# Generated by iptables-save v1.6.0 on Sun Jan 20 09:26:23 2019
*nat
:PREROUTING ACCEPT [6742:849916]
:INPUT ACCEPT [3444:241724]
:OUTPUT ACCEPT [2492:166578]
:POSTROUTING ACCEPT [11:1615]
-A POSTROUTING -o wan -j MASQUERADE
COMMIT
# Completed on Sun Jan 20 09:26:23 2019
root@bpi-r2:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
inet6 fe80::3c8c:11ff:fe69:298e/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:b1:52:1b:25:98 brd ff:ff:ff:ff:ff:ff
inet6 fe80::50b1:52ff:fe1b:2598/64 scope link
valid_lft forever preferred_lft forever
4: wan@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 10000
link/ether 52:b1:52:1b:25:98 brd ff:ff:ff:ff:ff:ff
inet 192.168.15.140/24 brd 192.168.15.255 scope global wan
valid_lft forever preferred_lft forever
inet6 2602:ae:1592:e100:50b1:52ff:fe1b:2598/64 scope global mngtmpaddr dynamic
valid_lft forever preferred_lft forever
inet6 fe80::50b1:52ff:fe1b:2598/64 scope link
valid_lft forever preferred_lft forever
5: lan0@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br0 state LOWERLAYERDOWN group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
6: lan1@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br0 state LOWERLAYERDOWN group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
7: lan2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br0 state LOWERLAYERDOWN group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
8: lan3@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 3e:8c:11:69:29:8e brd ff:ff:ff:ff:ff:ff
inet 192.168.2.1/24 brd 192.168.2.255 scope global br0
valid_lft forever preferred_lft forever
inet6 fe80::3c8c:11ff:fe69:298e/64 scope link
valid_lft forever preferred_lft forever
root@bpi-r2:~# lsmod
Module Size Used by
iptable_filter 16384 0
mtkhnat 24576 0
ipt_MASQUERADE 16384 1
nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE
iptable_nat 16384 1
nf_conntrack_ipv4 16384 2
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_nat_ipv4 16384 1 iptable_nat
nf_nat 32768 2 nf_nat_masquerade_ipv4,nf_nat_ipv4
nf_conntrack 126976 5 nf_conntrack_ipv4,ipt_MASQUERADE,nf_nat_masquerade_ipv4,nf_nat_ipv4,nf_nat
bridge 151552 0
mtk_thermal 16384 0
thermal_sys 61440 1 mtk_thermal
spi_mt65xx 20480 0
pwm_mediatek 16384 0
mt6577_auxadc 16384 0
nvmem_mtk_efuse 16384 0
mtk_pmic_keys 16384 0
rtc_mt6397 16384 1
ip_tables 24576 2 iptable_filter,iptable_nat
x_tables 28672 3 ip_tables,iptable_filter,ipt_MASQUERADE
ipv6 409600 23 bridge
Since you said you had difficulty reproducing this, I ran a speed test to the internet from ryoko. Strangely, I am getting 355Mb/s download and 308Mb/s upload, which is much higher than I expected considering the problems talking to a much closer computer. When ryoko connects directly to the switch, it gets 553Mb/s download and 451Mb/s upload.
netgear (mainrouter): 192.168.15.1 connected to wan, right?
bpi-r2: wan: 192.168.15.140/24, br0 (all lan-ports): 192.168.2.1/24, NAT on wan
mihoshi (client/ssh-target): 192.168.15.2 => seems to be wrong
ryoko (other client): x.x.x.x ?
your lans are bridged together to br0 with 192.168.2.1/24 so your mihoshi & ryoko should have 192.168.2.x/24
from which system do you make the ssh? i guess from r2 to your clientā¦then bottleneck could be r2ās cpu
Ryoko is using DHCP from the banana Pi. Currently it has address 192.168.2.135. The wan port of the banana pi connects to a switch. Mihoshi does not connect to the banana pi. It connects directly to the switch.
Ryoko and Mihoshiās networks are not bridged through br0. It goes through NAT translation in the banana pi.
[kyle@ryoko ~]$ ifconfig
enp0s18f2u4: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 00:24:9b:06:06:9c txqueuelen 1000 (Ethernet)
RX packets 780962 bytes 985254421 (939.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 778884 bytes 704802735 (672.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1492
inet 192.168.2.135 netmask 255.255.255.0 broadcast 192.168.2.255
inet6 fe80::7f9b:6f9d:729d:78c9 prefixlen 64 scopeid 0x20<link>
ether bc:5f:f4:af:1b:83 txqueuelen 10000 (Ethernet)
RX packets 850718674 bytes 1219355660935 (1.1 TiB)
RX errors 4 dropped 60 overruns 0 frame 6
TX packets 247260591 bytes 146336285643 (136.2 GiB)
TX errors 2 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 104706516 bytes 220974209428 (205.7 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 104706516 bytes 220974209428 (205.7 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[kyle@ryoko ~]$ ssh [email protected] 'dd if=/dev/zero of=/dev/stdout bs=$((1024 * 1024)) count=1000' >/dev/null
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 27.7315 s, 37.8 MB/s
[kyle@ryoko ~]$ dd if=/dev/zero of=/dev/stdout bs=$((1024 * 1024)) count=1000 | ssh [email protected] 'cat >/dev/null'
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 9.3078 s, 113 MB/s
interestingā¦that means that problem is not encryption, but generation of data via ddā¦but in both commands data is not generated on r2ā¦
192.168.15.2 seems to have problens generating the data, or it is direction-relatedā¦ryoko to 192.168.15.2 seems to work well, the opposite direction is bad
Create an NFS share, use FTP - you have to rule out encryption overhead. Given that multiple reports seem to indicate faster āinā than āoutā - then to my eyes the biggest culprit here is those 4 configurable LAN ports - the way they are exposed as āvirtualā interfaces and their shared lane to the CPU.
Maybe itās the actual hardware - itās likely software - but clearly thereās some kind of round-robin algorithm on some level that is causing an N^2 decrease in performance in one direction. That means thereās SOMETHING - a process, some code - something thatās operating per port and causes a delay, per-port that is proportional to the total number of used ports.
for each port do
{PROBLEM IS HERE}
end
Iām working on the lima/mali stuff right now - Iāll try to fix this eventually. Weāll fix it like you optimize a game engine - weāll profile every single call on the kernel side that is MTK specific and networking specific.
Run one interface - benchmark, run two - etc etc
See which calls are taking up a relatively larger amount of time as you transfer one way, the other, and with one, two, three etc connected machines.
Thatāll track down this problem - weāll see the culprit code or weāll determine a hardware limitation - but weāll find it.
EDIT : Iām not saying thatās easy - but presumably we can generate valgrind data - if we canāt? then we can just do our own timing routines and dump to a log - whatever. N^2 issues will stick out like a sore thumb during profiling.
Itās not due to encryption. Iāve been trying out different routers. If I replace the banana pi with an espressobin, the same benchmarks between the two computers get 90MiB/s.
I did some more tests. As far as I can tell, the router is slow with incompressible data, but that doesnāt make any sense. Iāve never heard of a NAT trying to do compression.
I get fast speeds sending all 0s:
[kyle@mihoshi ~]$ dd if=/dev/zero of=/dev/stdout bs=$((1024 * 1024)) count=1000 | socat -u STDIN TCP4-LISTEN:25566,reuseaddr
[kyle@ryoko ~]$ socat -u TCP4:192.168.15.2:25566 STDOUT | pv >/dev/null
1000MiB 0:00:10 [94.8MiB/s] [
I get fast speeds sending all 1s:
[kyle@mihoshi ~]$ dd if=/dev/zero of=/dev/stdout bs=$((1024 * 1024)) count=1000 | tr '\000' '\377' | socat -u STDIN TCP4-LISTEN:25566,reuseaddr
[kyle@ryoko ~]$ socat -u TCP4:192.168.15.2:25566 STDOUT | pv >/dev/null
1000MiB 0:00:10 [94.0MiB/s] [
However, sending a zip file (incompressible data) is slow:
[kyle@mihoshi ~]$ dd if=/backup/dimension_zero/auto_backup/2018-12-14-13-01-23.zip of=/dev/stdout bs=$((1024 * 1024)) count=1000 | socat -u STDIN TCP4-LISTEN:25566,reuseaddr
[kyle@ryoko ~]$ socat -u TCP4:192.168.15.2:25566 STDOUT | pv >/dev/null
1000MiB 0:00:29 [33.9MiB/s]
This shows that reading the file is not slow (the OS already cached it before I ran this test or the previous test, and I reran the network test afterwards too and got the same number):
[kyle@mihoshi ~]$ dd if=/backup/dimension_zero/auto_backup/2018-12-14-13-01-23.zip of=/dev/stdout bs=$((1024 * 1024)) count=1000 | cat >/dev/null
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.359988 s, 2.9 GB/s
I tried having the router receive the data, which would skip the NAT part. It does better:
[kyle@mihoshi ~]$ dd if=/backup/dimension_zero/auto_backup/2018-12-14-13-01-23.zip of=/dev/stdout bs=$((1024 * 1024)) count=1000 | socat -u STDIN TCP4-LISTEN:25566,reuseaddr
root@bpi-r2:~# socat -u TCP4:192.168.15.2:25566 STDOUT | pv >/dev/null
1000MiB 0:00:14 [68.5MiB/s] [
Probably the tcp buffers arenāt optimized well on the router yet (which makes no different for itās ability to NAT). Increasing the socat transfer size worked around this. I went back and tried a larger transfer size on ryoko, and it made no difference.
[kyle@mihoshi ~]$ dd if=/backup/dimension_zero/auto_backup/2018-12-14-13-01-23.zip of=/dev/stdout bs=$((1024 * 1024)) count=1000 | socat -u STDIN TCP4-LISTEN:25566,reuseaddr
root@bpi-r2:~# socat -b$((1024 * 1024)) -u TCP4:192.168.15.2:25566 STDOUT | pv >/dev/null
1000MiB 0:00:11 [87.3MiB/s] [
I did a tcpdump on ryoko and mihoshi during the speed test. Some packets from mihoshi to ryoko are dropped. Ryoko sees this and reduces the TCP window size. Packets are periodically dropped during the rest of the connection as it tries to zero in on the max speed.
Interestingly the 4th through 8th packets or so are dropped. I would have expected the router to have a large enough buffer to hold those packets and drop later packets when the buffer overflows.