Issues with NAT for LAN on 4.14.80 kernel


(Frank W.) #21

bring the bridge up seems to need this:

auto lan1
allow-hotplug lan1
iface lan1 inet manual
   pre-up   ifconfig $IFACE up
   pre-down ifconfig $IFACE down

auto lan2
allow-hotplug lan2
iface lan2 inet manual
   pre-up   ifconfig $IFACE up
   pre-down ifconfig $IFACE down

auto br0
iface br0 inet static
    address 192.168.0.18
    netmask 255.255.255.0
    bridge_ports lan1 lan2
    bridge_fd 5
    bridge_stp no

https://www.cyberciti.biz/faq/debian-network-interfaces-bridge-eth0-eth1-eth2/ but also need the “auto …” line

your topic is named issues with nat so i guess the problem exists only if NAT (Masquerading) is used…

ping over nat (my main-router) to cloudflare

--- 1.1.1.1 ping statistics ---
2108 packets transmitted, 2094 received, 0% packet loss, time 2110253ms
rtt min/avg/max/mdev = 35.064/44.800/1070.474/65.431 ms, pipe 2

Not much,but 14 packets lost of 2100

tcpdump on main-router:

[18:03] frank@bpi-r2-e:/var/lib/tftp$ sudo tcpdump -v icmp -i lan0 >/dev/null
tcpdump: listening on lan0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C4022 packets captured
4022 packets received by filter
0 packets dropped by kernel

[18:37] frank@bpi-r2-e:/var/lib/tftp$ uname -r
4.14.78-bpi-r2-main

did tcpdump on ppp0 (may wan where masquerade is setup)

[19:04] frank@bpi-r2-e:/var/lib/tftp$ sudo tcpdump -v icmp -i ppp0 >/dev/null
[sudo] Passwort für frank: 
tcpdump: listening on ppp0, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
^C2645 packets captured
2645 packets received by filter
0 packets dropped by kernel

client says

--- 1.1.1.1 ping statistics ---
1046 packets transmitted, 1034 received, 1% packet loss, time 1046582ms
rtt min/avg/max/mdev = 36.604/43.582/1066.369/52.799 ms, pipe 2

you can run (maybe with watch before to see counting)

netstat -i

to see where Packets are dropped, i have indeed some for lan0 which is my main-lan-interface (but there are 25m cable between r2 and switch, which may cause errors)

pinged google-dns over night and this is the result:

--- 8.8.8.8 ping statistics ---
46758 packets transmitted, 46742 received, +4 errors, 0% packet loss, time 46826538ms
rtt min/avg/max/mdev = 10.029/22.975/1042.775/87.433 ms, pipe 2

16 packets lost of 46700 and these are caused for sure by my 24h connection-reset (done by cronjob)


(singinwhale) #22

well it doesn’t seem to be an issue with NAT after all. I tried breaking it down to the simplest setup I could with everything disabled that wasn’t strictly necessary. So here is the setup again:

  • clean image of frank-w’s BPI-r2 debian stretch 4.14.80 from SD-Card.

  • installed iperf3

  • no changes to /etc/network/interfaces

  • set all interfaces except lan0 to down (for iface in {br0,lan1,lan2,lan3,wan,eth1}; do ip link set $iface down; done;

  • manually configure lan0 using ip

  • direct link between lan0 192.168.137.71 and a Gigabit-Ethernet-NIC on Windows 192.168.137.1

  • iperf3 running on BPI in server mode

  • iperf3 running on Windows in client mode

      root@bpi-r2:~$ ip a
      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
          inet 127.0.0.1/8 scope host lo
             valid_lft forever preferred_lft forever
          inet6 ::1/128 scope host
             valid_lft forever preferred_lft forever
      2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
          link/ether ee:06:92:66:df:23 brd ff:ff:ff:ff:ff:ff
          inet6 fe80::ec06:92ff:fe66:df23/64 scope link
             valid_lft forever preferred_lft forever
      3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
          link/ether 32:27:b9:b6:fe:2b brd ff:ff:ff:ff:ff:ff
      4: wan@eth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
          link/ether 32:27:b9:b6:fe:2b brd ff:ff:ff:ff:ff:ff
      5: lan0@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
          link/ether 02:01:02:03:04:00 brd ff:ff:ff:ff:ff:ff
          inet 192.168.137.71/24 brd 192.168.137.255 scope global lan0
             valid_lft forever preferred_lft forever
          inet6 fe80::1:2ff:fe03:400/64 scope link
             valid_lft forever preferred_lft forever
      6: lan1@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue master br0 state DOWN group default qlen 1000
          link/ether ee:06:92:66:df:23 brd ff:ff:ff:ff:ff:ff
      7: lan2@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue master br0 state DOWN group default qlen 1000
          link/ether ee:06:92:66:df:23 brd ff:ff:ff:ff:ff:ff
      8: lan3@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
          link/ether ee:06:92:66:df:23 brd ff:ff:ff:ff:ff:ff
      9: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
          link/ether ee:06:92:66:df:23 brd ff:ff:ff:ff:ff:ff
          inet 192.168.40.1/24 brd 192.168.40.255 scope global br0
             valid_lft forever preferred_lft forever
      10: lan3.60@lan3: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
          link/ether 02:01:02:03:04:03 brd ff:ff:ff:ff:ff:ff
          inet 192.168.60.10/24 brd 192.168.60.255 scope global lan3.60
             valid_lft forever preferred_lft forever
      root@bpi-r2:~$
    

Now the weird stuff begins again:

First iperf3 from Windows to lan0 seems fine:

    λ iperf3.exe -c 192.168.137.71                                                 
    Connecting to host 192.168.137.71, port 5201                                   
    [  4] local 192.168.137.1 port 62252 connected to 192.168.137.71 port 5201     
    [ ID] Interval           Transfer     Bandwidth                                
    [  4]   0.00-1.00   sec   107 MBytes   894 Mbits/sec                           
    [  4]   1.00-2.00   sec   111 MBytes   927 Mbits/sec                           
    [  4]   2.00-3.00   sec  84.8 MBytes   712 Mbits/sec                           
    [  4]   3.00-4.00   sec  85.8 MBytes   718 Mbits/sec                           
    [  4]   4.00-5.00   sec  59.0 MBytes   496 Mbits/sec                           
    [  4]   5.00-6.00   sec  58.4 MBytes   489 Mbits/sec                           
    [  4]   6.00-7.00   sec  83.5 MBytes   701 Mbits/sec                           
    [  4]   7.00-8.00   sec  59.0 MBytes   495 Mbits/sec                           
    [  4]   8.00-9.00   sec  59.4 MBytes   498 Mbits/sec                           
    [  4]   9.00-10.00  sec  85.2 MBytes   716 Mbits/sec                           
    - - - - - - - - - - - - - - - - - - - - - - - - -                              
    [ ID] Interval           Transfer     Bandwidth                                
    [  4]   0.00-10.00  sec   792 MBytes   665 Mbits/sec                  sender   
    [  4]   0.00-10.00  sec   792 MBytes   664 Mbits/sec                  receiver 
                                                                                   
    iperf Done.                                                                                                                      

But now the same in reverse mode so the BPI tries to send to the Windows machine:

    λ iperf3.exe -c 192.168.137.71 -R                                                  
    Connecting to host 192.168.137.71, port 5201                                       
    Reverse mode, remote host 192.168.137.71 is sending                                
    [  4] local 192.168.137.1 port 62364 connected to 192.168.137.71 port 5201         
    [ ID] Interval           Transfer     Bandwidth                                    
    [  4]   0.00-1.00   sec  18.5 KBytes   152 Kbits/sec                               
    [  4]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec                                 
    [  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec                                 
    - - - - - - - - - - - - - - - - - - - - - - - - -                                  
    [ ID] Interval           Transfer     Bandwidth       Retr                         
    [  4]   0.00-10.00  sec  77.0 KBytes  63.1 Kbits/sec   22             sender       
    [  4]   0.00-10.00  sec  18.5 KBytes  15.2 Kbits/sec                  receiver     
                                                                                       
    iperf Done.                                                                        

there almost nothing coming through. The small ‘almost’ is just enough to send pings. thats why the pings to hosts on the Internet work but the outgoing speed from the BPI is just not enough for more. I tried the official 4.4 image again and I have no LAN issues there so it can’t really be a hardware issue.


(Frank W.) #23

can you try now with 4.19-main-kernel to separate if it’s a problem with my ethenet-patches?

@moore @ryder.lee @Jackzeng is it possible to disable dsa-driver to have near the same setup as in 4.4 (wan = eth1, lan-ports=eth0) without separation?


(singinwhale) #24

rebuilt 4.19-main branch from your git repo and installed the deb on the BPI and rebooted into the kernel. did the same as before to disable all interfaces. Windows to BPI even got faster but the other way is still broken

λ iperf3.exe -c 192.168.137.71 -R
Connecting to host 192.168.137.71, port 5201
Reverse mode, remote host 192.168.137.71 is sending
[  4] local 192.168.137.1 port 64512 connected to 192.168.137.71 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  2.85 KBytes  23.4 Kbits/sec
[  4]   1.00-2.00   sec  2.85 KBytes  23.4 Kbits/sec
[  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   3.00-4.00   sec  5.70 KBytes  46.8 Kbits/sec
[  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  77.0 KBytes  63.1 Kbits/sec   11             sender
[  4]   0.00-10.00  sec  11.4 KBytes  9.34 Kbits/sec                  receiver

iperf Done.



λ iperf3.exe -c 192.168.137.71
Connecting to host 192.168.137.71, port 5201
[  4] local 192.168.137.1 port 64515 connected to 192.168.137.71 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   113 MBytes   947 Mbits/sec
[  4]   1.00-2.00   sec   113 MBytes   950 Mbits/sec
[  4]   2.00-3.00   sec   113 MBytes   949 Mbits/sec
[  4]   3.00-4.00   sec   112 MBytes   936 Mbits/sec
[  4]   4.00-5.00   sec   113 MBytes   948 Mbits/sec
[  4]   5.00-6.00   sec   113 MBytes   949 Mbits/sec
[  4]   6.00-7.00   sec   113 MBytes   949 Mbits/sec
[  4]   7.00-8.00   sec   113 MBytes   949 Mbits/sec
[  4]   8.00-9.00   sec   113 MBytes   949 Mbits/sec
[  4]   9.00-10.00  sec   112 MBytes   935 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  1.10 GBytes   946 Mbits/sec                  sender
[  4]   0.00-10.00  sec  1.10 GBytes   946 Mbits/sec                  receiver

iperf Done.

P.S.: Why is wan now also under eth0 and eth1 is gone?

4: wan@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 46:81:e9:88:0c:4a brd ff:ff:ff:ff:ff:ff

(Frank W.) #25

4.19 does not have my dsa-patches yet, because i try to bring them cleanly into mainline (see 4.20-gmac)

4.19 is (regarding ethernet) mainline, only my defconfig, swapped mmc/uart for compatibility, poweroff- and wifi-driver are currently merged

so it seems it’s a problem in mainline-code…if we can disable dsa we can look if there is the problem or in ethernet-code


(singinwhale) #26

i tried using 4.20-gmac now but i can’t get the network setup to work. Seems like its not far enough for this yet


(Frank W.) #27

4.20-gmac is for testing only and for me posting my patches…network-setup is same as in 4.14 (eth0/eth1 up) last commit removes 2 cpu-connections so 2 lan-ports on eth0 and 2 on eth1

We need now a way to disable dsa (to get same as on 4.4). Imho disable dsa in kernel-config is not enough


(singinwhale) #28

I guessed as much. I just wanted to give it a shot to maybe try it out.

I also tried to simply disable DSA before building 4.19-main but that experiment resulted in there only being a single eth0 interface that did not even send udev messages when I plugged a cable in. Would have been amazing if it had been that simple x)


(Frank W.) #29

4.19 have only 1 eth (cpu-port) without additional patches. i want to get a state where all ports simply passed through switch to eth0 (like dump switch connected to eth0). You cannot route between wan and lan,but thos is only to find the error.

imho without dsa,switch is not powered on (no leds). You cannot get link-up notifies because only switch recognize that,not cpu-port.

We need here some help from bpi/mtk to bypass dsa-driver. I guess problem is in ethernet-driver (mtk_eth_soc).