[BPI-R3] weird networking issue (and weirder "solution")

meehien · February 23, 2024, 10:51am

Hi all

The weird problem. For the last year or so I have been having this weird networking issue: large https streams (github, file downloads, online gaming, streaming, etc.) randomly fail with a SSL decryption error and need to be restarted, leading to connection interuptions (see below at [1] and [2]). This problem seems to only happen for eth (wired) connections that operate around 300 Mbps. Faster connections (or different network adapters) do not seem to have this issue. I have been able to reproduce the problem consistently between (1) two BPI-R3s connected with Powerlan Adapters (both Devolo and TP-Link); (2) a BPI-R3 and any PC connected through the PowerLan and, most importantly (3) BPI-R3 directly connected to a Cable Matters USB to Ethernet Adapter [3]. Additional relevant system specs below @[4].

The weird solution. As I was trying to diagnose this, I “mistakenly” used the following iptables-nft rule (note the missing --tcp-flags option) which seems to “fix” the problem.

iptables-nft -t mangle -A FORWARD -o brlan -p tcp -j TCPMSS --clamp-mss-to-pmtu

As soon as the rule is removed, or the --tcp-flags/--syn is added the problem reappears. Problem is also manifesting, at all times, with for native nftables rules (probably because, as far as I can tell, there is no way to create a “partial rule” such as the above).

Help requested.

Does anyone have any idea/hunch why the above rule addresses the problem?
So far I have found it quite difficult to debug the issue. Can anyone suggest how they would approach debugging this (tried traffic dumps but not sure exactly what to look for)?
At this point I am suspecting it might be a driver related issue. Is anyone aware of any patches that are relevant?

Thanks. I can provide additional details if needed.

Extra details:

[1]

$ wget -O /dev/null "https://software.download.prss.microsoft.com/dbazure/Win11_23H2_EnglishInternational_x64v2.iso?t=<RANDOM_TOKEN>"
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving software.download.prss.microsoft.com (software.download.prss.microsoft.com)... 152.199.21.175, 2606:2800:233:1cb7:261b:1f9c:2074:3c
Connecting to software.download.prss.microsoft.com (software.download.prss.microsoft.com)|152.199.21.175|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6797694976 (6.3G) [application/octet-stream]
Saving to: ‘/dev/null’

/dev/null                       0%[                                                  ]  41.30M  30.1MB/s    in 1.4s

2024-02-23 10:23:29 (30.1 MB/s) - Read error at byte 43302901/6797694976 (Decryption has failed.). Retrying.

[2]

curl -o /dev/null "https://software.download.prss.microsoft.com/dbazure/Win11_23H2_English_x64v2.iso"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 15 6497M   15 1015M    0     0  28.8M      0  0:03:45  0:00:35  0:03:10 31.4M
curl: (56) OpenSSL SSL_read: OpenSSL/3.2.1: error:0A000119:SSL routines::decryption failed or bad record mac, errno 0

[3] https://www.amazon.co.uk/gp/product/B00BBD7NFU

[4] ArchLinuxArm system with variuous 6.x kernels from:

sparkie · February 23, 2024, 6:41pm

is the kernel module present at all? I would like to use --clamp-mss-to-pmtu together with ‘iptables’ but in 'frank-w’s kernel no such module exists :

    # /sbin/iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss- to-pmtu
    Warning: Extension TCPMSS revision 0 not supported, missing kernel module?
    iptables v1.8.9 (nf_tables):  RULE_APPEND failed (No such file or directory): rule in chain FORWARD

why do you specify an interface ’ -o brlan’ at all? Simply skip this to clamp mss for forward and backward packet flow

meehien · February 23, 2024, 7:55pm

Hi @sparkie, please don’t hijack my thread.

That said, yes, on my device the kernel module is present. I have modified the kernel config to include all netfilter/iptables modules. You can probably do the same if you need the MSS functionality. Also, ttbomk, clamping should go in the mangle table, not filter.

Also, in the particular situation I described I want to apply the rule only for the affected interface i.e. brlan.

I am not trying to do MSS clamping, but rather figure out how to prevent the https connections resetting. The “malformed” iptables rule given seems to be the only thing that fixes this (in case anyone visits this forum), but I don’t understand why, nor what the actual cause of the problem is to begin with (and it is not the clamping/MSS, as far as I can tell).

meehien · February 24, 2024, 4:14pm

I have obtained another clue for this. It would seem that whenever the tcp connection reset happens the calculated window size abruptly changes from 68096 to 4175104 (see attached pics). Not sure what to make of this and would appreciate some help.

frank-w · February 24, 2024, 4:27pm

Good catch,maybe related…

Such huge jump looks for like some kind of overflow…not from c integer datatype but some register in hardware.

Or it is simply calculation issue based on some wrong value read out of it.

meehien · February 24, 2024, 5:38pm

Any idea where to start looking?

frank-w · February 24, 2024, 5:57pm

Not really…have you a trace like the one reported on openwrt (transmit queue timeout)? Is this in both directions or only one? You can test this using iperf3…is this only with r3 (test with both ends different device and same kernel),maybe it is an underlying bug and not mediatek/r3 specific.

meehien · February 25, 2024, 12:35pm

This is a bpi-r3 related issue. I am not observing this with any of my other devices. I don’t have any openwrt systems currently at hand but might set up something. Also, the above error might not be related, see below.

In the meantime, I have some new observations.

I have been unable to reproduce the error on a direct iperf3 connection: e.g. device to bpi-r3.
I have been unable to reproduce the error on a remote iperf3 connection: e.g. device to external server e.g. iperf3.moji.fr

So it would seam it has something to do with http/https. For this I have set up a local https and retried the wget test. I have not observed the error (mostly, need some more thorough tests for this). The error however comes up as soon as I try to fetch something external.

I did however noticed a new sympthom. Whenever the error occurs my ssh connection to the bpi-r3 is stalled. To me this seems like a buffe-like problem related to data shuffling between the wan port and the (e.g.) lan0 switch port.

I would like to try and use the spf2 port, the one connected to the other eth interface, for some more tests. Is it possible to connect a standard cat eth sfp to this one or it only supports fiber (which I don’t have )? Also, related, are all the possible pairs on the main switch (eth0) equivalent, or is the wan port special in some way?

Also, does anyone know where the mediatek switch is implemented in the kernel, maybe I could get a glimpse of how the shuffling between ports is done from there?

ericwoud · February 25, 2024, 1:24pm

There are several sfp modules. At 1Gbps, you could use a module where it is reported that it supports direct phy access through i2c. These modules usually have hardware inside, that is supported in the kernel. Get one with a 88E1111 inside. At 2.5Gbps, I’ve been working on the rtl8221b modules. There are now 2 known modules. I’m sending something upstream soon.

Other rj45 modules, the phy is not accessible and control over the hardware is very limited. They present themselves as optical modules and are treated as such by the kernel.

meehien · September 8, 2024, 3:59pm

Hi all

I have a quick update on this. Problem went away in kernel 6.9, but unfortunately it resurfaced with kernel 6.10 and 6.11. I was able to do more testing recently and I now belive that the problem is related to checksum offload calculalations on the BPI-R3. The iptables “fix” from above seems to have been just a lucky positive interaction.

My new way of addressing this issue is to disable tx offloading:

/sbin/ethtool -K eth0 tx off

I haven’t noticed any speed reduction (probably because the bandwidth bottleneck is caused by the PowerLan adapters), but if anyone needs encounters this problem in the future they might try this.

frank-w · September 8, 2024, 4:49pm

Had you tso set to on? This seems ro cause the watchdog tx queue timeouts…i guess this is also turned off when doing tx off

meehien · September 8, 2024, 6:09pm

Yes, tso gets turned off by tx off, and cannot be turned on. With just tso off I still get the error. The following is the only config that works currently.

 ~ # ethtool -k eth0 | grep ": on"
rx-checksumming: on
scatter-gather: on
   tx-scatter-gather: on
generic-segmentation-offload: on
generic-receive-offload: on
tx-vlan-offload: on
hw-tc-offload: on

I have only tried turning things off from default. Should have also probably mentioned that eth0 is the generic switch (dsa) interface.

frank-w · September 8, 2024, 8:43pm

Have you a way i can reproduce your behaviour?

sparkie · September 9, 2024, 5:28am

I also have a weird issue since months with BPI-R3. Don’t know if it’s related to the issue discussed here. If it’s related I would have a very quick way to reproduce the problem.

Basically my BPI-R3 is running Debian 12.7 configured as firewall routing packets between different subnets. I can reproduce my issue very quickly. It does even happen when running only two 'rsync’s across ‘lan0’ and ‘lan2’ with 3 machines involved. One machine must use a 100Mbit LAN interface. The others all have 1Gbit. Packets between ‘lan0’ and ‘lan2’ are forwarded/firewalled with iptables rules. Within a few seconds after starting the rsync test I get:

client_loop: send disconnect: Broken pipe
rsync: [sender] write error: Broken pipe (32)
rsync error: error in socket IO (code 10) at io.c(823) [sender=3.2.3]

probably because it gets a tcp-reset. I experimented a lot to workaround the issue (with kernels between 6.8 - 6.11) but nothing helped. Except replacing ‘lan0’ by an externel USB-ethernet adapter

I tried to check if

/sbin/ethtool -K lan0 tx off

could help me. But the command is not executed:

# ethtool -K lan0 tx off
Actual changes:
tx-checksum-ipv4: on [requested off]
tx-checksum-ipv6: on [requested off]

is this because the settings are all ‘fixed’?

# ethtool -k lan0 | less
Features for lan0:
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ipv4: on [fixed]
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: on [fixed]
        tx-tcp6-segmentation: on [fixed]
[...]

I currently run a kernel (6.10.0-bpi-r3-main) built after:

git clone [email protected]:frank-w/BPI-Router-Linux.git
cd BPI-Router-Linux
git checkout 6.10-main

Any ideas how to turn off tx-checksumming?

frank-w · September 9, 2024, 6:20am

I guess it is different issue…there were some reports with 100mbit devices for r3/r4

E.g. for r4: [BPI-R4] bad switch performance in upload

I tested with manual setting speed which worked,but it could be other issue.

When you say traffic between lan0 and lan2 is forwarded i expect they are in different lan segments (different subnet),right?

Have you tried disabling autoneg and set speed manually?

Maybe it is an autoneg issue.

Similar to this: BPI-R4: 100Mbit broken

sparkie · September 9, 2024, 6:45am

both ‘lan0’ and ‘lan2’ first are connected to simple 1Gbit switches. So basically the BPI-R3 sees only 1Gbit ports.

‘traffic between lan0 and lan2’ means: the BPI-R3 router forwards packets between ‘lan0’ and ‘lan2’ that are located on different subnets (aka 192.168.140.0/24 and 192.168.150.0/24)

there are no ‘dmesg’ messages at all on any machine involved at the time of error. But I will try to disable autoneg.

To ‘turn off tx-checksumming’ would be much more interesting IMHO. It appears not to work on the current kernel/drivers (plse see above). Do you know what to do to turn it off?

meehien · September 9, 2024, 7:19am

This is the same error I was having. You need to run it on the switch interface, i.e. part after @, like @eth0. Really curious to see if it fixes the issue for you too.

meehien · September 9, 2024, 7:27am

Hmm. I can describe the setup that was failing the most ‘reliably’ for me. Connect two BPI-R3 with an ethernet cable. Use a PC/device to connect to one of the BPI-R3s over wlan. Do some intensive TCP data transfers from the PC with the BPI-R3 you are not directly connected to. I am using the unison tool to backup my files, which uses rsync over ssh. This would fail to finish 99% of the time, with the error @sparkie printed above.

sparkie · September 9, 2024, 7:46am

it’s phantastic!

I can reproduce my issue within seconds in my LAN environment. So I tried to toggle between ‘tx on’ and ‘tx off’ on the interface originally named ‘eth0’ (thanks @meehien for hinting me). The actual test runs between interfaces originally named ‘lan0’ and ‘lan2’ though.

My issue does no longer appear with ‘tx off’. But instantly reappears after setting to ‘tx on’

sparkie · September 9, 2024, 8:29am

after lots of experimenting this finally is the easiest setup I found to reproduce the issue within seconds:

hardware setup:

desktop A (Gbit) connected to BPI-R3 (lan0) via Gbit switch A (in 192.168.140.0/24)
desktop B (Gbit) connected to BPI-R3 (lan2) via Gbit switch B (in 192.168.150.0/24)
RaspberryPi (100Mbit) connected to BPI-R3 (lan2) via Gbit switch B (in 192.168.150.0/24)

some illustrating ASCII art:

    ------------------------
    desktop A (debian 11.11)
    ------------------------
                |
           -------------
           Gbit switch A
           -------------
                |
     -------------------------
      lan0 (192.168.140.0/24)
        BPI-R3 (debian 12.7)
      lan2 (192.168.150.0/24)
     -------------------------
                |
           -------------
           Gbit switch B
           -------------
                |        \
                |         \
-------------------------- \
Raspberry Pi (debian 11.9)  \
--------------------------   \
                              \
                             -----------------------
                             desktop B (debian 12.5)
                             -----------------------

software setup:

basically only 2 concurrent ‘rsyncs’ are needed copying some files around. All commands are started from desktop A

desktopA# ssh raspberrypi rm -frv /tmp/YYYYY; rsync --delete -vaX --numeric-ids source_dir raspberrypi:/tmp/YYYYY
desktopA# ssh desktopB rm -frv /tmp/YYYYY; rsync --delete -vaX --numeric-ids source_dir desktopB:/tmp/YYYYY

(started in different shells concurrently)

error symptoms:

in case of error the ‘rsync’ running between ‘desktop A’ and ‘desktop B’ breaks with:

client_loop: send disconnect: Broken pipe
rsync: [sender] write error: Broken pipe (32)
rsync error: unexplained error (code 255) at io.c(823) [sender=3.2.3]

the other ‘rsync’ running between ‘desktop A’ and ‘raspberrypi’ is mostly not affected

successful workaround:

ethtool -K eth0 tx off

thanks to @meehien for providing this

caveats:

setting ‘tx’ to ‘off’ impacts network performance.

with ‘tx on’ (the default) ‘iftop’ utility shows stunning ‘117MB’ when running a simple ‘netcat’ between ‘desktop A’ and ‘desktop B’. Excellent for a truly routing/firewalling device.

alas with ‘tx off’ (workaround) ‘iftop’ utility shows no more than about ‘94MB’ for the same

desktopA# netcat desktopB 9000 < /dev/zero
desktopB# netcat -l 9000 > /dev/zero

without workaround:

desktopA# iftop -B

238MB          477MB         715MB          954MB    1.16GB
└─────────────┴──────────────┴─────────────┴──────────────┴──────────────
desktopA              <=> desktopB                 117MB   117MB   116MB

with workaround:

desktopA# iftop -B

238MB          477MB         715MB          954MB    1.16GB
└─────────────┴──────────────┴─────────────┴──────────────┴──────────────
desktopA              <=> desktopB                94.6MB  94.2MB  93.0MB