BPI-R3 wifi problems

Hi, The wifi on the BPI-R3 is unreliable, in that after a period of use, it starts refusing to forward packets to some destinations but not others. Work arounds:

  1. disconnect wifi and connect again from the client.
  2. Power cycle the BPI-R3. Not a soft reboot of the BPI-R3 does not resolve the problem.

My first ask, is

  1. Find a way that a soft reboot will resolve this, not needing a power off. I can then reboot periodically once a day to mostly resolve the problem.

In the long run, I wish to diagnose this problem. I assume the firmware in the mediatek chip can creates logs. How can we retrieve those logs into user space? When it currently stopped forwarding packets, there is not a single line in the syslog or dmesg. Not uninteresting logs, no logs at all. This problem is present on openwrt and also a normal ubuntu installation on the BPI-R3 with kernel 6.9-rc1 on it. I loaded the ubuntu and 6.9 kernel from source, so that it will be easier for me to try fixes. Any idea on how to further diagnose this?

If you know non-working targets i would try to capture these packets with tcpdump. Which period (hours,days,weeks)? Forwarding some targets and others not are strange and maybe some firewalling…as you wrote you have this issue on ubuntu too i guess you have not enabled wed,right? This is still known as broken in mainline kernel.

For the non-working targets.

Imagine I am on a 192.168.0.0/24 network. I.e. A private network.

Wifi clients get assigned addresses form 192.168.0.100 - 150.

BPI-R3 on .10

Wifi bridged to LAN, no NAT involved.

Some LAN based hosts on .1 .20 .30

When first the wifi client connects, it can ping .1 .20 .30.

After some time, maybe 24 hours, it can ping .1 and .30 but not .20.

Everything on the LAN can still ping .1 .20 .30

If the wifi client disconnects and reconnects, it can ping .1 .20 .30.

If the BPI-R3 is rebooted, the problem remains, or might work for a bit, but returns quickly, within 10 mins normally.

If the BPI-R3 is powered off, the problem disappears for about another 24 hours.

The main symptom (might be other problems as well) is that ARP requests begin to stop getting through, and thus when they expire, the problem presents itself.

The problem is the same whether openwrt or ubuntu is installed on the BPI-R3.

As an aside, other old access points that I installed openwrt on, that used mediatek chipsets also had similar problems with some packets not getting through. So, I think this has been a very long standing issue with mediatek chipsets.

So, I am basically asking for any hooks one can have into the mediatek firmware that will finally diagnose the cause of the problem, so one can work around it.

Diagnostics so far: Only plug one LAN cable into the BPI-R3. The problem remains no matter which LAN or SFP port you plug into.

So, from this I am suspecting the wifi chips / wifi firmware.

For clarification, “not enabled wed”. Please can you point me to the exact setting that toggles this. As it might be a little ambiguous otherwise.

Maybe this is the issue…

My image uses different subnets for lan and wifi (not sure if i have already changed all to systemd…used dnsmasq before and in my own setups because of mac2ip mappings which systemd not supported in my last try)

So maybe your wifi-clients get ips from wifi-subnet and then cannot access lan-clients on ip layer because they are same lan-segment but different ip subnets.

The time also would match the lease time of dhcp…

So please check ip settings of your clients.

@frank-w The IP addresses are fine. When the wifi client initially connects, it can reach everything, so routing, subnets is fine. It is only after being connected for a while that things then start failing. In case it makes a difference, I am not using the BPI-R3 as a DHCP or DNS server for the wifi clients. The DHCP/DNS server is elsewhere on the network.

TODO: I have thought of another test I can do to further diagnose this. But I am waiting for the fault to appear before posting results.

With bridged wifi/lan i guess you will not see much in tcpdump,on r2 there was an issue with this setup in wifi driver…not sure it is working on r3 without problems. But this is completely different driver.

Most debug will be in driver itself or via debugfs. You can also print interface statistics with ip or ethtool. Maybe there are some errors or dropped packets.

Have you disabled the parts for dhcp? If not you will have it stil running it in parallel with your external one which also causes strange errors

@frank-w dhcp is not running on the r3. The interface stats: It does not register that it is dropping any packets when the problem occurs.

The r3 is not running any firewall. You are correct, tcpdump on the r3 might not be capturing all the traffic, even when the r3 is forwarding packets ok. I.e. before the loss of packets bug occurs.

A problem did occur today with regards to the BPI-R3 not forwarding wifi packets.

It is probably different from some other causes, but this time I have 4 SSID.

2g-1

2g-2

5g-3

5g-4

During this problem, 2g-1 failed to forward packets. Any client attempting to connect to it failed because DHCP was failing. If the client was given a static IP address, ARPs were not resolving. If the client was given static ARPs, IP packets were not forwarding. They appear in the tcpdump -i wlan0, but were not forwarded out of the ethernet ports.

Clients trying to connect to 2g-2, 5g-3 and 5g-4 continued to work. I looked at the bridge status, and 2g-1 interface had dropped off the list of interfaces on the bridge, 2g-2, 5g-3 and 5g-4 interfaces were still there. restarting hostapd (the one for 2g-1 and 2g-2) fixed the problem, and 2g-1 and 2g-2 now appeared on the bridge. Also the clients were able to forward packets again through the BPI-R3.

So, I have found one bug and it looks to be caused by hostapd removing an interface from the bridge for no apparent reason.

I now have some better methods to diagnose the problem, so I will look to see if it happens again the next time packets stop forwarding.