Vlan enabled bridge bug?

Now I see, you want to setup a trunk port. You need to setup on both sides. I have this kind of setup on my current routers (WRT3200ACM lan1 port connected to WRT1900ACS lan1 port)

on both routers:

root@wrouter:~# bridge vlan
port	vlan ids
lan4	 2 PVID Egress Untagged

lan3	 2 PVID Egress Untagged

lan2	 2 PVID Egress Untagged

lan1	 2
	 3

wan	 1 PVID Egress Untagged

aux	 1 PVID Egress Untagged

brlan	 2 PVID Egress Untagged

wlp1s0	 2 PVID Egress Untagged

wlp2s0	 2 PVID Egress Untagged

guest24	 3 PVID Egress Untagged

With lan1 as the trunk port On this cable between routers there are tagged packets send. No ingress or egress set on lan1

ok, this seems to work

/sbin/bridge vlan add vid 500 tagged dev wan

bridge looks now as following:

root@bpi-r64:~# /sbin/bridge vlan show lanbr0                                   
port    vlan ids                                                                
wan      500                                                                    
                                                                                
lan1     1 PVID Egress Untagged                                                 
                                                                                
lan2     1 PVID Egress Untagged                                                 
                                                                                
br0      1 PVID Egress Untagged                                                 
                                                                                
lanbr0   500 PVID Egress Untagged                                               
                                                                                
root@bpi-r64:~#

changed mac of vlan on my main-router to match yours

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
aa:bb:cc:dd:ee:ff dev wan vlan 500 self                                         
root@bpi-r64:~# 

put vlan remotely down, to ensure no more traffic is reaching r64

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
root@bpi-r64:~#

self entry was dropped without a mt7530_fdb_write message…after a while it seems to be written down in dsa-driver to hardware

[  332.767384] DEBUG: Passed mt7530_fdb_write 366               
[  332.771975] DEBUG: Passed mt7530_fdb_write 381                               
[  332.777816] DEBUG: Passed mt7530_fdb_write 366                               
[  332.782696] DEBUG: Passed mt7530_fdb_write 381                               
[  332.788104] DEBUG: Passed mt7530_fdb_write 366                               
[  332.792985] DEBUG: Passed mt7530_fdb_write 381                               
[  332.798459] DEBUG: Passed mt7530_fdb_write 366                               
[  332.803358] DEBUG: Passed mt7530_fdb_write 381

flush is working

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
root@bpi-r64:~# echo 1 > /sys/class/net/lanbr0/bridge/flush                     
root@bpi-r64:~# [  434.499488] DEBUG: Passed mt7530_fdb_write 366               
[  434.504126] DEBUG: Passed mt7530_fdb_write 381                               
[  434.509221] DEBUG: Passed mt7530_fdb_write 366                               
[  434.513847] DEBUG: Passed mt7530_fdb_write 381                               
[  434.518987] DEBUG: Passed mt7530_fdb_write 366                               
[  434.523566] DEBUG: Passed mt7530_fdb_write 381                               
[  434.528659] DEBUG: Passed mt7530_fdb_write 366                               
[  434.533228] DEBUG: Passed mt7530_fdb_write 381                               
                                                                                
root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
root@bpi-r64:~# 

put vlan up again remotely and i got the 2 entries back (again without dsa handling)

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
aa:bb:cc:dd:ee:ff dev wan vlan 500 self                                         
root@bpi-r64:~# 

mac currently stays in cache, but i cannot delete it

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
aa:bb:cc:dd:ee:ff dev wan vlan 500 self                                         
root@bpi-r64:~# /sbin/bridge fdb del aa:bb:cc:dd:ee:ff dev wan vlan 500 self    
[  615.488224] DEBUG: Passed dsa_switch_event 480                               
[  615.493120] DEBUG: Passed dsa_switch_fdb_del 169                             
[  615.498140] DEBUG: Passed dsa_switch_fdb_del 172                             
[  615.503356] DEBUG: Passed mt7530_port_fdb_del 1346                           
[  615.508585] DEBUG: Passed mt7530_fdb_write 366                               
[  615.513426] DEBUG: Passed mt7530_fdb_write 381                               
[  615.518826] DEBUG: Passed mt7530_port_fdb_del 1351 ret:0                     
[  615.524455] DEBUG: Passed dsa_switch_event 482 err:0                         
root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
aa:bb:cc:dd:ee:ff dev wan vlan 500 self                                         
root@bpi-r64:~# 

seems like dsa can read the mac-entries (remove it in softwarelist if switch deletes it by itself, but dsa-driver cannot trigger deletion, only flush)

after some time mac-entry is dropped by self (e.g. switch mac timeout)

root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
aa:bb:cc:dd:ee:ff dev wan vlan 500 self                                         
root@bpi-r64:~# [  709.599846] DEBUG: Passed mt7530_fdb_write 366               
[  709.604688] DEBUG: Passed mt7530_fdb_write 381                               
[  709.612852] DEBUG: Passed mt7530_fdb_write 366                               
[  709.618276] DEBUG: Passed mt7530_fdb_write 381                               
[  709.625130] DEBUG: Passed mt7530_fdb_write 366                               
[  709.631539] DEBUG: Passed mt7530_fdb_write 381                               
[  709.637545] DEBUG: Passed mt7530_fdb_write 366                               
[  709.642884] DEBUG: Passed mt7530_fdb_write 381                               
                                                                                
root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
aa:bb:cc:dd:ee:ff dev wan vlan 500 master lanbr0                                
root@bpi-r64:~# [  725.983831] DEBUG: Passed mt7530_fdb_write 366               
[  725.988736] DEBUG: Passed mt7530_fdb_write 381                               
[  725.994868] DEBUG: Passed mt7530_fdb_write 366                               
[  726.001772] DEBUG: Passed mt7530_fdb_write 381                               
[  726.007564] DEBUG: Passed mt7530_fdb_write 366                               
[  726.012358] DEBUG: Passed mt7530_fdb_write 381                               
[  726.017889] DEBUG: Passed mt7530_fdb_write 366                               
[  726.022748] DEBUG: Passed mt7530_fdb_write 381                               
[  726.028195] DEBUG: Passed mt7530_fdb_write 366                               
[  726.033031] DEBUG: Passed mt7530_fdb_write 381                               
[  726.038506] DEBUG: Passed mt7530_fdb_write 366                               
[  726.043437] DEBUG: Passed mt7530_fdb_write 381                               
                                                                                
root@bpi-r64:~# /sbin/bridge fdb | grep aa:bb:cc:dd:ee:ff                       
root@bpi-r64:~#

Deleting the ‘self’ entries is possible when vlan is disabled, then it should possible when vlan is enabled. I think it is a bug…

I wonder why read and wrie are different registers…

reg[i] = mt7530_read(priv, MT7530_TSRA1 + (i * 4));

mt7530_write(priv, MT7530_ATA1 + (i * 4), reg[i]);

But it seems correct to have separate write register.

https://elixir.bootlin.com/linux/latest/source/drivers/net/dsa/mt7530.h#L76

Flush is a separate access,no loop deleting all entries.

hi

in the most cases it make no sense to delete the mac addresses from the member interfaces of an bridge.

the vlanfilter from the linux bridge is almost not necessary .

checkout the documentaton about vlanfilter at linux bridge.

holger

Hi Holger,

I would like to use vlan for setting up guest wifi.

I would like to be able to delete fdb self entries, to be able to setup a home network totally on layer 2 bridge level, between router and accesspoints The fdb needs to be tidied up to support roaming over this network. After that I would like to setup 802.11r fast roaming. I already have this set up successfully on Marvell chips of the Linksys routers, but code development is at full stop for wifi drivers. Therefore I am migrating to mediatek.

As long as the fdb self entry stays stuck on lanx to a remote router, it is impossible for a wifi client to connect to local wifi in the same bridge.

See all links below about this roaming issue on. It is dsa related, and I’ve experienced that this is also happening on the mediatek dsa switch.

For now I use my own userspace program to tidy up the fdb, I call it bridgefdbd , but it cannot delete fdb self entries, same reason iproute2 cannot. I use the program until the kernel drivers are fixed. The last link seems like a fix for Marvell for automatically deleting the stuck self entries within the kernel

https://www.spinics.net/lists/netdev/msg642583.html

https://gitlab.nic.cz/turris/os/build/-/issues/165

https://github.com/Chadster766/McDebian/issues/70

https://www.spinics.net/lists/netdev/msg645130.html

GOOD NEWS:

It seems for Marvell they finally after years fixed this in following commit: yay :blush:

It says: The switch may generate an “ATU violation” warning when a client moves from the CPU port to a switch port because the static ATU entry added by DSA core still points to the CPU port. DSA core will then clear the static entry so it is not fatal. Disable the warning so it will not confuse users.

as the commit is titled “dsa roaming fix” there was one for mt7530 (i guess mt7531 as it’s the same driver) too:

https://patchwork.kernel.org/project/linux-mediatek/patch/[email protected]/

Btw. It was same author for marvell :slight_smile:

Thanks. Have to check this out then. Maybe first try without vlan.

Ok, now I have tested wifi roaming on a home network connected router-lan to bpir64-lan.

It works when vlan is disabled, but it also works if vlan is enabled, but with vlan 1 only!

This is with enabled vlan, vlan 1. First command, phone connected to remote router, second command, phone roamed to BPI-R64.

root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev lan0 vlan 1 master brlan 
aa:bb:cc:dd:ee:ff dev lan0 vlan 1 self 
root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev wlan0 vlan 1 master brlan 

Phone has good internet connectivity :slight_smile:

Now the same setup but with vlan 2

root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 master brlan 
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 self 
root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 self 
aa:bb:cc:dd:ee:ff dev wlan0 vlan 2 master brlan 

This last result makes the phone not able to reach the dhcp server (on the remote router). No connectivity.

The fix ‘net: dsa: mt7530: fix roaming from DSA user ports’ does not work anymore when vlan id does not equal 1.

I guess it suffers from the same bug that fdb ‘seff’ entries with vlan’s other than 1 cannot be deleted.

And where to post this bug report? Would be nice someone could fix it.

You could look for author/maintainers/reviewer of driver (get_maintainer.pl) and ask them (better include mailinglist)

$ scripts/get_maintainer.pl drivers/net/dsa/mt7530.c

Send, let’s see if someone can help…

1 Like

Found a fix so that deleting manually works now. However automatic deletion when roaming still does not work.

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=11d8d98cbeef

Now at least I can use my bridgefdbd program. It can now keep the fdb cleaned up.

So I am happy for now…

1 Like

And already a fix applied to it, only use the ivl bit for vid larger than 1.

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=7e777021780e9c373fc0c04d40b8407ce8c3b5d5

all is working now? i thought you have still problems with vlan 1

No, only for vid larger than 1. This is now fixed.

I was quite happy, being able to delete the ‘self’ entries manually.

Upgrading to kernel 5.14 just broke that again.

The DSA driver has changed and now manipulation of ‘self’ entries using the bridge command (and also bridgefdbd) is no longer implemented. This should be great, and wifi roaming on vlan 1 still works great.

However, wifi roaming on vlan greater than 1 is still not working correctly. The automaitc deletion of the ‘self’ entry is still not working (as in kernel 5.12) but manual deletion is now also not possible anymore…

In 5.14, I guess it should be so, this ‘self’ entry is now deleted together with the ‘master static’ entry, but it does not happen…

i guess thats what vladimir talkes about…https://patchwork.kernel.org/project/linux-mediatek/patch/[email protected]/#24317677

Vladimir found an issue, I need to try it one of these days:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=2b0a5688493a

Can confirm it works now.

Vladimir is working on a major makeover of the coupling of the software bridge’s fdb and the dsa-switch fdb. It is now in netdev net-next git. I should be included starting from kernel 5.15-rc1.

Only need to add:

	ds->assisted_learning_on_cpu_port = true;

to mt7531_setup() in mt7530.c

Bridge fdb now looks like this, when roaming from router to (local) accesspoint to router:

aa:bb:cc:dd:ee:ff dev lan0 vlan 2 master brlan 
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 self 
root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev eth0 self permanent
aa:bb:cc:dd:ee:ff dev wlan0 vlan 2 offload master brlan 
root@bpi-r64:~# bridge fdb | grep aa:bb:cc:dd:ee:ff
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 master brlan 
aa:bb:cc:dd:ee:ff dev lan0 vlan 2 self 

All done inside the kernel.

Not even need to do anything with fdb in userspace anymore, so no need for my bridgefdbd program anymore. Anyway, that is in major changeover to support 802.11k and 802.11v, since it already communicates with all instances of hostapd on the entire local network.

1 Like