[BPI-R3] Crash in sta_set_sinfo+0xa18

Has anyone seen a crash like this?

I’ve got two BPI-R3 running the latest snapshots of OpenWRT; there’s a mesh network between them (wpad-mesh-wolfssl to manage it; the hostapd in this trace is from that package)

It crashes frequently, I suspect when devices move between the two APs. It’s worse with 802.11r turned on.

[  127.035755] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[  127.044543] Mem abort info:
[  127.047322]   ESR = 0x0000000096000005
[  127.051054]   EC = 0x25: DABT (current EL), IL = 32 bits
[  127.056356]   SET = 0, FnV = 0
[  127.059394]   EA = 0, S1PTW = 0
[  127.062518]   FSC = 0x05: level 1 translation fault
[  127.067387] Data abort info:
[  127.070254]   ISV = 0, ISS = 0x00000005
[  127.074078]   CM = 0, WnR = 0
[  127.077033] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000044ef3000
[  127.083452] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[  127.092139] Internal error: Oops: 96000005 [#1] SMP
[  127.097000] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_meta_bridge nft_masq nft_log nft_limit nft_hash nft_fwd_netdev nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_dup_netdev nft_ct nft_counter nft_compat nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_bridge nf_conntrack mt7915e mt76_connac_lib mt76 mac80211 iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables slhc sfp nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_dup_netdev nf_defrag_ipv6 nf_defrag_ipv4 mdio_i2c libcrc32c crc_ccitt compat crypto_safexcel pwm_fan hwmon i2c_gpio i2c_algo_bit tun sha1_generic seqiv md5 des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd
[  127.097141]  xhci_hcd gpio_button_hotplug usbcore usb_common
[  127.189709] CPU: 3 PID: 1566 Comm: hostapd Not tainted 5.15.104 #0
[  127.195870] Hardware name: Bananapi BPI-R3 (DT)
[  127.200382] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  127.207323] pc : sta_set_sinfo+0xa18/0xbb0 [mac80211]
[  127.212401] lr : sta_set_sinfo+0x5b8/0xbb0 [mac80211]
[  127.217450] sp : ffffffc00a1437a0
[  127.220748] x29: ffffffc00a1437a0 x28: 0000000000000001 x27: ffffffc00a143dd0
[  127.227864] x26: ffffff800007e880 x25: ffffff8006670900 x24: ffffff80049de735
[  127.234978] x23: 000000000000dad4 x22: ffffff8006452d00 x21: 0000058d1783bf82
[  127.242093] x20: ffffff80049de000 x19: ffffff800b4b1b00 x18: 0000000000000000
[  127.249209] x17: 00000000000015e0 x16: ffffffc008f05000 x15: 0000000000000af0
[  127.256324] x14: ffffff80049dedf8 x13: ffffff80049dedf8 x12: 0000000000000000
[  127.263439] x11: 0000000000000000 x10: ffffff80049dee00 x9 : 0000000000000000
[  127.270554] x8 : ffffff800b4b1c00 x7 : 0000000000002314 x6 : ffffff8003af03a0
[  127.277669] x5 : ffffff8003af0880 x4 : 0000000000000003 x3 : ffffff800b4b1b44
[  127.284784] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000004
[  127.291899] Call trace:
[  127.294332]  sta_set_sinfo+0xa18/0xbb0 [mac80211]
[  127.299038]  sta_set_sinfo+0xb10/0xbb0 [mac80211]
[  127.303741]  sta_info_destroy_addr_bss+0x4c/0x70 [mac80211]
[  127.309310]  ieee80211_color_change_finish+0x1bf8/0x1e80 [mac80211]
[  127.315573]  cfg80211_check_station_change+0x1384/0x4720 [cfg80211]
[  127.321834]  genl_family_rcv_msg_doit+0xb4/0x110
[  127.326437]  genl_rcv_msg+0xd0/0x1c0
[  127.329997]  netlink_rcv_skb+0x58/0x120
[  127.333816]  genl_rcv+0x34/0x50
[  127.336942]  netlink_unicast+0x1f0/0x2ec
[  127.340848]  netlink_sendmsg+0x19c/0x3d0
[  127.344753]  ____sys_sendmsg+0x21c/0x260
[  127.348660]  ___sys_sendmsg+0x80/0xf0
[  127.352307]  __sys_sendmsg+0x44/0xa0
[  127.355865]  __arm64_sys_sendmsg+0x20/0x30
[  127.359944]  invoke_syscall.constprop.0+0x4c/0xe0
[  127.364631]  do_el0_svc+0x40/0xd0
[  127.367930]  el0_svc+0x14/0x50
[  127.370973]  el0t_64_sync_handler+0xe0/0x110
[  127.375227]  el0t_64_sync+0x158/0x15c
[  127.378878] Code: d3441c42 12000c00 8b020cc2 f9409c42 (f9400446) 
[  127.384951] ---[ end trace b902af5b08d1a620 ]---
[  127.393978] Kernel panic - not syncing: Oops: Fatal exception
[  127.399708] SMP: stopping secondary CPUs
[  127.403615] Kernel Offset: disabled
[  127.407086] CPU features: 0x00000000,20000802
[  127.411427] Memory Limit: none
[  127.418816] Rebooting in 3 seconds..

While I’m at it, any way to get it to reboot into the same system? It wouldn’t be so bad if I didn’t have to go reboot the device manually afterward.

I do not know openwrt specific bootup enough to ensure boot to same system instead of recovery,but maybe i can help to fix the issue itself

https://elixir.bootlin.com/linux/latest/source/net/mac80211/sta_info.c#L2505

You can try to add some debug information in this function to check which pointer is Null

printk(KERN_ALERT "DEBUG: Passed %s %d val:0x%0x\n",__FUNCTION__,__LINE__,(unsigned int)val);

val needs to be replaced by the possible value which can be Null (without the last access)

E.g.

printk(KERN_ALERT "DEBUG: Passed %s %d sdata:0x%0x\n",__FUNCTION__,__LINE__,(unsigned int)sdata);
printk(KERN_ALERT "DEBUG: Passed %s %d sdata->local:0x%0x\n",__FUNCTION__,__LINE__,(unsigned int)sdata->local);
sinfo->generation = sdata->local->sta_generation;

Heh, that’s the easy part (I speak enough C to just do that) but I don’t actually know what the process for building and booting a custom kernel is in a way that would leave my existing configuration likely to work. Any tips?

It should be possible change it in the patched source of openwrt and recompile. But i don’t use openwrt so i do not know the build tools enough

https://openwrt.org/docs/guide-developer/start#using_the_toolchain

OpenWrt uses out-of-tree drivers for Wi-Fi build using compat-backport. That allows us to use bleeding-edge wifi drivers on top of Linux stable kernel. Hence, if you want to add printf’s to anywhere in the wifi drivers, it will have to happen via package/kernel/mac80211/patches/. The easiest and most convenient way is probably doing this using quilt, and that will roughly look like this:

make package/mac80211/{clean,prepare} QUILT=1
cd build_dir/target-*/linux-*/backports-*
quilt push -a
quilt new patches/subsys/999-my-custom-patch.patch
quilt add ${files_to_be_edited}
# now edit files
quilt refresh
cd ../../../..
make package/mac80211/compile V=99
# if it build successfully, proceed building the complete image
make -j$(nproc)
1 Like

Alrighty, let me give it a spin, thank you for the TL;DR version, that’s super helpful.

another user stumbled over this here: https://github.com/openwrt/openwrt/issues/12143#issuecomment-1740868809

@aredridel have you found out why this happens?

That’s right and like @aredridel I activated wifi roaming on my four BPiR3. I will try disable that on Monday.