[BPI-R2] Kernel Development

It works, if kernel modules like described HERE are enabled.

Hi, everyone!

Today i got my R2 halted with 5.8 kernel for 1st time ever.

  1. Firstly network gone down (wifi, i didnā€™t test ethernet)
  2. I got dmesg dump using serial console: 5.8-dmesg.txt (72.9 ŠšŠ‘) (dump started from 1st trace after reboot, tha last one is incomplete)
  3. After reboot everything seems good for now, i cant reproduce the bug.

No 3rd party modules/drivers were used.

uptime 1 day and ~1 hr

[90769.755898] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[90769.755915] rcu:     0-....: (3 GPs behind) idle=962/0/0x1 softirq=2280595/2280595 fqs=3892 
[90769.755922]  (detected by 1, t=8407 jiffies, g=5917193, q=141)
[90769.755930] Sending NMI from CPU 1 to CPUs 0:
[90769.756369] NMI backtrace for cpu 0
[90769.756374] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.8.0-rc6-arm+ #1
[90769.756376] Hardware name: Mediatek Cortex-A7 (Device Tree)
[90769.756378] PC is at mt76_rx_aggr_reorder+0x4/0x2dc [mt76]
[90769.756380] LR is at mt76_rx_poll_complete+0x2e8/0x454 [mt76]

seems like rcu-stall bug in mt76 wifi-driver, some before are warnings in clock/reset-driver caused by lima_clk_disable

[45565.315187] WARNING: CPU: 1 PID: 20682 at drivers/reset/core.c:358 reset_control_assert+0x198/0x200

[45565.316034] WARNING: CPU: 1 PID: 20682 at drivers/clk/clk.c:958 clk_core_disable+0xec/0x28c                                                                                                                                                     
[45565.316040] g3d_core already disabled                                                                                                                                                                                                           

[45565.317631] WARNING: CPU: 1 PID: 20682 at drivers/clk/clk.c:958 clk_core_disable+0xec/0x28c                                                                                                                                                     
[45565.317636] mmpll_ck already disabled

i see you have 5.8.0-rc6, could you try with 5.8-main tree?

Sure, Iā€™ll try to update it nearest weekend.

A couple of errors with self-reboot in 2 days 5.8-rc-fails.txt (23.1 ŠšŠ‘)

Now downgraded to 5.5, and 5.8-main compilation in progressā€¦

P.S. Not actively used (as a PC ;)), not even logged in, just acted as a router (3-6 wifi clients both internal wifi and mt76), with serial console attached.

P.P.S BPI-MT7615 wifi card ( MT7612 wifi card replacement) is on the way, and as i understood it uses the same wifi driver as MT7612, is it correct?

Mt7615 uses also mt76 wifi driver right,but separate folder/CONFIG_option/module

I hope error is fixed in 5.8 finalā€¦ maybe this: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/wireless/mediatek/mt76?h=linux-5.8.y&id=4ac668a3b8c9d3477a3fe162c1cfeb867dd65de8

else it will be interesting if error is in 5.7,5.6,ā€¦ just to get the breaking commit

For lima i guess we need the power-commit i added to 5.4-main

Iā€™ve switched to 5.8.0 release. Iā€™ll update on any errors/stability issues.

Yep, or dts-mod from 4.4 kernel which is in my 5.5-lima branch, Iā€™m not sure which way is better, but both seems to be working.

Better is find bugs in most recent version :slight_smile:

5 days of uptime with 5.8-main without any errors/TraceBacks/etc, interrupted by accidently pulled power cord :slight_smile:

P.S. Without usb-wifi.

good to hear :slight_smile:

and is usb-wifi still buggy? if there is a problem it should be reported to fix it mainline

My LTE modem with wifi-hotspot is buggy ;), so rolled back to USB LTE modem plugged in-to R2, usb-wifi dogle was used before as wifi-client. Iā€™ll test usb wifi ASAP, but right now there is no use for it.

P.S. After modem+wifi replacement iā€™ll be able to place R2 near TV, so hdmi/lima related stuff is going to be back in buisness soon :slight_smile:

Hdmi is working in 5.8, but lima (power changes) is not ported to it

I started work on 5.9

This commit breaks bootup of r2 and r64 f97dbf48ca43009e8b8bcdf07f47fc9f06149b36 irqchip/mtk-sysirq: Convert to a platform driver

I got this fix that works https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-next&id=7828a3ef8646fb2e69ed45616c8453a037ca7867 but leaves some errors in bootlogā€¦it looks like the full irq-series gets reverted in one of the next rcā€™s

Updated 5.9-rc branch. Hdmi and wifi are working so far (5.9-hdmi, 5.9-wifi)

Got new traceback on 5.8-main: dmesg.txt (7.0 ŠšŠ‘)

It appears on large file copying using mc (tens on GB), swap presence and amount doesnā€™t has any effect.

Looks like it doesnā€™t affects the OS.

Average free mem:

top - 23:16:34 up 1 day,  2:13,  4 users,  load average: 1,26, 0,92, 1,39
Tasks: 152 total,   1 running, 151 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,5 us, 18,3 sy,  0,0 ni, 61,2 id, 18,7 wa,  0,0 hi,  1,3 si,  0,0 st
MiB Mem :   2008,0 total,      6,7 free,    110,4 used,   1890,9 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   1834,7 avail Mem

P.S. I do not remember such messages on earlier kernels, but i canā€™t exclude that.

P.S.S. Looks like some general kernel issue, not R2 specific,

Looks like bug in mt76 driver

mt76x02_update_beacon_iter [mt76x02_lib]

You can try opening issue on github.com/openwrt/mt76 but write that youā€™re using mainline 5.9-rc1ā€¦i donā€™t know if they snc mt76 on every mergewindow

Mhmā€¦you copy locally? Should not be affected by mc file copyā€¦i guess it is caused by any limitation (cpu/ram) if related to copy

Btw for git i have created swap file to extend ramā€¦else i got oom reaper messages. Tried different git configs (windowsize,threads,ā€¦) but only swapfile works

Dts-patches (hdmi and pause fix) are now merged to linux-next.

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts

mtk drm patches afaik need to be merged to drm-next first => done, waiting to be visible in linux-next

as the tdms patch is based on hdmi-phy-move file drivers/phy/mediatek/phy-mtk-hdmi.c does not exit yet, so we can use mtk_dpi.c to show changes done by commit ā€œChange the getting possible_crtc wayā€

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/drivers/gpu/drm/mediatek/mtk_dpi.c

Edit: linux-next from 2020-09-29 contains the mtk-drm patches

2 Likes

hi, i booted 5.10 the first time after adding the basic patches (adding build.sh,defconfigs,ā€¦) and noticed at leats a warning

[    6.151110] ahci 0000:02:00.0: version 3.0                                   
[    6.151136] ahci 0000:02:00.0: enabling device (0140 -> 0143)                
[    6.157056] ------------[ cut here ]------------                             
[    6.161730] WARNING: CPU: 2 PID: 73 at include/linux/msi.h:213 pci_msi_setup_
msi_irqs.constprop.0+0x78/0x80                                                  
....
[    6.724607] WARNING: CPU: 2 PID: 73 at include/linux/msi.h:219 free_msi_irqs+

https://elixir.bootlin.com/linux/v5.10-rc1/source/include/linux/msi.h#L219

which comes from commit 077ee78e392869e46ae6bdc6ba2a3c4249d0b5e1 (PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable)

dmesg_5.10.log (47,6 KB)

I send patch for fixing itā€¦patchwork => Patch only suppress warning, but does not fix the root causeā€¦see comment from Thomas Gleixnerā€¦have not found the root-cause yet ;( i guess ahci/mtk_pcie-driver or missing option in dts It looks like pcie driver does not setup a msi-domain for mt7623/mt2701, reverted my patch and booted on bpi-r64, and got no warningā€¦this confirms my assumption, that this warning is caused only on mt2701/mt7623 where no msi-domain is setup.

Patch from Marc Zyngier seems to work, at least i get no warning

1 Like

after getting hnat working on last lts (5.10) i got also second gmac working with the help of deng qingfang and Marek BehĆŗn. Switching CPU-Port of ports needs modified ip-tool (iproute2)

root@bpi-r2:~# ip link show wan                                                                                                                
5: wan@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000                            
    link/ether 08:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff                                                                                         
root@bpi-r2:~# ip link set wan link eth1                                                                                                       
root@bpi-r2:~# ip link show wan                                                                                                                
5: wan@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000                            
    link/ether 08:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff                                                                                         
root@bpi-r2:~# 

references:

modified iproute2-tool can be found here:

second gmac is working on both devices (r2+r64) on quick-test, now also with default-cpu via dts, changing gmac via IP works

have uploaded compiled iproute2 binaries here: https://drive.google.com/drive/folders/1Zkxs_gglxxWCvVyY3HwPNNUmhR3DqW_h?usp=sharing

i have uploaded compiled kernel 5.10 with hnat+gmac patches here:

https://drive.google.com/drive/folders/17MoFc3vIuGHDEV5SsCmGegJDr009IXls?usp=sharing

compiled nftables can be found here: https://drive.google.com/drive/folders/1hajKvqQa96WRrAy52fQX90i59I1s0h-i?usp=sharing

second gmac seems to not work stableā€¦i see massive retransmitts when doing iperf3.asked network-specialists from mtk if they can help

Does R64openwrt source(https://wiki.banana-pi.org/Banana_Pi_BPI-R64) support HW NAT? I can only find it in R2, how to add this module, throughput test is not high

You should use newer source,not the old vendor openwrt. Afaik hw-nat ia mainline till 5.12. My 5.10-hnat branch has support but it is no openwrt,but you can use the commits to add patches to openwrtā€¦maybe mainline openwrt has already backported it to 5.10

Btw.this thread is about r2,not r64