As mentioned earlier here are my findings on BPiR2 ethernet speed and behavior with kernel 4.14 (tested OpenWRT version compiled from source - BananaPI BPI-R2 Openwrt18.06 Demo Image and Source Code Release 2019-03-06 and version from frank’s 4.14-main branch). For now I’m leaving performance figures aside as it seems that generating 1Gbps of traffic using iperf or iperf3 from R2 towards the host connected to any of lanX ports leads to a network stall accompanied with the NETDEV WATCHDOG message appearing in the kernel log. Here is one from frank’s 4.14-main kernel:
[ 75.991517] ------------[ cut here ]------------
[ 75.996146] WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x244/0x248
[ 76.004376] NETDEV WATCHDOG: eth1 (mtk_soc_eth): transmit queue 0 timed out
[ 76.011282] Modules linked in: ipv6 qcserial pppoe ppp_async usb_wwan rndis_host qmi_wwan pppox ppp_generic pl2303 nf_nat_pptp nf_conntrack_pptp lz4 iptable_nat ipt_REJECT ipt_MASQUERADE cdc_ether ax88179_178a asix xt_time xt_tcpudp xt_state xt_recent xt_policy xt_nat xt_multiport xt_mark xt_mac xt_limit xt_helper xt_esp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT usbserial usbnet slhc nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4 nf_log_common nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323
[ 76.081771] nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conntrack lz4_decompress lz4_compress iptable_raw iptable_mangle iptable_filter ipt_ah ip_tables crc_ccitt cdc_wdm fuse xt_set x_tables ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink nfsv4 nfsv3 nfsd nfs ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 rpcsec_gss_krb5 auth_rpcgss oid_registry tun loop af_key xfrm_user xfrm_ipcomp xfrm_algo lockd sunrpc grace dns_resolver raid10 raid1 raid0 md_mod nls_utf8 nls_cp866 nls_cp1251
[ 76.152022] zram zsmalloc md5 echainiv ecb des_generic cts cbc authenc nls_iso8859_1 nls_cp437 uas usb_storage leds_gpio ohci_platform ohci_hcd ehci_pci ehci_platform ehci_hcd mii
[ 76.168053] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.117-4.14-main #2
[ 76.174959] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 76.180508] [<c010ec84>] (unwind_backtrace) from [<c010abf0>] (show_stack+0x10/0x14)
[ 76.188199] [<c010abf0>] (show_stack) from [<c05af784>] (dump_stack+0x78/0x8c)
[ 76.195370] [<c05af784>] (dump_stack) from [<c011738c>] (__warn+0xe8/0x100)
[ 76.202284] [<c011738c>] (__warn) from [<c01173dc>] (warn_slowpath_fmt+0x38/0x48)
[ 76.209714] [<c01173dc>] (warn_slowpath_fmt) from [<c051d240>] (dev_watchdog+0x244/0x248)
[ 76.217837] [<c051d240>] (dev_watchdog) from [<c016af54>] (call_timer_fn.constprop.3+0x28/0x98)
[ 76.226472] [<c016af54>] (call_timer_fn.constprop.3) from [<c016b04c>] (expire_timers+0x88/0x94)
[ 76.235193] [<c016b04c>] (expire_timers) from [<c016b124>] (run_timer_softirq+0xcc/0x194)
[ 76.243314] [<c016b124>] (run_timer_softirq) from [<c0101578>] (__do_softirq+0xe8/0x25c)
[ 76.251345] [<c0101578>] (__do_softirq) from [<c011c00c>] (irq_exit+0xbc/0x104)
[ 76.258604] [<c011c00c>] (irq_exit) from [<c0157410>] (__handle_domain_irq+0x80/0xec)
[ 76.266379] [<c0157410>] (__handle_domain_irq) from [<c010144c>] (gic_handle_irq+0x4c/0x90)
[ 76.274670] [<c010144c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 76.282093] Exception stack(0xdf071f80 to 0xdf071fc8)
[ 76.287104] 1f80: 00000003 c069b9a4 1ef9f000 c0114000 ffffe000 c0903c74 c0903c28 c092809c
[ 76.295217] 1fa0: c069e04c 410fc073 00000000 00000000 00000000 df071fd0 c01082ac c01082b0
[ 76.303327] 1fc0: 60000013 ffffffff
[ 76.306796] [<c010b7cc>] (__irq_svc) from [<c01082b0>] (arch_cpu_idle+0x38/0x3c)
[ 76.314140] [<c01082b0>] (arch_cpu_idle) from [<c014c53c>] (do_idle+0xd0/0x138)
[ 76.321396] [<c014c53c>] (do_idle) from [<c014c84c>] (cpu_startup_entry+0x18/0x1c)
[ 76.328909] [<c014c84c>] (cpu_startup_entry) from [<8010178c>] (0x8010178c)
[ 76.335861] ---[ end trace 6601a547925c5504 ]---
[ 76.340452] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
And here is the same from OpenWRT’s 4.14 18.02.1 kernel:
[35491.038734] ------------[ cut here ]------------
[35491.043342] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x158/0x224
[35491.051569] NETDEV WATCHDOG: eth1 (mtk_soc_eth): transmit queue 0 timed out
[35491.058472] Modules linked in: qcserial pppoe ppp_async usb_wwan rndis_host qmi_wwan pppox ppp_generic pl2303 nf_nat_pptp nf_conntrack_pptp mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 lz4 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 cdc_ether ax88179_178a asix xt_time xt_tcpudp xt_state xt_recent xt_policy xt_nat xt_multiport xt_mark xt_mac xt_limit xt_helper xt_esp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT usbserial usbnet ts_fsm ts_bm slhc nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4 nf_log_common nf_flow_table_hw nf_flow_table nf_defrag_ipv4 nf_conntrack_tftp
[35491.129389] nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conntrack lz4_decompress lz4_compress iptable_raw iptable_mangle iptable_filter ipt_ah ip_tables crc_ccitt compat cdc_wdm fuse cryptodev xt_set x_tables ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink nfsv4 nfsv3 nfsd nfs ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 rpcsec_gss_krb5 auth_rpcgss
[35491.199843] oid_registry tun loop af_key xfrm_user xfrm_ipcomp xfrm_algo lockd sunrpc grace dns_resolver raid10 raid1 raid0 md_mod nls_utf8 nls_cp866 nls_cp1251 zram zsmalloc md5 echainiv ecb des_generic cts cbc authenc nls_iso8859_1 nls_cp437 uas usb_storage leds_gpio ohci_platform ohci_hcd ehci_pci ehci_platform ehci_hcd gpio_button_hotplug mii
[35491.230449] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.105 #0
[35491.236489] Hardware name: Mediatek Cortex-A7 (Device Tree)
[35491.242032] [<c010eb14>] (unwind_backtrace) from [<c010ac1c>] (show_stack+0x10/0x14)
[35491.249717] [<c010ac1c>] (show_stack) from [<c05a351c>] (dump_stack+0x78/0x8c)
[35491.256885] [<c05a351c>] (dump_stack) from [<c0117068>] (__warn+0xe4/0x100)
[35491.263793] [<c0117068>] (__warn) from [<c01170bc>] (warn_slowpath_fmt+0x38/0x48)
[35491.271218] [<c01170bc>] (warn_slowpath_fmt) from [<c050c0a8>] (dev_watchdog+0x158/0x224)
[35491.279334] [<c050c0a8>] (dev_watchdog) from [<c01689b8>] (call_timer_fn.constprop.3+0x28/0x94)
[35491.287964] [<c01689b8>] (call_timer_fn.constprop.3) from [<c0168aa0>] (expire_timers+0x7c/0x98)
[35491.296680] [<c0168aa0>] (expire_timers) from [<c0168b38>] (run_timer_softirq+0x7c/0x160)
[35491.304791] [<c0168b38>] (run_timer_softirq) from [<c010155c>] (__do_softirq+0xe4/0x250)
[35491.312818] [<c010155c>] (__do_softirq) from [<c011bc0c>] (irq_exit+0xac/0xf4)
[35491.319984] [<c011bc0c>] (irq_exit) from [<c0155b9c>] (__handle_domain_irq+0xbc/0xe4)
[35491.327752] [<c0155b9c>] (__handle_domain_irq) from [<c0101440>] (gic_handle_irq+0x5c/0x90)
[35491.336035] [<c0101440>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[35491.343454] Exception stack(0xdf06ff88 to 0xdf06ffd0)
[35491.348463] ff80: 00000002 c06cd8b0 1ef95000 c0113d60 ffffe000 c0903c74
[35491.356573] ffa0: c0903c28 c092d610 8000406a 410fc073 00000000 00000000 00000000 df06ffd8
[35491.364682] ffc0: c010834c c0108350 60000013 ffffffff
[35491.369694] [<c010b7cc>] (__irq_svc) from [<c0108350>] (arch_cpu_idle+0x34/0x38)
[35491.377034] [<c0108350>] (arch_cpu_idle) from [<c014af48>] (do_idle+0xa8/0x11c)
[35491.384286] [<c014af48>] (do_idle) from [<c014b240>] (cpu_startup_entry+0x18/0x1c)
[35491.391795] [<c014b240>] (cpu_startup_entry) from [<8010176c>] (0x8010176c)
[35491.398715] ---[ end trace d583d83ddc5b60c6 ]---
[35491.403299] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
What is interesting is a fact that while actual traffic exchange happen on the eth0 in logs above transmit timed out refers to eth1. As soon as these messages appear in the kernel log any network exchange on both wan and lanX stalls and the only way to access the R2 is to use serial console. I had tried to “reanimate” the network by /etc/init.d/network restart
to no avail, then tried to put interfaces down and back up using iproute2 tool - also didn’t help. Hadn’t tried to rmmod/insmod mtk_soc_eth
and mt7530
as these were compiled into the kernel.
The issue is pretty easy to reproduce: boot the board, execute iperf -c 192.168.1.2 -i 1 -t600 -p 5001
or iperf3 -c 192.168.1.2 -i1 -fm -t600 -p 5002
and wait for 5-10 seconds. Most of the time measured bandwidth will start at ~950Mbps but then will cease to 0 with the NETDEV WATCHDOG message appearing in the kernel log. Sometimes if I leave the board running with this stalled network state I start to periodically get new stacktraces in the kernel log.
First one:
[ 108.541232] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 108.546864] 0-...: (1 GPs behind) idle=87e/140000000000002/0 softirq=1347/1347 fqs=1051
[ 108.554971] (detected by 1, t=2103 jiffies, g=405, c=404, q=38)
[ 108.560935] Sending NMI from CPU 1 to CPUs 0:
[ 108.565453] NMI backtrace for cpu 0
[ 108.565457] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 108.565459] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 108.565461] task: c0906c40 task.stack: c0900000
[ 108.565463] PC is at __do_softirq+0x9c/0x25c
[ 108.565465] LR is at __do_softirq+0x8c/0x25c
[ 108.565467] pc : [<c010152c>] lr : [<c010151c>] psr: 20000113
[ 108.565469] sp : c0901eb0 ip : 0d01d83a fp : 00000000
[ 108.565471] r10: df008000 r9 : c069b9a4 r8 : 00000001
[ 108.565474] r7 : ffffe000 r6 : 000000e3 r5 : 00000000 r4 : 00000008
[ 108.565476] r3 : 00000000 r2 : c092dd00 r1 : c06a8078 r0 : 00000000
[ 108.565478] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 108.565481] Control: 10c5387d Table: 9d51006a DAC: 00000051
[ 108.565483] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 108.565485] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 108.565488] [<c010ec84>] (unwind_backtrace) from [<c010abf0>] (show_stack+0x10/0x14)
[ 108.565490] [<c010abf0>] (show_stack) from [<c05af784>] (dump_stack+0x78/0x8c)
[ 108.565493] [<c05af784>] (dump_stack) from [<c05b4eb4>] (nmi_cpu_backtrace+0x6c/0xb4)
[ 108.565496] [<c05b4eb4>] (nmi_cpu_backtrace) from [<c010dbb0>] (handle_IPI+0xec/0x1b0)
[ 108.565498] [<c010dbb0>] (handle_IPI) from [<c010148c>] (gic_handle_irq+0x8c/0x90)
[ 108.565501] [<c010148c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 108.565502] Exception stack(0xc0901e60 to 0xc0901ea8)
[ 108.565505] 1e60: 00000000 c06a8078 c092dd00 00000000 00000008 00000000 000000e3 ffffe000
[ 108.565508] 1e80: 00000001 c069b9a4 df008000 00000000 0d01d83a c0901eb0 c010151c c010152c
[ 108.565509] 1ea0: 20000113 ffffffff
[ 108.565512] [<c010b7cc>] (__irq_svc) from [<c010152c>] (__do_softirq+0x9c/0x25c)
[ 108.565514] [<c010152c>] (__do_softirq) from [<c011c00c>] (irq_exit+0xbc/0x104)
[ 108.565517] [<c011c00c>] (irq_exit) from [<c0157410>] (__handle_domain_irq+0x80/0xec)
[ 108.565520] [<c0157410>] (__handle_domain_irq) from [<c010144c>] (gic_handle_irq+0x4c/0x90)
[ 108.565522] [<c010144c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 108.565524] Exception stack(0xc0901f48 to 0xc0901f90)
[ 108.565527] 1f40: 00000000 00000001 df7a4ac0 ffffa8a3 c08337e0 c092d480
[ 108.565530] 1f60: c0903c00 ffffffff c092d480 c0823a28 e07fcb80 00000000 00000000 c0901f98
[ 108.565531] 1f80: c017bae8 c017c188 20000013 ffffffff
[ 108.565534] [<c010b7cc>] (__irq_svc) from [<c017c188>] (tick_nohz_idle_enter+0x44/0x78)
[ 108.565537] [<c017c188>] (tick_nohz_idle_enter) from [<c014c47c>] (do_idle+0x10/0x138)
[ 108.565539] [<c014c47c>] (do_idle) from [<c014c84c>] (cpu_startup_entry+0x18/0x1c)
[ 108.565542] [<c014c84c>] (cpu_startup_entry) from [<c0800c80>] (start_kernel+0x3c0/0x3cc)
Second one:
[ 171.590892] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 171.596526] 0-...: (1 GPs behind) idle=87e/140000000000001/0 softirq=1347/1347 fqs=4204
[ 171.604633] (detected by 1, t=8408 jiffies, g=405, c=404, q=123)
[ 171.610681] Sending NMI from CPU 1 to CPUs 0:
[ 171.615197] NMI backtrace for cpu 0
[ 171.615201] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 171.615203] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 171.615205] task: c0906c40 task.stack: c0900000
[ 171.615207] PC is at __do_softirq+0x9c/0x25c
[ 171.615209] LR is at __do_softirq+0x8c/0x25c
[ 171.615211] pc : [<c010152c>] lr : [<c010151c>] psr: 20000113
[ 171.615213] sp : c0901eb0 ip : 0d01d83a fp : 00000000
[ 171.615215] r10: df008000 r9 : c069b9a4 r8 : 00000001
[ 171.615217] r7 : ffffe000 r6 : 000000e3 r5 : 00000000 r4 : 00000008
[ 171.615220] r3 : 00000000 r2 : c092dd00 r1 : c06a8078 r0 : 00000000
[ 171.615222] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 171.615224] Control: 10c5387d Table: 9d51006a DAC: 00000051
[ 171.615227] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 171.615229] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 171.615232] [<c010ec84>] (unwind_backtrace) from [<c010abf0>] (show_stack+0x10/0x14)
[ 171.615234] [<c010abf0>] (show_stack) from [<c05af784>] (dump_stack+0x78/0x8c)
[ 171.615237] [<c05af784>] (dump_stack) from [<c05b4eb4>] (nmi_cpu_backtrace+0x6c/0xb4)
[ 171.615239] [<c05b4eb4>] (nmi_cpu_backtrace) from [<c010dbb0>] (handle_IPI+0xec/0x1b0)
[ 171.615242] [<c010dbb0>] (handle_IPI) from [<c010148c>] (gic_handle_irq+0x8c/0x90)
[ 171.615244] [<c010148c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 171.615246] Exception stack(0xc0901e60 to 0xc0901ea8)
[ 171.615249] 1e60: 00000000 c06a8078 c092dd00 00000000 00000008 00000000 000000e3 ffffe000
[ 171.615251] 1e80: 00000001 c069b9a4 df008000 00000000 0d01d83a c0901eb0 c010151c c010152c
[ 171.615253] 1ea0: 20000113 ffffffff
[ 171.615256] [<c010b7cc>] (__irq_svc) from [<c010152c>] (__do_softirq+0x9c/0x25c)
[ 171.615258] [<c010152c>] (__do_softirq) from [<c011c00c>] (irq_exit+0xbc/0x104)
[ 171.615261] [<c011c00c>] (irq_exit) from [<c0157410>] (__handle_domain_irq+0x80/0xec)
[ 171.615263] [<c0157410>] (__handle_domain_irq) from [<c010144c>] (gic_handle_irq+0x4c/0x90)
[ 171.615266] [<c010144c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 171.615268] Exception stack(0xc0901f48 to 0xc0901f90)
[ 171.615270] 1f40: 00000000 00000001 df7a4ac0 ffffa8a3 c08337e0 c092d480
[ 171.615273] 1f60: c0903c00 ffffffff c092d480 c0823a28 e07fcb80 00000000 00000000 c0901f98
[ 171.615275] 1f80: c017bae8 c017c188 20000013 ffffffff
[ 171.615278] [<c010b7cc>] (__irq_svc) from [<c017c188>] (tick_nohz_idle_enter+0x44/0x78)
[ 171.615280] [<c017c188>] (tick_nohz_idle_enter) from [<c014c47c>] (do_idle+0x10/0x138)
[ 171.615283] [<c014c47c>] (do_idle) from [<c014c84c>] (cpu_startup_entry+0x18/0x1c)
[ 171.615285] [<c014c84c>] (cpu_startup_entry) from [<c0800c80>] (start_kernel+0x3c0/0x3cc)
And so on.
Another possible but rare case is when netdev watchdog message appears in the kernel log but the network exchange don’t stall. In these case if I leave iperf running then after some time new stack trace appears in the kernel log:
[ 897.112518] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
[ 921.512426] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 921.518062] 0-...: (1 GPs behind) idle=a7e/2/0 softirq=6121/6123 fqs=526
[ 921.524878] (detected by 1, t=2363 jiffies, g=1819, c=1818, q=20)
[ 921.531014] Sending NMI from CPU 1 to CPUs 0:
[ 921.535539] NMI backtrace for cpu 0
[ 921.535543] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 921.535545] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 921.535547] task: c0906c40 task.stack: c0900000
[ 921.535549] PC is at __do_softirq+0x9c/0x25c
[ 921.535551] LR is at __do_softirq+0x8c/0x25c
[ 921.535553] pc : [<c010152c>] lr : [<c010151c>] psr: 20000113
[ 921.535555] sp : c0901eb0 ip : 86f75b2f fp : 00000000
[ 921.535557] r10: df008000 r9 : c069b9a4 r8 : 00000001
[ 921.535560] r7 : ffffe000 r6 : 000000e3 r5 : 00000000 r4 : 00000008
[ 921.535562] r3 : 00000000 r2 : c092dd00 r1 : c06a8078 r0 : 00000000
[ 921.535564] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 921.535566] Control: 10c5387d Table: 9e1a406a DAC: 00000051
[ 921.535569] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.14.117-4.14-main #2
[ 921.535571] Hardware name: Mediatek Cortex-A7 (Device Tree)
[ 921.535574] [<c010ec84>] (unwind_backtrace) from [<c010abf0>] (show_stack+0x10/0x14)
[ 921.535576] [<c010abf0>] (show_stack) from [<c05af784>] (dump_stack+0x78/0x8c)
[ 921.535579] [<c05af784>] (dump_stack) from [<c05b4eb4>] (nmi_cpu_backtrace+0x6c/0xb4)
[ 921.535581] [<c05b4eb4>] (nmi_cpu_backtrace) from [<c010dbb0>] (handle_IPI+0xec/0x1b0)
[ 921.535584] [<c010dbb0>] (handle_IPI) from [<c010148c>] (gic_handle_irq+0x8c/0x90)
[ 921.535586] [<c010148c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 921.535588] Exception stack(0xc0901e60 to 0xc0901ea8)
[ 921.535591] 1e60: 00000000 c06a8078 c092dd00 00000000 00000008 00000000 000000e3 ffffe000
[ 921.535594] 1e80: 00000001 c069b9a4 df008000 00000000 86f75b2f c0901eb0 c010151c c010152c
[ 921.535595] 1ea0: 20000113 ffffffff
[ 921.535598] [<c010b7cc>] (__irq_svc) from [<c010152c>] (__do_softirq+0x9c/0x25c)
[ 921.535600] [<c010152c>] (__do_softirq) from [<c011c00c>] (irq_exit+0xbc/0x104)
[ 921.535603] [<c011c00c>] (irq_exit) from [<c0157410>] (__handle_domain_irq+0x80/0xec)
[ 921.535606] [<c0157410>] (__handle_domain_irq) from [<c010144c>] (gic_handle_irq+0x4c/0x90)
[ 921.535608] [<c010144c>] (gic_handle_irq) from [<c010b7cc>] (__irq_svc+0x6c/0xa8)
[ 921.535610] Exception stack(0xc0901f48 to 0xc0901f90)
[ 921.535613] 1f40: 00000000 c069b9a4 1ef6f000 c0114000 ffffe000 c0903c74
[ 921.535615] 1f60: c0903c28 c092809c c069e04c c0823a28 e07fcb80 00000000 00000000 c0901f98
[ 921.535617] 1f80: c01082ac c01082b0 60000013 ffffffff
[ 921.535620] [<c010b7cc>] (__irq_svc) from [<c01082b0>] (arch_cpu_idle+0x38/0x3c)
[ 921.535622] [<c01082b0>] (arch_cpu_idle) from [<c014c53c>] (do_idle+0xd0/0x138)
[ 921.535624] [<c014c53c>] (do_idle) from [<c014c84c>] (cpu_startup_entry+0x18/0x1c)
[ 921.535627] [<c014c84c>] (cpu_startup_entry) from [<c0800c80>] (start_kernel+0x3c0/0x3cc)
After this stacktrace onward serial console becomes non-responsive but not dead: it echos back all characters I type, it reacts on Ctrl+S/Ctrl+Q, but don’t react on Ctrl-C or Ctrl-Z - i.e. kernel hadn’t hang completely but system went utterly broken.
I had repeated my tests with frank’s kernel 4.9-main and wasn’t able to reproduce the problem. It also seems to be fixed in frank’s kernel 5.1-p5detect2. Hadn’t tested with 5.0 or other kernels between 4.14 and 5.1 yet.
Any ideas what might be the cause of the problem here?