[BPI-R2] Porting 2nd gmac to 4.19

it seems that some of the ethernet-Patches causes this crash and that they are not needed (as i thought). I applied the patch-series (cherry-picked from 4.20-gmac_test_dsa_only) i’ve posted to mainline-kernel to 4.19…

after testing with 4.20 and now applied to (new) 4.19-gmac and tested again…seems to work without problems

here my tests (iperf over wan,lan0,wan again and lan0 again):

log
root@bpi-r2:~# uname -r
4.19.10-bpi-r2-gmac
root@bpi-r2:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 9a:da:30:8e:aa:52 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 5e:06:08:25:eb:ef brd ff:ff:ff:ff:ff:ff
4: wan@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5e:06:08:25:eb:ef brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.11/24 brd 192.168.0.255 scope global wan
       valid_lft forever preferred_lft forever
5: lan0@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9a:da:30:8e:aa:52 brd ff:ff:ff:ff:ff:ff
6: lan1@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9a:da:30:8e:aa:52 brd ff:ff:ff:ff:ff:ff
7: lan2@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9a:da:30:8e:aa:52 brd ff:ff:ff:ff:ff:ff
8: lan3@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
    link/ether 9a:da:30:8e:aa:52 brd ff:ff:ff:ff:ff:ff
9: wan.60@wan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5e:06:08:25:eb:ef brd ff:ff:ff:ff:ff:ff
    inet 192.168.60.1/24 brd 192.168.60.255 scope global wan.60
       valid_lft forever preferred_lft forever
10: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 8e:c6:30:ed:4c:43 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.1/24 brd 10.0.3.255 scope global lxcbr0
       valid_lft forever preferred_lft forever
root@bpi-r2:~# iperf -c 192.168.0.21
------------------------------------------------------------
Client connecting to 192.168.0.21, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.11 port 36498 connected with 192.168.0.21 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes   940 Mbits/sec
root@bpi-r2:~# ip link set wan down
[  129.232523] mt7530 mdio-bus:00 wan: Link is Down
root@bpi-r2:~# ip addr add 192.168.0.19/24 dev lan0
root@bpi-r2:~# ip link set lan0 up
[  144.462925] mt7530 mdio-bus:00 lan0: configuring for phy/gmii link mode
root@bpi-r2:~# [  150.717531] mt7530 mdio-bus:00 lan0: Link is Up - 1Gbps/Full - flow control off

root@bpi-r2:~# 
root@bpi-r2:~# iperf -c 192.168.0.21
------------------------------------------------------------
Client connecting to 192.168.0.21, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.19 port 38498 connected with 192.168.0.21 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.10 GBytes   942 Mbits/sec
root@bpi-r2:~# ip link set lan0 down                                                                                                            
[  233.152510] mt7530 mdio-bus:00 lan0: Link is Down
root@bpi-r2:~# 
root@bpi-r2:~# 
root@bpi-r2:~# 
root@bpi-r2:~# ip link set wan up
[  289.553227] mt7530 mdio-bus:00 wan: configuring for phy/gmii link mode
root@bpi-r2:~# [  292.717522] mt7530 mdio-bus:00 wan: Link is Up - 1Gbps/Full - flow control off

root@bpi-r2:~# iperf -c 192.168.0.21
------------------------------------------------------------
Client connecting to 192.168.0.21, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.11 port 36502 connected with 192.168.0.21 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes   940 Mbits/sec
root@bpi-r2:~# ip link set wan down
[  318.692540] mt7530 mdio-bus:00 wan: Link is Down
root@bpi-r2:~# 
root@bpi-r2:~# ip link set lan0 up
[  352.022970] mt7530 mdio-bus:00 lan0: configuring for phy/gmii link mode
root@bpi-r2:~# [  355.117526] mt7530 mdio-bus:00 lan0: Link is Up - 1Gbps/Full - flow control off

root@bpi-r2:~# iperf -c 192.168.0.21
------------------------------------------------------------
Client connecting to 192.168.0.21, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.19 port 38502 connected with 192.168.0.21 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.10 GBytes   942 Mbits/sec
root@bpi-r2:~#

i hope here are some users to test…

fixed crash if cpu-option is not set and want to post it to mainline, but mails got currently blocked on infradead (same as hdmi). contacted postmaster

if anybody here want to try it here is the actual patchset:

contacted andrew and florian how to get further to bring it mainline (they want to avoid dts-option and prefer bridge by user)

Have this problem too

[Tue Aug 13 13:40:01 2019] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
[Tue Aug 13 13:40:06 2019] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
[Tue Aug 13 13:40:12 2019] mtk_soc_eth 1b100000.ethernet eth0: transmit timed out
[Tue Aug 13 13:40:22 2019] mtk_soc_eth 1b100000.ethernet eth0: transmit timed out
[Tue Aug 13 13:40:32 2019] mtk_soc_eth 1b100000.ethernet eth0: transmit timed out
[Tue Aug 13 13:40:42 2019] mtk_soc_eth 1b100000.ethernet eth0: transmit timed out

Kernel version is

Linux router.shashilx.local 4.19.66-bpi-r2-main #3 SMP Mon Aug 12 13:07:02 EEST 2019 armv7l GNU/Linux

[Mon Aug 12 20:35:01 2019] ------------[ cut here ]------------
[Mon Aug 12 20:35:01 2019] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x27c/0x280
[Mon Aug 12 20:35:01 2019] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
[Mon Aug 12 20:35:01 2019] Modules linked in: ppp_mppe xt_multiport ppp_async crc_ccitt ppp_generic slhc nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_nat_ipv4 xt_nat nf_nat nft_counter ipt_REJECT nf_reject_ipv4 xt_TCPMSS xt_state xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink mt7530 dsa_core phylink bridge mtk_thermal spi_mt65xx thermal_sys pwm_mediatek mtk_pmic_keys mt6577_auxadc nvmem_mtk_efuse ip_tables x_tables ipv6
[Mon Aug 12 20:35:01 2019] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.66-bpi-r2-main #3
[Mon Aug 12 20:35:01 2019] Hardware name: Mediatek Cortex-A7 (Device Tree)
[Mon Aug 12 20:35:01 2019] [<c0114784>] (unwind_backtrace) from [<c010e4a4>] (show_stack+0x20/0x24)
[Mon Aug 12 20:35:01 2019] mtk_soc_eth 1b100000.ethernet eth1: transmit timed out
[Mon Aug 12 20:35:01 2019] [<c010e4a4>] (show_stack) from [<c0b625ac>] (dump_stack+0xb8/0xcc)
[Mon Aug 12 20:35:01 2019] [<c0b625ac>] (dump_stack) from [<c012830c>] (__warn+0x104/0x11c)
[Mon Aug 12 20:35:01 2019] [<c012830c>] (__warn) from [<c012837c>] (warn_slowpath_fmt+0x58/0x74)
[Mon Aug 12 20:35:01 2019] [<c012837c>] (warn_slowpath_fmt) from [<c097623c>] (dev_watchdog+0x27c/0x280)
[Mon Aug 12 20:35:01 2019] [<c097623c>] (dev_watchdog) from [<c01a9368>] (call_timer_fn+0x4c/0x194)
[Mon Aug 12 20:35:01 2019] [<c01a9368>] (call_timer_fn) from [<c01a9598>] (expire_timers+0xe8/0x148)
[Mon Aug 12 20:35:01 2019] [<c01a9598>] (expire_timers) from [<c01a98d0>] (run_timer_softirq+0xb8/0x1e4)
[Mon Aug 12 20:35:01 2019] [<c01a98d0>] (run_timer_softirq) from [<c0102398>] (__do_softirq+0xe8/0x384)
[Mon Aug 12 20:35:01 2019] [<c0102398>] (__do_softirq) from [<c012f150>] (irq_exit+0xd8/0x108)
[Mon Aug 12 20:35:01 2019] [<c012f150>] (irq_exit) from [<c018adb8>] (__handle_domain_irq+0x70/0xc4)
[Mon Aug 12 20:35:01 2019] [<c018adb8>] (__handle_domain_irq) from [<c0102268>] (gic_handle_irq+0x5c/0xa0)
[Mon Aug 12 20:35:01 2019] [<c0102268>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[Mon Aug 12 20:35:01 2019] Exception stack(0xde94bf38 to 0xde94bf80)
[Mon Aug 12 20:35:01 2019] bf20:                                                       00000000 0001845c
[Mon Aug 12 20:35:01 2019] bf40: df5ab408 c0121400 de94a000 c1204c70 c1204cb8 00000002 c12b6083 c0e45be8
[Mon Aug 12 20:35:01 2019] bf60: 00000000 de94bf94 de94bf98 de94bf88 c010a680 c010a684 600f0013 ffffffff
[Mon Aug 12 20:35:01 2019] [<c0101a0c>] (__irq_svc) from [<c010a684>] (arch_cpu_idle+0x48/0x4c)
[Mon Aug 12 20:35:01 2019] [<c010a684>] (arch_cpu_idle) from [<c0b7f940>] (default_idle_call+0x30/0x3c)
[Mon Aug 12 20:35:01 2019] [<c0b7f940>] (default_idle_call) from [<c015cd9c>] (do_idle+0xec/0x16c)
[Mon Aug 12 20:35:01 2019] [<c015cd9c>] (do_idle) from [<c015d0dc>] (cpu_startup_entry+0x28/0x2c)
[Mon Aug 12 20:35:01 2019] [<c015d0dc>] (cpu_startup_entry) from [<c0111e80>] (secondary_start_kernel+0x170/0x194)
[Mon Aug 12 20:35:01 2019] [<c0111e80>] (secondary_start_kernel) from [<801026cc>] (0x801026cc)
[Mon Aug 12 20:35:01 2019] ---[ end trace e56451d93e95659f ]---

Don’t do anything special just use it as home router with some services. Have this second time for 24h

You could try revert (or remove by git rebase if revert fails) the changes for 2nd gmac

https://github.com/frank-w/BPI-R2-4.14/commits/4.19-main?after=0a5ab450b676f79a205086cb2c61187b90fc13ba+69&author=frank-w

Starting with net:dsa:

in my tests it lookes like this is caused by additional network-patches (qdma,bql,…). after removing them i did not seen this on 4.19 again (made multiple iperf rounds switching from gmac0 to 1 and back)…hoping phylink gets merged soon and go into next LTS. phylink works better than mainline-driver. i guess this is because of wrong gmac-setup (both need to set to same mode regardless gmac1 does not support trgmii)

Can you tell me how to do this for newbie?

Have pushed a 4.19 branch without the network-patches. Compile it like the other branches using build.sh (importconfig/without param)

https://github.com/frank-w/BPI-R2-4.14/tree/4.19-without2ndgmac

Will try it but of course I will add persistent mac address for wan interface to dts file (I can’t get internet without it). As I understand it’s second gmac?

Without my patches you have only one gmac,so you can add only 1 mac for all ports (maybe change lanports later via ip-command)

No, I mean what we speak in this thread [BPI-R2 new image] debian 10 buster image with Kernel 4.19.62

right, but in the special branch (without second gmac) you have only gmac0 where you can set mac in dts

imho you cannot set mac on dsa-ports (wan,lan0-lan3) in dts

https://github.com/frank-w/BPI-R2-4.14/blob/982128e0b41116e49b6e50e36adc9411784624ac/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts#L200

So I need to set mac for wan in /etc/network/interfaces?

normally it should be enough to set mac in /etc/interfaces for the ports, else change the mac in my dts

@sHAsHiLx have you tried without second gmac?

[BPI-R2] Porting 2nd gmac to 4.19

Yes, but without luck. It’s still lost network ports randomly (can see crash in dmesg if its yet restores network but it’s not always) and this make me crazy. Have tried 4.14 - no errors but very poor pptp vpn and I need these two (network and pptp) work. So just put device on the shelf and wait till someone release stable kernel :wink:

Which crash do you see? Ethx timouts?

Nobody will release a “stable” kernel until problems are reproducable reported :slight_smile:

Btw have you tried phylink? Maybe it fixes your problem (5.3-phylinkX)

I’m not so understand what i need to try. Kernel 5.3?

We need a way to reproduce the error if it also happens with 5.3-phylink-x. So try thos version and look if you get the error again…then thunk about what you/system have done while the error comes up and try this again

Tried to give another try for this device and what I did - used latest stable ubuntu 19.04 and compiled kernel from it (before was compiled kernel from debian 9 and default gcc) with gcc 8.x - no error reproduced. Then installed GCC 9.x and tried - no driver erros for two days uptime (before was 3-5 times per day). Still watching for it.

Which kernel have you tried? I think it does not matter in which system kernel was compiled.

have you created ubuntu image by yourself?

this one

Linux version 4.19.69-bpi-r2-main (shashilx@ubuntu) (gcc version 9.1.0 (Ubuntu 9.1.0-2ubuntu2~19.04)) #2 SMP Fri Sep 6 11:45:53 EEST 2019

i think yes it doesn’t matter in which system but I think it’s matter wich compiler version. don’t know if it’s gcc version or newer kernel version (hangs was in .66 and it’s .69)