Kernel very limited x-table support?

I tried the 4.19.21, got a big oops 5 on the one time it had HDMI output. (other boots it just showed a blank, white screen and went black after that) Will try to downgrade kernel with the sdcard mounted locally.

had you hdmi running on a recent 4.19-kernel? can you post the oops?

Got it on the last release, got it on the last release of 4.14 as well, made a picture, but it’s not that sharp. I’m trying to get one again with a working hdmi, but it looks like the device wants to show output only the 1st boot of the kernel, after that the screen stays blank. (can’t think of a reason why it shouldn’t show)

if you have debug-uart you can copy it from minicom…else if it is only an oops you can login on system and get it from syslog

Hmmm, it’s an oops5 on the network, it just kept on displaying link up/down messages and didn’t continue. (no login, no network) At least on the times it did give output.

Hmmm, I seem not to be downgrading the kernel correctly, even after untarring the older kernels on the sdcard (BPI-BOOT and BPI-ROOT correctly mounted, tar xvzf in the dir with the BPI dirs) it boots the 4.19.21 kernel. (managed to get extra pics of the OOPS though, they’re at https://jan.huijsmans.nu/BPi-R2/ )

Will have to try again later, as I need to finish a kitchen. (at least, when I want something to eat :wink: )

1 Like

it depends on your uenv.txt which file is loaded by default…make sure your kernel-var is set to new filename

have you vlans/bridges defined on any of the dsa-interfaces (lanx,wan)?

which was your previous kernel and does it works without this oops?

i have currently running 4.14.98 on my r2, but no bridge, but vlans (currently not used, but defined)

have no oops on 4.19.23, working fbdev and xserver, activated bridge but no vlan

Ah, ok, downgraded to the 4.14.98 kernel by editing the uenv.txt, It had 2 kernel variables set, with the 4.19.21 as last one. I use both vlans and bridges, as I translate the lanx. to vlan and bridge lan2 with lan3.242. Which triggers an aha moment. I’m having issues with ipv6 on 1 vlan, and that the one that is bridged as well. (rest is handled by switches) I just installed the 4.14.101 kernel and that works, at least no oops. I have the following issues left:

  • Connection tracking on ipv6 seems to be ‘sketchy’ Some connections work, some don’t. Almost as if some netfilter modules that are requirements of others are missing. Could you include all netfilter modules in the deb? (xt_multiport and xt_hl are missing anyway in the 4.14.101, will test the 4.19.23 later)
  • HDMI display not working after a reboot. WIth a power off/on the display seems to work, reboot (no power cycle of the board) the display goes white at reboot, and stays black when during power-on it shows console after a few sec of black.

Edit: 4.19.23 kernel probably gives oops as well, power off and on is indeed the only way to get display working.

it seems that the oops depend on your network-config (bridge/vlan and dsa-driver), so please post it

try to boot up without bridge/vlan and try to set them manually after boot-up and share command-log with me, so we see after which command the oops occurs

for bridge:

https://wiki.archlinux.org/index.php/Network_bridge#With_bridge-utils

Create a new bridge:
# brctl addbr bridge_name

Add a device to a bridge, for example eth0:
# brctl addif bridge_name eth0

Show current bridges and what interfaces they are connected to:
$ brctl show

Set the bridge device up:
# ip link set dev bridge_name up

Add IP
# ip addr add dev bridge_name 192.168.66.66/24

Delete a bridge, you need to first set it to down:
# ip link set dev bridge_name down
# brctl delbr bridge_name

vlan (replace eth0 with your interface and 5 with your vlan):

# ip link add link eth0 name eth0.5 type vlan id 5
# ip link
# ip -d link show eth0.5

You need to activate and add an IP address to vlan link, type:
# ip addr add 192.168.1.200/24 brd 192.168.1.255 dev eth0.5
# ip link set dev eth0.5 up

remove vlan5
# ip link set dev eth0.5 down
# ip link delete eth0.5

btw. my deb adds a new kernel-line with installed kernel and remove it when deinstalling…so it makes sure to boot the installed kernel and make a clean uninstall (fallback to old kernel), so if installing deb, uninstalling the same way (if you can bootup), else you need to remove the line by yourself

Interfaces:

# interfaces(5) file used by ifup(8) and ifdown(8)
# Include files from /etc/network/interfaces.d:
auto eth0
iface eth0 inet manual
   pre-up ip link set dev $IFACE address de:ad:00:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto eth1
iface eth1 inet manual
   pre-up ip link set dev $IFACE address de:ad:01:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan0
iface lan0 inet manual
   pre-up ip link set dev $IFACE address de:ad:00:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan1
iface lan1 inet manual
   pre-up ip link set dev $IFACE address de:ad:00:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan2
iface lan2 inet manual
   pre-up ip link set dev $IFACE address de:ad:00:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan3
iface lan3 inet manual
   pre-up ip link set dev $IFACE address de:ad:00:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto wan
iface wan inet manual
   pre-up ip link set dev $IFACE address de:ad:01:be:ef:00
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

source-directory /etc/network/interfaces.d

One of the vlans:

# /etc/network/interfaces -- configuration file for ifup(8), ifdown(8)

auto lan3.242
iface lan3.242 inet manual

#auto lan2
#iface lan2 inet manual

auto vlan242
iface vlan242 inet static
        address 10.13.242.1
        netmask 255.255.255.0
        broadcast 10.13.242.255
        bridge_ports lan3.242
        bridge_stp off
        bridge_fd 0

iface vlan242 inet6 static
        address 2001:470:7c29:242::1
        #up /sbin/ip addr add 2001:610:611:242::1/64 dev vlan242
        netmask 64

With this setup I have loads of issues with ipv6 and the outgoing pppoe connection to my provider. (the pppoe connection needs to be reset after boot-up to get it to work)

I’m getting the feeling that stacking eth0, lan3 and a vlan is giving issues… As I need a working router I’ll place the R1 back and start messing around with the R2, now I know how to get the HDMI port working. Will try agani next week. (need to get the kitchen done)

1 Like

as i see the interface vlan242 is basicly a bridge (and should be named like this) and not a vlan…the real vlan is lan3.242

try to setup it manually as i mention above and look when the oops came up so i can reproduce it on my test-device

why do you set mac-addresses for dsa-ports? they inherit the address from eth0/1 if not set manually

i had reproduced the Problem (create bridge with vlan-member):

root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5
root@bpi-r2:~# ip link set dev lan0 up
root@bpi-r2:~# ip link set dev lan0.5 up

12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
    link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
    inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
       valid_lft forever preferred_lft forever

root@bpi-r2:~# brctl addbr bridge_name
root@bpi-r2:~# brctl addif bridge_name lan0.5
[  352.057128] bridge_name: port 1(lan0.5) entered blocking state
[  352.063065] bridge_name: port 1(lan0.5) entered disabled state
[  352.069181] device lan0.5 entered promiscuous mode
[  352.074018] device lan0 entered promiscuous mode
[  352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558
...
[  352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
[  352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
[  352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
[  352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
[  352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
[  352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
[  352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
[  352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
[  352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
[  352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
[  352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
[  352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge])
[  352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
[  352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324)
[  352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc)
[  352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580)

@ryder.lee / @moore / @linkerosa / @Jackzeng can you please try it on your system and or give me a hint why this happen?

regarding hdmi after reboot…is working for me, but after the crash, r2 does not reboot (hangs on shutdown)…how do you think it does the reboot?

debug:

./net/bridge/br_vlan.c

bool br_vlan_enabled(const struct net_device *dev)
{
    struct net_bridge *br = netdev_priv(dev);

    return !!br->vlan_enabled;
}

i guess netdev_priv or access access br brings the oops

since crash is initiated from dsa-stack…

./net/dsa/port.c:258

int dsa_port_vlan_add(struct dsa_port *dp,
              const struct switchdev_obj_port_vlan *vlan,
              struct switchdev_trans *trans)
{
    struct dsa_notifier_vlan_info info = {
        .sw_index = dp->ds->index,
        .port = dp->index,
        .trans = trans,
        .vlan = vlan,
    };

    if (netif_is_bridge_master(vlan->obj.orig_dev))
        return -EOPNOTSUPP;

    if (br_vlan_enabled(dp->bridge_dev))
        return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);

    return 0;
}

tested this with kernel 5.0-rc1 and here no crash occurs…maybe related with my dsa-patches, but i had not touched bridge-handling…

bridge_dev is assigned to dp in dsa_port_bridge_join which is called in dsa_slave_changeupper (./net/dsa/slave.c)

this seems to be related: https://www.mail-archive.com/[email protected]/msg281176.html

but strange that there is no crash in 5.0-rc1 which is released in jan 6th…

had reproduced this with 4.19 without my dsa-changes…so error depends not on my changes…and got my debug-message, which shows the problem mentioned in the mailing-list

[  135.750400] DEBUG: Passed dsa_port_vlan_add 258 0x0 

printk(KERN_ALERT "DEBUG: Passed %s %d 0x%x \n",__FUNCTION__,__LINE__,(unsigned int)dp->bridge_dev);
if (br_vlan_enabled(dp->bridge_dev))

Strange that 5.0-rc1 not crashes,because these 2 code-sections are unchanged: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/dsa/port.c#n255 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_vlan.c#n788

uploaded a fix for this oops to 4.19-main, travis is now building it (take ~20 minutes to compile+create release) and reported the bug with this fix to netdev-mailinglist because it is not introduced by my dsa-changes

Nice to see a fix so fast, will test it when I get a free moment. A couple of answers about this setup:

vlan named bridges:

This is the configuration I use on all my systems and works perfectly on all debian and raspbian based systems. On the ones with kvm it’s easy to link the clients to the vlan then need access to and the other systems have tagged vlan connections, for which this naming scheme is easier to maintain and ads uniformity to my systems.

mac on dsa:

My system sets the ‘hardware’ mac address of 02:02:02:02:02:02 on the eth0 dsa ports and the 02:03:03:03:03:03:03 to the eth1 dsa port, even though I set the mac address of eth0 and 1 differently. I was testing if this solved a few ipv6 issues I’m having, but nop. (and haven’t removed them again, as it is no issue as well)

hdmi after reboot

I know for sure it rebooted as I gave the command reboot, got kicked out, could login again and uptime was about 1m. I really need a power down (leds out) and power up to get hdmi working again, once. Wouldn’t be surprised if it was hardware related, as it’s an oldy by now.

i only fixed the oops itself, but it is still unclear why this function is called with bridge_dev=NULL and why no crash happens in 4.14 and 5.0

which issues do you have with ipv6, are they still there and on 4.14 and 4.19? how to reproduce them?

hdmi is strange because i rebooted (reboot-command) multiple times and always got hdmi-output, only the one time, i had the kernel-oops from vlan-bridge…here reboot hangs on shutdown.

After discussing with Florian (dsa-maintainer) he send a patch differently to mine (not return if bridge unset) and say we need to do

echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering

Sorry, way to busy with non-it stuff. I’ve installed the 4.14.105 kernel, as I’m messing about with bridging, will see is this helps me a bit.