Ubuntu 18.4 with kernel 4.14.48


(Marco Alvarado) #61

Thanks Frank … the R2 it is booting from emmc without problems. However …

It is complicated to make the lan ports to work well with the 4.14.69 kernel (I suppose the problem is mine, but I can’t catch it). The wan port works flawlessly.

I have been trying a lot of configurations but I still don’t understand what I am doing wrong.

Could you check to see if there is something wrong with my network configuration?

(I can ping loosing packages … but all connection attempts are interrupted. Here I am using a bridge, but I also attempted with the independent lan ports without success)

auto lo
iface lo inet loopback
   dns-nameservers  8.8.8.8

auto eth0
iface eth0 inet manual
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

auto lan0
iface lan0 inet manual
  pre-up ip link set $IFACE up
 post-down ip link set $IFACE down

auto lan1
  iface lan1 inet manual
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

auto lan2
  iface lan2 inet manual
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

auto lan3
  iface lan3 inet manual
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

auto br0
iface br0 inet static
  address 192.168.51.115
  netmask 255.255.255.0
  bridge_ports lan0 lan1 lan2 lan3
  bridge_fd 5
  bridge_stp no

auto eth1
iface eth1 inet manual
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

auto wan
iface wan inet dhcp
  pre-up ip link set $IFACE up
  post-down ip link set $IFACE down

(Frank W.) #62

If using a bridge remove config for single ports (lanx).


(Marco Alvarado) #63

Hi …

After checking and re-checking everything 100 times, making all sort of customizations, trying different types of cables, 4.4, 4.9, 4.16, 4.17 kernels … no way, the lanX ports, alone or in a bridge basically don’t work.

Then, I was thinking about what you said on the R2 SATA port problems. So, I took my last R2 in stock (brand new), and made it to work with the SD card. And figure what …

It works!!!

This gives me a very bad feeling about the R2 machines. Not about SINOVOIP, because I have several M2+ and they work well … but it seems that the R2 have serious manufacturing problems. The main problem is that the R2 configuration is for heavy usage scenarios, but the machine seems not to be on pair with the real-life requirements :frowning:

P.S. The one failed it is also a new machine. Just unpacked it to install the software. And the one with SATA “issues” was also new.


(Frank W.) #64

you mean your lan-problem is also a manufacturing-issue not misconfiguration?

Have you tried one of my images?


(Marco Alvarado) #65

I am using your image with 4.4.69 (downloaded from your GIT). The same SD works with one machine and doesn’t work with the other.

With the machine have failures, it shows more than 40% lost packages, so it almost never can complete a connection (2 from 30 attempts)… and from these 2 successful connections, only one was working well … at least form the 2 minutes I used it … the other quickly was dropped.


(Marco Alvarado) #66

Just an update.

I have been configuring and recompiling everything for the 18.04 version (as many libraries change). But it seems to work well … I still need to return to the “failing machine” to re-check how it works.

However, I need to activate the user space modules in your imported configuration because without them the AF_ALG it is not available for cryptographic processing. Right now, I have the Hash functions working on 4.14.71.


(Frank W.) #67

Which config-option exactly is missing?


(Marco Alvarado) #68

Try these…

CONFIG_CRYPTO_USER=m  
CONFIG_CRYPTO_USER_API=m
CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_USER_API_AEAD=m

In this case, I am using them as modules, but this is my decission.


(Maciek Szelągowski) #69

How to do to pass DNS servers which I have got from my provider as DHCP WAN to br0 (bridge) DHCP clients?

Or there is another way to serve as DNS server. Ubuntu 18.04 with 4.14.71 kernel.

this is my /etc/dnsmasq.d/interfaces.conf:

interface=br0
#interface=wlp1s0
#interface=tun0
#interface=lxcbr0
#interface=eth1
#interface=ap0

# DHCP-Server not active for Interface
no-dhcp-interface=ppp0
#no-dhcp-interface=eth0
no-dhcp-interface=eth1
no-dhcp-interface=wlp1s0

#dhcp-authoritative
dhcp-range=br0,192.168.1.100,192.168.1.150,255.255.255.0,48h
dhcp-option=br0,3,192.168.1.1
dhcp-option=option:dns-server,192.168.1.1
#dhcp-range=wlp1s0,192.168.11.100,192.168.11.150,255.255.255.0,48h
#dhcp-option=wlp1s0wlp1s0,3,192.168.11.1

and my /etc/network/interfaces:

# ifupdown has been replaced by netplan(5) on this system.  See
# /etc/netplan for current configuration.
# To re-enable ifupdown on this system, you can run:
#    sudo apt install ifupdown

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan0
iface lan0 inet manual
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan1
iface lan1 inet manual
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan2
iface lan2 inet manual
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto lan3
iface lan3 inet manual
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto eth1
iface eth1 inet dhcp
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto wan
iface wan inet dhcp
#  pre-up ip link set $IFACE up
#  post-down ip link set $IFACE down

auto br0
iface br0 inet static
   address 192.168.1.1
   netmask 255.255.255.0
   network 192.168.1.0
   broadcast 192.168.1.255
   dns-nameservers 192.168.1.1
   bridge_ports lan0 lan1 lan2 lan3
   bridge_fd 5
   bridge_stp no
   bridge_maxwait 0

(Frank W.) #70

Dnsmasq normally also starts a dns-server which receives dna-requests from clients and “forward” them to dns configured on r2 (from your isp). Clients see r2 as dns. I have no other config yet,but maybe it’s possible to forward local dns-server-ips to clients


(Shujaa) #71

Hello Frank,

I’m still trying to make netem (namely “tc qdisc add dev lanX root netem delay 10ms”) work. When using that command on interfaces that have no IP address but are L2 bridged, nothing happens whatsoever (whereas delay would be introduced in regular x86 platforms).

If the interface happens to have an IP address, I get an immediate kernel panic…

I’ve tried with other usb-based network adapter for which I added kernel driver support, but the issue is the same, so apparently might not even be linked to BPI specific switching chip.

Do you have any idea how I could tackle that ? I have no experience in troubleshooting kernel panics… Currently trying again with 4.19 kernel in case this would be fixed. I mainly need this and interface separation.

Thanks, (log attached).crash_netem.log (9,3 Ko)


(Frank W.) #72

Is it fixed in 4.19?

I would try to get last functions (most top in log) from call trace and simply google for them with the error (null pointer) and look if there is a patch for it on a mailing list.

If it works on 4.19 i would try 4.18/17/16/15 to figure out when it was fixed and the look through kernel-commit-log


(Shujaa) #73

Hey Frank,

Making progress: nailed it. I was able to isolate several kernel versions that work well with tc/netem. It’s actually broken only in 4.14 afaik, but some random seg fault tend to appear for example when bridging vlan interfaces etc…

My issue however has evolved to something a little more troublesome: the switching chip (M7530) appears to be discarding ARP replies on ports. This leads to unusable openvSwitch configurations as hosts cannot even discover themselves. The other protocol work like a charm afaik…

Would you have any idea how I could work around this ? I’m looking for an official BPI support, but couldn’t find one. This is a pretty serious issue I believe for a router board.


(Frank W.) #74

try to find these functions and add printks to it

[  130.808234] [<c07c4b4c>] (netif_skb_features) from [<c07c4f2c>] (validate_xmit_skb+0x24/0x2f8)
[  130.816789] [<c07c4f2c>] (validate_xmit_skb) from [<c07c5240>] (validate_xmit_skb_list+0x40/0x70)
[  130.825604] [<c07c5240>] (validate_xmit_skb_list) from [<c07f5b44>] (sch_direct_xmit+0x17c/0x1c4)
[  130.834418] [<c07f5b44>] (sch_direct_xmit) from [<c07f5c34>] (__qdisc_run+0xa8/0x36c)
[  130.842196] [<c07f5c34>] (__qdisc_run) from [<c07c1cc0>] (net_tx_action+0x14c/0x254)
[  130.849889] [<c07c1cc0>] (net_tx_action) from [<c01015dc>] (__do_softirq+0xec/0x370)

you can grep these functions in sourcetree of my repo with “grep -Rn ‘function’ .” then you can add this line after each existing:

printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

recompile, boot and look which is the last line printed. then examine the line found and maybe add the vars accessed in the line after to the printk before

the crash itself should happen in validate_xmit_skb by calling netif_skb_features

./net/core/dev.c:3041:static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)

3041 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_d$
3042 {
3043     netdev_features_t features;
3044
3045     features = netif_skb_features(skb);
3046     skb = validate_xmit_vlan(skb, features);

it looks you have to add printks to netif_skb_features which is in same file at line 2945

maybe you can try a mainline 4.14, swap mmc and move uart2 before the others in dtsi-file…add my defconfig and build-script, recompile again

kernel failure is a “Fatal exception in interrupt” so maybe it results by one of my modifications (2nd gmac or internal wifi-driver/watchdog)


(Marco Alvarado) #75

There was some discussions about the zRAM technology in the Armbian forum. It seems to be very useful for an R2 machine. Later I will try to make it to work with the 4.14.71 in an R2 device to check it on real life. If this works well (the scripting), it seems to be a good improvement on the R2 Ubuntu 18.04 image.


(Shujaa) #76

Thanks Frank,

I’ll have a look into it in a bit, but cannot right now, I’ll keep you updated once I have debugged this. For other users, kernels 14.18 and 14.19 work fine on that aspect (ie: injecting network latency on L3 interfaces)

At the moment I’m being pressed to find a usable network latency injector, and looks like this ARP issue will discredit the banana-pi completely. Do you have any idea how I could get support for this particular issue, I’d be OK debugging the MT7530 driver, but would require lots of documentation to do so…

Cheers,


(Frank W.) #77

you are using my repo or compiled binaries?

you can try as a quick test a 4.14 from 4.14-main before 2nd-gmac (4.14.53+) so please try to reset 4.14-main to https://github.com/frank-w/BPI-R2-4.14/commit/6ef94287ce470f615a17fedb577b607997082c0e, recompile and test again

here you see that 2nd gmac is merged after tht: https://github.com/frank-w/BPI-R2-4.14/commits/4.14-main?after=723a96f6eaf45451130c250e0bba73142995b837+1835

you can also use my precompiled kernels on gdrive (e.g. 4.14.48-main):

https://drive.google.com/open?id=1EGN1TvqCpDHdOAS-mjRg9ipi0kahnOUV