Problem with NAT/ip_forward

@sinovoip Could you read this interesting topic and try to answer “what’s going on?” Regards :wink:

Test on the problematic BPI-R2 on which NAT can’t work with kernel 4.16.18.

On kernel:

root@bpi-iot-ros-ai:/home/pi# uname -a 
Linux bpi-iot-ros-ai 5.3.0-rc1-bpi-r2-phylink-2.5 #1 SMP Mon Jul 29 11:37:39 CEST 2019 armv7l GNU/Linux

with NAT up:

root@bpi-iot-ros-ai:/home/pi# iptables -t nat -L -vn | grep MASQ
   10   749 MASQUERADE  all  --  *      wan     0.0.0.0/0            0.0.0.0/0 
root@bpi-iot-ros-ai:/home/pi# sysctl -a | grep ip_forward                                                                                                                           
net.ipv4.ip_forward = 1

And on the test-PC:

root@slackware:~# wget http://noc.pirx.pl/100mb.bin -O /dev/null
Connecting to noc.pirx.pl (217.73.181.197:80)
null                  61% |*******************************************************************************                                                  | 63418k  0:00:08 ETA^C

@frank-w HDMI can’t work on this kernel version 5.3-rc1 (branch you point :frowning: )

It is stupid, strange and something else. How it is possible that the same kernel version works well on one BPI-R2 board, but not work with the second other. How? :rage:

edit2: @frank-w Do you have a branch with phylink in your repository for kernel 4.16.18?

Hdmi only works on 5.3-hdmi branch…

It is possible to merge this two branches to have working HDMI and NAT on this kernel version (5.3)? I will make a more test on other problematic BPI-R2 boards. I have a few pieces of them.

Just use hdmi-branch if you want to use mainline net driver

How to means “mainline”? It is the same net driver which is in “5.3-phylink-2.5” branch?

Mainline is default driver from linux…phylink is changed network driver

OK, thx for clarification.

Now, I have compiled kernel 5.3-rc1 with phylink driver and NAT works well on the same board, on which don’t worked with kernel 4.16. My question was “It is possible to use phylink driver and have working HDMI too” on kernel 5.3-rc1 or exist the patch for kernel 4.16 to use phylink driver? I will be able to check if NAT will work OK this older kernel version with phylink driver too.

Phylink for mtk does not exist for 4.16 because it is currently in development

To try out phylink and hdmi together you need to merge this 2 branches (see bpi-r2 kernel development thread) but if all works with mainline net-driver your problem is solved,right?

Yes, of course. So, I must check now with mainline net driver :slight_smile: I’m a little afraid that I will have the same situation that had with kernel 4.16… Let me check :slight_smile:

Strange. On this problematic BPI-R2 (on which NAT don’t work with kernel 4.16) with kernel 5.3.0-rc1-bpi-r2-hdmi, looks that NAT works. I get the next one, on which have problems with NAT on kernel 4.16 and on kernel 5.3 NAT seems to work but the speed is very, very slow… Strange and I’m tired this not clear situation with this BPI-R2 boards. I haven’t any other idea about the correct reason of this problems.

have you done a powercycle between 4.16 and 5.3? Can you look if phylink works better?

What you mean “done a powercycle between 4.16 and 5.3”? If this mean power off and then power on this board - yes, I done it. I will do test of course with phylink too.

yes i mean power-off not only reset between kernels…these are really hard to find issues if a previous kernel sets a hardware-register/memory to some value and the next kernel does not overwrite/miss it. so you have that memory-content from old kernel and have same issue (or it works till power-cycle, and no more from cold boot-up)

All reboots (kernel version switch) are done as a full power cycle, not reboot. With kernel

5.3.0-rc1-bpi-r2-phylink-2.5

on BPI-R2 I’m not able to use dhcp client on wan interface:

root@bpi-iot-ros-ai:/home/pi# dhcpcd wan
wan: waiting for carrier
timed out 

Network interface can’t obtain IP but we receive DNS settings from DHCP server:

root@bpi-iot-ros-ai:/home/pi# cat /etc/resolv.conf
# Generated by resolvconf
nameserver 10.10.0.1
nameserver fe80::fad1:11ff:feb0:da6f%wan

Network speed is very better with this phylink dirver than mainline. wget from test-PC with 5.3 kernel with phylink driver on BPI-R2 and NAT:

root@slackware:~# /usr/bin/wget http://noc.pirx.pl/100mb.bin -O /dev/null 
--2009-03-04 21:35:37--  http://noc.pirx.pl/100mb.bin
Resolving noc.pirx.pl (noc.pirx.pl)... 217.73.181.197
Connecting to noc.pirx.pl (noc.pirx.pl)|217.73.181.197|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: '/dev/null'

/dev/null                                     70%[=================================================================>                            ]  70.56M  7.39MB/s    eta 5s     ^C

wget from test-PC with 5.3.0-rc1-bpi-r2-hdmi with mainline driver on BPI-R2 and NAT:

root@slackware:~# /usr/bin/wget http://noc.pirx.pl/100mb.bin -O /dev/null 
--2009-03-04 21:48:21--  http://noc.pirx.pl/100mb.bin
Resolving noc.pirx.pl (noc.pirx.pl)... 217.73.181.197
Connecting to noc.pirx.pl (noc.pirx.pl)|217.73.181.197|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: '/dev/null'

/dev/null                                     13%[===========>                                                                                  ]  13.66M  1.72MB/s    eta 44s    ^C

dhcp client works well on wan interface with this mainline network driver.

:frowning: :frowning: [Is it not stiupid?]

do you see any arp from/to wan in phylink-kernel? have you done speed-test also on wan with phylink?

strange that dhcp not works on wan but traffic if manually configured…is dhcp-client working on lan-ports?

  1. What you mean - tcpdump?
  2. speed-test on wan with phylink - wget result will be enough?
  3. I’m not tested dhcp-client on lan ports. I setup dhcp server on one lan port (lan3) and it works. test-PC get IP from him.

tcpdump is an application to show traffic on a interface

sudo tcpdump -i wan #maybe you need to install it first

and then do (on other console) the dhcp-request

wget should be enough for your case…i see no explaination about why dhcp should not work if you can make normal traffic…also if link is down when runnning dhcp-client

i got response from rene (author of phylink-patches)…

if a link is active it works like the normal way…a single protocol should not be affected, if you can make traffic on that interface dhcp should also work…

I’m trying to merge 5.3-hdmi branch and 5.3-phylink-2.5…

root@bpi-r2:/usr/src/frank_5.3-hdmi# git checkout -b 5.3-hdmi-phylink
Switched to a new branch '5.3-hdmi-phylink'
root@bpi-r2:/usr/src/frank_5.3-hdmi# git merge 5.3-hdmi
Auto-merging arch/arm/configs/mt7623n_evb_fwu_defconfig
CONFLICT (add/add): Merge conflict in arch/arm/configs/mt7623n_evb_fwu_defconfig
Auto-merging arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
Auto-merging arch/arm/boot/dts/mt7623.dtsi
Automatic merge failed; fix conflicts and then commit the result.
root@bpi-r2:/usr/src/frank_5.3-hdmi# git branch 
  5.3-hdmi
* 5.3-hdmi-phylink
  5.3-phylink-2.5

How to fix it?

EDIT: OK, I have it.

nano arch/arm/configs/mt7623n_evb_fwu_defconfig
git add .
git commit -m "Resolved merge conflict"

Have you fixed the merge-conflict (removing at least conflict markers “<<<<”,"====",">>>>" )?

Then only "git add " for each file changed (check with git status) and then “git merge --continue”