BPI-R2 Kernel 4.14 HNAT


(Frank W.) #41

as i understand @rainfall83 right with this:

then this change is not needed…and all traffic should be bound. i tested only init-procedure (will be called if module loaded), because i don’t know how to debug it deeper


(xbgmsharp) #42

OK then i will revert my change.


(Frank W.) #43

i can confirm, that on 4.14-hnat also all entries are UNBIND:

log
root@bpi-r2-ubuntu:~# cat /sys/kernel/debug/hnat/all_entry
(fb8c5180)0x00146|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:581
99=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8c5880)0x00162|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:288
5=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1=
0x50000118|info2=0x0|vlan1=0|vlan2=0
(fb8c6a80)0x001aa|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:417
61=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8cf800)0x003e0|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:273
96=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000016|info2=0x0|vlan1=0|vlan2=0
(fb8d8e00)0x00638|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:475
92=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8e1c00)0x00870|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:788
4=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1=
0x50000019|info2=0x0|vlan1=0|vlan2=0
(fb8e6580)0x00996|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
68=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000018|info2=0x0|vlan1=0|vlan2=0
(fb8e6780)0x0099e|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
64=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000018|info2=0x0|vlan1=0|vlan2=0
(fb8ee700)0x00b9c|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
65=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000019|info2=0x0|vlan1=0|vlan2=0
(fb8f0300)0x00c0c|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:524
66=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8f4d00)0x00d34|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:727
8=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1=
0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8f5300)0x00d4c|state=UNBIND|type=IPV4_HNAPT|192.168.0.10:53->192.168.0.11:523
06=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x50000119|info2=0x0|vlan1=0|vlan2=0
(fb8f6680)0x00d9a|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
66=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000119|info2=0x0|vlan1=0|vlan2=0
(fb8fe600)0x00f98|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
67=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000118|info2=0x0|vlan1=0|vlan2=0
(fb8ff800)0x00fe0|state=UNBIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:456
63=>0.0.0.0:0->0.0.0.0:0|00:00:00:00:00:00=>00:00:00:00:00:00|etype=0x0000|info1
=0x10000118|info2=0x0|vlan1=0|vlan2=0

how can we debug this?

dmesg
[  380.129873] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:252) of_node=hnat
[  380.129900] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:265) res:-559325376
[  380.129966] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:273) host-fe_base:-503775232
[  380.130236] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:281) err (debugfs):0
[  380.133039] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:293) err (hnat_start):0
[  380.133044] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:299) register reached
[  380.323695] [HNAT] hnat_probe: (drivers/net/ethernet/mediatek/mtk_hnat/hnat.c
:303) ret (hnat_register):0

As far as i know,traffic is received by hnat-module,but i don’t know why it is unbind…there are several unclear checks (mtk_hnat_nf_post_routing) before bind is set (skb_to_hnat_info)


(Frank W.) #44

i’ve added some printk’s to mtk_hnat_nf_post_routing, and looked which are printed…

first result: they are less than i thought…only 2 entries (i thought that this is done by each packet received from driver)

Jul 18 12:03:25 bpi-r2-ubuntu kernel: [  115.123949] [HNAT] mtk_hnat_nf_post_rou
ting: (drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:171) ct
Jul 18 12:03:25 bpi-r2-ubuntu kernel: [  115.124015] [HNAT] mtk_hnat_nf_post_rou
ting: (drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:178) help

my code:

167     ct = nf_ct_get(skb, &ctinfo);
168     if (!ct)
169         return 0;
170 printk(KERN_WARNING "[HNAT] %s: (%s:%i) ct",
171  __FUNCTION__, __FILE__, __LINE__);
172
173     /* rcu_read_lock()ed by nf_hook_slow */
174     help = nfct_help(ct);
175     if (help && rcu_dereference(help->helper))
176         return 0;
177 printk(KERN_WARNING "[HNAT] %s: (%s:%i) help",
178  __FUNCTION__, __FILE__, __LINE__);
179
180     if ((FROM_GE_WAN(skb) || FROM_GE_LAN(skb)) &&
181         skb_hnat_is_hashed(skb) &&
182         (skb_hnat_reason(skb) == HIT_BIND_KEEPALIVE_DUP_OLD_HDR))
183         return -1;
184 printk(KERN_WARNING "[HNAT] %s: (%s:%i) from_wan/lan",
185  __FUNCTION__, __FILE__, __LINE__);

as you see the last message does not came up…so that condition seems to be the Problem…after adding a printk for the check-parts…i get much more messages (also the missing “from_wan/lan”)…seems previous test has cached something on client-side

180 printk(KERN_WARNING "[HNAT] %s: (%s:%i) check: from wan: %i,from lan:%i,skb-hashed:%i,skb-reason:%i",
181  __FUNCTION__, __FILE__, __LINE__,FROM_GE_WAN(skb), FROM_GE_LAN(skb),skb_hnat_is_hashed(skb),skb_hnat_reason(skb));

what i wonder: i did not get any message where from “lan:1”

everytime the “from lan/wan”-message comes up wan=1…maybe its a detection-problem of lan-ports

Jul 18 12:21:24 bpi-r2-ubuntu kernel: [  440.135087] [HNAT] mtk_hnat_nf_post_routing: (drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:181) check: from wan: 1,from lan:0,skb-hashed:1,skb-reason:14
Jul 18 12:21:24 bpi-r2-ubuntu kernel: [  440.135156] [HNAT] mtk_hnat_nf_post_routing: (drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:189) from_wan/lan

skb-hashed is mostly 1, and skb-reason is 12,13,14,15, 23 (here all values are 0).

drivers/net/ethernet/mediatek/mtk_hnat/hnat.h:371:#define FROM_GE_LAN(skb)	(HNAT_SKB_CB(skb)->iif == FOE_MAGIC_GE_LAN)
drivers/net/ethernet/mediatek/mtk_hnat/hnat.h:375:#define FOE_MAGIC_GE_LAN	0x7272
drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:223:		HNAT_SKB_CB(skb)->iif = FOE_MAGIC_GE_LAN;

the flag is set in prerouting-nf-hook:

224     if (IS_WAN(state->in))
225         HNAT_SKB_CB(skb)->iif = FOE_MAGIC_GE_WAN;
226     else if (IS_LAN(state->in))
227         HNAT_SKB_CB(skb)->iif = FOE_MAGIC_GE_LAN;

which leads us to IS_LAN-macro:

drivers/net/ethernet/mediatek/mtk_hnat/hnat.h:402:#define IS_LAN(dev)		(!strncmp(dev->name, "lan", 3))

lol…after i added a printk to pre_routing-function i see my problem:

Jul 18 12:48:11 bpi-r2-ubuntu kernel: [  117.792015] [HNAT] mtk_hnat_nf_pre_routing: (drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c:221) in-device: ap0

i tested only over wifi-device :wink: because i need my laptop to build kernel…i think IS_LAN should be changed to include ap0 and wlan* devices or maybe !wan (but how about the unknown-devices - last step in prerouting)…

there are maybe some other problems:

drivers/net/ethernet/mediatek/mtk_hnat/hnat_nf_hook.c

121     if (IS_LAN(dev))
122         gmac = NR_GMAC1_PORT;
123     else if (IS_WAN(dev))
124         gmac = NR_GMAC2_PORT;

imho i cannot append ap0 to IS_LAN because then traffic is routed to GMAC1_PORT…maybe we need a IS_WIFI :slight_smile:

if i test over lan0 i have also BIND and FIN-states :slight_smile:

cat /sys/kernel/debug/hnat/all_entry | grep -v 'UNBIIND'
(fb8f6300)0x00d8c|state=FIN|type=IPV4_HNAPT|192.168.4.22:54306->78.46.4.143:443=>192.168.0.11:5306->78.46.4.143:443|a6:eb:a4:f6:6f:a1=>08:00:00:00:00:00|etype=0x8100|info1=0x315102e4|info2=0x3f040|vlan1=1|vlan2=0
(fb8f6980)0x00da6|state=UNBIND|type=IPV4_HNAPT|192.168.4.22:33698->13.32.8.17:443=>192.168.0.11:33698->13.32.8.17:443|a6:eb:a4:f6:6f:a1=>08:00:00:00:00:00|etype=0x8100|info1=0x100000e1|info2=0x3f040|vlan1=1|vlan2=0
(fb8f8b80)0x00e2e|state=BIND|type=IPV4_HNAPT|192.168.4.22:58824->162.125.66.3:443=>192.168.0.11:58824->162.125.66.3:443|a6:eb:a4:f6:6f:a1=>08:00:00:00:00:00|etype=0x8100|info1=0x215182e0|info2=0x3f040|vlan1=1|vlan2=0

also tested with 4.14-hnat-branch…same result

uname -r
4.14.54-bpi-r2-hnat
cat /sys/kernel/debug/hnat/all_entry | grep -v 'UNBIND'
(fb8e9980)0x00a66|state=BIND|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:58332=>52.88.66.112:80->192.168.4.22:58332|2a:2b:78:8b:8f:d0=>00:13:77:b7:a7:62|etype=0x8100|info1=0x215180fb|info2=0x83f020|vlan1=2|vlan2=0
(fb8e9b80)0x00a6e|state=FIN|type=IPV4_HNAPT|52.88.66.112:80->192.168.0.11:58328=>52.88.66.112:80->192.168.4.22:58328|2a:2b:78:8b:8f:d0=>00:13:77:b7:a7:62|etype=0x8100|info1=0x315180fa|info2=0x83f020|vlan1=2|vlan2=0
(fb8f9d00)0x00e74|state=FIN|type=IPV4_HNAPT|192.168.4.22:58328->52.88.66.112:80=>192.168.0.11:58328->52.88.66.112:80|7e:81:57:f2:23:c0=>08:00:00:00:00:00|etype=0x8100|info1=0x315180fa|info2=0x3f040|vlan1=1|vlan2=0
(fb8f9f00)0x00e7c|state=BIND|type=IPV4_HNAPT|192.168.4.22:58332->52.88.66.112:80=>192.168.0.11:58332->52.88.66.112:80|7e:81:57:f2:23:c0=>08:00:00:00:00:00|etype=0x8100|info1=0x215180fb|info2=0x3f040|vlan1=1|vlan2=0
....

so i assume that hnat works, @rainfall83 am i right? how can we add this behaviour to ap0 and wlanx?

so everything that is not wan (dts-setting), lan (first 3 letters from interface name, so also lanbrx) and bridge-device (first 2 letters “br”) get tagging invalid…maybe we can create a additional tag (e.g. FOE_WIFI) and add a IS_WIFI-macro (check if devicename is ap0 or contains “wlan”).

btw. in /proc/interrupts i see no decisive difference between with and without mtkhnat-module

ok, it takes some time…i started a download (ubuntu-image) and after some seconds, the interrupts counting ~4/second…in debugfs i see more bind than unbind-entries


(Frank W.) #45

to get it running for wifi maybe we can change the code to check only for wan (because traffic should not go from wan to wan…and wan is always part of the check):

if (FROM_GE_WAN(skb) || IS_WAN(out)) {

instead of

if ((FROM_GE_WAN(skb) || FROM_GE_LAN(skb)) &&

and

if (FROM_GE_WAN(skb) || IS_WAN(out)) {

instead of

if ((IS_LAN(out) && FROM_GE_WAN(skb)) || (IS_WAN(out) && FROM_GE_LAN(skb))) {

but first i want to merge current version, if someone confirm it is working

maybe then we can merge the block to only check once for wan-port and make additional checks inside

only check for wan seems not to work at least without tagging in mtk_hnat_nf_pre_routing

this is what i tried (no traffic with this patch on top of 4.14-hnat-branch): hnat_test_wifi.diff (2,5 KB) i have bind-entries for ap0-connection, but traffic seems to routed wrong or dropped


(Frank W.) #46

@jackzeng please use this thread if you are working on hnat (wifi/bridge-support).


(ZB) #47

Ok,frank,I am going to try your repo to see the xhci problem.


(ZB) #48

Hello, Frank, I’m trying to port Hnat to kernel 4.4, but after I ported one version code, eth will be disabled. do you have any suggestions?


(Frank W.) #49

current hnat-code depends on second gmac. I don’t know if it is implemented same as in 4.9/4.14. also forwarding saves sourceport which needs name lanx or wan (needs dsa-driver).

Why do you put such effort in old kernel 4.4?

can you help me with uboot (mmc-offset) and bluetooth (also not working in 4.4) and wifi (wmt-tools)?


(Frank W.) #50

What is status of official hnat-patch (new framework)? Last info i have is currently under review…


(Frank W.) #51

@Ryder.Lee can you give me an update on this?

framework should be this: https://www.kernel.org/doc/Documentation/networking/nf_flowtable.txt right? how far is hnat/hwqos implementation (4.19)?


(moore liu) #52

Per my understanding, framework support software acceleration and openwrt community enhanced it to support hardware acceleration in openwrt trunk. we should wait until upstream framework support hw acceleration before sending patch, thanks.


(Frank W.) #53

Can i see the progress anywhere?


(moore liu) #54

We are studying upstream framework and just found hnat feature can only be supported in openwrt trunk. After we have overview about framework, we will discuss with netfilter core developer about hw nat plan, and hope mt7623 can be the first platform to support upstream hnat feature, thanks.


(Cioby23) #55

Any news about this feature being implemented soon ?


(moore liu) #56

We have sent 3 patches to openwrt and kernel.org to fix mt7621 hnat issues. All patches were merged, but OpenWRT user reported system become unstable and we are trying to find the root cause, thanks.


(Alex) #57

How many connection supported by hnat? I tests BPI-R2 by ixia, one UDP flow it works good. But I saw only 4 records in debug/hnat/all_entry when run test with 128 UDP flows.


(Frank W.) #58

Which kernel have you tested? Old hnat-code in 4.14/4.9 from my repo or the new code in openwrt? remember hnat only works between lanx and wan,no bridge and no wifi


(Alexey Loukianov) #59

Em, just to make sure I understood you correctly: are you telling that HW NAT acceleration won’t work if person have got lanX ports bridged together? I mean, it is not an ordinary linux bridge we get here, it is a DSA-assisted bridge and essentially we’ve got an eth0<->eth1 physical interconnection for this case.

My expectation was that HW NAT implementation is something like having control plane implemented as usual through netfilter stack with additional setup done to move data plane processing either into switch ASIC or into a some specific HW-based accelerator that is built into SoC and is able to intercept packets matching patterns that were set up by control plane and perform a NAT translation and forwarding between wan port and lanX ports without kernel participation in the process.


(Frank W.) #60

As far as i understand the code for 4.14 it looks for device-names and activate hnat only for wan-lan and reversed way.

See mtk_hnat_nf_post_routing in https://github.com/frank-w/BPI-R2-4.14/commits/4.14-hnat