New netfilter flow table based HNAT

FYR.

New netfilter flow table based HNAT was already sent to the netfilter mailing list for the upstream process.

https://patchwork.ozlabs.org/project/netfilter-devel/list/?series=233310&archive=both&state=*

4 Likes

In part21 it looks like mt7623 is not supported,right?

it can also work too. just have to map to the corresponding offload version.

Can you tell me the right version and how i can test it?

It’s probably V2, same as MT7622.

Thanks, i try to add mt7623 with version 2 like it’s done for mt7621 and 7622.

Can you give an example for configure nft with hw nat offload?

Not sure about nftables, I only know openwrt way.

One of these examples should help:

table netdev filter {
    chain ingress {
        type filter hook ingress device eth0 priority 0; flags offload;
        ip daddr 192.168.0.10 tcp dport 22 drop
    }
}

or

table inet flowoffload {
        flowtable f {
                hook ingress priority 0
                devices = { eno1, eno2, ens1f0, ens1f1, virbr0, virbr-cube-kvm }
        }

        chain flowoffload {
                type filter hook forward priority 0; policy accept;
                ip protocol { tcp, udp } flow offload @f
                ip6 nexthdr { tcp, udp } flow offload @f
                counter
                ct state established,related counter accept
                ip protocol { tcp, udp } accept
                ip6 nexthdr { tcp, udp } accept
        }
}

or

table inet x {

    flowtable f {
        hook ingress priority 0 devices = { eth0, eth1 };
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # offload established connections
        ip protocol { tcp, udp } flow offload @f
        ip6 nexthdr { tcp, udp } flow offload @f
        counter packets 0 bytes 0

        # established/related connections
        ct state established,related counter accept

        # allow initial connection
        ip protocol { tcp, udp } accept
        ip6 nexthdr { tcp, udp } accept
    }
}

Hi, I’m trying to add ipv6 support, but with no success. I’ve attached my changes, I suspect my mtk_flow_mangle_ipv6 implementation is wrong, any ideas? @Ryder.Lee

mtk_ppe_offload.c (11.5 KB)

i guess we need to add the flowtable to a nat-chain, something like this:

table ip filter {
        chain input {
                type filter hook input priority 0; policy accept;
        }

        chain output {
                type filter hook output priority 0; policy accept;
        }

        chain forward {
                type filter hook forward priority 0; policy accept;
        }
}
table ip nat {
        flowtable f {
                hook ingress priority 10
                devices = { wan, lan0 }
        }

        chain post {
                type nat hook postrouting priority 0; policy accept;
                oifname "wan" masquerade
                #add flowtable here
        }

        chain pre {
                type nat hook prerouting priority 0; policy accept;
        }
}

but i got

Error: Could not process rule: Operation not supported
add rule nat post ip protocol tcp flow offload @f

@Ryder.Lee any idea? can i only add flowtable to forwarding-chain like this picture suggests: https://wiki.nftables.org/wiki-nftables/index.php/Flowtable

loaded modules:

nft_flow_offload       16384  0
nf_flow_table_ipv4     16384  1
nf_flow_table          36864  2 nf_flow_table_ipv4,nft_flow_offload
nf_log_ipv4            16384  0
nf_log_common          16384  1 nf_log_ipv4
nft_log                16384  0
nft_reject_ipv4        16384  0
nf_reject_ipv4         16384  1 nft_reject_ipv4
nft_reject             16384  1 nft_reject_ipv4
nft_limit              16384  0
nft_ct                 20480  0
nft_masq               16384  1
nft_fib_ipv4           16384  0
nft_fib                16384  1 nft_fib_ipv4
nft_nat                16384  0
nft_chain_nat          16384  2
nf_nat                 45056  3 nft_nat,nft_chain_nat,nft_masq
nf_conntrack          135168  6 nft_ct,nft_nat,nf_flow_table,nft_flow_offload,nft_masq,nf_nat
nf_defrag_ipv6         20480  1 nf_conntrack
nf_defrag_ipv4         16384  1 nf_conntrack
nft_counter            16384  0
nf_tables             200704  15 nft_ct,nft_nat,nf_flow_table_ipv4,nft_reject,nft_reject_ipv4,nft_limit,nft_flow_offload,nft_chain_nat,nft_masq,nft_fib,nft_fib_ipv4,nft_counter,nft_log
nfnetlink              20480  1 nf_tables

seems like the series was not sent to mediatek-mailinglist in cc…

i can apply the flow offload rule to forward-chain, but is this right?? how can i check if offloading works (looking on top only while running iperf over it seems like not the best method for it)

table ip filter {
        flowtable f {
                hook ingress priority 0; devices = { wan, lan0 };
        }
        chain input {
                type filter hook input priority 0; policy accept;
        }

        chain output {
                type filter hook output priority 0; policy accept;
        }

        chain forward {
                type filter hook forward priority 0; policy accept;
                ip protocol { tcp, udp } flow offload @f
        }
}
table ip nat {
        chain post {
                type nat hook postrouting priority 0; policy accept;
                oifname "wan" masquerade
        }

        chain pre {
                type nat hook prerouting priority 0; policy accept;
        }
}

With this config (added lan3 to flowtables devices) i started test

192.168.90.122 (client,iperf3 -c) => 192.168.90.1 [lan3] / 192.168.0.12 [wan] (test-r2) => 192.168.0.10 (main-r2,iperf3 -s)

and looked in debugfs:

[email protected]:~# cat /sys/kernel/debug/mtk_ppe/entries
0051e UNB IPv4 5T orig=192.168.0.21:58552->239.255.255.250:1900 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=5000014c ib2=00000000
00742 UNB IPv4 5T orig=192.168.0.10:5201->192.168.0.12:45878 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=10001e49 ib2=00000000
0128c UNB IPv4 5T orig=192.168.90.122:37058->193.227.196.10:22067 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=1000024d ib2=00000000
01516 UNB IPv4 5T orig=192.168.90.122:52304->162.125.7.20:443 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=10000048 ib2=00000000
01672 UNB IPv4 5T orig=193.227.196.10:22067->192.168.0.12:37058 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=1000024d ib2=00000000
01b40 UNB IPv4 5T orig=192.168.0.21:48287->239.255.255.250:1900 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=5000014c ib2=00000000
01f3a UNB IPv4 5T orig=192.168.90.122:45878->192.168.0.10:5201 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=10000049 ib2=00000000
01f3e UNB IPv4 5T orig=192.168.90.122:45876->192.168.0.10:5201 new=0.0.0.0:0->0.0.0.0:0 eth=00:00:00:00:00:00->00:00:00:00:00:00 etype=0000 vlan=0,0 ib1=10000049 ib2=00000000

but see only UNB (entries changing while running iperf) instead of BND. NAT is working as expected as i see on iperf3-server wan-ip of test-r2 instead of clients ip

Accepted connection from 192.168.0.12, port 45876
[  5] local 192.168.0.10 port 5201 connected to 192.168.0.12 port 45878

with and without flowoffload i don’t see much cpu-load and speed is ~93MBit/s (client has only 100Mbit-card) added flow-version 2 to mt7623…maybe this is wrong?

tried same on r64 with same result

maybe this is related:

[email protected]:~# ethtool -k wan | grep offload
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
macsec-hw-offload: off [fixed]
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]
[email protected]:~# 

The new patches do not add xt_flowoffload like in openwrt and i’m not sure if i need additional flag (like hw in owrt). Tries to find any information for hw offloading but found nothing that helps here

This probably means HWNAT hooks are set up using the new way (TC).

.ndo_setup_tc = mtk_eth_setup_tc

If you remove ip protocol { tcp, udp } flow offload @f does ethtool show hw-tc-offload off?

seems like this flag (and the others too) is on by default (after bootup without any nftables ruleset configured):

[email protected]:~# ethtool -k wan | grep offload
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
macsec-hw-offload: off [fixed]
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]
[email protected]:~# 
[email protected]:~# nft list ruleset
[email protected]:~#

I just tried the new nftables based openwrt firewall and flow offload doesn’t seem to work.

fw4 print:

flowtable f {
	hook ingress priority 0; devices = { eth0, eth1 };
}

chain forward {
	type filter hook forward priority filter; policy drop;
	
	# offload established connections
	ip protocol { tcp, udp } flow offload @f
	ip6 nexthdr { tcp, udp } flow offload @f

	ct state established,related accept comment "!fw4: Allow forwarded established and related flows"



	iifname "br-lan" jump forward_lan comment "!fw4: Handle lan IPv4/IPv6 forward traffic"
	iifname "pppoe-wan" jump forward_wan comment "!fw4: Handle wan IPv4/IPv6 forward traffic"

	jump handle_reject
}

I think I was able to reduce CPU usage by setting flowtable hook like this:

hook ingress priority 0; devices = { pppoe-wan, br-lan };

sirq went from 49% to 29% during 900 mbps speedtest.

Still no BND entries in mtk_ppe/entries.

According to this commit we need to specify HW_OFFLOAD flag but it says syntax is wrong.

It seems in old openwrt way that flag was hardcoded here, and it seems currently nft cannot enable flowtable hardware offloading.

I tried adding the flag manually in nf_flow_table_core but then nft complained it cannot read the configuration.

https://lwn.net/Articles/804384/ says

 table inet x {
      flowtable f {
               hook ingress priority 10 devices = { eth0, eth1 }
	       flags offload #<<<<<<<<<<
      }
      chain y {
               type filter hook forward priority 0; policy accept;
               ip protocol tcp flow offload @f
      }
 }
This example above enables the fastpath for TCP traffic between devices
eth0 and eth1. Users can turn on the hardware offload through the
'offload' flag from the flowtable definition.

So i guess we only need flags offload in the flowtable as above example…i hope all kinds of devices work (nic,dsa,bridge,ppp). And ethtool likely tells support not activation of flags

I see commit description youve posted shows this flag in the example too :slight_smile: but i got error while adding the flags

nft-nat-flowoffload.nft:6:17-21: Error: syntax error, unexpected flags

(tried with and without semicolon in line before)

based on https://www.spinics.net/lists/netfilter/msg59253.html i set tc-offload on all interfaces (they are shown as on before) again to on

[email protected]:~# ethtool -K wan hw-tc-offload on                                 
[email protected]:~# ethtool -K lan3 hw-tc-offload on                                
[email protected]:~# ethtool -K lan0 hw-tc-offload on

i guess ntf is too old and does not support the flags

[email protected]:~# nft --version                                                   
nftables v0.9.0 (Fearless Fosdick)

I already updated to latest nftables but with no luck.

[email protected]:/# nft --version
nftables v0.9.8 (E.D.S.)

I guess only solution for now is to patch nf_flow_table (or maybe another file) so we set the flag directly in code for any flowtable.

have you added the “flags offload” option to your config? do you get an error?

cannot get libnftnl/nftables compiled…tried crosscompile too

Yes, still can’t set flags offload.

Latest nftables depends on libnftnl 1.1.9, that is why I could not compile initally.

It seems there is option to specify offload flag for a chain too, not for flowtable, this commit suggets: http://git.netfilter.org/nftables/commit/?id=d42bd56cff1a22301703d2b9d6d6fc937ea7cfbd

with crosscompile i cannot compile libnftl because of missing libmnl

/usr/lib/gcc-cross/arm-linux-gnueabihf/9/…/…/…/…/arm-linux-gnueabihf/bin/ld: cannot find -lmnl

i can install libnml-dev but not for armhf

sudo apt install asciidoc libmnl-dev libnftnl-dev --no-install-recommends

but it seems to exist…maybe i need to add additional repo for it

https://packages.ubuntu.com/focal/armhf/libmnl-dev/filelist

sudo dpkg --add-architecture armhf
sudo apt update #shows many 404 for armhf...
sudo apt install asciidoc libmnl-dev:armhf --no-install-recommends

on native install (very slow) i hang on in configure-step

./configure: line 4091: syntax error near unexpected token `LIBMNL,'            
./configure: line 4091: `PKG_CHECK_MODULES(LIBMNL, libmnl >= 1.0.4)'

have installed these:

apt-get install git make gcc dh-autoreconf bison flex asciidoc --no-install-recommends libmnl-dev pkg-config libgmp-dev libreadline-dev

had to run autogen.sh again…now it gets over this point…but no idea about crosscompile

The patch you’ve linked is for nft tool too,so i still need to get libnftnl and nft compiled too…but strange that you still cannot add offload flag to flowtable

I think flowtable just does not support offload flags for now.

I try to patch flow_offload_add:

+__set_bit(NF_FLOWTABLE_HW_OFFLOAD ,flow_table->flags);
if (nf_flowtable_hw_offload(flow_table)) {
	__set_bit(NF_FLOW_HW, &flow->flags);
	nf_flow_offload_add(flow_table, flow);
}

But still not offload, also kernel crash shortly.

[  507.363674] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G S                5.10.23 #0
[  507.371057] Hardware name: Bananapi BPI-R64 (DT)
[  507.375667] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[  507.381677] pc : flow_offload_add+0x9c/0x254 [nf_flow_table]
[  507.387328] lr : flow_offload_add+0x54/0x254 [nf_flow_table]
[  507.392977] sp : ffffffc010b3b420
[  507.396282] x29: ffffffc010b3b420 x28: ffffff8001e0da50 
[  507.401587] x27: ffffff8002022e90 x26: ffffff8000044300 
[  507.406891] x25: 0000000000000000 x24: ffffff8002f53c00 
[  507.412196] x23: 0000000000000000 x22: ffffff8002f53c60 
[  507.417499] x21: ffffffc0109f6000 x20: ffffff8002f53c50 
[  507.422803] x19: ffffff8002f1a200 x18: 0000000000000000 
[  507.428108] x17: 00001ed380d5b700 x16: 0000000000000001 
[  507.433412] x15: ffffffc010b3b370 x14: ffffffc010b3b368 
[  507.438716] x13: 0000000000000002 x12: ffffffc010b3b360 
[  507.444020] x11: ffffff8000c3a000 x10: 0000000002d64d22 
[  507.449324] x9 : 0000000000000000 x8 : ffffff8002f1a300 
[  507.454629] x7 : 00000000a8139c6a x6 : 0000000008e83a30 
[  507.459933] x5 : 00000000881c520d x4 : ffffff8002f7d4d8 
[  507.465237] x3 : 000000010000ca5e x2 : 0000000001499700 
[  507.470541] x1 : 0000000000000000 x0 : 00000000014a615e 
[  507.475846] Call trace:
[  507.478287]  flow_offload_add+0x9c/0x254 [nf_flow_table]
[  507.483591]  0xffffffc008a46900
[  507.486727]  nft_do_chain+0xc8/0x440 [nf_tables]
[  507.491338]  nft_data_dump+0x1b44/0x2370 [nf_tables]
[  507.496297]  nf_hook_slow+0x48/0xe0
[  507.499780]  ip6_forward+0x454/0x8b0
[  507.503347]  ipv6_rcv+0xb0/0xe0
[  507.506482]  __netif_receive_skb_one_core+0x44/0x50
[  507.511351]  __netif_receive_skb+0x14/0x60
[  507.515438]  netif_receive_skb+0x20/0x94
[  507.519355]  br_pass_frame_up+0x124/0x160
[  507.523355]  br_handle_frame_finish+0x2b8/0x444
[  507.527877]  br_handle_frame+0x35c/0x3ac
[  507.531791]  __netif_receive_skb_core+0x2a0/0xaf0
[  507.536486]  __netif_receive_skb_list_core+0xd0/0x1e4
[  507.541529]  netif_receive_skb_list_internal+0x168/0x260
[  507.546831]  napi_complete_done+0x64/0x1c0
[  507.550921]  mtk_napi_rx+0x56c/0x680
[  507.554487]  __napi_poll+0x34/0x140
[  507.557967]  net_rx_action+0xd4/0x280
[  507.561621]  _stext+0x124/0x28c
[  507.564756]  irq_exit+0xd8/0xf4
[  507.567890]  __handle_domain_irq+0x7c/0xdc
[  507.571980]  gic_handle_irq+0x68/0x8c
[  507.575633]  el1_irq+0xc0/0x180
[  507.578767]  arch_cpu_idle+0x14/0x2c
[  507.582335]  do_idle+0xc4/0x140
[  507.585467]  cpu_startup_entry+0x20/0x50
[  507.589382]  rest_init+0xb8/0xc4
[  507.592601]  arch_call_rest_init+0xc/0x14
[  507.596601]  start_kernel+0x46c/0x484
[  507.600258] Code: 0b020000 b9000820 b9410281 52800017 (f9400020) 
[  507.606342] ---[ end trace f25d78f236d013d9 ]---

Hi, frank.

You can try to add sid repo to install latest nftables (and dep libs). https://packages.debian.org/sid/nftables it has nftables 0.9.8.

Short howto:

1st in yout /etc/apt/source.list change your distro name to sid i.e

deb http://deb.debian.org/debian buster main contrib non-free ->
deb http://deb.debian.org/debian sid main contrib non-free
  1. apt-update

3.install/update nftables (read carefully what packets are affected, just cancel if you don’t like it)

  1. replace sid to your distro, and repeat apt-update.
1 Like

thanks i try this on r2 then…but first i try to get it working on r64 with the compile-variant (currently natively on r64 as crosscompile does not work, maybe i try a armhf-chroot on my laptop as third option similar to openwrt build-processs and afair like i’ve done for openssl) as it looks like we need additional patches

for crosscompile, maybe i need to crosccompile libnml too and add lib while configure like it’s done here: https://stackoverflow.com/questions/63435221/issues-cross-compiling-libnftnl-for-arm but have not yet found source for it…

contacted author pablo naira ayuso of the flags-patch above about the missing flags support in flowoffload chain