nftables (Netfilter) consolidates {ip,ip6,arp}tables into a new kernel-based Linux firewall. Most Linux distributions are shifting from iptables to nftables as their default firewall framework. nftables is now the default in Debian 10, Ubuntu 20.04, RHEL 8, SUSE 15 and Fedora 32. Time to migrate!
This blog post elaborates on how to setup nftables based on a perimeter model, which is visualized metaphorically in picture 1. Look into a zero-trust network model if you want to fill the gaps of a perimeter-based approach. Also check out PF if you need a robust firewall solution on the edge of your network.
Getting started with nftables
First we install nftables:
$ sudo apt-get install nftables -y
$ nft -v
nftables v0.9.3 (Topsy)
Next we enable nftables on boot and start the daemon:
$ sudo systemctl enable nftables
$ sudo systemctl start nftables
$ sudo systemctl status nftables
nftables.service - nftables
Loaded: loaded (/lib/systemd/system/nftables.service; disabled; vendor preset: enabled)
Active: active (exited); 1s ago
Docs: man:nft(8)
http://wiki.nftables.org
Process: 16565 ExecStart=/usr/sbin/nft -f /etc/nftables.conf (code=exited, status=0/SUCCESS)
Main PID: 16565 (code=exited, status=0/SUCCESS)
How does nftables work
nftables is in essence a network filter, also referred to as network Access Control List (ACL) which allows you to control network data flows. These network filters are hierarchical and order dependent. Figure 1 shows how nftables functions based on the TCP/IP model:
Let's start from the bottom at the TCP/IP model. The data link is the point where you can streamline traffic
for specific (virtual) NICs (Network Interface Cards) based on their incoming vNIC (iifname) and outcoming vNIC (oifname). This way you can
segmentate data traffic (e.g. HTTPS) from management traffic (e.g. SSH or VNC).
At the data link, the ARP (Address Resolution Protocol) protocol is used to resolve an IP address to a MAC
address. During the initial ARP broadcast, a malicious entity could attempt to associate his MAC address with
the IP address of to requested host IP address, causing traffic meant for that IP address to be sent to the
attacker's host instead. You can control ARP traffic in the arp filter section (1).
Next up is the TCP/IP internetwork and transport layer with ip filter and ip6 filter (2). These filters help us shape network traffic from our host IP (saddr) to network segments or another host (daddr). nftables can filter packets based on network protocol, destination port (dport), source port (sport) and its session state (ct state). Small note - the ICMP protocol is actually a part of the IP protocol and therefore technically operates at the Internetwork layer. Ideally, your ip and ip6 tables should block any network traffic (drop) unless it is explicitly allowed (accept).
nftables is a network filter and not a native Layer 7 (L7) application firewall (3). Network ports are often mistaken for application network controls. Be aware that a malicious actor can tunnel a reverse shell over TCP port 443 (HTTPS) or UDP port 53 (DNS). Application (L7) filtering can fill in these gaps by leveraging a web proxy for HTTPS traffic, and Intrusion Prevention Systems (IPS) for dropping malicious tunneled traffic over other network protocols, even over ICMP. DPI (Deep Packet Inspection) is the keyword here.
How to configure nftables
We will directly edit the /etc/nftables.conf config file instead of using the nft CLI (nft add) and (nft delete). This config file is loaded by default on boot. You have to be root (sudo) to set firewall filters on ports under 1024.
$ sudo cp /etc/nftables.conf /etc/nftables.conf.bak
$ sudo vi /etc/nftables.conf
We first define variables which we can use later on in our ruleset:
define NIC_NAME = "eth0"
define NIC_MAC_GW = "DE:AD:BE:EF:01:01"
define NIC_IP = "192.168.1.12"
define LOCAL_INETW = { 192.168.0.0/16 }
define LOCAL_INETWv6 = { fe80::/10 }
define DNS_SERVERS = { 1.1.1.1, 8.8.8.8 }
define NTP_SERVERS = { time1.google.com, time2.google.com, time3.google.com, time4.google.com }
define DHCP_SERVER = "192.168.1.1"
The next code block shows our ip filter and ip6 filter. We first create an explicit deny rule (policy drop;) for the chain input and chain output. This means all network traffic is dropped unless its explicitly allowed later on. Next we have to define these exceptions based on network traffic we want to allow. Loopback network traffic is only allowed from the loopback interface and within RFC loopback network space.
nftables automatically maps network protocol names to port numbers (e.g. HTTPS <> 443). In our example, we only allow incoming sessions which we initiated (ct state established accept) from ephemeral ports (dport 32768-65535). Be aware an app or web server should allow newly initiated sessions (ct state new).
Certain network sessions initiated by this host (ct state new,established accept) in the chain output are explicitly allowed in the output chain. We also allow outgoing ping requests (icmp type echo-request), but do not want others to ping this host, hence ct state established in the icmp type input chain. More info on states can be found here.
table ip filter {
chain input {
type filter hook input priority 0; policy drop;
iifname "lo" accept
iifname "lo" ip saddr != 127.0.0.0/8 drop
iifname $NIC_NAME ip saddr 0.0.0.0/0 ip daddr $NIC_IP tcp sport { ssh, http, https, http-alt } tcp dport 32768-65535 ct state established accept
iifname $NIC_NAME ip saddr $NTP_SERVERS ip daddr $NIC_IP udp sport ntp udp dport 32768-65535 ct state established accept
iifname $NIC_NAME ip saddr $DHCP_SERVER ip daddr $NIC_IP udp sport bootpc udp dport 32768-65535 ct state established log accept
iifname $NIC_NAME ip saddr $DNS_SERVERS ip daddr $NIC_IP udp sport domain udp dport 32768-65535 ct state established accept
iifname $NIC_NAME ip saddr $LOCAL_INETW ip daddr $NIC_IP icmp type echo-reply ct state established accept
}
chain output {
type filter hook output priority 0; policy drop;
oifname "lo" accept
oifname "lo" ip daddr != 127.0.0.0/8 drop
oifname $NIC_NAME ip daddr 0.0.0.0/0 ip saddr $NIC_IP tcp dport { ssh, http, https, http-alt } tcp sport 32768-65535 ct state new,established accept
oifname $NIC_NAME ip daddr $NTP_SERVERS ip saddr $NIC_IP udp dport ntp udp sport 32768-65535 ct state new,established accept
oifname $NIC_NAME ip daddr $DHCP_SERVER ip saddr $NIC_IP udp dport bootpc udp sport 32768-65535 ct state new,established log accept
oifname $NIC_NAME ip daddr $DNS_SERVERS ip saddr $NIC_IP udp dport domain udp sport 32768-65535 ct state new,established accept
oifname $NIC_NAME ip daddr $LOCAL_INETW ip saddr $NIC_IP icmp type echo-request ct state new,established accept
}
chain forward {
type filter hook forward priority 0; policy drop;
}
}
The next code block is used to block incoming and outgoing IPv6 traffic, except ping requests (icmpv6 type echo-request) and IPv6 network discovery (nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert). vNICs are often automatically provisioned with IPv6 addresses and left untouched. These interfaces can be abused by malicious entities to tunnel out confidential data or even a shell.
table ip6 filter {
chain input {
type filter hook input priority 0; policy drop;
iifname "lo" accept
iifname "lo" ip6 saddr != ::1/128 drop
iifname $NIC_NAME ip6 saddr $LOCAL_INETWv6 icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-reply, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } ct state established accept
}
chain output {
type filter hook output priority 0; policy drop;
oifname "lo" accept
oifname "lo" ip6 daddr != ::1/128 drop
oifname $NIC_NAME ip6 daddr $LOCAL_INETWv6 icmpv6 type echo-request ct state new,established accept
}
chain forward {
type filter hook forward priority 0; policy drop;
}
}
The last code block is used for ARP traffic which limits ARP broadcast network frames:
table arp filter {
chain input {
type filter hook input priority 0; policy accept;
iif $NIC_NAME limit rate 1/second burst 2 packets accept
}
chain output {
type filter hook output priority 0; policy accept;
}
}
The full nftables config file is availble on this GitHub repository. Let's load this new config file into memory:
$ sudo systemctl restart nftables && systemctl status nftables && nft list ruleset
Considerations
Make sure to test your ports are truly open or closed. You can use nc, telnet or tcpdump for this.
nftables can log actions in /var/log/syslog. You should leverage rsyslog to forward logs to your favorite SIEM solution to get a better notion of your network.
With the right skeleton, nftables makes life easier for DevOps engineers to apply micro segmentation firewalling. DevOps engineers can provision modular firewall rules to Linux-based hosts by using a configuration management tool such as Ansible. These firewall rulesets are then pushed and loaded based on the core function of the VM or container. For example, database servers should only talk with a limited subset of web servers. This way of provisioning firewall rules on a host level should reduce the attack surface for lateral movement by a malicious actor.
This blog post merely touched upon the core firewall capabilities of nftables. Check out the nftables wiki page for more firewall techniques. Keep in mind that one firewall solution does not necessarily replace another. Also look into an edge firewall if you do not have a zero-trust network domain.