I have interfaces enp101s0f0u2u{1..3}
, on each of which there is device responding to 192.168.8.1
.
I want a local processes to be able to reach all of them simultaneously.
This is one process, so network namespaces are not an option.
I am looking for a solution that doesn't use socat or another proxy that can bind an outgoing interface.
I thought of locally making virtual IPs 192.168.8.1{1..3}
to point to them.
What I got so far:
- Interface
enp101s0f0u2ux
has ipv4192.168.8.2x/32
. - ip rule
100x: from all to 192.168.8.1x lookup 20x
- ip route
default dev enp101s0f0u2ux table 20x scope link src 192.168.8.2x
(this means the interface and src are correct when chosen automatically)
chain output {
type nat hook output priority dstnat; policy accept;
ip daddr 192.168.8.1x meta mark set 20x counter dnat to 192.168.8.1
}
(this means the destination ip is changed to .1, unfortunately I only found a way to do this before routing decision is made, so we need the next thing)
- ip rule
110x: from all fwmark 20x lookup 20x
(this means that despite dst being 192.168.8.1
, it goes to the …ux interface) now the hard part:
chain input {
type nat hook input priority filter; policy accept;
ip saddr 192.168.8.1 ip daddr 192.168.8.2x counter snat to 192.168.8.1x
}
(this should restore the src of the return packet to .1x, so the socket and application are not astonished)
Unfortunately, at this point if I try to curl, tcpdump
sees a 192.168.8.21.11111 > 192.168.8.1.80
(SYN) and multiple 192.168.8.1.80 > 192.168.8.21.11111
(SYN-ACK) attempts, but the input
chain counter is not hit.
However, if I add the seemingly useless
chain postrouting {
type nat hook postrouting priority srcnat; policy accept;
ip daddr 192.168.8.1 counter masquerade
}
I get 1 packet hitting the input snat rule, and the application gets some data back! However, all the consequent packets from 192.168.8.1 in the flow are dropped. Here is a tcpdump and a conntrack
I'm at the end of my rope, been at it for days. There's no firewall/filter happening (which conntrack would be opening for me), I have empty nftables besides the chains I showed here.
I cannot understand why the masquerade makes a difference, and in general what goes on in conntrack. (The entry gets created and destroyed twice, and then an entry starting from outside gets created?)
Of note is that the entries are not symmetrical, they mention both 192.168.8.1
and 192.168.8.12
in each entry for opposite directions.
I especially don't understand how or why in absence of masquerade the returning 192.168.8.1.80 > 192.168.8.21.11111
(SYN-ACK) packets get dropped instead of going to input chain. Would this happen if the application TCP socket did CONNECT and so only wants replies from .11?
But shouldn't input
be able to intercept before the socket? And I can't snat in prerouting anyway, so where would this have to be done?
Update:
Adding
type filter hook output priority raw; policy accept;
ip daddr 192.168.8.11 counter notrack
makes it stop hitting this counter too:
type nat hook output priority dstnat; policy accept;
ip daddr 192.168.8.11 meta mark set 201 counter dnat to 192.168.8.1
Does notrack prevent entering nat chains, instead of entering them for all packets and not just first? And so, prevents doing -nat actions altogether?
bind()
the socket to the address of the interface you want to use, setting the port to 0, before youconnect()
to 192.168.8.1 with the target port (or use sendmsg, or whatever you have planned with that socket)connect
syscalls when they happen, and run abind
on the affected socket first.gdb
to your running process, andcatch syscall connect
. If that happens rarely enough, you can actually make acomm
program in GDB that reacts to that syscall happening and fires of other calls and modify the arguments of that syscall. When that works in general, I'd replicate the same inbpftrace
, and see whether I can add useful data to the socket, to make it easy for nftables to route things.