3

I created a macvlan interface with eth0 as the parent interface. I can see NS for eth0 and NA msgs from eth0 but not for macvlan interface. However when I ping the gateway from macvlan, then NS msgs are seen for macvlan but macvlan is not responding with NA. What configuration would help resolve this? I want to see periodic NS and NA msgs for macvlan the way I am seeing for eth0 currently. There are no namespaces created. I am working in global namespace itself.

Also in order to ping to the gateway from macvlan I had to put "iface lo inet6 loopback" in /etc/network/interfaces file. Otherwise, ping6 kept on changing source into whenever I tried to explicitly put macvlan using -I in ping6. Ping6 from macvlan to eth0 is also not working. Ping6 always changes the source address with msg like -> "Warning: source address might be selected on a device other than macvlan1."

To reproduce the issue checkout below snippets:

ubuntu@vm0:~$ uname -r
3.13.0-36-generic
ubuntu@vm0:~$ ip -6 link                                                            
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 02:ec:39:e5:22:50 brd ff:ff:ff:ff:ff:ff
3: macvlan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/ether ce:99:a8:33:1e:5d brd ff:ff:ff:ff:ff:ff
ubuntu@vm0:~$ ip -6 address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:db8::3/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::ec:39ff:fee5:2250/64 scope link 
       valid_lft forever preferred_lft forever
3: macvlan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
    inet6 2001:db8::8/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::cc99:a8ff:fe33:1e5d/64 scope link 
       valid_lft forever preferred_lft forever
ubuntu@vm0:~$ ip -6 route
2001:db8::/64 dev macvlan1  proto kernel  metric 256 
2001:db8::/64 dev eth0  proto kernel  metric 256 
2001:db8::/48 dev eth0  proto kernel  metric 256 
2001:db8::/48 dev macvlan1  proto kernel  metric 256 
fe80::/64 dev eth0  proto kernel  metric 256 
fe80::/64 dev macvlan1  proto kernel  metric 256 
default via 2001:db8::1 dev eth0  metric 1 
default via fe80::ec:39ff:fee5:22 dev eth0  metric 1024 
ubuntu@vm0:~$ ip -6 neighbor
2001:db8::1 dev macvlan1 lladdr 00:00:5e:00:01:00 router REACHABLE
2001:db8::1 dev eth0 lladdr 00:00:5e:00:01:00 router STALE
2001:db8::2 dev macvlan1  router FAILED
fe80::5e00:100 dev macvlan1 lladdr 00:00:5e:00:01:00 router STALE
fe80::5e00:100 dev eth0 lladdr 00:00:5e:00:01:00 router STALE
2001:db8::2 dev eth0 lladdr 00:00:5e:00:01:00 router DELAY
2001:db8::9 dev macvlan1 lladdr 6a:25:a8:4e:23:5d STALE

When trying to ping eth0 from macvlan, ping6 changes the source with below warning:

ubuntu@vm0:~$ ping6 2001:db8::3 -I macvlan1
ping6: Warning: source address might be selected on device other than macvlan1.
PING 2001:db8::3(2001:db8::3) from 2001:db8::3 macvlan1: 56 data bytes
64 bytes from 2001:db8::3: icmp_seq=1 ttl=64 time=4.41 ms
64 bytes from 2001:db8::3: icmp_seq=2 ttl=64 time=0.548 ms
64 bytes from 2001:db8::3: icmp_seq=3 ttl=64 time=0.628 ms
64 bytes from 2001:db8::3: icmp_seq=4 ttl=64 time=0.546 ms
^C
--- 2001:db8::3 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3053ms
rtt min/avg/max/mdev = 0.546/1.534/4.417/1.665 ms

Ping to gateway works from macvlan

ubuntu@vm0:~$ ping6 2001:db8::1 -I macvlan1
PING 2001:db8::1(2001:db8::1) from 2001:db8::8 macvlan1: 56 data bytes
64 bytes from 2001:db8::1: icmp_seq=2 ttl=255 time=3.50 ms
64 bytes from 2001:db8::1: icmp_seq=3 ttl=255 time=1.63 ms
64 bytes from 2001:db8::1: icmp_seq=4 ttl=255 time=2.54 ms
64 bytes from 2001:db8::1: icmp_seq=5 ttl=255 time=1.26 ms
^C
--- 2001:db8::1 ping statistics ---
5 packets transmitted, 4 received, 20% packet loss, time 4041ms
rtt min/avg/max/mdev = 1.261/2.237/3.501/0.867 ms

tcpdump on eth0 looks like this

ubuntu@vm0:~$ sudo tcpdump -eni eth0 ip6
11:45:39.281934 00:00:5e:00:01:00 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 110: fe80::5e00:100 > ff02::1: ICMP6, router advertisement, length 56
11:45:40.105204 00:00:5e:00:01:00 > 33:33:ff:00:00:03, ethertype IPv6 (0x86dd), length 86: 2001:db8::2 > ff02::1:ff00:3: ICMP6, neighbor solicitation, who has 2001:db8::3, length 32
11:45:40.105975 02:ec:39:e5:22:50 > 00:00:5e:00:01:00, ethertype IPv6 (0x86dd), length 86: 2001:db8::3 > 2001:db8::2: ICMP6, neighbor advertisement, tgt is 2001:db8::3, length 32

but no NS for macvlan

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on macvlan1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:48:09.316865 00:00:5e:00:01:00 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 110: fe80::5e00:100 > ff02::1: ICMP6, router advertisement, length 56
11:48:10.197974 00:00:5e:00:01:00 > 33:33:ff:00:00:03, ethertype IPv6 (0x86dd), length 86: 2001:db8::2 > ff02::1:ff00:3: ICMP6, neighbor solicitation, who has 2001:db8::3, length 32
11:48:20.220753 00:00:5e:00:01:00 > 33:33:ff:00:00:03, ethertype IPv6 (0x86dd), length 86: 2001:db8::2 > ff02::1:ff00:3: ICMP6, neighbor solicitation, who has 2001:db8::3, length 32


5
  • Working fine here doing the same. When communicating with the gateway, macvlan interface in the same namespace sends NS and receives back NA as well as receives NS and sends back NA. Commented Dec 21, 2020 at 8:17
  • I created macvlan intf and statically added ipv6 address to it. I used following commands "ip link add macvlan1 link eth0 type macvlan", "ip link set dev macvlan1 up" and then "ip -6 addr add <addr from same subnet as eth0's> dev macvlan1" Commented Dec 21, 2020 at 8:17
  • There are no special network settings in place. Also in order to ping to gateway from macvlan I had to put "iface lo inet6 loopback" in /etc/network/interfaces file. Otherwise ping6 kept on changing source intf whenever I tried to explicitly put macvlan using -I in ping6. Ping6 from macvlan to eth0 is also not working. Ping6 always changes the source address with msg like -> "Warning: source address might be selected on device other than macvlan1." Commented Dec 21, 2020 at 8:27
  • The route to reach one global scope local IP address from one other global scope local IP address is always through the lo (loopback) interface. So you're asking about an issue when forcing ping to do something else than expected. Commented Dec 21, 2020 at 11:29
  • I am expecting NS and NA msgs to and from macvlan intf when I bring it up. Are there any changes in settings needs to be done? Commented Dec 21, 2020 at 11:33

1 Answer 1

2

It can't work like this

The issue is a wrong routing assumption combined with how MACVLAN is working. MACVLAN's main use case is for network namespaces and containers. Even in the container case it's well known that there's a communication problem between the host and the container when the container is using a MACVLAN interface.

When the MACVLAN interface is created, the parent NIC interface listens for packets on the additional MAC address assigned to the MACVLAN interface, and further on all additional MAC addresses required by the MACVLAN interface (mostly multicast, which is used for IPv6's NDP initial queries to multicast destinations). I don't know how to list the NIC's MAC address filter table, but allocated multicast MAC addresses can be listed using ip maddress. So if you compare ip maddress show dev eth0 before and after bringing up macvlan1 this additional entry will appear:

    link  33:33:ff:33:1e:5d

Because that's the Ethernet multicast address that will be used for the IPv6 NDP requests about ce:99:a8:33:1e:5d / fe80::cc99:a8ff:fe33:1e5d/64 (an improvement from IPv4 ARP broadcasts). Add an other IPv6 address 2001:db8::8/64? Requires an new matching Ethernet multicast address for the NIC to listen to:

    link  33:33:ff:00:00:08

That's just to show there's a relation between the interfaces. When a frame is emitted with macvlan1's MAC address by eth0 this packet is not seen by eth0's IPv6 stack. It's injected directly by the MACVLAN driver at the last step. Likewise, when a reply is received on eth0 for any of the MACVLAN's reserved addresses (until now there's at least: ce:99:a8:33:1e:5d which is a local unicast, and also 33:33:ff:33:1e:5d and 33:33:ff:00:00:08 which are Ethernet multicast for IPv6), this Ethernet frame is stolen by the MACVLAN driver and put on the macvlan1 interface instead. So the IPv6 stack behind eth0 will never receive this packet.

So to summarize what happens when a query intended to be received by eth0 and its network stack behind is emitted by a MACVLAN interface linked to eth0:

  • frame on behalf of macvlan1 is emitted through eth0 to the LAN
  • frame is never received by eth0
  • no reply ever happens

Imagine such query was somehow actually received and a reply was actually made somehow (despite routing issues here)

  • frame this time really on behalf of eth0 is emitted through eth0 to the LAN
  • frame is never received by eth0
  • frame is thus never stolen by the MACVLAN driver and never received by macvlan1
  • no reply is ever received

That's what would happen if macvlan1 was in a network namespace, just for information by doing this:

ip netns add experiment
ip link set macvlan1 netns experiment

and redoing its configuration because changing namespace resets an interface:

ip -n experiment link set macvlan1 up
ip -n experiment address add 2001:db8::8/64 dev macvlan1

and doing a ping like this (which would still fail as explained above):

ip netns exec experiment ping 2001:db8::3/64

This would not happen naturally without network namespace, because when two interfaces are on the same host (whatever kind of interface, it's not relevant), reaching the host from one global IPv6 address belonging to itself to an other IPv6 global address belonging to itself will not require Ethernet at all: it's always routed through the lo (loopback) interface:

# ip route get 2001:db8::3 from 2001:db8::8
local 2001:db8::3 from 2001:db8::8 dev lo table local proto kernel src 2001:db8::3 metric 0 pref medium

and only forcing the interface with the ping command will trigger additional routing issues.

So I explained sufficient reasons why there's no answer to queries, also how the interface should have been used and why it would still not work.


How this should be handled

For this to work (in the network namespace case at least, not regarding routing issues for local addresses), the external switch has to send back frames from where they came. Here the switch could be the host system running the VM or a real external switch depending on the configuration.

If this is impractical to configure such switch settings (VEPA, hairpinning...), one way is to create the MACVLAN interface in bridge mode and add an other MACVLAN interface on the host also in bridge mode, with specific address and routing settings. Because then the MACVLAN driver will not have to emit anything through eth0 but will accurately detect all can be handled internally from one virtual interface to the other. Here from scratch:

ip link add link eth0 name macvlan1 address ce:99:a8:33:1e:5d type macvlan mode bridge
ip netns add experiment
ip link set macvlan1 netns experiment
ip -n experiment link set macvlan1 up
ip -n experiment address add 2001:db8::8/64 dev macvlan1

ip link add link eth0 name macvlanhost up type macvlan mode bridge
ip address add 2001:db8::3/64 dev macvlanhost noprefixroute
ip route add 2001:db8::8/128 dev macvlanhost

The command below, still forcing the address in case privacy extensions are in use (only the route for 2001:db8::8 was added on macvlanhost):

ip netns exec experiment ping -I 2001:db8::8 2001:db8::3

will now work. If you have to check for more NS/NA, first flush NDP entries:

ip -6 neighbour flush dev macvlanhost
ip -n experiment -6 neighbour flush dev macvlan1

with captures on the host:

tcpdump -e -n -s0 -p -i macvlanhost icmp6

or in the namespace:

ip netns exec experiment tcpdump -e -n -s0 -p icmp6
 

Of course the network namespace experiment can still communicate with anything present on the LAN as before including the gateway, but not with services on localhost (ie: using ::1, as its own instance of the lo interface is different from the host's lo interface) , it just has to work slightly differently when communicating with the host.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.