I operate servers where we use BGP (EVPN - VXLAN). The architecture is similar to the one described by Vincent Bernat. The hypervisors act as routers for the VMs.
When I designed the addressing plan, I used IPv4/IPv6 networks in /24
and /64
respectively with the gateway VIPs positioned on the first available IP: 10.0.0.1/24
and fdc0:3489:e0fd::/64
.
This topology has been working perfectly for 2 years. Recently, we had to deploy a dedicated VM for interconnecting two VRFs. We use iBGP sessions for route exchange between a VM and a hypervisor.
In our case, the session is properly established and the exchanged routes are correct. I enable IPv6 forwarding on the VM (net.ipv6.conf.all.forwarding=1
). At this precise moment, the BGP session drops. What happened?
First debugging step, analyzing the daemon logs on the VM:
|
|
The error seems quite clear, our BGP neighbor appears to be unreachable.
|
|
First clue, the response is very quick and the reply’s “from” field comes from the VM.
I decide to check the route used by my kernel to reach this IP:
|
|
We notice an anycast route. I’ve never seen this type of anycast route before, but its behavior seems identical to local type routes.
In summary, the kernel adds an anycast route on the first IPv6 of our network as soon as forwarding is enabled.
While digging through the kernel code, I come across dev_forward_change
.
We will use function tracer to understand the implementation. I plan to write a separate article about this topic later.
Let’s check if a function tracer is available.
|
|
It’s available, we can enable tracing and activate forwarding.
|
|
|
|
This confirms that the dev_forward_change
function is indeed being used, we can continue reading. A few lines down, we see the call to addrconf_join_anycast
Here’s the function that decides which IP to steal from our hypervisor:
|
|
In this function, we can see that if the IP on the interface is in a prefix larger than /127, then we call the __ipv6_dev_ac_inc
function with the device and prefix.
The __ipv6_dev_ac_inc
function adds this anycast route using the ip6_route_info_create
function.
After understanding this process and finding the right keywords, I found a series of articles like Daryll Swer’s and Karl Auer’s. Finally, if you want to learn more, I recommend reading RFC 4291.
In summary, enabling IPv6 forwarding on our VM had an unexpected effect: the kernel automatically added a “Subnet Router” anycast address (thanks to RFC 4291), which nicely short-circuited our BGP session by taking over the address we were using.
Today, I’ve never seen people intentionally use the properties of this address, and Linux’s less visible behavior can be a source of errors.
However, the solution is quite simple: avoid using the first address of the network (::
). A specific address like ::1
will work perfectly fine.