Routing in Azure is one of the most misunderstood topics. If you are in IT, you may have learned things like gateway, switching, VLAN, and other stuff used in on-premises infrastructure. The most advanced of us may know what BGP means.
We used to think that cloud networks work the same. We have a gateway and routing, … And most of the time it is very similar to what we can find on-premises.
In Azure, if you have a small deployment with only a few VNETs and some peering. You do not have to worry about routing, everything is managed behind the scenes by Azure and routing should not be an issue.
But when you need to have connectivity to several on-premises networks, firewalls, and network appliances, things can become tougher.
There is an error many IT people make, including me, thinking Azure Networking like it is a simple on-premises network. But things that worked with your Cisco or Juniper switches and routers do not work the same here. It is not the same thing.
But things about bringing networking across thousands of different tenants and different data centers in the world? You understand that classic networking does not apply here. You need a solution to make sure that every VM belonging to the same VNET can see each other regardless of where they are.
Computing and Networking in Azure are based on, Hyper-V 2016. Yes, Hyper-V runs Azure. Network flows are managed by the Hyper-V Network Virtualization functionality, Hyper-V switch, and VXLAN encapsulation. Packets are encapsulated to transit into the Azure underlay network.
Look at one VM, it can ping any other VMs in the same VNET. But if you look deeper, even if you have a gateway in the network configuration, it doesn't seem to exist, at least to ping.
In the Azure Network virtualization, the packet is sent to the virtual network interface and encapsulated to transit to its destination. The communication between two VMs goes directly from vNic to vNic.
It changes a lot when you think about networking in Azure compared to on-premises. On-premises, if you want two machines to talk to each other you need to create a link. You need to make sure that you can forward packets on a wired network. It could include a switch and router. It means that you need a gateway, routers, and a path.
In Azure, nothing is needed. Azure has done this work for you. The consequences, of routing are not the same. Routing is automatically set up when you create a VNET. You have nothing to do. When you peer a VNET to another VNET, Azure will automatically update the route table to all involved subnets.
There is a route table managed by Azure, and you have nothing to worry about unless you need to alter these routes to add a firewall, an NVA, or a connection outside Azure. In this case, you will have to change the default routing behavior.
To understand routing, you need to understand what a routing table is. A routing table is a collection of rules to transfer a packet for an IP prefix to the next point in the network, the next hop.
Here 10.0.0.0/24 is the destination and 192.168.0.2 is the first point to go to the destination.
When you create a VNET, Azure will automatically create a system route table. To be short, this route table is relatively simple. You will have two important entries.
The VNET prefix, with “Virtual Network” for the next hop, and 0.0.0.0/0 with "Internet" for the next hop. This means that IPs belonging to the VNET will be managed by the VNET and other IPs will be routed by the Internet.
No matter what the configuration inside the VM is, packets will be routed by the virtual NIC by these two entries.
Azure has other types of next hop, Peering, for routing traffic between peered VNET,
But to understand routing in Azure there are two rules, longest prefix match and symmetric routing.
The longest prefix match rule is easy to understand. Takes a routing table with 10.0.0.0/16, 10.0.0.0/24, and 10.0.0.14/32 routes, if the destination is 10.0.0.14 then the chosen next step will be the one associated with 10.0.0.14/32, if the destination is 10.0.0.18 the next step the next hop will be the one associated with 10.0.0.0/24 and if the destination is 10.0.2.19 the next hop will be the one associated with 10.0.0.0/16.
The symmetric routing rule is more complex to apprehend because, in the on-premises world, network routing is symmetrical most of the time. Return packets use the same path as the incoming packets. But when working with Azure networking things are a little more complicated.
Reminder, in Azure it is the NIC and the software-defined networking that do the routing. The gateway you have is useless. Once a packet hits the virtual NIC, Azure makes the routing.
The virtual nic will use the route table coming with the subnet.
This route table has 3 sources.
- System routes
- User-defined routes
- BGP routes
The system routes are automatically created by Azure when creating a virtual network. It contains an internet route (0.0.0.0/0, meaning outside your VNET Internet and Azure Backbone), a virtual network route (to route traffic inside the VNET), and some other MS routes.
When you a VNET is peered to another VNET, a VNET peering is added.
The user-defined route is the second routing source. You use a UDR at the subnet level to alter system/default routes. For example, if you want to send the Internet traffic to a Firewall instead of the default outbound, you can use a UDR to add the route 0.0.0.0/0 with the next hope to your firewall. But you will need to do that for every subnet in your VNET.
When you add an entry in a User Defined Route, it will deactivate any system routes using the same prefix as we saw with the 0.0.0.0/0 example. The default Internet next hop is replaced by the new UDR route next hop.
BGP routes. BGP is the main routing protocol of the Internet. Without BGP? No Internet! You may not have to learn everything about BGP to understand Azure Networking but having some notion is essential.
In Azure, BGP routes are shared via peering and come from Azure Express Route Circuit, VPN when the BGP option is activated, Azure route server connected to an NVA, and Azure Virtual Wan.
In the routing table, BGP routes are labeled as BGP routes, and the next hop is the virtual networkgateway, generally in a peered VNET containing this Virtual Network Gateway.
The Virtual Network Gateway inject the peered network prefix in the BGP announcement to be advertised to BGP peer (VPN, Express Route,…).
Express Route Circuit is not the only service that uses BGP, BGP can come from a VPN or via an NVA using Azure Route server.
Routes learned via BGP are added to VNETs and labeled as BGP. In this case, there is one important thing to remember, routes learned via BGP are preferred to other routes including system/default routes.
In other words, if you have a 0.0.0.0/0 route in the route table (default Internet route in Azure) and if the BGP peer announces the same route with a different next hop. The BGP route will be used.
If you have the same prefix announced via system route, static route (UDR), and BGP, BGP will always win.
So, the rules for routing in Azure are simple, the bigger prefix wins, BGP prefix wins over the same prefix in the system route. But there is a last important rule, for every route going from IP1 to IP2 there must be a reverse route going from IP2 to IP1 using the same path.
But if it is easy to create a UDR to go one way, the reverse path is less evident. This creates a peering between two VNETs and adds a route to go to a Firewall via peering. Now you need to create a rule for the traffic to go back. To ease a little the problem you decide to use a large prefix (ex 10.0.0.0/21) to go to all you peered VNET. However, doing so will make the return traffic use a different path as the prefix used for the rule is less specific than the peered VNET. The return path will use the peering directly instead of going to the firewall.
The best way to avoid error and misconfiguration in Azure is always to have the 3 rules in mind, the bigger traffic wins, for the same prefix, the BGP route wins and you need to design the return path according to the two first rules.