The recent outage of Facebook and other applications on the 10th of October 2021 shows how BGP is crucial for the Internet.
But what is BGP, and why you should care about it in Azure (and elsewhere)?
BGP is the acronym of Border Gateway Protocol (not patrol). It is defined by the RCF 4271. It's a routing protocol, it helps routers to advertise networks to each other.
BGP is based on the notion of Autonomous Systems, or AS. It represents an entity controlling a group of networks. An AS is associated with an Autonomous System Number or ASN. This number can be a 16 bits number (living only 64510 possible numbers) or 32 bits (4 billion possibilities). This number can be public for exchanging routes between external entities. In this case, numbers are registered to an Internet registry ().
For private entities, no need to use an Internet registry, using a private ASN is necessary. I can be an ASN between 64512 and 65534 for 16 bits ASN or from 4200000000 to 4294967294 for 32 bits ASN.
To start exchanging routing information, two systems need to peer, they open a BGP session over TCP on port 179 between them. This session is created by linking the two systems (a direct connection over a /31 or /30 network generally), and when the connection is made, the two systems can start exchanging information. A system will advertise the list of its network routes. And a system can be also connected to other systems it can also advertise these other routes information.
You may think, this kind of routing configuration is reserved for hardcore network guys and has nothing to do with Azure. You will be wrong, BGP is everywhere in Azure. For example, behind Express Route you have BGP.
There is another situation where two different sets of networks need to be interconnected, VPN from Azure to on-premises Networks or other CLOUD providers.
By default, the configuration of the VPN gateway is based on route policy. You declare on both sides which networks are available by declaring their prefix. For example, the Azure side will have 172.24.0.0/24 and the on-premises side 172.24.1.0/24.
In some situations, it should not be a problem, but it's not the same for most configurations, because things are not static. You may need to add new network on-premises or new prefixes in an Azure VNET. More if you want to use the HUB and Spoke model in Azure with the VPN Gateway, several new prefixes could be added over time.
Each new prefix added to Azure, by adding a new paired VNET or by adding a prefix to the gateway VNET, requires a change in the VPN tunnel configuration. It can be difficult to manage. This static routing-like configuration is impossible to manage at scale. Generally, people prefer another way, they use a large prefix (/8 to /16) to cover future changes.
This solution is far from perfect. It will work if you only have a few VNET and if you don't need to inter-connect several subscriptions. The main difficulty here is that you may run out of /8 very quickly.
The solution is to enable BGP on your Azure VPN Gateway.
The first thing is to provide two ASN, one for your Azure network and the other for the on-premises network. It can be a 16bits number or a 32Bits number. You can use a public ASN, if you have one. But the best practice is to use a private ASN.
In 16 bit you can only use private numbers between 64512 and 65534, except 65515, 65517, 65518, 65519, 65520, reserved by Microsoft.
For 32 bits numbers, you can use the total private space, from 4200000000 to 4294967294. But to use a 32 bits number you will need to use PowerShell or other CLI/IaC method instead of Azure portal.
$VPNIP = Get-AzPublicIpAddress -Name <PubIPName> -ResourceGroupName <RGName> -Location <AzureLocation> -AllocationMethod Dynamic $GwVnet = Get-AzVirtualNetwork -Name <VnetName> -ResourceGroupName <ResourceGroupName> $GwSubnet = Get-AzVirtualNetworkSubnetConfig -Name "GatewaySubnet" -VirtualNetwork $GwVnet $GwIPCfg = New-AzVirtualNetworkGatewayIpConfig -Name <VpnConfigName> -Subnet $GwSubnet -PublicIpAddress $VPNIP New-AzVirtualNetworkGateway -Name <VPNGwName> -ResourceGroupName <RGName> -Location <AzureLocation> -IpConfigurations $GwIPCfg -GatewayType Vpn -VpnType RouteBased -GatewaySku VpnGw1 -Asn 4200000001
With the VPN gateway configured with its ASN number, you will need to get the second element to enable BGP. The VPN gateway IP for the peering with the on-premises device.
In the on-premises world, you will choose a /31 or /30 to create the peering. With the Azure VPN Gateway, you can only use an IP from the gatewaysubnet subnet. This IP is automatically attributed to you by Azure. You can get the IP by running this PowerShell command
(Get-AzVirtualNetworkGateway -Name <GatewayName> -ResourceGroupName <ResourceGroupName>).BgpSettings.BgpPeeringAddress
Once the on-premises device is configured, you will need to create a Local Network Gateway.
$AzLocalGw = New-AzLocalNetworkGateway -Name <LocalNetworkGatewayName> -ResourceGroupName <ResourceGroupName> -Location <AzureLoacation> -GatewayIpAddress <VPNPublicIP> -AddressPrefix <OptionalLocalPrefix> -Asn <VPNDeviceASN> -BgpPeeringAddress <VPNDevicePeerIP>
With this object, we can create a connection object to finalize the VPN configuration.
$AzVPNGw = Get-AzVirtualNetworkGateway -Name <GatewayName> -ResourceGroupName <ResourceGroupName> New-AzVirtualNetworkGatewayConnection -Name <ConnectionName> -ResourceGroupName <ResourceGroupName> -VirtualNetworkGateway1 $AzVPNGw -LocalNetworkGateway2 $AzLocalGw -Location <AzLocation> -ConnectionType IPsec -SharedKey <PreSharedKey> -EnableBGP $True
A tunnel is now created between the Azure VPN Gateway and the on-premises device. And they can start the exchange routes using BGP.
You can check the route by using:
Get-AzVirtualNetworkGatewayAdvertisedRoute -VirtualNetworkGatewayName gatewayName <GatewayName> -ResourceGroupName <ResourceGroupName> -Peer <VPNDevicePeerIP>
BGP can help you to scale your inter-connected architecture without worrying about routing. But BGP can do more for you. You can set up two VPN devices serving the same on-premises network. It creates a fail-over in case of one device goes offline.