How has the cloud changed networking?
The fundamentals of networking have not changed much over the years since the days of dial-up internet, but the way networking is implemented has changed a lot.
It’s become more reliable and thankfully much faster since the days of dial-up. Networking is how resources and services communicate with each other. With on-premises networking, we had to deploy and configure routers, switches, and firewalls, applying firmware updates and replacing hardware was a common chore.
But the cloud put an end to those tasks.
Although the hardware is no longer our problem, networking still exists and we’re still responsible for configuring and securing it.
What is a virtual network?
Let’s get started with the basics, the network. Azure, Google Cloud Platform and Amazon Web Services all provide a virtual network. Think of this as a virtual routing switch hosted in the Cloud. It’s what all the services connect to and use to communicate with each other.
In Azure, the virtual network is referred to as a VNet.
For AWS and GCP, the network is referred to as a Virtual Private Cloud or VPC.
In each case, they serve a similar function. They contain one or more subnets and allow communication between resources and the subnets.
Networking basics
Each cloud provider has the concept of regions. A region is a grouping of one or more data centers. Spreading workloads across multiple regions provides high availability by duplicating services across those regions. We can also place resources closer to customers.
The way networking utilizes the regions is similar between Azure and AWS — and different with GCP.
Microsoft Azure
With Azure, the virtual network, or VNet exists in one region. Subnets are added to the VNet and resources that are assigned to the subnet, all resources in a VNet can communicate with each other by default, including resources on different subnets.
They also have internet access by default. A resource must be in the same region as the Vnet in order to connect to that subnet. If redundancy is required for high availability, another VNet is deployed to a different region.
AWS
Like Azure, a VPC is created in a region with AWS. A VPC in AWS uses availability zones, which are distinct locations, isolated from failures in another availability zone, subnets exist inside these availability zones.
In an AWS VPC there are two types of subnets, a public subnet and a private subnet. A public subnet has access to the internet while a private subnet does not.
By default, all resources or instances as they’re called in AWS, connected to a VPC can communicate within the VPC.
Google Cloud (GCP)
Compared to AWS and Azure, GCP takes a different approach to networking with our implementation of global VPC.
As the name implies a global VPC spans multiple regions and is not associated with any specific region. It’s a global resource. When a global VPC network is created, system generated subnets are created in each GCP region. The subnets are region-specific.
By default, all instances, virtual machines, for example, can communicate with each other in the network.
Peering and gateways
Now that we understand the networks and subnets for each provider, let’s look at how we can connect them so we can communicate between the different VPCs and Vnets in our services. After all, cloud services are not that useful if we can’t communicate with them.
Each provider supports peering between virtual networks.
For example, Azure supports peering between VNets, allowing the peered VNets to communicate.
AWS VPCs also support peering.
And GCP VPCs as well.
But for all three transitive peering is not supported. So if network A is peered with network B and B with C, network A and network C will not be able to communicate that is at least until we add another peering between A and C.
That solution, however will not scale well, if we add one more network, we need three more peerings to get them all to communicate. This would be hard to manage as the number of networks grow. What we need is a hub-and-spoke solution that allows us to connect multiple networks with a single connection. This is accomplished with gateways.
Microsoft Azure
In Azure, a VNet becomes a Hub and a Hub and Spoke, peering is used to connect the Spoke VNets to the Hub and the Hub contains a gateway that routes traffic between the different networks.
A gateway supports transit of connectivity between the VNets. Gateways support connectivity outside of Azure as well. A VPN gateway supports a VPN connection between the gateway and a VPN endpoint, supporting connectivity to your on-premise network, for example.
An ExpressRoute gateway supports connectivity between Azure and an on-premises network with a private ExpressRoute connection. An ExpressRoute connection is a secure, redundant connection over a third-party network.
AWS
With AWS, a transitive gateway is used to connect multiple VPCs. The transit of gateway connects to the VPCs within a region and allows traffic to flow between them.
If there are multiple regions involved inter-region peering connects the transit of gateways, providing connectivity between the networks.
Connectivity to a remote network can take place over a VPN with the use of a virtual private gateway.
For a dedicated connection, AWS Direct Connect gateway is used to provide a private, high bandwidth dedicated connection between an on-premises network and the VPC.
A third-party provider is required in this scenario for connectivity between the data center and AWS. These providers are located close to the AWS Data Center, providing a private, reliable connection between an on-premises network and AWS.
Google Cloud (GCP)
That brings us to Google Cloud services. Remember, a VPC in GCP is cross-region and all subnets can communicate by default.
GCP introduces the concept of a project. A VPC is part of a project, subnets internal to the VPC can communicate, but they can’t communicate with a VPC in another project. For two VPCs to communicate, we need to add VPC peering.
Just like Azure and AWS, peering in GCP is not transitive. Meaning if we add a third project and VPC and peered that with the VPC and Project 2, Project 1 and 2 can communicate, Project 2 and 3 can communicate, but Project 1 and Project 3 can’t communicate unless we add another peering relationship.
GCP has another feature called a shared VPC. This provides flexibility by allowing multiple projects to leverage a central VPC where connectivity can be controlled and centrally managed.
There are two gateways used for hybrid connectivity to on-premises networks, including a Cloud VPN that provides secure connectivity over a public internet connection.
GCP also offers a cloud interconnect service. This, like Azure ExpressRoute and AWS private virtual connection, provides a secure connection over a private dedicated circuit.
Load balancing
Another important feature of networking is the ability to distribute connections between multiple instances of a service. This is referred to as load balancing.
Not only does load balancing help availability, but it also helps performance by spreading the workload across multiple instances of the same service.
There are many ways to implement load balancing, and Azure, AWS and GCP have different options to meet any load balancing need.
Azure has a protocol level load balancer called Azure Load Balancer. Load balancing can be extended with an application level load balancer called Azure Application Gateway. Azure also offers a domain naming service or DNS load balancer called Traffic Manager. This uses name to IP address resolution to distribute connections based on rules you define, and a global load balancer called Front Door that supports SSL offloading and routes traffic to the closest resource based on rules you configure.
As you’d expect, AWS offers multiple load balancing solutions as well. The network load balancer in AWS distributes connections based on the transport or SSL traffic layer. An application load balancer makes routing decisions at the application layer, directing traffic using path-based routing. AWS also offers a DNS load balancer called Route 53. Route 53 uses DNS to route traffic based on rules you configure that include health checks for DNS endpoints, geography, and latency-based decisions.
And last but not least is load balancing with GCP. Google Cloud is a little different from the others. There are two basic types of load balancers, internal and external. With an internal load balancer, the client requests come from within Google Cloud and internal load balancer is regional. It can use TCP or UDP ports to manage traffic, or the internal load balancer can be a proxy using HTTP or HTTPS to direct traffic. An external load balancer is used when client connections come from the internet. An external load balancer can be regional or global and use passthrough or proxy mode to route traffic.
As we compare the options and remember that, properly designing a service across multiple regions or geographies and building and redundancy is just as important as the load balancing solutions when planning for high availability with load balancers.
For example, if we deploy a load balancer, but the resources behind it are all in a single data center, the solution would be at risk if that data center should become unavailable. A better solution is to use a combination of global, regional, or internal and external load balancers to design a highly available solution.