devscoach.

Demystifying AWS Networking

2024-04-01

aws

networking

Networking was something that took a long time to click for me. I started my career at Cisco Systems, the king of all network companies, and the amount of knowledge the network engineers had was mind-boggling. But over time the parts started to fit together in my head. Professionally, I have spent the most time dealing with networking in AWS.

This post is to solidify my knowledge of how the pieces of AWS networking work, and hopefully save some time for others in the process.

Each of these sections could be a book of its own, so I will try to keep the explanations concise but useful. At the end we will walk through an example of how it all fits together for an example SaaS application.

The VPC

VPC Diagram

Virtual Private Cloud or VPC is your own little network kingdom inside of AWS. The VPC is the container unit for all the network items that we will discuss later. The great thing about the VPC is it allows you to separate your different services inside of AWS, so they cannot access each other.

This isolation provides security from attackers (if one network becomes compromised the attacker is isolated in that VPC) and internal safeguards (someone can't deploy a service in your staging VPC that will hurt your services in your production VPC).

VPCs are Software-Defined Networks meaning that the structure of the network is defined programmatically. This means that adding or removing networks is done without modifying hardware or physical networks, while providing the same isolation. Amazing.

Each VPC is assigned a single Internet Gateway (IGW) -- AWS loves their acronyms -- to route all connections in and out of the VPC.

VPCs (and all networks defined inside of them) are region specific.

Subnets & Route Tables

Subnets are groups of IP addresses, and Route Tables define how traffic moves in and out of these groups. The route table entries define what type of subnet it is. We will discuss the two most common types -- public and private subnets. VPCs have an IP address range, and subnets must choose their group of addresses from within the VPC range.

When you add a new EC2 instance into your VPC, you must assign it to a subnet. From there it will get randomly assigned an IP address chosen from the subnet's IP address block. For example, if you add an EC2 instance to the subnet 10.0.0.0/16 it might be assigned 10.0.0.1. The range 10.0.0.0/16 encompasses all IP addresses from 10.0.0.0 to 10.0.255.255.

If you are unfamiliar with CIDR notion for IP addresses, the wikipedia entry gives a good primer.

Subnets are Availability Zone specific, you will need a new subnet for every AZ you want to deploy into.

Private Subnets

Private Subnets are hidden completely from the internet. Resources that are placed inside a private subnet are not addressable from the outside world. Additionally, the resources cannot connect to the wider internet, only other local resources inside the VPC. By default, private subnets only have a route table that looks like this:

Default Route Image

This means only requests to an IP address within 10.0.0.0/16 will be routed, and then they are only routed locally.

So if you place an EC2 instance inside a private subnet you cannot access it from the internet, ssh into it from your computer, or download any packages. Which makes it very secure, but not super useful.

OUTBOUND - NAT GATEWAY

That's where NAT gateways step in. NAT (Network Address Translation) gateways sit inside your VPC and allow resources inside private networks to connect to the public internet. When your create a NAT gateway and connect it your private subnet the route table will look like this:

NAT Route Example

For example, your server requests to download a package from the internet. The route table for the subnet your server is in sees it's a non-local request and sends it the NAT gateway. The NAT gateway translates the server's private IP address to the NAT's public IP Address -- you could think of it as lending the IP address to the server. Then the NAT gateway makes the request and sends the response back to the server, shielding it from the internet.

This means that if you want your resources inside a private subnet to be able to access the outside world you need a NAT gateway. But, NAT gateways are outbound only. There is no way for a machine in the outside world to use the NAT gateway to connect to your private resource, keeping it safe and sound inside the VPC.

INBOUND

So how do we get other servers to talk to the ones sitting inside a private subnet? Well they need to be in the same VPC. All servers by default can talk to each over the local route in the route tables. But if you are not in the VPC you will need to use something like a Jump Box. A jump box is a server inside a public subnet in the same VPC as the private server you want to talk to. As we will see in the next section, resources in the public subnets can be reached from the outside internet. A user can then, for example, SSH into the jump box and then ssh again -- or "jump" -- to the private resource.

This is only one example. There are other setups that can be effective as well, like using VPNs.

Public Subnets

The difference between private and public subnets revolves around how requests are routed via the route table. Private subnets do not have a route to the internet gateway, while public subnets do. This means that when you launch an instance into a public subnet it can send and receive traffic from the outside internet.

To be reached from the outside internet though, the newly added resource will need an IP address. On public subnets you can choose to automatically assign IPv4 and/or IPv6 addresses to any resource that is added to that subnet.

Note that IPv4 addresses have a monthly fee attached.

If you add a resource to a public subnet, and have an IP address assigned to it, be warned -- it is now accessible from anywhere. To add more specific rules to individual resources you will need to update the Security Group assigned to that item. Security groups are essentially firewalls.

Route Tables

Route tables can hold much more complicated logic like cutting off access to certain subnets, sending requests through a VPN gateway, or even peering with another VPC. You can think of a route table as a switch board, it sees the incoming request and looks up where to connect it.

What if you have multiple routes matching a request? Well the route tables use something called Longest Prefix Match Routing which means the most specific route for a given request wins. The linked article goes into more detail.

Example Setup

Now that we understand the basics let's walk through an example on how we might set up a generic SaaS application's networking.

In real life you could run a simple SaaS application on one EC2 instance, in a public subnet and be done, but I want to show a more complicated setup for example purposes.

What will we need:

VPC - our house where all of our networking lives
Load Balancer - an ALB to distribute requests
Web Servers - two EC2 instances to run our application
Database - hosted db with RDS
Private Subnets - one for each web server, and RDS
Public Subnets - at least one for our ALB

The final setup will look like this:

Final Setup Image

Now with our knowledge we should be able to puzzle out how this works. The ALB is our public facing component. It will be location to point our DNS record to. For this reason it should be inside a public subnet because we want our users to be able to access it.

You might then be thinking that the EC2 instances should also be placed inside the public subnet. If we didn't have a load balancer that would be the case. And if we didn't want to use an ALB we could place the web servers inside the public subnet, assign elastic IP addresses to them, and use DNS to load balance between them.

But, in this case we are using an ALB which means our web servers can instead be nestled safely inside our private subnets. The reason I used two separate subnets here is that I want to launch these web servers into different Availability Zones (AZs). It doesn't make much sense from a redundancy standpoint to place them in the same AZ. If an AZ fails then our other server can keep serving requests.

Additionally, we don't want our RDS instance to be reachable from the outside world. It should only be able to be accessed by our web servers. That means it should also be in a private subnet.

The final piece is the Jump Box. Most likely you will want to be able to ssh into your web servers at some point. The jump box allows you to do that. Since it is in the public subnet we can first ssh onto it, and then ssh to the web servers since are in the same VPC.

I highly suggest updating the jump box to either a) only be accessible from your VPN or b) only be accessible from your home IP address. To do this you can update the security group assigned to the jump box. Security Groups -- as mentioned earlier -- are a firewall where you can pick and choose what traffic is allowed in and out of a specific resource.

So now we can see how a request will travel through our network. A user will make a request to our application. Our DNS record is pointed at the ALB. The request is routed to our VPC from DNS. The request goes through the IGW, hits our ALB, the ALB then chooses on of our servers to send the request to. The ALB sends the request over the local network so that it reaches our private subnet. The server reads the request, queries the database for information -- again traveling over the VPC local network -- and responds back to the ALB. Which in turn sends the request back to the user in the outside world.

Conclusion

This only scratches the surface of what you can do with AWS networking. But, like all things with infrastructure, it's important to start as simple as possible. When your needs change, and you need to scale, then add more elements. The more elements, the more surface area for attack and the more complicated the administration.

If I made any errors, or you want to discuss more, please email me.