Another Day, Another Billion Flows - How VPC technology works?

Misc Notes

  • Public IP of your instance is being held on the array of Internet Gateway. IGW array controls all the IPs. It deals with IPs and managing packets.

VPC Encapsulation

  • AWS has its own proprietary VPC encapsulation to disambiguate the overlapping IP address. Which VPC, ENI (physical interface) all these info are on VPC encapsulation
  • Blackfoot Edge Devices: Translate from the inside VPC encapsulation and translate the packets to the outside normal IP networking, it has an embedded router which performs Encapsulate and de-encapsulate. It will NAT the internal IP to the public IP addresses.

Encapsulation in Direct Connect

  • It encapsulates the packet to make it ready for VLANs as Direct Connect is represented as a physical VLAN

S3 / DynamoDB Endpoints

  • Enables you to directly connect these services to your VPC without traversing the internet

Mapping Service

  • Identifies which ENI corresponds to which instance and physical host at the moment. It also manages the routes. Embedded routers within physical host talks to Mapping Service. This is not like ARP, much more enhanced. Look ups are really fast. It is a distributed web service that handles mapping between customers VPC routers and IPs and physical destinations on the wire

VPC Flow logs

  • They are ENI aggregated data

Network Load Balancer

  • It is embedded in the VPC network. Can load balance flows natively and transparently in the VPC network


  • On the VPC network along with EC2 physical hosts + Blackfoot. You can make an EC2 instance a HyperPlane. They do state tracking etc. It is a distributed system, hyperplane nodes makes transactional decisions and share state in tens of microseconds.
  • For NAT: HyperPlane guarantees that connections to the same destination IP/destination port pair have a unique source port. Example of this is the software updates. Let’s say Ubuntu update, so many instances will want to hit the same IP with the same port at the same time
  • For NLB: When the first packet comes in, HyperPlane selects the target instance or container that should handle a connection. It does this by keeping the state info

HyperPlane and Shuffle Sharding

  • Let’s say you have 8 HyperPlane nodes. All 8 shared by all customers. The problem with that is the noisy neighbour problem, busy customer floods all 8 nodes! You can shard it to 4 groups of 2! Now 1/4 of your customers have a bad day, it’s better. Shuffle Sharding: you just give 3 of 8 randomly to a customer. Another customer, 3 nodes again but 1 overlap between these two. Potential overlap decreases by the number of nodes increase!!!

Enhanced Networking

  • A lot of application connections look like request/replies. Web services call to a load balancer, pulling data from an application server. There is a strong correlation in between the latency and the underlying packet rate you can obtain from the network. Packet per second capability of the instances really matters in this case. With enhanced networking, AWS uses a technology called SR-IOV (Single Route I/O Virtualisation) which makes it possible to bypass some of the virtualisation layer and have your instance directly connects to the physical NIC of the infrastructure. Your instance needs to run a driver which knows how to use SR-IOV. Virtual function needs to run on the physical NIC on the AWS side when you turn on the enhanced networking mode. Enhanced networking reduces the jitter on the network which further reduces the latency.

VPC Peering

  • You can design your architecture in a way that shared services across your organisation would be based on VPC peering. Use cases are Authentication, Monitoring, Logging, Remote administration and Scanning
  • Provides Infrastructure Zoning by segregating Dev, Test and Production VPCs and you can have identical VPCs, can literally use the same IP space and have the same experience everywhere

VPC peering for VPC-to-VPC connectivity

  • You can use across accounts. You have to specify the peer owner account’s ID.
  • Using peering ID for handshaking to confirm the peering connection between 2 VPCs. Once it’s created, update route tables to say that any traffic in between goes through the peering connection. VPC peer as a route target!
  • Security groups not supported: You need to specify security group rules by IP Prefix
  • No transit capability for VPN, AWS Direct Connect, or 3rd VPCs. It cannot access VPC C from VPC A via VPC B. You need to create a direct peering and also Peer VPC address ranges cannot overlap but you can peer with 2+ VPCs that themselves overlap

Routing and Private Connections

Create a VPC which does not overlap with your existing network on premises

  • VPN Connection: Running a few APIs through AWS CLI, SDK or your tools to connect the gateway to the VPC. Define Customer gateway and create a connection between those two.
  • AWS Direct Connect: Either direct fiber connection or one of the AWS APN partners who provide circuit backhaul from Direct Connect locations. From AWS standpoint, it is again API driven. Tell AWS how much bandwidth you need and to where, you can create a private virtual interface between on premises and AWS. This is optional if you don’t need a private connectivity. Use this particularly for hybrid applications
  • Configuring Routing Tables: AWS route tables are highly available and redundant and abstracted, therefore there isn’t second route table in your VPC.

Remote Connectivity Best Practices

  • By default when you create a VPN connection, AWS gives you 2 IPSec tunnels. Use BGP for failure recovery. BGP routed tunnel is recommended for high availability. BGP is just a routing protocol used between AWS VPN router and yours. BGP provides a hard layer 4 connectivity check. If anything happens, BGP will know and failover.
  • It is better to configure two VPN connections in between and protect against failure. 4 IPSec tunnels. You can also configure if you wish to have single ingress path. If you need to have high confidentiality, you need to go with Direct Connect which is hard-wired/provisioned for your instances. Direct Connect with VPN backup provides more robust availability

VPC with Private and Public Connectivity

For public connectivity

  • You need to create an internet gateway and attach it to the VPC. Internet gateway is highly available, no throughput constraints, no latency, no single point of failure

For Hybrid App Connectivity

  • If you have an app customer facing, you would need a direct path to your VPC from outside. You can create an Internet Gateway for that purpose. You point your route to the Internet Gateway.

Automatic Route Propagation

  • Any kind of prefixes that AWS receives from your Data Center that the customer advertised to the routing protocol to BGP, they are populated in to the VPC route table. Let’s say you have a subnet in your home network, and you decide to add an IP block, you add that route to your corporate network, that would automatically populated in the VPC route table as advertised.

Isolating connectivity by subnet

  • If you don’t want an AWS subnet to have any kind of connectivity to home network, create a new route table and associate with the subnet. As soon as you associate with that subnet, any traffic coming from the instances from that subnet, will be routed according to that route table and will not use the default route table.

Software VPN for VPC-to-VPC Connectivity

  • Useful for connecting VPCs across regions. Software firewall to the Internet: NAT/Firewall functionality which provides routing all traffic from subnets to the Internet via a firewall is conceptually similar.

NAT Instance & Gateway Best Practices

NAT Instances

  • When creating a NAT instance, disable source destination check on the Instance
  • NAT instance must be in a public subnet to have a connectivity to the outside world. There must be a route out of the private subnet to the NAT instance, in order for this to work
  • If you are bottlenecking about the amount of traffic NAT instance supports, increase the instance size
  • You can create high availability using Autoscaling groups, multiple subnets in different AZs and a script to automate failover NAT instance is always behind a security group

NAT Gateways

  • This is preferred by big organisations as it can scale automatically up to 10Gbps and there is no need to patch as it’s a managed services
  • It is not associated with security groups
  • Automatically assigned a public IP
  • Remember to update your route tables as per NAT Gateway
  • No need to disable source/destination check
  • NAT Gateway brings per-flow stateful NAT to VPC. Many instances trying to hide behind one IP

Network ACLs

  • Your VPC automatically comes with a default network ACL and by default it allows all outbound and inbound traffic
  • Each subnet in your VPC is automatically associated with the default network ACL if you don’t associate with your network ACL
  • You can associate an ACL with multiple subnets; however, a subnet can be associated only with one network ACL
  • Network ACL rules are evaluated in order! Separate inbound and outbound rules! And each rule can either allow or deny traffic
  • Network ACLs are stateless, responses to allowed inbound traffic are subject to the rules for outbound traffic
  • Block specific IP addresses using network ACLs, not Security Groups

NATs vs Bastion Hosts

  • NAT is used to route traffic from instances in a private subnet to reach out to internet. However, people from the internet cannot SSH into those instances in the private subnet to administrator
  • If you want to do that, use jump boxes. Jump boxes let you privately SSH into your bastion host and initiate another connection to your instances through the private network. Bastion is used only for administration purposes. Harden the bastion server OS!

High Availability NAT - Squid Proxy Problem

  • Standard NAT inside of VPC is confined to a single instance, which could fail. You also need to perform large puts and gets to Amazon S3 Solution
  • Run Squid in proxy configuration in an AutoScaling Group. That AS Group would scale on network I/O. Put a private proxy layer in front of this (private load balancer). And then when you boot these EC2 instances, you will bootstrap and proxy all their connections http/s out through that particular private load balancer. So if Squid proxy fails, you get many more other instances.

Availability pattern for private connections

  • Good one is to use BGP for failure recovery, tight coupling. Every VPN connection is treated as an IPSec tunnel. 2 AZs for example.
  • A better availability scenario is to have 2 VPN connections from 2 routers at the edge of your network from your DCs. 4 BGP and IPSec tunnels, protects against customer gateway failure.
  • Use of Direct Connect to improve consistency in network performance and VPN as a backup. 2 Direct Connect connections from two DCs etc. Direct Connect connections are direct connections so you need to create two pairs of them for resiliency

  • Automatic route propagation from VGW: Listening your routes and propagating through the network. Some customers use MPLS cloud and connect to the direct connect. They will dynamically advertise prefixes etc. from customer DCs. With route propagation, there is no extra work.
  • When configuring your customer gateway to connect to your VPC, Internet Key Exchange, the IKE Security Association is established between your customer gateway and the virtual private gateway using a set of pre-shared keys