Route 53 Deep Dive

Reducing latency for your web application

  • You need a domain and hosted zone for your web app for minimum. Then you need a Load Balancer in front of your web server
  • Register vs Host Domains: You can either register a domain natively within Route 53 or transfer a domain from another registrar. You can either just register a domain within Route 53 but then leave the hosting of the domain to another provider or you can go fully Route 53 native. You need to configure name servers to point to Route 53 for 3rd party domain providers.
  • Hosted zone is the actual data. You provide 4 name servers registered to the hosted zone.
  • Hosted zone can be public or private. Private ones are only VPC specific.
  • TTL value: means any other name servers which become aware of the certain record would cache the record sets based on TTL.

Route 53 Record Sets use cases

  • Use an A record if you manage what IP addresses are assigned to a particular machine or if the IP are fixed (this is the most common case)
  • Use a CNAME record if you want to alias a name to another name, and you don’t need other records (such as MX records for emails) for the same name. You can only use CNAME for subdomains.
  • Use an ALIAS record if you are trying to alias the root domain (apex zone) or if you need other records for the same name or you can refer to an AWS logical resource (ELB, CloudFront, S3).
  • Use the URL record if you want the name to redirect (change address) instead of resolving to a destination.
  • You can use S3 as a backup of an EC2 instance web hosting and use an Alias to serve the contents.
  • TTL (Time-to-Live): Let’s say you have a value of 3600 seconds for TTL. You can get a non-authoritative answer for your query because it’s cached within the Name Server as it does not change. When things change, if you migrate your email servers for example and have a high TTL value and MX record, then the delivery might be delayed because old IPs are cached. High values mean less changes.

DNS Failover

  • Use health checks and associate them with resource record sets
  • DNS server propagates health check notifications globally
  • Set up Fail Whale if both health checks from 2 different path are evaluated as failure
  • Within failover set up, Primary record needs to be associated with the healtcheck on Route53.
  • Geolocation: able to specify a location when you have multiple regions and a specific bucket as well.

Deployment Strategies

  • DNS Wave Deployments: Every Edge Location servers consist of 1 stripe where 1 stripe consists of 4 name servers. Your name servers correspond to many Edge Locations. 1 stripe is supported by many Edge Locations but 1 Edge Location supports 1 stripe only.
  • API Deployments: 3 different fleets: Batch fleet (manages metrics, config etc small things), Operations fleet (used to process change proc but not directly to the customer) and Customer fleet (served API requests, client faced) in Staging Environment. You introduce a mixed mode architecture, old version of the software and new version of the software in two different waves. Then you take it to the production in the same way
  • Conditional Routing Trees: The idea is to utilize Alias Record Sets with health checks and combine them into the decision trees. For your web browser access, it would return more than one name. You can also check depending on the load on each address. If prod1 is less than %50, check prod2 as well and then if both true, return both prod1 and 2 addresses. This is basically shifting the return server depending on the load.

Going Global with Route 53

  • People then started to complain about latency issues within DNS. You would put a CloudFront in front but that might not be enough all the time. Let’s say you spin up 3 stacks in 3 different region. You can associate same private zone that you associated with your VPC with every single VPCs in those regions. Afterwards, you would apply Latency Based Routing among all those three regions. It does not always relate to the geographical region but also ISP details etc.
  • Geo Based Routing for places where AWS don’t have regions, let’s say Africa. Always uses the most specific routing possible. Fail Whale of course works with all these deployments.
  • Reducing failover latency via cache busting: This is in the context of DDoS. Typically used in a DDoS to make it hurdle for the server to respond by giving it more work because it is uncached and it will cause server to perform more work. In this case, we will ensure that our queries are not cached by the DNS failover, thus reducing our failover time.
  • Route 53 Weighted Round Robin: Allowing developers to specify the frequency (weight) with which different DNS responses are returned to end users. You can use WRR to bring servers into production, perform A/B testing, or balance your traffic across regions or data centres of varying size.
  • An alternative to using Auto Scaling groups and elastic load balancing is using Elastic IP addresses with Route 53 record sets. You might consider this solution when you use third-party load balancers, you need to support custom

Healthchecks within Route 53

  • Healthchecks are used for global resiliency, it cannot detect an individual EC2 instance’s health but only the endpoints such as ELB.
  • Utilising HTTP/HTTPS/TCP to configure and Route 53 will monitor the endpoint targets.
  • Here’s what happens when you omit a health check on a non-alias resource record set in this configuration: Amazon Route 53 receives a query for example.com. Based on the latency for the user making the request, Amazon Route 53 selects the latency alias resource record set for the us-east-1 region. Amazon Route 53 looks up the alias target for the latency alias resource record set, and checks the status of the corresponding health checks. The health check for one weighted resource record set failed, so that resource record set is omitted from consideration. The other weighted resource record set in the alias target for the us-east-1 region has no health check. The corresponding resource might or might not be healthy, but without a health check, Amazon Route 53 has no way to know. Amazon Route 53 assumes that the resource is healthy and returns the applicable value in response to the query.
  • You can use Health Checks with TCP for connection, HTTP/HTTPS for web server response and HTTP/HTTPS with String Matching for ensuring that the app is delivering expected content.
  • You can check Endpoints, CloudWatch Alarms, Checks of Checks (calculated checks for app health in overall).

Inbound and Outbound Endpoints

  • Helps you to create private IP addresses so you can access from outside.

Route 53 Resolver

  • It’s essentially a VPC DNS Resolver existing for some time in the form of a shared resolver for each AZ in isolation. What this resolver does is to forward multiple queries to multiple targets.
  • Resolver can classify query type by association, i.e. private DNS, VPC DNS (EC2 names) or Public DNS. Private DNS takes priority over others. Private DNS now supports overlapping private hosted zones. E.g. mycompany.com and xyzmycompany.com in the same VPC.
  • Hybrid Scenarios
    • DNS forwarder is used to forward requests to/from on-premises servers to resolve names. You can configure timeout and attempts in Linux stub resolver for failure scenarios.
    • Hub and Spoke: In case you have multiple teams/VPCs. You can connect them with peering/transit gateways.

Route 53 Resolver Endpoints

  • Allow on-premises resolvers query Route 53 Resolver. Creates routable ENIs in VPC reachable over DX or VPN.
  • Use multiple ENIs in separate AZs for HA.
  • Use a retrying DNS resolver on-premises in the form of caching etc.
  • Outbound endpoints provides a path for Route 53 Resolver to query on-premises / DNS resolvers.

Misc Notes

  • Alias records can only be used to redirect queries to selected AWS resources;
    • S3 buckets
    • CloudFront distributions
    • Another record in the same Route 53 hosted zone
  • Whereas a CNAME record can redirect queries to any DNS record.
  • For EC2 instances, always use a Type A Record without an Alias. For ELB, Cloudfront and S3, always use a Type A Record with an Alias and finally, for RDS, always use the CNAME Record with no Alias

Hybrid DNS Architectures By default, there is a VPC resolver and DNS configured.

  • If the client on premises wants to reach out to two.myvpc.com, it gets to the DC mydc.com first and since it’s not resolvable publicly, we need to create a conditional forwarding rule that forwards the requests to the VPC. However, VPC resolver is not reachable other than local VPC resources. To get around this, Unbound DNS can be implemented in the VPC running conditional forwarders.
  • New services: Route 53 Resolver Endpoints: you will be able to place resolvers to your endpoints that are reachable over Direct Connect. You can place multiple on different AZs as primary, secondary and tertiary.
  • If you need to connect to your VPCs via DirectConnect, you’ll need to have virtual gateways attached to Private Virtual Interfaces. For services such as S3 and DynamoDB, you need to have public virtual interface as these services need to be accessible publicly and they are region independent