Deep Dive on AWS RDS

How do I decide between GP2 and IO1 instance type for RDS?

  • GP2 is a great choice but be aware of the burst credits on volumes < 1TB
  • Hitting credit-depletion results in IOPS drop - latency and queue depth metrics will spike until credits are replenished
  • Monitor BurstBalance to see percent of burst-bucket I/O
  • Think of GP2 burst rate and IOPS stated rate as maximum I/O rates. More IOPS are not necessarily better, it needs to be optimised

What happens during a Multi-AZ failover?

  • Primary and Standby EC2 instances that are running your Relational Database replicates physically the storage blocks
  • 3rd party is monitoring these primary and secondary instances. Once that observer loses connectivity between these two. Then initiates the failover and secondary becomes the primary. You put a new entry to the DNS table so when your application is disconnected, queries the DNS again and reconnect to the new primary. If you are caching data, you would need to set the TTL value as low as possible because of failover scenarios

Why would I use Read Replica?

  • Primary goal is to relieve pressure on your primary database. Offload app’s read heavy workloads to ReadReplicas as most of the apps are read-heavy.
  • You can use cross-region Read Replica.
  • You can also use it for Disaster Recovery purposes
  • You can also upgrade the Read Replica to a new engine version in case you don’t wanna impact your primary database. You can run tests
  • CloudWatch metric for asynchronous latency/lag = ReplicaLag. You would get up to a minute of lag within MySQL but within AWS Aurora, it would be minimal

Backups in RDS

  • Two types: Automatic Backups and Manual Snapshots
  • Transaction logs are stored every 5 minutes in Amazon S3 to support point-in-time recovery PITR. Then ship those logs to S3. Back to the point where you want to!
  • Amazon RDS backups leverage Amazon EBS Snapshots stored in S3 managed by RDS
  • When should I use automated backups as opposed to snapshots?
Automated Backups Manual Snapshots
Specify backup retention window per instance (7day default) Manually created through AWS console, cli or RDS API
Deleted after window Always there
Supports PITR Restores to snapshot
Good for disaster recovery Use for testing, final copy before deleting a database, non-prod/test environments

How do I restore a backup?

  • Restoring creates an entirely new database instance while the old one is running
  • New volumes are hydrated from AWS S3. Making the volume initially ready for the database. Blocks are pulled in from S3 when they are needed. Restoring can be slow until some of those blocks come in.
  • You can use a larger instance for that initial restore

Securing your AWS RDS?

  • Secure by default. Network isolation with VPC. AWS IAM based resource-level permission controls. Encryption at rest using KMS or Oracle/Microsoft TDE. Use SSL protection for data in transit
  • How can I save money on my RDS Database?
  • AWS Reserved Instances up to %60 discount for 4 years etc. It’s just a billing commitment, not literally a reserved instance for you
  • Size flexibility: if you are running a r4.large and you wanna scale up to r4.xlarge, AWS count the Reserved Instances for the area for large against usage of the larger type. RI flexibility to get better RI utilisation
  • RI Utilisation report: of the RIs purchased, how many are being utilized and how much so this works for RDS together with EC2!
  • You can also start/stop a database, while it’s not running, you only pay for the storage!!! Now in single AZ DBs applicable

Scalability of RDS

  • Either you increase the instance size for compute purposes
  • Or increase the storage attached to the RDS instantiation
  • You can increase RDS’ size but cannot decrease. Decrease would need a defragmentation etc. not a software would be able to figure out