Tommyasai

Website logo
November 1, 2023
Tweet this post

Key Metrics to Watch in Aurora IO

About Amazon Aurora IOPS

I researched the IOPS metrics for Amazon Aurora after finding it confusing to differentiate the purpose of Read/WriteIOPS and VolumeRead/WriteIOPS.

TL;DR

What is IOPS?

IOPS is a measure of how actively a database exchanges data with storage. One IOPS is counted for each read of a database page. In Aurora, a page is up to 16KB for MySQL and up to 8KB for PostgreSQL.

Aurora supports a 16 KB page size for Amazon Aurora MySQL and an 8 KB page size for Amazon Aurora PostgreSQL. The page size is the smallest I/O operation that the database engine performs.

For more information on planning I/O in Amazon Aurora, you can refer to the AWS blog post here: Planning I/O in Amazon Aurora.

VolumeRead/WriteIOPS Directly Affect Costs

VolumeReadIOPS/VolumeWriteIOPS are metrics that correlate with the costs associated with Aurora's I/O operations. These are only available at the cluster level and are measured at five-minute intervals. They may also be listed as [billed] VolumeRead/WriteIOPS.

In services with heavy read/write activity, I/O costs can take up a significant portion of Aurora's expenses, making it valuable to monitor for unexpected cost increases with alarms.

At the time of writing, I/O pricing for MySQL Aurora in the Tokyo region is as follows:

$0.24 per 1 million requests

If I/O is high, you might consider using the Aurora I/O-Optimized option announced in May 2023 for potential cost savings.

Measuring Inefficient I/O with BufferCacheHitRatio

BufferCacheHitRatio measures the percentage of read requests that retrieve data from the instance's RAM cache. A low ratio suggests increased disk I/O, likely leading to higher VolumeReadIOPS/ReadIOPS. This metric is crucial for identifying inefficient I/O. A low ratio may indicate insufficient RAM for data caching, necessitating an instance scale-up.

Per-Instance ReadIOPS

ReadIOPS/WriteIOPS provides more granular IOPS data at various levels such as instances, clusters, and roles (Writer/Reader), and at one-minute intervals. Since Aurora automatically scales its storage in response to I/O, unlike standard RDS, a mere increase in IOPS does not affect performance. This metric is useful when investigating I/O details at the instance level following alerts for increased VolumeReadIOPS. For analysis in bytes, ReadThroughput/WriteThroughput is available.

Determining which metrics to focus on requires a detailed understanding of each. If you have differing views or advice, please feel free to share!

GitHub profileTwitter profileLinkedIn profile
Yosuke Tommy Asai
© 2024
tommyasai.fly.dev