ebs_cloudwatch_metrics

Posted on 20 November 2019, updated on 10 October 2023.

Amazon EBS is the block storage service designed to be used with Amazon EC2  It’s designed to support a broad range of workloads, has security for data compliance, virtually unlimited scalability and it’s cost-effective.

How can you monitor your EBS volumes efficiently and with which metrics?

Enable Available CloudWatch Metrics

By default, some metrics are unavailable through CloudWatch so we need to enable them.

For Ubuntu EC2, execute the following lines:

More information about the installation on Amazon Linux and Red Hat Enterprise Linux.

Then download, install, and configure the monitoring scripts:

Inside the package, we’ll be interested onto mon-put-instance-data.pl script. It will collect memory, swap, and disk space utilization data on the current system and then send it to CloudWatch. We’ll principally use MemoryUtilization and DiskSpaceUtilization.

Let’s set up a cronjob to send our metrics periodically. Add the following line in your crontab –e

Then access your metrics in CloudWatch with the prefix System/Linux

Monitor your EBS with CloudWatch

There are two distinct EBS volumes types:

  • SSD with high levels of I/O operations per second (called IOPS).
  • HDD which provides excellent data throughput. Individual I/O size is capped at 1024 KiB.

AWS’s stated limits are based on an I/O block size of 16 KiB for SSD volumes and 1024 KiB for HDD volumes.

You should use SSD volumes for high-transaction databases or for applications where data needs to be accessed or changed frequently. They are available as General Purpose (gp2) volumes which provide the ability to burst IOPS performance for a period of time.

Generally, you should rather use HDD for sequential I/O very quickly. HDD volumes come in two types:

  • Throughput Optimized (st1) volumes with higher throughput performance
  • Cold HDD (sc1) volumes with low cost for less frequently accessed data.

Both of these volumes can burst throughput performance.

Let’s focus on the key metrics for your EBS volume.

Throughput

EBSReadBytes

Information on the read operations in a specific period of time.

EBS -> Per-Instance Metrics

EBSWriteBytes

Information on the write operations in a specific period of time.

EBS -> Per-Instance Metrics

EBSIOBalance%

Information about the percentage of I/O credits available.

EBS -> Per-Instance Metrics

Disk latency

VolumeQueueLenght

Information on the read/write operations waiting to be done

EBS -> Per-Volume Metrics

VolumeTotalReadTime

Information about the read time operation on the volume

EBS -> Per-Volume Metrics

VolumeTotalWriteTime

Information about the write time operation on the volume

EBS -> Per-Volume Metrics

Idle Time

VolumeIdleTime

Information about the time passed the volume received no read/write operations.

EBS -> Per-Volume Metrics

Then correlate your volumes’ queue length with IOPS, volume read/write to find the proper balance for your workload.

You can also set up some alerts on BurstBalance Metrics which show you when your volume does burst for either IOPS (up to 3,000 IOPS for gp2) or throughput (up to 500 MiB/s for st1 and 250 for sc1). This metric shows off the percentage of credits available for your volume.

You can set up these alerts directly on CloudWatch (in Alert) or with your monitoring system (Alert-manager with Grafana and Prometheus).

Monitoring your volumes may help you reconsider your architecture structure in EBS choice for EC2 instances. Especially if it makes you save money. Check out what is best for your workflow here.