Monitor your EBS using CloudWatch Metrics

Amazon Elastic Block Store (EBS) is the block storage service designed to be used with Amazon Elastic Compute Cloud (EC2) with high performance and easy to use. It’s designed to support a broad range of workloads (relational and non-relational database, containerized applications, big data analytics, and availability), crash-consistency snapshot, security for data compliance, virtually unlimited scalability and it’s cost-effective.

How can you monitor your EBS volumes efficiently and with which metrics?

Enable Available CloudWatch Metrics  

By default, some metrics are unavailable through CloudWatch so we need to enable them.

For Ubuntu EC2, execute the following lines:

More information about the installation on Amazon Linux and Red Hat Enterprise Linux here.

Then download, install and configure the monitoring scripts:

Inside the package, we’ll be interested onto mon-put-instance-data.pl script. It will collect memory, swap and disk space utilisation data on the current system and then send it to CloudWatch. We’ll principally use MemoryUtilization and DiskSpaceUtilization.

Let’s set up a cronjob sending our metrics periodically. Add the following line in your crontab –e

Then access your metrics in CloudWatch with the prefix System/Linux

Monitor your EBS with CloudWatch

There are two distinct EBS volumes types:

  • SSD with high levels of I/O operations per second (called IOPS).
  • HDD which provide excellent data throughput. Individual I/O size is capped at 1024 KiB.

AWS’s stated limits are based on an I/O block size of 16 KiB for SSD volumes and 1024 KiB for HDD volumes.

You should use SSD volumes for high-transaction databases or for application where data needs to be accessed or changed frequently. They are available as General Purpose (gp2) volumes which provide the ability to burst IOPS performance for a period of time.

Generally, you should rather use HDD for sequential I/O very quickly. HDD volumes come in two types:

  • Throughput Optimized (st1) volumes with higher throughput performance
  • Cold HDD (sc1) volumes with low cost for less frequently accessed data.

Both of these volumes can burst throughput performance.

Let’s focus on the key metrics for your EBS volume

 

Throughput

EBSReadBytes

Information on the read operations in a specific period of time.

EBS -> Per-Instance Metrics

EBSWriteBytes

Information on the write operations in a specific period of time.

EBS -> Per-Instance Metrics

EBSIOBalance%

Information about the percentage of I/O credits available.

EBS -> Per-Instance Metrics

 

Disk latency

VolumeQueueLenght

Information on the read/write operations waiting to be done

EBS -> Per-Volume Metrics

VolumeTotalReadTime

Information about the read time operation on the volume

EBS -> Per-Volume Metrics

VolumeTotalWriteTime

Information about the write time operation on the volume

EBS -> Per-Volume Metrics

 

Idle Time

VolumeIdleTime

Information about the time passed the volume received no read/write operations.

EBS -> Per-Volume Metrics

 

Then correlate your volumes’ queue length with IOPS, volume read/write to find the proper balance for your workload.

You can also set up some alerts on BurstBalance Metrics which shows you when your volume does burst for either IOPS (up to 3,000 IOPS for gp2) or throughput (up to 500 MiB/s fo st1 and 250 for sc1). This metric shows off the percentage of credits available for your volume.

You can set up these alerts directly on CloudWatch (in Alert) or with your monitoring system (Alert-manager with Grafana and Prometheus).

 

Monitoring your volumes may help you reconsider your architecture structure in EBS choice for EC2 instances. Especially if it makes you save money. Check out what is best for your workflow here.

Matthieu Lanvert

Matthieu Lanvert

Matthieu is Site Reliability Engineer at Padok

What do you think? Leave your comments here !