Monitor your RDS instances using CloudWatch Metrics

Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the AWS Cloud. It provides cost-efficient, resizable capacity for an industry-standard relational database and manages common database administration tasks.

How can you monitor your RDS instances, what are the best options to enable and which metrics you should focus on for your use?

RDS Instance Type - General purpose and Memory optimized

All the RDS instances types come up with varying combinations of CPU, memory, storage, and networking capacity. Find the best instance type might be a bit tricky if you don’t know them and for what use.

  • General purpose instances provide a balance of compute, memory and networking resources, and can be used for a variety of diverse workloads”. It’s a good choice for many database workloads including small and mid-size databases.
 

General Purpose

Type

t2 (tiny or turbo)

m4 (medium)

Description

CPU Burstable Performance


High frequency Intel Xeon processors

2,3 Ghz Intel Xeon E5-2686 v4 (Brodwell) processors or 2,4 GHz Intel Xeon E5-2676 v3 (Haswell) processors

    

EBS-optimized by default without additional cost


Support for Enhanced Networking

 

You might also look for the latest generation of General Purpose Instance:

 

General Purpose (Latest)

Type

t3 (over T2)

m5 (over M4)

Description

CPU Burstable Performance


High frequency Intel Xeon processors


EBS Optimized


Support for Enhanced Networking


Powered by AWS Nitro System


Unlimited mode to ensure performance during peak periods

2,5 GHz Intel Xeon Platinum 8175 processors

    

EBS-optimized by default without additional cost


Support for Enhanced Networking (up to 25 gbps network bandwidth)

 

  • Memory optimized instances are designed to deliver fast performance for workloads that process large data sets in memory”. It’s a good choice for memory-intensive database workloads at a larger scale. 
 

Memory Optimized

Type

r4 (RAM)

x1e (xtreme)

Description

High Frequency Intel Xeon  E5-2686 v4 (Broadwell) processor 


Optimized for memory-intensive applications


EBS Optimized


Support for Enhanced Networking

High Frequency Intel Xeon  E7-8880 v3 (Haswelll) processor


EBS Optimized


Support for Enhanced Networking


Optimized for high-performance databases, memory-intensive databases, and other memory-intensive applications

 

You might also look for the latest generation of Memory Optimized Instance:

 

Memory Optimized

Type

r5

z1d

Description

High Frequency Intel Xeon Platinum 8000 series processor with sustained all core Turbo CPU clock speed of up to 3.1 GHz


EBS optimized


Support for Enhanced Networking

High Frequency Intel Xeon processor


EBS Optimized


Support for Enhanced Networking


Optimized for high-performance databases, memory-intensive databases, and other memory-intensive applications

AWS RDS Instance Features

For each of your AWS RDS Instance, you might want to wrap some metrics around the following topic:

  • CPU / RAM consumption
  • Network traffic
  • Disk space consumption
  • Database connections
  • IOPS metrics

Let’s dig into it

 

CPU / RAM consumption

For your RDS instance:

Name

Description

Unit

CloudWatch Namespace

CPUUtilization

Percentage of CPU utilization

Percent

RDS

 

For your RDS burstable performance instances:

Name

Description

Units

CloudWatch Namespace

CPUCreditUsage

( T2 Instances)

Number of CPU credits spent by the instance for CPU utilization

Credits (vCPU-minutes)

RDS

CPUCreditBalance

( T2 Instances)

Number of earned CPU credits that an instance has accrued since it was launched or started

Credits (vCPU-minutes)

RDS

BurstBalance (gp2)

The percent of General Purpose SSD (gp2) burst-bucket I/O credits available.

Percent

RDS

 

Network traffic CloudWatch Metrics

Name

Description

Units

CloudWatch Namespace

NetworkReceiveThroughput

Number of bytes received on all network interfaces by the instance

Bytes/Second

RDS

NetworkTransmitThroughput

Number of bytes sent out on all network interfaces by the instance.

Bytes/Second

RDS

 

Disk Space Consumption CloudWatch Metrics

Name

Description

Units

CloudWatch Namespace

FreeStorageSpace

The amount of available storage space

Bytes

RDS

WriteIOPS

The average number of disk write I/O operations per second.

Count/Second

RDS

 

Database connections CloudWatch Metrics

Name

Description

Units

CloudWatch Namespace

DatabaseConnections

The number of database connections in use.

Count

RDS

FreeableMemory

The amount of available random access memory

Bytes

RDS

NetworkReceiveThroughput

The incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

Bytes/Second

RDS

NetworkTransmitThroughput

The outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication

Bytes/Second

RDS

 

IOPS CloudWatch Metrics

Name

Description

Units

CloudWatch Namespace

ReadIOPS

The average number of disk read I/O operations per second.

Count/Second

RDS

WriteIOPS

The average number of disk write I/O operations per second.

Count/Second

RDS

DiskQueueDepth

The number of outstanding IOs (read/write requests) waiting to access the disk

Count/Second

RDS

ReadLatency

The average amount of time taken per disk I/O operation. 

Seconds

RDS

WriteLatency

The average amount of bytes written to disk per second. 

Bytes/Second

RDS

Going deeper, the enhanced monitoring: Monitor your processes

Launched in 2016, RDS Enhanced Monitoring is a feature that enables customers to collect metrics from their database server’s operating system

By default, CloudWatch gathers metrics about CPU utilization from the hypervisor when Enhanced Monitoring gathers its metrics from an agent directly on the instance. Enhanced Monitoring is particularly useful when you want to see how different processes or threads on your DB instance use the CPU.

You can enable this option directly on your DB instance when:

  1. You create a DB → In the Configure Advanced Settings page (with the aws console) or through parameters (with terraform)
  2. You create a Read Replica → In the Configure Advanced Settings page (in aws console) or through your terraform code (through a parameter).
  3. You modify a DB Instance → In the Modify DB Instance page or through your terraform code (which will update your DB).

For your Enhanced Monitoring you’ll need:

  1. Set the Monitoring Role property to the IAM role that you created to permit Amazon RDS to communicate with CloudWatch for you (or either choose Default to have RDS create this role for you)
  2. Set the Granularity property to the interval chosen (1, 5, 10, 15, 30 or 60 seconds)

Enabling enhanced monitoring will show the following Process list on your RDS instance:

  • RDS child processes - A summary of RDS processes that support the DB instance. 
  • RDS processes - A summary of the resources used by the RDS management agent, diagnostics monitoring processes and other AWS processes that are required to support RDS DB instances.
  • OS processes - A summary of the kernel and system processes (which have a low impact on performance). 

It is also available through CloudWatch, through the Logs section in the navigation pane, you need to select RDSOSMetrics from the list of log groups and then select your desired log stream. It’s based on the following groups:

  • general - Global information about the DB instance.
  • cpuUtilization - CPU and its uses by the DB instance logs.
  • fileSys - File-system logs.
  • loadAverageMinute - Number of processes requesting CPU time.
  • memory - Memory logs
  • network - Network interface and bytes received/uploaded logs
  • processList - CPU, memory used by the processes logs
  • swap - Swap memory logs
  • tasks - Tasks logs

 

Find more information about EC2 CloudWatch Metrics and EBS CloudWatch metrics in those two blog posts.

 

Now that you understand better RDS types, and which metrics are important to monitor, you may check what is best for your workflow.

Matthieu Lanvert

Matthieu Lanvert

Matthieu is a Site Reliability Engineer (SRE) at Padok. He specializes in DevOps technologies such as AWS, Kubernetes, Docker, Gitlab, and CloudWatch.

What do you think? Leave your comments here !