Blog

Optimizing AWS RDS Costs with Python: Gaining Visibility into Unused Resources

Optimizing-AWS-RDS-Costs-with-Python-DoiT-International

In the world of cloud computing, managing costs is just as important as the technical aspects of deploying and maintaining infrastructure. AWS, as one of the leading cloud providers, offers powerful services like Amazon RDS for managing relational databases. However, without proper monitoring and optimization, these services can quickly lead to unexpected bills. In this blog post, I will share a script I wrote for cost optimization reviews for RDS service which can help you gain visibility into unused RDS resources and reduce costs effectively.

The Challenge of Unutilized Resources

Amazon RDS is a managed database service that provides easy scalability and high availability. While this is incredibly valuable, it’s not uncommon for users to provision RDS instances with more resources than they actually need. Over time, these instances can accumulate and become an unnecessary drain on your cloud budget.

One way to tackle this problem is by monitoring your RDS instances for underutilization. You can do this by tracking CPU utilization and storage usage, among other metrics. AWS CloudWatch provides the tools necessary to gather this data. However, manually analyzing CloudWatch metrics for each RDS instance can be time-consuming and error-prone so I decided to automate it both for scalability, consistency and future reuse.

"""Module to calculade RDS utilization"""
import string
from datetime import datetime, timezone, timedelta
import sys
import openpyxl
from openpyxl.styles import PatternFill, Font
import boto3

REGION = sys.argv[1] if len(sys.argv) > 1 else "us-east-1"

EXCEL_ROW_ITER = 2

column_headers = [
    "Instance",
    "Class",
    "Engine",
    "Engine version",
    "MultiAZ",
    "AZ",
    "Storage type",
    "Allocated Storage",
    "cpu < 25",
    "cpu 25-50",
    "cpu 50-74",
    "cpu > 75",
    "free storage < 25%",
    "free storage 25-50%",
    "free storage 50-74%",
    "free storage > 75%",
]

red = PatternFill(patternType="solid", fgColor="FC2C03")
orange = PatternFill(patternType="solid", fgColor="E57909")
green = PatternFill(patternType="solid", fgColor="35FC03")
yellow = PatternFill(patternType="solid", fgColor="FCBA03")

wb = openpyxl.Workbook()
ws = wb["Sheet"]


def get_color(usage_type, usage_value):
    """Get color for utilization"""
    if usage_type == "bl_25":
        if 0 <= usage_value <= 25:
            return green
        if 25 <= usage_value <= 50:
            return yellow
        if 50 <= usage_value <= 75:
            return red
        if usage_value >= 75:
            return red
    if usage_type in ["bt_25_49", "bt_50_74"]:
        if 0 <= usage_value <= 25:
            return red
        if 25 <= usage_value <= 50:
            return orange
        if 50 <= usage_value <= 75:
            return yellow
        if usage_value >= 75:
            return green
    if usage_type == "gt_75":
        if 0 <= usage_value <= 25:
            return yellow
        if 25 <= usage_value <= 50:
            return green
        if 50 <= usage_value <= 75:
            return orange
        if usage_value >= 75:
            return red
    else:
        return None


def fetch_metrics(db_instance_name, metricName):
    """Fetch metrics from cloudwatch"""
    stats = cw.get_metric_statistics(
        Namespace="AWS/RDS",
        MetricName=metricName,
        Dimensions=[
            {"Name": "DBInstanceIdentifier", "Value": db_instance_name},
        ],
        StartTime=datetime.now(timezone.utc) - timedelta(days=60),
        EndTime=datetime.now(timezone.utc),
        Period=3600,
        Statistics=["Maximum"],
    )
    return stats


def add_row_to_excel(ws, row_data):
    """Write to excel"""
    for cnt, data in enumerate(row_data):
        ws[f'{string.ascii_uppercase[cnt]}{globals()["EXCEL_ROW_ITER"]}'] = data
        if cnt == 8:
            ws[
                f'{string.ascii_uppercase[cnt]}{globals()["EXCEL_ROW_ITER"]}'
            ].fill = get_color("bl_25", data)
        if cnt == 9:
            ws[
                f'{string.ascii_uppercase[cnt]}{globals()["EXCEL_ROW_ITER"]}'
            ].fill = get_color("bt_25_49", data)
        if cnt == 10:
            ws[
                f'{string.ascii_uppercase[cnt]}{globals()["EXCEL_ROW_ITER"]}'
            ].fill = get_color("bt_50_74", data)
        if cnt == 11:
            ws[
                f'{string.ascii_uppercase[cnt]}{globals()["EXCEL_ROW_ITER"]}'
            ].fill = get_color("gt_75", data)

    globals()["EXCEL_ROW_ITER"] += 1


# Initialize Boto3 clients
rds = boto3.client("rds", region_name=REGION)
cw = boto3.client("cloudwatch", region_name=REGION)

# Define column headers
for cnt, header in enumerate(column_headers):
    cell_address = f"{string.ascii_uppercase[cnt]}1"
    ws[cell_address] = header
    ws[cell_address].font = Font(bold=True)

# Fetch RDS instances
response = rds.describe_db_instances()
print(f'Found {len(response["DBInstances"])} databases')
for instance_data in response["DBInstances"]:
    db_instance_name = instance_data["DBInstanceIdentifier"]
    db_type = instance_data["DBInstanceClass"]
    db_storage = instance_data["AllocatedStorage"]
    db_engine = instance_data["Engine"]
    db_engine_version = instance_data["EngineVersion"]
    db_multiaz = instance_data["MultiAZ"]
    db_az = instance_data["AvailabilityZone"]
    db_storage_type = instance_data["StorageType"]

    if db_engine == "docdb":
        continue  # Skip docdb instances
    print(f"Pulling information for {db_instance_name}")
    cpu_metrics = fetch_metrics(db_instance_name, "CPUUtilization")

    # Calculate usage percentages
    usage_values = [d["Maximum"] for d in cpu_metrics["Datapoints"]]
    cpu_usage_percentages = [
        round(
            sum(1 for value in usage_values if 25 * i <= value < 25 * (i + 1))
            / len(usage_values)
            * 100,
            1,
        )
        for i in range(4)
    ]
    for i, usage in enumerate(cpu_usage_percentages):
        print(f"\tCPU {25 * i}% <= value < {25 * (i + 1)}%: {usage}%")
    storage_usage_percentages = []

    if db_storage_type != "aurora":
        print("\tStorage found")
        storage_metrics = fetch_metrics(db_instance_name, "FreeStorageSpace")
        storage_usage_values = [d["Maximum"] for d in storage_metrics["Datapoints"]]
        storage_usage_percentages = [
            round(
                sum(
                    1
                    for value in storage_usage_values
                    if 25 * i
                    <= value / (1024 * 1024 * 1024) / db_storage * 100
                    < 25 * (i + 1)
                )
                / len(storage_usage_values)
                * 100,
                1,
            )
            for i in range(4)
        ]
        for i, usage in enumerate(storage_usage_percentages):
            print(f"\tStorage {25 * i}% <= value < {25 * (i + 1)}%: {usage}%")
    # Add data to Excel worksheet
    row_data = (
        [
            db_instance_name,
            db_type,
            db_engine,
            db_engine_version,
            db_multiaz,
            db_az,
            db_storage_type,
            db_storage,
        ]
        + cpu_usage_percentages
        + storage_usage_percentages
    )
    add_row_to_excel(ws, row_data)

# Save the Excel workbook
wb.save("results.xlsx")

Automating Cost Reduction with Python

To streamline the process of identifying underutilized RDS instances, we can turn to Python, a versatile and widely-used programming language.

1. Fetching RDS Instance Data
The script begins by connecting to AWS and fetching data about your RDS instances. It collects crucial information such as instance names, types, storage, and more.

2. Gathering Metric Data
The script then queries AWS CloudWatch for CPU utilization and storage usage metrics. These metrics provide insights into how effectively your RDS instances are being utilized.

3. Calculating Utilization Percentages
Using the gathered metrics, the script calculates utilization percentages for both CPU and storage. These percentages are categorized into different usage levels (e.g., low, moderate, high) based on predefined thresholds.

4. Visualizing the Data
The script doesn’t stop at just gathering data; it also creates an Excel spreadsheet with the collected information. The script uses color-coded cells to highlight instances with different utilization levels, making it easy to spot underutilized resources at a glance.

Putting the Script to Use

To make the most of this cost-saving script, follow these steps:

1. Prerequisites

  • Ensure you have Python installed on your system.
  • Install the required Python libraries: boto3 and openpyxl.
pip3 install boto3 openpyxl

2. AWS Configuration

  • Make sure your AWS credentials are properly configured on your machine. Run the get-caller-identity command and make sure you are on the right account with the appropriate permissions.
aws sts get-caller-identity
{
    "UserId": "AIDAV7DHVCA7557LUGTRA",
    "Account": "410386763839",
    "Arn": "arn:aws:iam::410386763839:user/bogdan"
}

3. Run the Script

  • Execute the Python script, providing the desired AWS region as an argument (or use the default ‘us-east-1’).

“resizing

python3 run.py us-east-2

4. Review the Excel Report

  • The script generates an Excel report named “results.xlsx” that visually highlights underutilized RDS instances. Review this report to identify cost-saving opportunities.

5. Take Action

  • Based on the report, consider resizing or terminating underutilized RDS instances to reduce costs.

Interpreting results

The script fetches maximum CPUUtilization metric in the past 60 days in 1 hour periods and puts it in 4 utilization buckets: 0–25, 25–50, 50–75 and 75–100.
Then the script calculates the percentage of total hours in each bucket from the total working hours of the instance.

In the sample results above, we can see that most of the databases spend most of their time below 25% cpu load — these are the first candidates for downsizing/consolidation or termination.

For storage usage, it pulls the FreeStorageSpace metric for the past 60 days in 1 hour buckets, calculates it percentage out of the total provisioned storage and puts it into 4 corresponding buckets.

We can see that some of the databases are using provisioned storage (gp2/gp3). We can see for some of them that in 100% of the sampled hours, there is 75% or more free storage. It’s a clear indicator that the storage size can be reduced in these databases.

Conclusion

Managing AWS costs is an essential aspect of cloud infrastructure management. This Python script simplifies the process of gaining visibility into underutilized RDS resources, making it easier to identify opportunities for cost reduction. By automating the analysis of CloudWatch metrics and visualizing the results in an Excel report, you can make data-driven decisions to optimize your AWS spending.

Although it cannot be compared with off the shelf ready to use monitoring solutions, it gives you the flexibility to slice and dice the data as you wish as well as present it the way you want. In the next series I will be adding more cost related features to the report so stay tuned.

Subscribe to updates, news and more.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related blogs

Connect With Us