Blog

On the money: Making DoiT Anomaly Detection more precise and personalized

As companies look to control their cloud spend, many are nudging engineering and product teams toward being more accountable to their portion of cloud costs — and it’s no wonder why. 

When engineers and product owners are aware of the cost of their work, they’re more likely to factor in cost when developing features and monitoring post-launch, resulting in a direct, positive impact on the bottom line.

And while driving real-time reporting remains a top tactic for many looking to drive awareness among stakeholders, they don’t necessarily help teams quickly react to anomalous behavior or cost spikes. You also need real-time alerting.

But up until now, out-of-the-box anomaly detection systems typically provided insights on your entire organization's cloud usage with service-level alerts. However, this broad approach:

  1. Forces you to manually pinpoint the SKU(s) and resource(s) driving the cost anomaly for a service
  2. Notifies teams on cost anomalies that are not directly relevant to their operations.

That’s why we’re excited to announce SKU-specific alerting for DoiT Anomaly Detection, along with the ability to subscribe to anomaly alerts specific to the parts of infrastructure you’re responsible for.

Let’s explore the benefits of both, and how to set up custom anomaly alerts in the DoiT Console.

Everyone owns their cloud costs

A critical principle of FinOps is “Everyone owns their costs.”

A prerequisite to establishing a culture of cost optimization is getting precise and personalized data to the right people.

Like with reports and dashboards, personalized and precise anomaly alerts spark internal conversations that ultimately lead to a better understanding and accountability of costs and the underlying infrastructure decisions behind them. 

“Why did our costs spike? Was this expected? What can we do better next time?”

How SKU-level anomaly detection works

As many companies are still looking to automate anomaly alerting, DoiT Anomaly Detection comes ready out-of-the-box, autonomously monitoring for cost spikes, alerting you when anomalous spend is detected so you can act fast and minimize their impact on your bill. This way, your engineers don’t need to build and maintain an internal tool that monitors anomalies for you.

Previously, it would observe how your organization is consuming cloud resources, defining “normal behavior” for each service across each project/account.

But every minute a cost spike doesn’t get detected and quickly resolved, it's like leaving the tap running on your bank account. The longer it goes unnoticed, the greater the financial impact and potential consequences. The more manual work you have to do to identify the source of a cost spike, the greater the financial impact it’ll have.

With our latest update, DoiT Anomaly Detection now scans for abnormalities for each SKU. Once abnormal behavior is detected, you’ll receive an email (and Slack message if you’ve set it up) that highlights the anomalous SKU.

Anomaly alert sent via email

Anomaly alert sent via email

If you want to dig in deeper in the DoiT Console, you’ll also see the relevant resources contributing to the cost spike. This helps users pinpoint exactly what caused a cost spike at the most granular level and reduces the mean-time to resolve the source of the spike.

In the example below, we see in the DoiT Console that an anomaly was detected for S3, and specifically the “DataTransfer-Out-Bytes” SKU. We can see that there are primarily three S3 buckets contributing to this at the bottom, with “bucket-1” as the main culprit


Anomaly detected for an AWS S3 SKU

An example cost spike being detected for a SKU in AWS S3

In addition, you also get an explanation of the anomaly and SKU — sometimes they don’t have user-friendly names after all (ex. “EUN1-LCUUsage”) — along with relevant optimization tips.


Anomaly explanation for an anomaly detected in an AWS S3 SKU

Detailed explanation of an anomaly detected for S3 Data Transfer costs, along with optimization recommendations

Why you should personalize anomaly alerts to the parts of cloud infrastructure you're responsible for

Targeted anomaly alerting takes SKU-level detection a step further, enabling your teams to fine-tune the anomaly alerts they receive, focusing only on the cloud costs that they’re accountable for. 

You may want to understand variance within a single context. For example, business seasonality can obscure spikes in costs in R&D projects.

Here's why this is so important:

  1. Relevance and focus: By tailoring anomaly detection alerts to specific teams, you ensure that your stakeholders are only notified about anomalies that directly impact their operations. This reduces alert fatigue and helps teams concentrate on action items within their control.
  2. Faster response: When your team(s) receive alerts related to their own cloud costs, they can swiftly investigate the source of the issue and address any irregularities. This targeted approach accelerates the incident response time, minimizing the impact a cost spike can have on your monthly bill.
  3. Precise optimization: Anomaly detection provides a lens into areas where cost optimization is needed. When teams are alerted to cost spikes within their domain, they are more motivated to optimize resource usage and implement best practices, resulting in overall cost savings and good financial habits.
  4. Cultivating accountability: With targeted anomaly detection, a sense of ownership and accountability is reinforced. Teams become directly responsible for managing their cloud spend, fostering a culture of financial awareness and prudence.

Setting up personalized anomaly alerting with DoiT

First you need to define the costs that each team or person is responsible for. In the DoiT Console you do this with Attributions.

What are Attributions?

An “Attribution” is a logical grouping of cloud resources that defines a cost category unique to your company.

Attributions help you map your cloud spend to teams, applications, environments — any category or grouping relevant to your business. 

For example, let’s say your company has three different products that you’d like to track costs for. In this case, each product’s resources are spread across a number of AWS accounts.

Using Attributions, we can group together multiple AWS accounts and name that grouping after our Application. Below is an example of “Application A” being defined this way.


Example Attribution for a hypothetical application

Example Attribution for a hypothetical application

Now let’s say you want the engineers and product folks who work on Application A to only get alerted when an anomaly related to this application is detected.

Once an Attribution is created, you just have to toggle on Anomaly Detection for it.


Toggling on Anomaly Detection for an Attribution

Toggling on Anomaly Detection for an Attribution

Then you can go into the notification settings for the people responsible for Application A and subscribe them to alerts related to that Attribution (or they can do it themselves).


Subscribing to anomaly alerts for a specific Attribution

Subscribing to anomaly alerts for a specific Attribution

Additionally, they can send those alerts to a team Slack channel if they have it.


Subscribing to Slack alerts for detected anomalies on an Attribution
Send detected anomaly alerts for an Attribution to a shared Slack channel

 

When an anomaly alert is sent to Slack, you can either rate it to fine-tune our algorithm, or choose to investigate it further in the DoiT Console.

anomaly alert sent to a slack channel

Anomaly alert sent to a shared Slack channel

When an anomaly is detected for an Attribution, you will see the name of the Attribution appear on the anomaly's page.



Conclusion

When it comes to cloud cost management, precise and personalized real-time alerting is crucial for driving cost accountability among your team.

SKU-level alerting in DoiT’s Anomaly Detection represents a huge stride toward meaningfully reducing mean-time to resolve cost spikes.

This, combined with more precise and targeted anomaly alerting, helps your company drive a culture of responsible cloud spending, enabling teams to make informed decisions independently, respond fast to any irregularities, and proactively contribute to the overall success of your cloud initiatives.

As your company adopts FinOps practices, these will become an invaluable tool for enhancing cost visibility, optimizing resource allocation, and ultimately achieving financial excellence in the cloud.

If you're a DoiT customer, you can start subscribing yourself and stakeholders to anomaly alerts specific to the parts of infrastructure they own in the DoiT Console. Not a DoiT customer but interested in leveraging this along with DoiT's larger product portfolio? Get in touch with us here.

Subscribe to updates, news and more.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related blogs