In his article “Stop Chasing Idle Servers: Intent-Aware FinOps for the Real World,” DoiT CEO Vadim Solovey highlighted an important reality of cloud computing: An “illusion of efficiency” can mask significant waste. While idle resources are easy to spot, some of the most costly inefficiencies hide within your engineering organization or code — places where cloud reconfiguration alone cannot solve the problem.

This article focuses specifically on those cases where code is at fault. It is not always easy to take your investigation from the cloud level down to the code level. I will present a methodology.
A Systematic Approach to Cloud Cost Optimization
Step 1: Identify Your Biggest Cost Centers
Cloud cost optimization follows the Pareto (80/20) Principle — a small percentage of your resources typically account for most of your costs. Start by using tools like DoiT Cloud Intelligence™ to identify your largest expense areas.
Step 2: Take Quick Administrative Wins
Before diving into code optimizations, tackle the simpler cloud administration adjustments first:
- Right-sizing underutilized instances
- Removing orphaned resources
- Implementing appropriate storage tiers
- Adjusting autoscaling parameters
These are less expensive than diving into code.
Step 3: Look for Code-Level Inefficiency Indicators
Once administrative optimizations are complete, examine high-cost resources for potential code-driven inefficiencies.
Due to the “illusion of efficiency,” these indicators can be subtle. The real-world scenarios discussed later in this article illustrate common telltale signs to watch for, often found using standard cloud monitoring tools in the GCP, AWS, and Azure Consoles.
Step 4: Investigate: Profile and Analyze
Step from the cloud level to the code level using execution profiling tools that can be run in the cloud.
- For databases: Cloud Console query analyzers and performance insights
- For applications: Language-specific profilers and memory analyzers
- For data pipelines: Execution graphs and distribution metrics

Implementing this can be easy, as for SQL where tools are provided in the cloud, or difficult, as for memory profiling of Python in distributed applications running in managed environments.
Step 5: Implement, Measure, Validate
Fix identified issues, redeploy, and measure technical improvements using the AWS, GCP, and Azure Cloud Consoles; review Cloud Intelligence™ cost reports to validate cost improvements.
Real-World Scenarios and Solutions
Scenario 1: Java Microservice with Memory Leaks
What Looked Efficient: A Java Lambda microservice maintaining 70–100% memory utilization, seemingly maximizing resource allocation.
The Reality: The application suffered from memory leaks where a global object held reference chains retaining objects across invocations. Occasional crashes and instance replacements were infrequent enough to escape SRE attention.
The Clue: Monitoring revealed a sawtooth memory usage pattern over time. Investigating the drops in that sawtooth led to CloudWatch logs showing periodic crashes.
Investigation: CodeGuru Profiler was activated; it revealed increasing memory usage over time. Offline investigations with a JVM profiler identified unexpected object retention.
Solution: Modified code to release object references at the end of each web request.
Result: Stable memory usage, fewer instance replacements, and lower resource costs.
Scenario 2: Java Data Processing Pipeline with Inefficient Data Structures
What Looked Efficient: A Dataflow pipeline with custom Java containers processing millions of records daily with consistently high CPU utilization.
The Reality: The code used inefficient data structures, including maps with unnecessary per-object data string concatenation in tight loops, creating excessive garbage collection overhead.
The Clue: High resource usage suggested the need for deeper investigation.
Investigation: GCP Cloud Profiler was added to the container. This showed super-linear scaling of time and memory usage with larger datasets.
Solution:
- Replaced maps with custom objects holding just the information needed.
- Implemented proper string joining instead of repeated concatenation
Result: 50% less memory usage and 70% reduction in processing time, allowing for smaller worker machines and fewer instance.
Scenario 3: In-Memory Data Structures Causing Oversized VMs
What Looked Efficient: Large-memory VMs appeared cost-effective compared to horizontally-scaled solutions, with in-memory Python data structures offering algorithmic speed advantages over database queries.
The Reality: This approach created multiple inefficiencies:
- Cloud providers enforce minimum CPU-to-memory ratios, resulting in expensive and unused CPU capacity.
- Memory allocations come in predefined increments, requiring payment for unused buffer capacity.
- Long initialization times necessitated keeping multiple expensive instances running simultaneously for robustness.
The Clue: DoiT Cloud Intelligence™ showed a large fraction of total expense coming from ultra-large VMs — typically an indicator of problematic statefulness in cloud architectures.
Investigation: Deep analysis of algorithms revealed opportunities to refactor and allow storage of data outside application memory.
Solution:
- Refactored algorithms to work with partial datasets queried from databases as needed
- Implemented a NoSQL database with Redis as an in-memory cache
- Where full data preloading was needed for key reference data, optimized data structures in Redis allowed a smaller memory footprint than objects in application memory
Result: This architectural change significantly reduced VM sizes, allowing horizontal scaling and radically reducing costs, though it required substantial engineering effort.
Reviewing earlier cases
Let’s go back to two more examples from the article that I mentioned earlier, “Stop Chasing Idle Servers”, and see how they fit into this framework.
Scenario 4: Database Pushing 85% IOPS
What Looked Efficient: The RDS instance appeared fully utilized, suggesting optimal resource allocation.
The Reality: Every query was performing full-table scans due to missing two critical indexes, dramatically increasing resource requirements.
The Clue: Since most SQL queries shouldn’t require high resource usage (except in highly tuned batch processes), a high-utilization pattern indicated optimization opportunities.
Investigation: Identified problematic queries and missing indexes, using AWS RDS Performance Optimizer, activated out-of-the-box in the AWS Console. (GCP has the similar Cloud SQL Query Insights).
Solution: Added the missing indexes.
Result: 10x reduction in query latency and the ability to downsize the database by two tiers.
Scenario 5: Spark Job at 70% CPU for Four Hours Nightly
What Looked Efficient: The cluster maintained high CPU utilization, suggesting appropriate resource allocation.
The Reality: 80% of data was concentrated on a single skewed key, creating straggler tasks that extended processing time significantly.
The Clue: The issue began at a specific point in time with no other clear cause. (This later turned out to correspond to the arrival of new data with the “hot key”.)
Investigation: Spark code runs in a highly distributed environment, making it difficult to use an ordinary profiler that you might use for applications. This is a good reason to keep the logic focused on simple transformations rather than complex business logic. However, using the Spark UI to analyze task distribution across stages identified stragglers. BigTable monitoring revealed “hot keys” in the database being processed.
Solution: Repartitioned and salted the problematic key to distribute the workload more evenly.
Result: Job completion time decreased from 4 hours to 45 minutes, and the necessary cluster size was reduced by two-thirds.
Conclusion
Achieving true cloud efficiency sometimes requires going beyond cloud configurations and also addressing inefficiencies at the code level. When code is the root cause of excess cloud costs, infrastructure tweaks alone won’t fix the problem.
By partnering your development team with FinOps and SRE, you can identify and resolve these hidden inefficiencies through a systematic approach:
- Start with the biggest expense areas shown in cost analytics.
- First, address quick wins at the cloud configuration level.
- In the Cloud Consoles,look for telltale clues that suggest deeper investigation.
- Use appropriate profiling tools, preferably in the cloud but offline if needed, to pinpoint inefficiencies.
- Fix the code, redeploy, and validate cost improvements.
This collaborative approach not only reduces costs, but often improves application performance and reliability as well — a win for both your budget and your users.
In the DoiT Customer Reliability Engineering team, I guide organizations through the entire optimization journey. Leveraging DoiT Cloud Intelligence™ and decades of experience, we help identify potential wins, describe cloud-level fixes, uncover illusions of efficiency, and validate the impact of code-level improvements. Get in touch at doit.com/services