Shortly after deploying PerfectScale by DoiT, Serdze and the team gained the cost visibility they were missing. Across Trax’s vast 200+ microservice environment, they had clear visibility into what resources each service needed, and were able to identify the most significant opportunities to eliminate waste.
The platform’s AI-guided intelligence allowed the team to take action to start reducing costs quickly. By comparing cost savings with overall resilience, they could adjust resources safely and efficiently without compromising performance.
“The cost optimization recommendations were key for us, telling us what actions to take with a clear understanding of the impact each change would have,” Serdze explained. “In one of our clusters, we were able to reduce cost by 75%, saving us over a hundred thousands dollars in yearly expenses.”
Furthermore, Trax was impressed by the comprehensive data and intelligence the solution provided across its entire environment. This implementation allowed it to upgrade its cost visibility toolset without impacting its budget.
“We were able to replace a FinOps tool we were using that didn’t provide granular cost details or offer guidance on how to optimize our environment,” explained Serdze. “PerfectScale is a tool built for the engineering teams, not just for finance, which made it easier for us to make the cost impacts we wanted.”
Kubernetes optimization to improve business metrics
After eliminating the wasted resources, Trax focused on identifying additional opportunities for cost optimization. The team delved into PerfectScale’s data, seeking ways to meaningfully impact their cost-centric business metrics.
“A key metric for us is ‘cost per processing,’ which is heavily affected by our Kubernetes efficiency,” said Serdze. “If it gets over a certain amount, we are under a lot of pressure to figure out why and to take actions to reduce it.”
PerfectScale has a unique feature that consolidates every replica of a service into a single view to provide a clear picture of the utilization trends across all replicas, which is especially useful for ephemeral workloads like Spark or Flink jobs. Trax leveraged this capability to understand better the heterogeneous utilization across the replicas of several of its heavily used services. This level of visibility helped it rearchitect some of these services to drive additional cost savings without impacting their resilience or availability.
“We were able to build multiple flavors of the service with different levels of resources and route the incoming requests to the proper service based on the size of the data,” explained Serdze. “This made a big impact on our ‘cost per processing’ metric. PerfectScale surfaced this data instantly, and without them, we would have spent countless hours evaluating hundreds of replicas to generate the same results.”