Five data pipeline transgressions costing you money

Cloud Masters Episode #105

Covering five costly mistakes data engineers make when building their data pipelines, and what you should be doing instead.

Cloud Masters Episode #105

With DoiT Spot Scaling, automate your AWS Spot Instances to save up to 90% on compute spend without compromising reliability.

| December 6, 2023

Cloud Masters

Five data pipeline transgressions costing you money

00:00 / 00:51:01

| December 6, 2023

December 6, 2023

Cloud Masters

Five data pipeline transgressions costing you money

00:00 / 00:51:01

Episode notes

Jon Osborn, Field CTO at Ascend.io, joined us to share some of the costliest mistakes he sees data engineers making when building their cloud data pipelines.

Key Moments

3:46: [Transgression #1] Overpaying for data ingestion
11:13: [Transgression #2] Using Spark with Snowflake, rather than Snowpark
15:10: [Transgression #3] Not using partitioning
24:34: [Transgression #4] Re-running the whole pipeline every time
28:13: [Transgression #5] Using the same-sized warehouse for every workload

About the guests

Jon Osborn

With over 20 years of experience in data and technology, as the Field CTO at Ascend.io, Jon supports the Ascend.io platform with deep executive and enterprise architecture experience, working with customers across healthcare, insurance, and retail sectors to automate data processing in a new way. At Ascend.io, he collaborates with the engineering team to build the only data platform that takes on the hard infrastructure and process stuff so engineering teams can focus on the code that matters the most. He also contributes to the product vision, roadmap, and backlog, ensuring that customer feedback becomes actual features.

Matthew Richardson

Matthew Richardson is a Senior Cloud Architect at DoiT International specializing in the Data & Analytics space, with certifications across all major cloud providers — Google Cloud, AWS, Azure — as well as Snowflake, DBT, Python, Tableau and SAS. He has deep experience in the use of a variety of Data Engineering, ETL, BI reporting & programming tool sets including the above products plus Matillion, Talend, Cognos, Teradata and Business Objects in addition. Matthew currently works with customers in optimizing their Data Modelling strategies particularly in BigQuery, Redshift or Snowflake, ensuring customers are getting the most out of their data on a consistent basis.

With over 20 years of experience in data and technology, as the Field CTO at Ascend.io, Jon supports the Ascend.io platform with deep executive and enterprise architecture experience, working with customers across healthcare, insurance, and retail sectors to automate data processing in a new way. At Ascend.io, he collaborates with the engineering team to build the only data platform that takes on the hard infrastructure and process stuff so engineering teams can focus on the code that matters the most. He also contributes to the product vision, roadmap, and backlog, ensuring that customer feedback becomes actual features.

Matthew Richardson is a Senior Cloud Architect at DoiT International specializing in the Data & Analytics space, with certifications across all major cloud providers — Google Cloud, AWS, Azure — as well as Snowflake, DBT, Python, Tableau and SAS. He has deep experience in the use of a variety of Data Engineering, ETL, BI reporting & programming tool sets including the above products plus Matillion, Talend, Cognos, Teradata and Business Objects in addition. Matthew currently works with customers in optimizing their Data Modelling strategies particularly in BigQuery, Redshift or Snowflake, ensuring customers are getting the most out of their data on a consistent basis.

Intent-aware FinOps platform to eradicate the "Illusion of Efficiency"

By Role

Tailored solutions for keycloud stakeholders

By Industry

Cloud innovation designed for your business goals

Consolidate and simplify your billing

Operating Intelligence

Cloud Analytics

Anomaly Detection

Allocations

Budgets

Workload Intelligence

BigQuery Lens

Spot Scaling

DataHub

Cloud Diagrams

Containerized Workloads

Automation

PerfectScale

CloudFlow

Flexsave for Compute

Ava

Integrations

Customer success stories

Customers, advancing ourtechnology

Real-time DoiT efficiency,impact and success

Global compliance acrosscloud providers

Insights, tips and perspectivesfrom cloud experts

Tangible tips for navigating the cloud

Foundational expertise and future-ready recommendations

What’s new at DoiT

In-person and virtual tech talks

Demos, interviews and more from cloud experts

Company

Company

Offering

Support

Never miss an update.

Subscribe to updates, news and more.

Schedule a call with our team

Tailored solutions for key
cloud stakeholders

Cloud innovation designed
for your business goals

Consolidate and simplify
your billing

Customers, advancing our
technology

Real-time DoiT efficiency,
impact and success

Global compliance across
cloud providers

Insights, tips and perspectives
from cloud experts

Tangible tips for navigating
the cloud

Foundational expertise and
future-ready recommendations

Demos, interviews and
more from cloud experts