TL;DR Google is introducing a new method to autoscale slots along with a new pricing plan to accommodate this. We have created a calculator you can use to determine potential costs and cost savings when switching located here
BigQuery slot autoscaling was just announced by Google as a public preview feature and it’s a pretty massive change from the pricing models that have existed for most, if not all, of its lifetime.
Before continuing on here I want to put in a quick primer on some new and old terms used throughout this article so everybody is on the same page.
Key BigQuery Terms to Know
Here is a quick primer on terms used throughout this article. If you are familiar with BigQuery then most of this will be known, if not then here is some assistance before continuing on.
A job is an action that runs on BigQuery to perform work such as querying, loading data, copying data, etc. The most common type of job is a query and the two are used quite often in lieu of each other.
The basic compute construct used in BigQuery to do work for a job. Think of this as a “mini-virtual machine” that joins with many others to perform the actual work for a job.
A construct that reserves a set of slots to be utilized for zero or more projects that are added to it.
A fixed number of slots that are reserved for a committed duration of time, usually one month or one year, in exchange for a cheaper price.
- Admin Project
A project that is designated as the “manager” of BigQuery reservations and commitments. Generally one organization will have a single (or a select few) of these to allow a “single pane-of-glass” for managing the BigQuery resources of the organization. For flat-rate and autoscaling projects the reservation billing SKUs across the organization will be associated with this project.
- Flex Slots
Slots that can be added to a reservation for a short term burst in slot counts. Note they are billed by the second with a minimum billing period of 60 seconds.
- Idle Slot
A slot that is currently not being utilized in a job.
A workload is a set of jobs that are run inside of a single project.
The basic billing metric for BigQuery Autoscaling projects. It is defined as a single slot being utilized in a job for one hour.
- Autoscaling Slots
New construct that scales the number of slots up from a baseline slots value to a maximum slots value as needed for a running workload.
- Baseline Slots
The minimum amount of slots kept “hot” and active in a BigQuery Autoscaling project. These will be utilized first before autoscaling occurs and are billed as long as the reservation is active.
Slot Autoscaling? What is this madness?
Google has just put into public preview a long awaited feature: BigQuery slot autoscaling for when workloads need more slots than the purchased capacity allows for.
In rough terms it allows BigQuery to autoscale up and down the slots needed by workloads without any manual intervention such as a Cloud Function that adjusts a Flex Slots reservation when certain usage thresholds are met.
Since the billing model for BigQuery is well over 10 years old and is really based on a “pay-for-data-scanned” and a “pay-for-capacity” model it doesn’t really jive very well with autoscaling of the compute capacity. To accommodate this Google has introduced a more modern billing model based around the “pay-for-what-you-use” approach for this new feature.
Overview of BigQuery Autoscaling Slots
This feature is exactly what the name says, autoscaling slots. Meaning that when a workload needs additional slots then BigQuery will now autoscale the number of slots inside of the reservation up and down as needed for the jobs running.
Along with this is, as I mentioned above, is a new billing model that bills by a construct called “slot/hour”, which is just the price for using a single slot for an hour billed by the second.
*Note that this change affects only the compute/analysis pricing side of BigQuery, not storage costs. Storage is not affected by these changes at all. So keep that in mind as you read through the rest of this article.
This is immensely powerful if you currently use flat-rate billing and need more slots than you have reserved. Flex Slots gave you an approach to getting more slots, but they were not automatic and required additional tooling, usually in the form of a Cloud Function running on an interval that checked the used slots metric and added or removed Flex Slots to a reservation as needed. A good example of this is located here (not affiliated with DoiT). Unfortunately if more slots were needed immediately you were out of luck, unless you built and paid for additional infrastructure to do this.
Before looking too deeply into this, note that this new billing model can be applied to a single reservation which may contain zero or more projects. So this means you can mix-and-match on-demand, flat-rate, and autoscaling as you see fit across reservations and projects for optimal usage or costs.
In practice when you have a project with a baseline value set, all jobs will start using the number of baseline slots and start scaling once those have been used up. Note that these are a shared pool of slots across all jobs running in projects inside of a reservation, which is the same as a flat-rate reservation. When they are no longer being used they are scaled back down so you aren’t billed for them, which is the differentiator from Flex Slots, which are billed until they are removed.
Now with that said, there is a 60-second window for scaling where BigQuery determines how it will scale. This prevents short-run queries from scaling up to your max slots amount and overbilling you for slots used when they were not needed.
Should I Switch My Workloads to Autoscaling?
As happens WAY too often in the world of cloud, the answer is it depends. That answer sucks, so let’s dive in and show a few scenarios where this new autoscaling model would be ideal:
Let’s say you have a workload that is run only a few times a day for a short period of time, but it uses a massive amount of slots (think a couple thousand slots). Flex Slots can be used here, but oftentimes this job runs in under a minute so you are overpaying for the minimum billing period of 60 seconds of Flex Slots when it’s not used for that period.
And if you have workloads that use massive amounts of slots (over 2k to match the on-demand slot count) and are very read-heavy, which makes on-demand billing not make sense, then this is a perfect candidate for Autoscaling. In addition many times these workloads may be started by a customer action so there isn’t time to create a Flex Slots reservation to run it in.
Finally, if your workloads are very spiky, you can compare their slot-usage in our BigQuery Autoscaling Slots calculator to your existing setup and see if it’s a better fit.
Reservations under this new autoscaling model are mostly the same as previous reservations for flat-rate billing, but they do have some new concepts associated with them along with a few differences.
They still are created inside of a management project and inside of each reservation are zero or more projects to which slots are allocated for use.
Since this introduces a new concept of scaling there is now a minimum scaling value, called “Baseline Slots”, and a maximum scaling value, called the “maximum reservation size” or “max slots.”
The max slots value is defined as the sum of the Baseline Slots value and the number of autoscaled slots set for the entire reservation.
By default if a job, or set of jobs, overflows the max autoscaling amount for its reservation then it can borrow slots from other reservations from the same management project. There is a checkbox called “Ignore idle slots” that allows this feature to be turned off in the “Create Reservation” screen:
How much will switching to Autoscaling cost me?
Pricing is completely different from the traditional pricing models BigQuery has had for a while now. The new pricing model is based upon a concept called slot/hour, which is a charge per slot for an hour of usage, billed by the second.
Autoscaling bills your usage based upon how long you use a slot for. Note this is billed by the second, but uses the hour as a nice rounding point on the numbers. So if you use a single slot for 900 seconds (1/4th of an hour in seconds) you will be billed for 0.25 slot/hours. For 3,600 seconds (1 hour in seconds) you will be billed for 1.0 slot/hours. Hopefully you will be using more than a single slot for a job, so multiplying this by the number of slots used will tell you how many slot/hours you are billed for.
The missing piece to determine price here is the cost per slot/hour, which for the public preview is set at $0.06 USD per slot/hour. Note that this may change due to this feature being in preview.
So the overall formula for determining the cost for a job is:
Price = 0.06 * (<Slots used> * (<Seconds used>/3600))
Not the easiest to read, but essentially you are multiplying the slots used by the hours it was used (seconds used / 3600 seconds). It looks more difficult than it is.
That’s why we created a calculator to make it easier for you to determine pricing and compare. Note it’s read-only, so create a copy of it into your Workspace environment first for best usage.
Like in flat-rate pricing Autoscaling has a commitment feature where you can commit to using a set number of slots, in sets of 100, for a specified amount of time in exchange for a cheaper price. These work the same as in a flat-rate commitment, except in Autoscaling they only allow annual commitments. They are on a per-region or per-multi-region basis just like in a flat-rate commitment, meaning you must utilize these in the particular region or multi-region they are purchased in to realize these savings.
*Note when performing a commitment you will be charged for the entire commit value much as you would for a flat-rate commitment. These are treated just like flat-rate commitments in terms of billing.
The pricing for an annual commitment is $0.05 USD per slot/hour, purchased in sets of 100. This translates to approximately $3,504 USD per month for 100 slots versus approximately $4,380 USD per month for 100 slots run for an entire month (assuming 730 hours per month) without a commitment.
Switching to BigQuery Autoscaling
The easiest method to switch from flat-rate billing to Autoscaling is just to edit the reservation from flat-rate to Autoscaling. Google has made it easy to swap them out and doesn’t require any deleting and recreating reservations.
To switch from on-demand to Autoscaling (or flat-rate) is a bit more of a process. I would highly recommend reading Google’s documentation on the subject here before switching and then following the procedures to create a reservation, but selecting Autoscaling instead of flat-rate in the menu.
How Many Slots Do I Use Currently?
Lastly this question is probably being asked because many BigQuery users have no idea. Don’t worry, we have you covered here!
In a previous series of articles I wrote on BigQuery cost optimization I published a set of BigQuery queries in a GitHub repository that contained multiple queries for assisting on this. Note before running anything in this repository to watch your scanned data quantity as these can process huge amounts of data (>2 TB on some of our internal datasets) easily.
While this topic is extremely large and is well outside the scope of this article I will recommend running the following queries:
- Slots by Minute
This query will create a time series over the timeframe and show the number of slots used during that minute period.
- Top Complex Queries
This query will return back the most complex (in this case defined as most slots used) queries over the specified timeframe.
- Job Information
This query will return back some statistics about a Job ID including the slot information. This is very useful when you determine a job is hitting performance issues and you would like to see if it needs additional slots to execute faster.