There is a feature in GCP that most people see when creating Compute Engine instances and other resources, but rarely ever use the label's functions.
Up until the past year or so, GCP didn’t really use labels in stark comparison to AWS, which has had it for quite a while and uses it quite extensively, especially for understanding billing data.
What are these label things anyways?
The GCP documentation defines labels as a “lightweight way to group together resources that are related or associated with each other.”
That does a good job of summing up what they are, but a better question is, “What are they used for?”
The primary purpose of putting a label on a resource is being able to easily track how much specific resource costs in your billing data. By default, every service in your billing is grouped together as a single line item.
For instance, a line item from a bill for running a Compute Engine instance might appear something like this: “Compute Engine Small Instance with 1 VCPU running in Americas: 15342.56 Hours.”
Just reading that seems simple enough until you realize that the average month has 730 hours in it. So that means there were multiple instances running in that month, but how many? Also which ones ran for how long in that month and what project were they in? Looking at the billing report in the GCP console doesn’t tell you much more either.
This data is almost impossible to extrapolate from the figures Google gives in the bill and console. This is where labels come into play.
When using labels in your project, your billing report appears in your billing report as something like this:
Notice how it is broken out by labels which make it much easier to read. I will cover the iris_name part of the labels later in this article.
Labels vs Tags
One distinction that first needs to be made is the difference between labels and tags inside of GCP.
Up until recently, they had a pseudo-1-to-1 relationship, but now they are separate entities as GCP evolves. The difference now is that tags are a way to group resources together to have a network policy applied to them versus a label that groups resources together for billing and searching purposes (and possibly more in the future).
An example would be if you created a firewall rule and added a target tag to that rule then it would apply to any resource that had the same network tag applied to it, such as Compute Engine instances. The main difference most people will notice is the wording now, GCP is now using tags to refer to network tags (or target tags as firewall rules call them) and labels to apply to labels when in the past they were referred to the collective of both as tags.
It may take a while for the new verbiage to be applied, so when you hear of something being tagged in GCP do a double-take on it to see if it is a tag or a label. Many years of using the term tagging something is going to make it a hard habit to break when actually referring to labels.
One of the best open secrets out there for managing billing and costs while using GCP is reOptimize, a cost discovery and optimization tool provided by DoiT International to the public.
With reOptimize Reports, you can get instant insights on your Google Cloud Platform billing, manage budgets, set up cost allocations, receive alerts for budgets and cost anomalies, and explore different cost optimization strategies.
CMP Cloud Reports
With Cloud Reports you get even more visibility into your Google Cloud costs with a host of improvements:
- Up to 36 months of historical data (vs. 3–6 months)
- 100x faster Reports load & refresh time
- Unlimited number of user and system labels (vs. only one or the other)
- Support for reports on credits such as SUDs or CUDs
- Built-in reports
- Far more chart types
- Mobile-friendly reports
- Get regularly scheduled report updates via email or Slack
Out of the box, we provide a set of predefined reports you can use right away. These reports are shared across your entire organization so you and your team/s can be on the same page effortlessly.
Cloud Reports resemble the same idea of a pivot table to allow you to create highly configurable reports by any number of dimensions and measurements:
If you’re labeling your resources or use Iris3 to auto-generate labels for you, you can extract powerful insights from your billing information such as Google Cloud Storage cost per bucket:
Unlike reOptimize or Google Cloud Billing reports, CMP Cloud Reports offer hourly metrics of your billing data so it’s easy to spot billing issues or spikes derived from a recent deployment or change in configuration.
In a few weeks, we will provide you with an easy way to push annotations to your reports and integration with popular CI/CD systems such as CircleCI, Codefresh, Jenkins, and a few others.
If you and your team are interested in gaining access to Cloud Reports (and a host of other tools) at no additional cost, get in touch with us to learn more about becoming a DoiT International customer at no extra cost.
Best Practices for Labels
Below is a set of best practices that have been compiled by the DoiT International Customer Reliability Engineering team on using labels inside of GCP.
Before setting up labels it is best to know about the restrictions on them and what they can be applied to first. I recommend reading Google’s documentation on the subject first especially since it is bound to change after the time of this writing.
Not every product has label support yet, but many do that are not listed in the documentation. Just looking at the create and edit pages for instances of a service is the best way to determine if it supports labels or not.
Always use labels
Sometimes when creating a new Compute Engine instance or an image based on one we forget to put a label on there, and then it gets forgotten about till accounting comes back asking about some expense on your bill.
To avoid this, just make sure it’s part of your workflow to always use a label when creating a resource and to ensure that it’s done on any automated creation of resources.
Use an automated resource labeling tool
Leading on from the previous tip, Iris3 is newly rewritten open-source tool developed by us at DoiT to automatically label resources as they are created. This ensures you never miss a resource label being applied to a resource.
It can be customized to work with additional resource types if the type you want to use is not supported yet.
Label all you need
Labels are a key-value pair so use that to your advantage to add in as much data as you need onto them.
Here are some examples of labels to use:
- Environment names
Labeling a resource as belonging to development, staging, testing, production, etc. is always a safe choice so searching for all resources in production later for a billing report is much easier.
If you have a set of Compute Engine resources acting as web servers, another set in a GKE cluster, and one for a database server label them as such.
- Application name
Using the application name as a label makes it easier to group resources by the business project or application they are associated with makes it easier to look at costs per application.
- Region name
If your application or project stretches multiple regions adding the region value will help sort them later. This can be the GCP region name, a logical region you have setup in your application, or having a label for each.
- Resource Creator
Putting the name of the creator in a label will help sort out who created what resource later instead of having to dig through audit logs that may or may not have been retained.
- Owner or maintainer
Labeling a resource with the owner or maintainer’s name assists in who to contact in the case of an issue or question later. A common example of this would be the team that owns or maintains the resource.
- Cost, billing, or budget code
Some organizations have codes for different expenditures or budgets. Adding these into the label makes it easier for billing administrators or auditors to track these.
If the resource belongs to a certain department then adding it as a label makes it easier to track down a department later.
- Bucket name (Google Cloud Storage only)
If you have ever had to look at a GCP billing report or invoice you realize it groups all of the storage buckets together into one line item. This is a nightmare if you want to know what each bucket costs, so adding the bucket name as a label to each bucket will allow this to be broken out into separate line items.
- Associated resource name
If you have a resource that’s tied to another resource it’s a good idea to add a label naming that resource to it. Examples of this might be a persistent disk tied to a Compute Engine instance or a Dataproc cluster. An external IP that is tied to a Compute Engine managed instance group is a great example of this as well.
- Data classification
This is a broad labeling use case, but if you have any sort of data that needs to be labeled in a bucket or BigQuery dataset then add a label for it to denote this. Examples might be data that falls under regulatory compliance such as HIPAA or PCI or encrypted data. It’s best to label these resources with this so when a C-level executive asks how much money is spent on storing PHI (protected health information) per month it can be pulled up very quickly.
- Resource state
If a resource is active, pending deletion, disabled, etc. then labeling it as so makes it very easy to see how much is being billed to resources in a particular state.
- Folder or organization name
If you utilize folders or different organizational units in your organizational structure make a label them to be able to see what each organization or folder costs you later.
For a very good discussion on this topic and organizational structure for GCP please read this great article by my colleague Mike Sparr.
Create a standard set of labels you will apply to each resource and stick with it. It may make sense to have a standard set of labels to apply for each resource type. This makes searching for resources much easier and also makes tracking sets of resources on a billing report that much easier as well.
Once again using Iris3 to do this automatically makes it much easier to keep the standard applied.
Search for resources using filters and labels
Throughout the console in the filter bars if you start typing the word labels it will pop up a selection for it in a dropdown. After selecting it, you can start typing the label name after the colon and it will start autocompleting for you to filter by this particular label.
Note that at the time of writing this article this doesn’t work on the Cloud Storage page yet.
Use Kubernetes labels in GKE
GKE gives you the ability to label each node in a cluster with the Kubernetes labels function (located on the Metadata page when creating a cluster).
Tip: Inside of Kubernetes you can use a selector to select these nodes. Note this only applies to using GKE and not standalone clusters.
Use labels in your BigQuery billing queries
If you have ever used BigQuery as your billing data sink then you probably have seen a labels record in the tables it creates. Every label you add to a service appears in that record for the resource associated with that row. Using this data can allow much better exploration of the data using BigQuery.
Use labels on your Queues
If you use Pub/Sub for queueing or as part of a workflow create a label on your topic to label it as part of an application or the workflow it is a part of. This is an often overlooked cost especially with heavy users of Pub/Sub. Note that as of the time of this writing Pub/Sub Lite doesn’t support labels.
Keep your label policy up-to-date
When updating or rolling out a new project or using a new service in GCP check to see if it uses labels. If so then update your label policy to include them or add it to Iris3 for updating automatically.