Streamlining Kubernetes Resource Validation with Validating Admission Policies and CEL
In-cluster validation offers many practical uses, such as preventing accidental or malicious deletion of resources, limiting the number of replicas a deployment can have for better resource management, and requiring certain annotations, labels, or environment variables to be present (or absent), among other things. Understanding the process of validating resources within a Kubernetes cluster is quite simple: every time a matching request is made to the API, a set of policies are run against it to determine whether it’s allowed or denied.
This can be incredibly valuable for cluster operators, as it helps enforce specific standards and rules on cluster users, such as developers. It’s no wonder that this has become a hot topic for many of our customers, and numerous products and companies have emerged to simplify the process of creating and running validation for Kubernetes resources.
By providing more information about the role of Dynamic Admission Control (DAC) in the validation process and the challenges faced by cluster operators, the article will help readers better appreciate the simplicity and benefits offered by Validating Admission Policies (VAP), Common Expression Language (CEL) and their RBAC-like syntax. This context will enable them to make more informed decisions about adopting these new features in their Kubernetes environments.
If you’re here for the actual code and examples, skip ahead to the Validating Admission Policies section. For more advanced use cases, take a look at the follow up post: Validating Admission Policies in Kubernetes: Advanced Use Cases.
Challenges with implementing Dynamic Admission Control
Admission Controller Phases (source)
Implementing request validation within your cluster isn’t always a walk in the park. It involves using DAC — a powerful but not always user-friendly tool for cluster operators.
DAC is a critical component in the Kubernetes validation process. It allows cluster administrators to manage the admission of resources into the cluster by intercepting API requests and either allowing or rejecting them based on a set of predefined policies. This offers fine-grained control over the resources being created, updated, or deleted in the cluster, ensuring that they adhere to specific standards and rules.
However, using DAC currently can be challenging for cluster operators due to its complexity. In short, the process entails writing an admission webhook server, creating the webhook resource, configuring authentication to the API server if needed, and managing the certificate used for TLS. Furthermore, operators need to be well-versed in Kubernetes concepts and security best practices and be proficient in programming languages to create and maintain the validating logic.
Although tools like OPA Gatekeeper can streamline this process, cluster operators still need to deploy and manage the workload running in their cluster. But fear not! Kubernetes has introduced a new resource type called Validating Admission Policies (VAPs), available in alpha starting v1.26, which offers a simpler solution. Their simplicity, which demonstrate in the next sections, will likely make VAPs the de-facto standard way of performing in-cluster validation.
Introducing VAPs — a cluster operator’s best friend for resource validation
VAPs leverage Google’s CEL to define validation expressions that will be run against matching API requests. Because CEL expressions are evaluated directly in the API server, there’s no need to write or deploy custom workloads to evaluate your policies. This makes policy creation incredibly straightforward and accessible compared to alternative methods, such as the previously discussed Validating Admission Webhooks.
If you’re concerned about overloading the API and impacting other processes, don’t worry — CEL comes with resource constraints to keep things in check. It’s worth learning more about these safeguards by reading about CEL resource constraints in the official Kubernetes documentation.
CEL support was first introduced in Kubernetes v1.23 for inline validation of Custom Resource Definitions (CRD) — currently in beta starting v1.25. This has opened up numerous possibilities for CEL use cases within Kubernetes. For a deeper dive into the origins of CEL in Kubernetes and its potential, we recommend watching Cici Huang’s talk, “The Path to Self Contained CRDs,” which inspired our exploration of Validating Admission Policies.
Harnessing VAPs and CEL for Efficient Validation
One of the main advantages of using CEL and VAPs is their simplicity. Since the validation logic is evaluated directly within the API server, there’s no additional overhead from custom workloads, making it easier to scale the validation process as the cluster grows. Additionally, CEL provides a lightweight and efficient way to express validation rules, improving performance and reducing the risk of overloading the API server. Built-in resource constraints further ensure the stability and efficiency of the validation process.
By leveraging Validating Admission Policies and CEL, cluster operators can use a more straightforward and scalable validation process. This allows them to enforce standards and rules within the cluster more efficiently and effectively than with other methods. The ease of use, scalability, and performance benefits of VAPs and CEL make them a compelling alternative for cluster operators looking to streamline Kubernetes in-cluster validation.
Validating Admissions Policies
Using this feature requires two main components to get going:
ValidatingAdmissionPolicy— defines the failure policy, request matches and CEL validation expressions. I.e. — the policy
ValidatingAdmissionPolicyBinding— defines the scope of the policy; it binds the policy to a set of matched resources
Even though this feature has been in Alpha since v1.26, some features are only present starting v1.27 (audit annotation, validation actions), so all of the following manifests were tested on a Kubernetes cluster running v1.27.1 (with alpha features and
ValidatingAdmissionPolicyfeature gate enabled).
All examples in this post can also be found in a GitHub repository created for this purpose.
The first example comes straight from the documentation. We’ll create and use the demo namespace for running all the examples in this post:
$ echo 'apiVersion: v1 kind: Namespace metadata: labels: environment: demo name: demo' | k apply -f- namespace/demo created $ k config set-context --current --namespace demo
Next, we will create a policy that prevents Deployments from having more than five replicas:
apiVersion: admissionregistration.k8s.io/v1alpha1 kind: ValidatingAdmissionPolicy metadata: name: "demo-policy.example.com" spec: failurePolicy: Fail matchConstraints: resourceRules: - apiGroups: ["apps"] apiVersions: ["v1"] operations: ["CREATE", "UPDATE"] resources: ["deployments"] validations: - expression: "object.spec.replicas <= 5"
failurePolicyfield can be set to the following values:
Fail means that an error calling the
ValidatingAdmissionPolicycauses the admission to fail and the API request to be rejected.
Ignore means that an error calling the
ValidatingAdmissionPolicyis ignored and the API request is allowed to continue.
matchConstraintsfield is used to match incoming requests and is configured so that the policy will only apply to API requests for Deployments in the
apps/v1API, and only requests to
validationsfield contains the actual CEL expressions that will be run against the matched API requests. All expressions need to be evaluated to
true in order for the request to be admitted.
We’ll create the following binding to complete the configuration:
apiVersion: admissionregistration.k8s.io/v1alpha1 kind: ValidatingAdmissionPolicyBinding metadata: name: "demo-binding-test.example.com" spec: policyName: "demo-policy.example.com" validationActions: [Deny] matchResources: namespaceSelector: matchLabels: environment: demo
We’re binding the policy created earlier to any namespace that has the label
environment=demoset — this is useful for configuring where we want to enforce the policy validation.
There are more options for matching namespace, such as
matchExpressionsfor performing more granular matching, as well as other configuration options;
excludeResourceRulesto exclude certain resources,
objectSelectorto match certain objects (discouraged since developers can omit a label to avoid auditing). I’ll cover the more interesting options in later examples.
Once these manifests are applied to the cluster, trying to create a new Deployment that violates the policy will throw an error, as expected:
$ k create deployment nginx — image=nginx — replicas=10 error: failed to create deployment: deployments.apps "nginx" is forbidden: ValidatingAdmissionPolicy 'demo-policy.example.com' with binding 'demo-binding-test.example.com' denied request: failed expression: object.spec.replicas <= 5
This also works when trying to diff the resource:
$ k create deployment nginx --image=nginx --replicas=10 --dry-run=client -oyaml | k diff -f - The deployments "nginx" is invalid: : ValidatingAdmissionPolicy 'demo-policy.example.com' with binding 'demo-binding-test.example.com' denied request: failed expression: object.spec.replicas <= 5
Multiple validation expressions
Sometimes you’d want to perform multiple validations on resources, consider the following validation expressions for
apiVersion: admissionregistration.k8s.io/v1alpha1 kind: ValidatingAdmissionPolicy metadata: name: "demo-policy.example.com" spec: failurePolicy: Fail matchConstraints: resourceRules: - apiGroups: ["apps"] apiVersions: ["v1"] operations: ["CREATE", "UPDATE"] resources: ["deployments"] validations: # Deployments can't have more than 3 replicas - expression: "object.spec.replicas <= 3" # Deployment containers must be using images hosted in europe-west1 Artifact Registry in project test-eyal - expression: "object.spec.template.spec.containers.all(c, c.image.startsWith('europe-west1-docker.pkg.dev/test-eyal/'))" # Deployment cannot use emptyDir volumes - expression: "!has(object.spec.template.spec.volumes) || object.spec.template.spec.volumes.all(v, !has(v.emptyDir))"
A Deployment that is matched by this policy will need to have all expressions evaluated to
truein order to be admitted to the cluster.
One thing to note is that the validations are executed sequentially. If one expression fails, it’ll return that failure to the client immediately — this means that if your resource violates more than one of the validation expressions, it will be an iterative process of getting denied, fixing the violation and trying again.
Customize the validation message
It is also possible to supply a custom
messageto be returned to the client in case of a failed validation. You can even perform interpolation if necessary by using a
messageExpression. A message expression has access to
Let’s update our last policy with custom messages rather than comments:
apiVersion: admissionregistration.k8s.io/v1alpha1 kind: ValidatingAdmissionPolicy metadata: name: "demo-policy.example.com" spec: failurePolicy: Fail matchConstraints: resourceRules: - apiGroups: ["apps"] apiVersions: ["v1"] operations: ["CREATE", "UPDATE"] resources: ["deployments"] validations: - expression: "object.spec.replicas <= 3" messageExpression: "'Deployments cannot have more than 3 replicas, this one has ' + string(object.spec.replicas)" - expression: "object.spec.template.spec.containers.all(c, c.image.startsWith('europe-west1-docker.pkg.dev/test-eyal/'))" message: "Deployment containers must be using images hosted in europe-west1 Artifact Registry in project test-eyal" - expression: "!has(object.spec.template.spec.volumes) || object.spec.template.spec.volumes.all(v, !has(v.emptyDir))" messageExpression: "'Deployment cannot use emptyDir volumes, change the following volume: ' + object.spec.template.spec.volumes.filter(v, has(v.emptyDir)).map(v, v.name)"
It’s important to note that any interpolated values in the
messageExpressionfield would need to be of type
string, otherwise the message will return an error for the failed evaluation instead.
Without setting a custom message, a deployment with an
emptyDirvolume would have failed with the following:
The deployments “nginx” is invalid: : ValidatingAdmissionPolicy ‘demo-policy.example.com’ with binding ‘demo-binding-test.example.com’ denied request: failed expression: !has(object.spec.template.spec.volumes) || object.spec.template.spec.volumes.all(v, !has(v.emptyDir))
This might be somewhat hard to understand from a client's perspective. With the custom message expression, we would get the following:
The deployments “nginx” is invalid: : ValidatingAdmissionPolicy ‘demo-policy.example.com’ with binding ‘demo-binding-test.example.com’ denied request: Deployment cannot use emptyDir volumes, change the following volumes: test
Note that I’m using a
messageExpressionrather than a normal
messagefor this validation in order to demonstrate the power of CEL. The fact that you can do something doesn’t always mean that you should!
As Kubernetes continues to evolve, CEL support is set to introduce more exciting capabilities in future releases. We encourage you to explore Validating Admission Policies and become acquainted with this powerful feature, as it’s likely to become a go-to tool for cluster operators.
Remember that the Validating Admission Policy feature gate is currently in Alpha, which means changes and improvements are expected as it progresses toward General Availability. Stay informed by following updates in the Kubernetes changelog and documentation to ensure you’re up to speed with the latest developments.
Given the ongoing nature of this feature, the content of this article may need to be updated over time. If you come across an example that no longer works or a statement that is no longer accurate, please reach out to us, and we’ll review and update the information accordingly.