BLOG

eBPF, Cilium, Dataplane V2 and All That Buzz (Part 2)

1 s vljulibfikr30cbda78a

Table of contents

1 s vljulibfikr30cbda78a
I hope you enjoyed reading about eBPF in Part 1! Now letโ€™s examine Cilium as a popular K8s eBPF solution and find out how it relates to Dataplane V2.

Cilium ๐Ÿ is a โ€œhotโ€ technology thatโ€™s powered by eBPF. Itโ€™s often the first thing mentioned when eBPF comes up, more so in the context of K8s. Cilium is basically open-source software that acts as a CNI plugin for Kubernetes. It provides eBPF-based networking, observability, and security with optimal scale and performance for platform teams operating Kubernetes environments on the cloud and on-premise.

 

Taking advantage of Extended Berkeley Packet Filter (eBPF), Cilium brings a number of interesting features to K8s. Letโ€™s take a closer look.

0 5zwtvbutbozpdrzv

Network Policiesย ๐Ÿ•ธ

Itโ€™s a good practice to implement Least Privilege security when it comes to your K8s pods communicating with each other. The basic K8s Network Policies (operate at L3/L4) do a good job, but you can build upon them with Cilium Network Policies (operate at L3-L7).

This can be very useful in the world of K8s and microservices because examining and controlling the network traffic with metadata (e.g., IPs and ports) doesnโ€™t provide much value. IPs and ports change all the time as services come and go. With Cilium, you can also control the traffic with Pod, HTTP, gRPC, Kafka, DNS, and other metadata.

For example, you can define HTTP rules that allow a specific API call to be made from a certain pod via path, header, and request methods. Another example is defining DNS rules based on FQDN, so that only queries to a specific domain will be allowed. That allows us to define security policies that are more valuable and usable in our real-world use-cases.

Multi-Cluster Connectivity ๐Ÿ”—

By using a cluster mesh, Cilium allows K8s pods to communicate and be discovered across K8s clusters. Use cases include High Availability and Multi-Cloud (connecting K8s clusters across cloud providers).

Load Balancing โš–๏ธ

Cilium replaces kube-proxy for BPF.kube-proxy uses iptables which is being replaced for BPF. This change dramatically improves performance.

Additional Features

A few words on the alternatives. There are other capable CNI plugins out there in the market. Hereโ€™s a comparison (might be opinionated). Calico has also recently introduced an eBPF Dataplane.

Dataplane V2ย โœˆ

Google Cloud Platform is keeping GKE (Google Kubernetes Engine) in the loop by leveraging Cilium into their own mechanism, Dataplane V2. But is Dataplane V2 really a Google-managed Cilium for GKE? We did love our managed services, right? This calls for a close inspection.

Upon inspecting Googleโ€™s documentation of Dataplane V2 concepts, there is no indication or reference to the Cilium project (at the time of writing this blog post). However, in the official blog post and in some documentation, there are some minor references.

Dataplane V2 control plane is deployed as a K8s DaemonSet called anetd. A quick kubectl describe daemonsets.apps -n kube-system anetd reveals that it is using the image gke.gcr.io/cilium/cilium:v1.9.4-gke.17.

So, is that really Cilium? Letโ€™s run kubectl exec -n kube-system -ti ds/anetdโ€Šโ€”โ€Šcilium versionย . Here is the output:

Client: 1.9.4 609a63dfb 2021-04-12T15:01:54-07:00 go version go1.15.7 linux/amd64

Yes! Itโ€™s indeed Cilium 1.9.4ย ! However, upon comparing it with the official Cilium v1.9.4 image, we get a slightly different result:

Client: 1.9.4

07b62884c


2021-02-03T11:45:44-08:00

 go version go1.15.7 linux/amd64

Now letโ€™s compare the Docker images gke.gcr.io/cilium/cilium:v1.9.4-gke.17 and quay.io/cilium/cilium:v1.9.4 with a tool like Dive. It seems like there are some changes in the layers, but itโ€™s hard for me to tell if there is any major logic change between them when it comes to Ciliumโ€™s offerings.

Itโ€™s also worth mentioning that in this blog post, itโ€™s claimed that Google stepped in and contributed a number of meaningful features to the Cilium project. I think that shows a certain degree of commitment.

So, Dataplane V2 = Managedย Cilium?

So far, I cannot conclude whether Dataplane V2 is a managed Cilium for GKE. And without this official conclusion, we can say that at least as a product, Dataplane V2 โ‰  Cilium. It looks like Cilium is being used under the hood or behind the scenes. Googleโ€™s docs simply donโ€™t say or guide you to Cilliumโ€™s docs. Itโ€™s a completely different offering.

From the testing I have done, some Cilium features seem to work on Dataplane V2. However, thereโ€™s no official Google support. Needless to say, โ€œunchartedโ€ Cilium features on Dataplane V2 could work today, but may break unexpectedly at any given time. Hence, itโ€™s best we donโ€™t enter uncharted waters. Just follow the official docs to be on the safe side.

Vanilla Cilium or Dataplane V2?ย ๐Ÿค”

Hereโ€™s a feature comparison:

https://gist.github.com/yarelm/a9e9b5ab51ee2b8a79b8c16acb4444ba

The common features of both Cillium and Dataplane V2 currently are:

  • K8s Network Policies (not CiliumNetworkPolicy, although Dataplane V2 doesnโ€™t seem to reject it at the moment),
  • Replacing kube-proxy with eBPF,
  • Network Policy Logging. Itโ€™s not really a Cilium feature, but it is based on Cilium. It allows you to monitor the outcome of your network policies hits.

My Opinionย ๐Ÿ’ญ

I always try to opt for the simplest solution possible. If itโ€™s managed, thatโ€™s great! It saves precious time for all sides involved. Dataplane V2 seems like a simpler and easier managed solution if all you need is to leverage K8s Network Policies, a kube-proxy eBPF replacement for performance and scale, and the easy-to-use logging of the network policies outcome.

Just make sure you are aware of its limitations. If you need the additional features from Cilium such as Hubble, Cilium L7 Network Policies, Cluster Mesh or using a self-hosted/different cloud provider, youโ€™ll probably want to go with vanilla Cilium instead.

Dataplane V2: The Prosย ๐Ÿ‘

  • Firstly, itโ€™s easy to install. Just add the โ€”enable-dataplane-v2flag when creating a new GKE cluster via gcloud container clusters createย .
  • It is based on the open-source Cilium project.
  • Dataplane V2 acts as the foundation for the Network Policy Logging in GKE. This is a nifty feature that creates logs when a connection is allowed or denied by a network policy.
  • The anted plugin as part of Dataplane V2 (based on Cilium) is managed by Google and is currently in GA (General Availability). That means it is ready for production workloads, with ongoing updates and support.
  • Itโ€™s reasonable to assume Google will add more integration with GKE native features, perhaps with Cilium native features as well. This makes the choice more compelling if you are looking at the long-term.

Dataplane V2: The Consย ๐Ÿ‘Ž

  • As of this moment, I couldnโ€™t find a way to use Hubble with Dataplane V2. Hubble is a very nice observability tool by Cilium that can provide important visibility by utilizing Cilium eBPF. You can track it here.
  • Officially, Dataplane V2 is not a managed Cilium solution. This means you canโ€™t rely on some Cilium features that are working for you now with Dataplane V2. You may be looking at potential breaks going ahead.
  • You can only enable it on a newly-created GKE cluster. This means you canโ€™t use it on your existing GKE clusters.
  • There are some other limitations you should be aware of.

Getting Startedย ๐Ÿƒ๐Ÿฝโ€โ™€๏ธ

To get started with Cilium on K8โ€Šโ€”โ€ŠClick here.

To get started with Dataplane V2 on GKEโ€Šโ€”โ€ŠClick here.

Pro Tip: When you are writing or planning your K8s/Cilium Network Policy manifests, use the Cilium Editor for a fun and safe experience.

The Future ofย eBPF

Groundbreaking as this technology is, I can predict we will continue seeing many more solutions and interesting developments using eBPF.

One such potential area of influence is the service mesh world. Most of the existing service mesh solutions (e.g., Istio, Linkerd) rely on sidecar proxies attached to your pods. This impacts performance, adds complexity, and introduces additional points of failure. eBPF has the potential to provide service mesh capabilities by replacing the sidecar proxies with eBPF logic, potentially making service mesh accessible for additional use cases.

Stay tuned!


Thanks for reading! To stay connected, follow us on the DoiT Engineering Blog, DoiT Linkedin Channel, and DoiT Twitter Channel. To explore career opportunities, visit https://careers.doit.com.

Schedule a call with our team

You will receive a calendar invite to the email address provided below for a 15-minute call with one of our team members to discuss your needs.

You will be presented with date and time options on the next step