If you're in the Linux or networking world then you might have heard of eBPF as a promising new technology for observability, security and networking. Companies like Cilium and open source projects such as Calico have started to gain a lot of traction due to their eBPF-based tools. So what exactly is eBPF? How does it work? And How can we use it? Well to answer those questions, let's start by looking at how Linux works and how it runs user-generated code.
##First - User space vs Kernel space
Linux separates memory allocation into two distinct areas: user space and kernel space. In the diagram above, you can see what processes run in each space. The idea is that the users can be unpredictable in how they use the system and might be accessing potentially dangerous software so as a way to protect critical system processes and functions, Linux segments the available memory into user space and kernel space. User space only has access to a limited part of memory and can only access a small part of the kernel via an interface exposed by the kernel - the system calls. On the other hand, the kernel space is where the operating system kernel runs and provides services to user-level processes. This space has direct access to system resources and provides services such as process scheduling, memory management, and device drivers.
So why is this important to understanding eBPF?
eBPF leverages this separation of user space and kernel space to provide developers a way to safely execute code in kernel space. This means that a developer can write a program that can safely access system processes running in kernel space. This is important because it significantly increases the things that you can do as a developer working on a linux machine. You can now get granular security data, that was once only available to the kernel, to analyze and improve your application security, you can trace applications and provide insights for performance troubleshooting and much more.
So what is eBPF?
eBPF (or extended Berkeley Packet Filter) allows developers to insert code that will safely run inside of a sandbox in kernel space. This sandbox is a virtual machine with privileged access to the kernel that is compiled just-in-time(JIT), supports various runtimes from C, Go Rust and more, and provides a way to add custom functionality to the operating system without requiring kernel code changes or loading a kernel module.
eBPF programs are event-driven and run when the kernel or application passes a hook point. There are a number of pre-defined hooks such as:
- system calls
- function entry/exit
- kernel tracepoints
- network events
Once the hook point passes then the eBPF program is executed.
Verifying the eBPF Program
As we described above, one of the main reasons that this works well is because the virtual machine allows developers to run their code safely. When you're writing an eBPF program, most of the time you'll use a project like Cilium, bcc or bpftrace provide layers of abstraction onto of eBPF and make it easy to write programs. These projects load the eBPF program you've written into the kernel using the
bpf command. The program then goes through several verification and validation steps which ensure that the program is save to run. Some of these include:
- The process loading the eBPF program holds the required capabilities (privileges). Unless unprivileged eBPF is enabled, only privileged processes can load eBPF programs.
- The program does not crash or otherwise harm the system.
- The program always runs to completion (i.e. the program does not sit in a loop forever, holding up further processing).
eBPF has a verification component which runs this process and ensures that the code is safe. Once it passes the verification then the code goes to the JIT compiler to be compiled into a machine-specific instruction set that the kernel can execute.
Projects using eBPF
There is a whole ecosystem growing around eBPF that I suspect will continue to grow. Below, I listed some of the more popular projects currently going on. For a much more complete list, check out this link .
- Calico - Pluggable eBPF-based networking and security for containers and Kubernetes
- Cilium - eBPF-based Networking, Security and Observability
- bpftrace - High-level tracing language for Linux eBPF
- Pixie - Scriptable observability for Kubernetes
- Falco - Cloud native runtime security
eBPF in Kubernetes
We mentioned above a few projects that are bringing eBPF to Kubernetes and I wanted to expand a little more on those use-cases. eBPF has gained popularity in the Kubernetes ecosystem as a way to add custom networking and security functionality to Kubernetes clusters. It leverages the Kubernetes CNI (Container Networking Interface) plugin architecture to implement custom networking solutions such as load balancing, traffic shaping, and network policy enforcement. Projects like Istio are even introducing new eBPF-based solutions .
To use eBPF in Kubernetes, you'll need to install a CNI plugin such as Cilium or Calico that supports eBPF. These plugins provide a way to deploy and manage eBPF programs within your Kubernetes cluster. Once installed, you can define your eBPF programs using YAML manifests.
This was a short introduction into eBPF and how it can be used for security, networking, observability and more. As the eBPF ecosystem continues to grow, I'm sure more projects will come out that push what eBPF can do forward. At Nucleus, we're particularly interested in how eBPF will be applied to Kubernetes. We'll definitely be keeping an eye on it!