Press ESC to close Press / to search

eBPF for Linux Sysadmins: Real-Time Observability Without Agents or Code Changes

πŸ“‘ Table of Contents

If you’ve spent any time troubleshooting a production Linux system, you know the drill: you notice latency spikes, dropped packets, or mysterious CPU saturation, and then you start bolting on monitoring agents, enabling verbose logging, and restarting services just to get visibility. That approach has worked for years, but it comes with real costs β€” agent overhead, configuration drift, code changes, and sometimes a reboot window you don’t have. eBPF changes that equation entirely. It gives you deep, real-time insight into the Linux kernel and user-space applications with virtually zero overhead and zero code changes to the applications you’re watching.

In 2026, eBPF is no longer an experimental curiosity. It is the foundational technology behind some of the most widely deployed observability, security, and networking tools in the Linux ecosystem. If you’re a sysadmin or DevOps engineer and you haven’t gotten hands-on with it yet, this guide is your starting point.

What eBPF Actually Is

eBPF stands for extended Berkeley Packet Filter. The original BPF was introduced in the early 1990s as a mechanism for filtering network packets efficiently in the kernel. The “extended” version, which landed in the Linux kernel around 3.18 and matured significantly through kernel versions 4.x and 5.x, turned that narrow packet-filtering mechanism into a general-purpose, safe, sandboxed virtual machine running inside the kernel.

The core concept is this: you write a small program, compile it to eBPF bytecode, and load it into the kernel. The kernel verifies the program before execution β€” it checks for infinite loops, out-of-bounds memory access, and other dangerous patterns. If it passes verification, the just-in-time (JIT) compiler turns it into native machine code, and your program runs inside the kernel with access to kernel data structures, hardware counters, and network events.

What makes this powerful for observability is the attachment points. You can attach eBPF programs to:

  • System calls (kprobes, tracepoints)
  • Network packet processing (XDP, TC hooks)
  • User-space function entry and exit (uprobes)
  • Scheduler events
  • Memory subsystem events
  • Hardware performance counters

You attach a probe, collect data into eBPF maps (key-value stores shared between kernel and user space), and a user-space program reads those maps and presents the data. No kernel module. No recompilation. No application restart.

Why eBPF Matters More Than Ever in 2026

The shift to containerized, microservices-based architectures has made traditional monitoring harder. You have hundreds of short-lived processes, dynamic networking, and workloads that may exist for seconds. Deploying a metrics agent per container is expensive and impractical. Modifying application code to emit telemetry takes time and requires coordination across teams.

eBPF solves this at the infrastructure layer. A single eBPF program running on a node can observe every process, every network connection, every syscall β€” without touching the applications themselves. This is why tools like Cilium (networking), Falco (security), Pixie (observability), and Parca (continuous profiling) have all converged on eBPF as their core mechanism.

Kernel support has also matured significantly. Modern distributions ship kernels well above 5.15, and features like BTF (BPF Type Format) and CO-RE (Compile Once, Run Everywhere) mean you can write portable eBPF programs that work across kernel versions without recompilation. That was a serious pain point two or three years ago. It isn’t anymore.

The Primary eBPF Toolkits

BCC (BPF Compiler Collection)

BCC is the original high-level eBPF toolkit. It ships with a large library of ready-to-run tools and lets you write eBPF programs using Python or Lua as the front-end language, with C for the kernel-side code. If you’re on Ubuntu or Debian, installation is straightforward:

apt install bpfcc-tools linux-headers-$(uname -r)

On RHEL/Rocky Linux:

dnf install bcc-tools kernel-devel

BCC ships dozens of production-ready tools. Some immediately useful ones:

  • execsnoop β€” traces all new process executions system-wide
  • opensnoop β€” traces file opens, showing which process opens which file
  • biolatency β€” shows block I/O latency as a histogram
  • tcpconnect β€” traces outbound TCP connections
  • runqlat β€” measures scheduler run queue latency

Run them directly:

/usr/share/bcc/tools/execsnoop
/usr/share/bcc/tools/opensnoop -p 1234
/usr/share/bcc/tools/biolatency -D

bpftrace

bpftrace is the more modern, lightweight alternative. Think of it like awk or sed, but for kernel events. It has a clean scripting language that lets you write one-liners or short scripts against kernel tracepoints, kprobes, and uprobes.

Install on Ubuntu:

apt install bpftrace

Install on RHEL/Rocky:

dnf install bpftrace

The bpftrace one-liner syntax is compact and readable. Here are examples you can run on any modern Linux system right now:

Trace all files opened system-wide, showing PID, process name, and filename:

bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%d %s %s\n", pid, comm, str(args->filename)); }'

Show a histogram of read() syscall latency:

bpftrace -e 'tracepoint:syscalls:sys_enter_read { @start[tid] = nsecs; }
tracepoint:syscalls:sys_exit_read /@start[tid]/ {
  @latency = hist(nsecs - @start[tid]);
  delete(@start[tid]);
}'

Count syscalls by process name every 5 seconds:

bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); } interval:s:5 { print(@); clear(@); }'

Trace TCP connection attempts, showing source and destination:

bpftrace -e 'kprobe:tcp_connect { printf("%s -> %s\n", comm, ntop(AF_INET, ((struct sock *)arg0)->__sk_common.skc_daddr)); }'

These one-liners give you insight that would have previously required instrumenting the kernel or adding application-level logging.

Pixie

Pixie is an open-source observability platform built entirely on eBPF. It runs as a DaemonSet in Kubernetes and automatically captures full-body HTTP/gRPC requests, database queries, and latency data β€” no SDK, no instrumentation, no code changes.

Deploy it with:

px deploy

Once deployed, you use PxL (Pixie Language) scripts to query your cluster’s telemetry. For example, to see slow HTTP requests across all services:

px run px/http_data -- -start_time '-5m'

Pixie is particularly useful in microservices environments where you want distributed tracing without modifying every service to emit traces manually.

Cilium

Cilium uses eBPF for Kubernetes networking and security. It replaces iptables-based kube-proxy with an eBPF-native data plane, provides network policy enforcement at the kernel level, and ships Hubble for network flow visibility. We cover Cilium in depth in a separate article, but from an observability standpoint, Hubble gives you a real-time network flow graph of your entire cluster β€” powered entirely by eBPF probes on every node.

Real Use Cases

Diagnosing Slow System Calls

One of the most practical eBPF use cases is finding which syscalls are taking longer than expected. Application latency often traces back to a slow read(), write(), or futex() call, but application-level profilers don’t always surface this cleanly.

With bpftrace, you can find the slowest syscalls for a specific process:

bpftrace -e '
tracepoint:syscalls:sys_enter_read /pid == 12345/ { @ts[tid] = nsecs; }
tracepoint:syscalls:sys_exit_read /@ts[tid]/ {
  $delta = nsecs - @ts[tid];
  if ($delta > 1000000) {
    printf("slow read: %d us\n", $delta / 1000);
  }
  delete(@ts[tid]);
}'

This prints a line every time process 12345 does a read() that takes more than 1ms. You can swap in any syscall. The output appears immediately, in real time, with no agents and no application restart.

Network Debugging Without tcpdump Overhead

tcpdump is excellent but copies packet data to user space, which adds overhead. With eBPF, you can count, sample, or filter at the kernel level and only surface aggregated data or specific events.

Count inbound connections by destination port:

bpftrace -e 'kprobe:inet_csk_accept { @[((struct sock *)retval)->__sk_common.skc_num] = count(); }'

Detect SYN flood patterns by counting half-open connections:

/usr/share/bcc/tools/tcpsynbl

Track per-process network bandwidth consumption:

/usr/share/bcc/tools/nethogs  # BCC equivalent

CPU Performance Profiling

Profile CPU usage stack traces across all processes using bpftrace’s sampling mode:

bpftrace -e 'profile:hz:99 { @[kstack] = count(); } interval:s:10 { print(@); exit(); }'

This samples kernel stacks at 99Hz for 10 seconds and prints a frequency map. Pipe the output to a flamegraph tool to produce a visual profile without any instrumentation on the running workload.

For user-space profiling, include user stacks:

bpftrace -e 'profile:hz:49 /pid == 5678/ { @[ustack] = count(); }'

Security Monitoring

eBPF is increasingly used for runtime security. Falco, a CNCF project, uses eBPF to detect suspicious behavior β€” unexpected outbound connections, privilege escalation attempts, sensitive file reads. Because the detection happens in the kernel, it’s harder to evade than user-space monitoring.

A simple bpftrace example to detect any process trying to open /etc/shadow:

bpftrace -e 'tracepoint:syscalls:sys_enter_openat /str(args->filename) == "/etc/shadow"/ {
  printf("ALERT: %s (pid %d) opened /etc/shadow\n", comm, pid);
}'

eBPF vs. Traditional Methods

Traditional observability relies on one of three approaches: agents that run alongside your application, instrumentation you add to application code, or kernel modules that expose additional metrics. Each has drawbacks.

Agents consume memory and CPU, need to be deployed and updated, and can fail independently of your application. Application instrumentation requires code changes and redeployment β€” and if you don’t own the code, you’re stuck. Kernel modules are powerful but dangerous; a bug in a module can crash the entire host, they tie you to specific kernel versions, and they require signing on modern systems.

eBPF avoids all three problems. The verifier ensures safety. Programs run inside the kernel without a separate process. You attach and detach probes without restarting anything. The overhead is typically measured in single-digit percentage points of CPU for heavy profiling workloads, and well under 1% for targeted tracing.

The one limitation worth knowing: eBPF requires a reasonably modern kernel. Most production systems running Ubuntu 22.04+, RHEL 9+, Debian 12+, or any Fedora release from the last three years are well within the range where the full eBPF feature set is available.

Getting Started Checklist

  1. Check your kernel version: uname -r β€” you want 5.15 or higher for the best experience
  2. Install bpftrace and bcc-tools on your system
  3. Run bpftrace -l 'tracepoint:syscalls:*' to list available tracepoints
  4. Try the execsnoop one-liner to see every new process on the system
  5. Explore the BCC tools at /usr/share/bcc/tools/ β€” each one is documented in its source
  6. For Kubernetes environments, evaluate Pixie or Cilium+Hubble for cluster-wide observability

Key Takeaways

  • eBPF lets you observe the Linux kernel and any application running on it without agents, code changes, or restarts
  • bpftrace provides an awk-like scripting interface for writing kernel probes in minutes
  • BCC ships production-ready tools for disk I/O, network, CPU, and syscall analysis
  • Pixie brings eBPF-powered observability to Kubernetes with zero instrumentation
  • Cilium uses eBPF for both networking and security, replacing iptables at the kernel level
  • eBPF programs are verified for safety before execution β€” they cannot crash the kernel
  • The performance overhead is minimal, making eBPF suitable for continuous production use, not just debugging sessions
  • In 2026, any serious observability or security tooling decision in the Linux/Kubernetes space should start with eBPF

Was this article helpful?

Advertisement
R

About Ramesh Sundararamaiah

Red Hat Certified Architect

Expert in Linux system administration, DevOps automation, and cloud infrastructure. Specializing in Red Hat Enterprise Linux, CentOS, Ubuntu, Docker, Ansible, and enterprise IT solutions.

🐧 Stay Updated with Linux Tips

Get the latest tutorials, news, and guides delivered to your inbox weekly.

Advertisement

Add Comment


↑