Skip to content

ADR-0013: Container network interface

Proposed
Status

proposed

Date

2026-03-09

Group

networking

Depends-on

ADR-0004, ADR-0006

Context

With Kubernetes on bare-metal (ADR-0007) and a spine-leaf BGP/EVPN underlay (ADR-0004), we need a CNI plugin that handles pod networking, network policy enforcement, and integrates with the physical network. The CNI choice affects performance, observability, and security capabilities across all tenant clusters. metal-stack currently ships with Calico, but this does not preclude choosing a different CNI.

Options

Option 1: Cilium

  • Pros: eBPF-based — high performance, no iptables overhead; built-in network policy, observability (Hubble), encryption (WireGuard), and load balancing; large CNCF community; BGP support for bare-metal integration; service mesh capabilities without sidecars

  • Cons: Isovalent (main contributor) was acquired by Cisco — long-term sovereignty implications unclear; eBPF requires recent kernel versions; metal-stack default is Calico, so integration requires additional work; steep learning curve for eBPF debugging

Option 2: Calico

  • Pros: metal-stack default — proven integration; mature and battle-tested; supports BGP natively; well-understood operational model; eBPF data plane mode available as optional upgrade path

  • Cons: iptables-based data plane has performance limitations at scale; less built-in observability; Tigera (main contributor) is also a commercial entity; eBPF mode is less mature than Cilium’s native eBPF implementation

Option 3: Antrea

  • Pros: OVS-based with eBPF support; CNCF project; strong multi-cluster networking; good VMware/Broadcom ecosystem integration

  • Cons: smaller community than Cilium and Calico; less bare-metal focus (origins in VMware); Broadcom acquisition adds vendor uncertainty

Option 4: Flannel

  • Pros: simplest CNI to operate; well-understood; minimal resource overhead

  • Cons: no network policy enforcement (requires a separate policy engine); limited observability; no BGP support; too simple for a multi-tenant platform

Decision

Cilium. The eBPF-based architecture provides superior performance, observability, and security features at scale. Calico is the metal-stack default and a credible alternative, but its iptables data plane is a scaling concern and its eBPF mode is less mature than Cilium’s. Antrea has a smaller community and less bare-metal focus. Flannel lacks network policy enforcement entirely. The sovereignty concern around Cisco’s acquisition of Isovalent is noted but mitigated by Cilium’s Apache 2.0 license and large open-source community. This decision should be revisited if Cilium’s open-source governance changes materially.

Consequences

  • metal-stack’s default Calico must be replaced with Cilium in Gardener shoot cluster provisioning

  • Kernel version requirements must be met on bare-metal nodes

  • Hubble provides built-in network observability per tenant cluster

  • Network policy enforcement uses Cilium’s policy engine

  • WireGuard encryption is available for cross-node pod traffic

  • Bare-metal load balancing ADR should evaluate Cilium’s built-in LB capabilities