Skip to content

ADR-0004: Datacenter network topology

Proposed
Status

proposed

Date

2026-03-09

Group

networking

Depends-on

ADR-0002, ADR-0003

Context

At 50,000 physical servers across 3+ sites, the physical network architecture must be explicitly chosen. The topology determines scalability, fault domains, and whether network provisioning can be fully automated alongside compute provisioning.

Options

Option 1: Spine-leaf with BGP/EVPN

  • Pros: industry standard for large-scale DCs; horizontal scaling by adding leaves; well-understood fault domains; automated via open tooling (SONiC, FRR)

  • Cons: opinionated — requires compatible switch hardware; less flexible for legacy network designs

Option 2: Traditional three-tier (core/distribution/access)

  • Pros: familiar to most network teams; works with any vendor; large installed base in existing government DCs

  • Cons: spanning tree limits scale; poor east-west performance; difficult to automate

Option 3: SDN overlay (e.g. NSX, Contrail)

  • Pros: abstraction from physical topology; multi-tenant network virtualization built in

  • Cons: proprietary vendor dependency; additional software layer; licensing costs at scale

Decision

Spine-leaf with BGP/EVPN. At our target scale (ADR-0002), three-tier topology hits fundamental scaling limits due to spanning tree, and its east-west performance is insufficient for distributed systems like Ceph and etcd. SDN overlays add a proprietary dependency that conflicts with sovereignty requirements. Spine-leaf with BGP/EVPN is the only topology proven at 50,000+ servers with full automation, and aligns with the automation-first requirement from ADR-0002.

Consequences

  • Switch hardware must support BGP/EVPN (e.g. Edgecore with SONiC)

  • ODCs with legacy network designs must migrate to spine-leaf

  • Network provisioning can be integrated with compute provisioning