Virtual Control Planes, Not Virtual Clusters
Most Kubernetes multi-tenancy solutions duplicate infrastructure per tenant — a separate API server, a separate datastore, a syncer to bridge the gap. kplane takes a different approach: one shared API server serves all control planes, with isolation enforced at the storage and policy layer.
What runs
- Management plane: a single Kubernetes API server backed by etcd or Spanner
- Virtual control planes: logical partitions within the shared API server, each with its own API endpoint, RBAC bindings, CRD schemas, and namespace tree
- Node infrastructure: standard Kubernetes nodes, allocated per policy — shared, dedicated, or GPU-pinned
What doesn't run
- No per-cluster API server process
- No per-cluster etcd or datastore instance
- No syncer or resource translation layer
- No per-cluster controller manager — informers and caches are shared
This is what makes the density possible. A synced virtual cluster runs 3-5 processes per tenant. kplane runs zero.
Isolation model
Each virtual control plane is path-isolated within the shared API server's storage layer. Requests are routed to the correct partition using the cluster identity in the request path.
- Independent RBAC: policies, roles, and bindings are scoped per control plane
- Independent CRDs: each control plane can install its own custom resource definitions without affecting others
- Independent namespaces: the namespace hierarchy is per control plane
- Policy-driven node access: control which node pools each control plane can schedule onto
There is no proxy between the client and the API server. kubectl sends standard Kubernetes API requests. The API server processes them natively. No translation, no rewriting, no sync lag.
Vertical scaling
The shared API server scales vertically. Informer caches are shared across control planes — when a new control plane is added, the marginal cost is the metadata for that plane's resources (~3 MB), not a full duplicate of every controller's watch cache.
Horizontal scaling
Multiple API server replicas split ownership of control planes using rendezvous hashing (HRW). Each replica serves a subset of control planes. When a replica joins or leaves, ownership rebalances automatically with in-flight request draining to prevent dropped connections.
Horizontal scaling details
- Peer discovery via Coordination Leases
- Fence leases for single-writer guarantees during ownership transitions
- Reverse proxy forwarding for requests that land on the wrong replica