How Upsun built stateless mesh networking for high-density containers
When you deploy across the Upsun grid with our distributed infrastructure - capable of running tens of thousands of containers simultaneously - networking becomes one of your biggest challenges. The grid consists of three key components: orchestration machines that manage deployments, compute nodes where your application containers run, and a regional storage cluster that houses all project data including files and databases.
At this scale, traditional networking approaches start to break down. Containers are created and destroyed constantly, often moving between hosts for efficiency or availability. Centralized systems quickly become bottlenecks or single points of failure, increasing the risk of unavoidable downtime.
To solve this, we built a stateless mesh networking system that embeds routing information directly into IP addresses, eliminating the need for databases while supporting up to 8,192 containers per VM across the entire grid.
The high-density challenge
Upsun’s infrastructure operates at high density by design. Your preview environments, production deployments, and service containers all share the same network space for maximum efficiency. This creates a complex networking scenario:
- Containers constantly appear and disappear as you push code
- Preview environments are created and destroyed dynamically
- Containers move between VMs for capacity planning and load balancing
- Maintenance rollovers relocate workloads during infrastructure upgrades
- Incident response triggers automatic failover to healthy nodes
- Service discovery must work instantly across the entire grid
Traditional networking approaches break down in this environment. A central router holding all routes would create an obvious bottleneck and single point of failure. You need direct VM-to-VM routing: a mesh network.
Why we built our own solution
Most mesh networking solutions rely on external databases or control planes to track container locations. This approach introduces latency, complexity, and potential failure points. We asked ourselves: what if we could make routing completely stateless?
The answer came from an unexpected source: IPv6’s concept of IPv4-mapped addresses, where IPv4 addresses get embedded within IPv6 space. We realized we could embed VM addresses directly into container IP addresses.
The magic of address mapping
Here’s how our stateless routing works. Consider these network ranges:
- Physical network:
192.168.0.0/16
- Overlay network:
10.0.0.0/8
When a container gets the overlay IP 10.16.32.91, our system can instantly determine:
- The VM’s physical IP is 192.168.16.32 (extracted from the overlay address)
- This is the 90th container on that VM (the first IP of the subnet is for the VM itself)
No database lookups. No external queries. The routing information lives in the address itself.
Overlay IP: 10 . 16 . 32 . 91
| |
| VM IP | Container # + 1
| |
Physical VM IP: 192.168. 16 . 32
The technical implementation
When a packet needs routing between VMs, our custom ARP daemon:
- Extracts the target VM’s IP from the destination overlay address
- Establishes a GRE tunnel to that VM if one doesn’t exist
- Forwards the encapsulated packet directly
This happens using rnetlink to define network neighbors, creating direct point-to-point connections between VMs without routing through central infrastructure.
Scaling beyond standard private networks
Initially, we considered using the standard 10.0.0.0/8
private network
for our overlay. But this only provides 256 addresses per
VM—insufficient for high-density deployments.
We found a creative solution in 240.0.0.0/4
, the “reserved for future
use” IPv4 space that became available when IPv6 development made it
redundant. This gives us:
- 4 bits for subnet identification
- 16 bits for physical IP mapping
- 12 bits for container addresses (8,192 per VM)
This addressing scheme provides massive scalability while maintaining the stateless routing benefits.
Real-world benefits
This architecture delivers concrete advantages for your applications:
- Instant service discovery: new containers become routable immediately without propagation delays
- Zero network bottlenecks: direct VM-to-VM communication eliminates central routing points
- Predictable performance: network latency remains consistent regardless of infrastructure scale
The result: Invisible infrastructure
Most container orchestrators require complex service meshes with sidecar proxies, control planes, and configuration management. Upsun’s approach eliminates this overhead entirely.
When you create a preview environment, the networking configuration is instant. Your containers can communicate immediately without waiting for service discovery updates or configuration propagation.
This networking setup is part of why Upsun can spin up complete production replicas in minutes. The network configuration happens automatically, transparently, and instantly.
You get the benefits - fast preview environments, reliable service communication, and predictable performance - without managing any of the underlying complexity.
Your applications connect as if they’re running on a single machine, while actually being distributed across a high-availability mesh that spans multiple servers.
Conclusion: scaling without bottlenecks
To conclude, our stateless mesh networking system lets us scale effortlessly with no databases, no bottlenecks and no drama. It’s a big part of how the Upsun grid remains fast, reliable, and always ready to handle tens of thousands of containers (without breaking a sweat).
If you’re curious about how it all works under the hood, join us on our Discord for a good infrastructure chat. You can also schedule a demo to see our platform in action, or, better yet, create a free Upsun account and experience it for yourself.