Neoclouds: The Next Layer of Compute Gravity

The term “neocloud” is frequently used in contemporary discussions, often to describe architectures that do not align with traditional classifications. Conventional cloud infrastructure was designed for general-purpose workloads, emphasizing elasticity, multi-tenancy, and a diverse range of applications capable of tolerating abstraction. In contrast, neoclouds depart from this model by being purpose-built for high-density computing, primarily driven by artificial intelligence (AI) and graphics processing unit (GPU) workloads.

At a practical level, a neocloud is not just “another cloud provider.” It is an infrastructure stack designed around GPU clusters, high-throughput storage, and extremely fast east–west networking. The economics are different, the traffic patterns are different, and the failure domains matter more. You are not spinning up a few VMs. You are coordinating thousands of GPUs that need to behave like a single system.

At this point, the operational paradigm diverges from conventional network engineering practices. Traditional cloud traffic predominantly exhibited north–south patterns, characterized by user requests, API responses, and caching mechanisms that optimized delivery. In contrast, neocloud environments are dominated by east–west traffic flows. Training jobs transfer large datasets across clusters, continuously synchronize state, and generate sustained high-throughput traffic that differs significantly from typical web traffic.

From a network design standpoint, that changes everything. Latency still matters, but consistency matters more. Packet loss that would go unnoticed in a web application can undermine the efficiency of distributed training. Microbursts are not edge cases; they are the norm. Oversubscription ratios that worked fine in enterprise or even hyperscale environments start to fall apart under AI load.

The physical location of neocloud environments further illustrates their distinct nature. These infrastructures are typically situated where power availability, scalable cooling, and high fiber density can support extensive interconnectivity. The priority is not proximity to end users, but rather proximity to other compute, storage, and exchange points capable of transferring data rapidly enough to maintain GPU utilization.

There are a few traits that show up consistently across neocloud designs:

Dense GPU clusters equipped with high-speed interconnects such as InfiniBand or 400G/800G Ethernet.
Flat or near-flat network fabrics designed to minimize hop count and jitter.
Predominance of east–west traffic with sustained throughput, as opposed to bursty, user-driven flows.
Tight coupling between compute, storage, and network design decisions

What makes this interesting is how it reshapes the interconnection strategy. Traditional peering and transit models assumed traffic diversity and statistical multiplexing. Neoclouds do not behave that way. When a training job runs, it consumes capacity in a highly deterministic, large-scale manner. That means interconnection has to be engineered more like a backbone than a best-effort exchange.

This is where you start to see the concept of “AI gravity” take hold. Data, models, and compute clusters gravitate toward each other. Once a large dataset and a trained model exist in one location, it becomes inefficient to move them. Instead, more compute and more connectivity move toward that center. Neoclouds become anchors, and everything else starts orbiting them.

For network engineers, this changes how we think about scale. It is no longer about how many customers you can serve or how many prefixes you can carry. It is about how much sustained throughput you can deliver between specific points with predictable performance. Route policy still matters, but physical topology and capacity planning matter more.

Neoclouds do not replace traditional cloud infrastructure; rather, they are layered atop existing systems, targeting a specific class of workloads that challenge previous assumptions. Treating neoclouds as conventional tenants or peers can lead to significant operational challenges. However, designing for density, determinism, and data locality enables the model to function effectively.