Power, water, and the limits on inference scaling in 2026
The constraint on inference is not GPU supply anymore. It is interconnect at the rack level and water at the regional level. Both are showing up in our pricing.
For most of 2024 the bottleneck on AI infrastructure was GPU supply. By late 2025 it was rack-level networking. In 2026 it is the building itself: power delivery, water for cooling, and the regulatory environment that gates both.
This does not affect inference customers directly today. It does set the trajectory for the next 18-24 months and it is the reason capacity in some pools is meaningfully more expensive than capacity in others.
Why power is hard
An H100 SXM module pulls 700W. A rack of them pulls 35-50kW. A row of those racks pulls megawatts. The local utility connection that supplies that power was sized for a building that pulled tens of kilowatts, not tens of megawatts. Upgrading the connection requires a substation, which requires permitting, which requires a queue.
Power queue times in major US metros now average 3-5 years. Operators with energized capacity have a moat that did not exist three years ago.
Why water is harder
Liquid cooling is the only thermally viable option at current rack densities. Liquid cooling needs a water source. In some metros — Phoenix, Las Vegas, parts of Texas — water rights are now the gating factor, not power. We have seen multi-hundred-million-dollar projects pause because the water permit did not clear.
The regulatory question is whether AI workloads are 'productive use' of municipal water. The answer varies by jurisdiction. We expect more litigation here, not less.
How it shows up in our pricing
Three of our pools — North America, Europe, and APAC — have meaningfully different price floors. North America is the cheapest because of historical buildout and abundant power in specific regions (Virginia, Texas, the Pacific Northwest). Europe is more expensive because power is more expensive almost everywhere. APAC varies wildly because it averages Tokyo, Singapore, and Sydney — three very different infrastructure stories.
Customers who care about price more than latency should default to North America. Customers who care about regulatory compliance more than price should accept the European premium. Customers who care about latency to APAC end users have to pay APAC pricing.
What we are watching
Two things. First, modular nuclear — small reactors are starting to show up in serious operator buildout plans. Whether they materialize on schedule is one of the bigger open questions for late-2027 capacity. Second, immersion cooling at scale — if a major operator proves immersion cooling works in production at rack density, water-constrained metros open back up.
Neither is a 2026 story. Both are reasons to be optimistic about price floors a few years out.