AWS and Google Point to a New Business — On-Prem Reconsidered in the AI Era and Selling Physical Racks Outside the Cloud

Watching the recent moves of hyperscalers and data center companies around AI infrastructure, I feel that a new kind of business opportunity is starting to emerge out of the cloud business — one that is a little different from what we have seen before.

The big advantage of cloud services is that you can use compute resources like CPUs, GPUs, and storage only when you need them, and only as much as you need.

Users can rent compute power without thinking about racks, cabling, or cooling.

That has been the big difference from on-prem, and for a long time it has been the core strength of the cloud.

But in the AI era, this assumption is starting to change. Compute demand is rising sharply, and at the same time physical constraints — power, cooling, land, chip supply — are getting tighter. In this environment, what is gaining value is not only the abstracted cloud service, but the physical infrastructure itself, optimized for AI.

On this line, there may be a new business model for hyperscalers: “selling AI-optimized racks, including their own chips, to outside customers.” I think it is more interesting to see this not as a simple return to on-prem, but as moving the design thinking cultivated in the cloud into the customer’s physical space.

A Serious Shortage of AI Compute

Looking at just the past few months, large AI players like Meta and Anthropic have been signing big contracts not only with existing clouds, but also with AI-infrastructure-focused companies like CoreWeave and Nebius.

In April 2026, Meta expanded its CoreWeave contract by another $21 billion and also signed a deal with Nebius worth up to $27 billion.

Anthropic also announced a multi-year agreement with CoreWeave, which shows that leading AI companies are in a hurry to secure compute capacity.

What matters here is that these deals reflect the reality that it is hard to secure enough physical capacity.

Providers that can deliver not only GPUs, but also power, cooling, buildings, and networks, are becoming more valuable.

AWS Hints at Selling Chips and Racks to Third Parties

In his 2025 letter to shareholders, Amazon CEO Andy Jassy explained that the annualized revenue run rate of Amazon’s chip business (including Graviton, Trainium, and Nitro) has reached over $20 billion.

He also said that if this business had been deployed not just for AWS internal use but more broadly to third parties, it could have reached around $50 billion.

Furthermore, because demand is so strong, he mentioned the possibility of selling racks equipped with those chips to third parties in the future.

I think this is quite important. Because Amazon itself is admitting that there is room to sell its work not only inside the cloud, but also as physical infrastructure to outside customers.

Google May Also Rent TPUs Externally and Sell Them Into Customer Data Centers

Google is moving in a similar direction. Reuters reported that Meta signed a multi-year, multi-billion-dollar contract to rent Google’s TPUs. The same report also said that Meta and Google are discussing the possibility that Meta will buy TPUs for its own data centers from 2027 onward.

In other words, Google is also showing that it may not consume TPUs only inside Google Cloud, but rent them to outside customers and eventually bring them into customers’ own physical data centers.

Looking this far, the competitive axis of hyperscalers seems to be shifting from simple cloud usage fees to how far they can extend their own architecture into the outside physical world.

Why Racks Instead of Just Chips?

Here one technical question comes up: “Why sell by the rack, instead of just selling chips?”

AI chips like Trainium and Inferentia cannot be handled the same way as plugging an Intel CPU into a general-purpose motherboard. There are three main reasons why selling by the rack makes sense.

1. Proprietary High-Speed Interconnect (EFA)

Trainium is designed on the assumption that it will work in clusters of thousands of chips, not as a single unit. It needs Amazon’s own high-speed communication technology (EFA: Elastic Fabric Adapter) and ultra-fast chip-to-chip interconnects, so a general-purpose chassis from another vendor cannot get the most out of it.

2. Vertical Integration of Power and Cooling

The latest AI chips are extremely dense, and their power consumption and heat output are significant. Without the liquid cooling system and dedicated power units that AWS has refined in its own data centers, it is physically hard to keep them running stably. You really need “the rack” itself.

3. Tight Coupling with the Software Stack (Neuron SDK)

The hardware and its driver and compiler stack are tightly integrated, so customers struggle when they get only the hardware. This is the same logic as NVIDIA selling not just GPUs, but the “DGX” as a complete server system.

Three Differentiators in Amazon’s External Sales Strategy

When Amazon enters the market as a hardware vendor, what advantages can it have against existing chip makers?

1. Packaged operational know-how optimized for AI: Amazon’s racks are not just a list of specs. They come as a package that includes the field knowledge gained from running the world’s largest cloud, such as “hard to break” and “easy to manage.”

2. Strong margins and cost competitiveness: In-house designs like Trainium do not require paying an “NVIDIA tax” the way other vendors do. Because the IP is built internally, Amazon can keep margins that others cannot match in external sales, while still offering competitive prices to customers.

3. Human support and lifecycle management: It is not a “sell and forget” business. AWS can also package the human know-how of operating and maintaining large-scale infrastructure that it has built up over many years.

The Cloud Is Moving from “Invisible Compute” Toward “a Ready-Made AI Factory”

What I see in these moves is that the center of AI infrastructure value may shift from abstracted virtual resources toward optimized physical packages.

The traditional cloud created value by giving customers access to compute resources tuned for AI.

But in AI, the difference in overall optimization — covering chips, memory, network, power density, cooling, and the software stack — directly becomes a difference in performance and cost.

When that is the case, what customers want is a ready-made physical environment, optimized for AI, that works immediately.

In that sense, what hyperscalers can sell outside is not just a rack. It could be a small AI factory that includes chips, servers, network, operational software, supply capacity, and design thinking.

Why External Rack Sales Could Be a Big Opportunity

The reason I feel strong upside here is that hyperscalers have already finished a huge amount of internal optimization for themselves.

They have designed chips, built servers, put together networks, and tuned cooling and power design. If they can take that output and, instead of only renting it internally or as a cloud service, productize it directly for outside customers, that is a pretty strong business.

From the customer’s point of view, it is faster to install a native hyperscaler rack than to design and procure AI infrastructure from scratch.

On top of that, it is not just a mix of parts. It comes with a configuration that has been hardened in real operations from day one. The time saving is big, and the cost of mistakes should also come down.

From the seller’s side, this is not just an extra product. It is a chance to pull forward, as a physical infrastructure sale, the value that used to be recovered slowly as cloud usage fees.

Profit Simulation: The Impact of External Sales

This part matters from an investor perspective, so I will deliberately simplify it. But the following is only my own estimate, not a company plan. It is a rough calculation that does not consider logistics costs, support costs, SG&A, maintenance, yield, discounts, utilization, tax effects, or accounting differences.

Also, this estimate does not assume that “AWS will redirect its existing cloud supply into external sales.” If hyperscalers sell too many racks outside, it would obviously reduce the supply they have for their own cloud and could hurt AWS’s own growth.

So the external sales room here only makes sense if additional physical capacity can be secured, or if a separate supply allocation can be built for external sales.

According to Amazon’s own explanation, the current annualized revenue of the chip business is over $20 billion, while with external sales included it could be in the range of $50 billion. If we take these numbers straight, the additional revenue room is:

ΔRevenue = $50B − $20B = $30B

This $30 billion can be seen as the potential additional revenue that AWS could pull out, through the external sales route, from the value that is currently locked inside internal AWS use. But again, this assumes enough additional facilities, power, and manufacturing capacity, and it does not ignore the risk of cannibalizing the existing cloud business.

If we assume the gross margin of this rack external sales business is 60%, the incremental gross profit looks like this:

Incremental Gross Profit = $30B × 0.60 = $18B

More conservatively, at a 50% gross margin:

$30B × 0.50 = $15B

Even at a 40% gross margin:

$30B × 0.40 = $12B

So even with this very simple calculation, the picture is roughly $12B–18B of incremental gross profit on $30B of incremental revenue. From an investor’s viewpoint, these are large numbers.

What I find interesting is that this looks like “just a hardware sale,” but it really is not. If AWS actually moves into external rack sales, what it is selling is not metal boxes. It is the whole architecture centered on Trainium and Graviton.

That means not only revenue, but also a long tail of opportunity that includes software, SDKs, operational thinking, and future upgrade demand.

Concerns

I want to look at the risks calmly too.

First, customers will become more dependent on a specific architecture. With Amazon it will be Neuron, with Google it will be the TPU stack.

That convenience comes at the cost of flexibility.

Second, hardware becomes obsolete. Hardware optimized for AI — where technology moves quickly — will depreciate faster.

In the cloud, you can escape by upgrading. But a rack that has been bought becomes an asset. In a fast-moving AI market, that is not a small issue.

Third, installation, replacement, maintenance, and failure response all require an operational muscle different from providing a cloud.

If external sales really take off, hyperscalers will need to be not only “service companies” but also stronger as large-scale infrastructure manufacturers.

Conclusion

I think hyperscalers in the AI era have a significant new opportunity, beyond renting compute, in selling physical racks that include their own chips.

The reason is that the stronger AI demand becomes, the more value shifts away from the simple right to use a cloud, toward the ability to supply optimized physical infrastructure quickly.

And the players with the most of that ability are the hyperscalers. They are not just software companies. They are companies that can bundle land, power, cooling, chips, servers, networks, logistics, and operations — with both physical assets and know-how.

In other words, they are giant tech companies that can compete on physical things.

What I find interesting is that this flow is not simply an extension of “cloud expansion.” In a sense, it also looks like on-prem being reconsidered for the AI era. Of course, the old “build everything yourself” on-prem is not coming back as-is. But in AI, performance, power, cooling, and network optimization matter a lot, so there is growing value in placing a ready-made physical rack, designed with cloud thinking, on the customer’s side.

In that sense, the on-prem of the near future may be re-evaluated not as the old “build everything yourself” on-prem, but as a new form of on-prem where hyperscaler-designed infrastructure is brought into the customer’s physical space.

The numbers Amazon showed suggest that when that physical advantage is exposed to the outside, the upside for revenue and profit may be larger than expected.

And if Google is also looking at renting TPUs outside and eventually selling them to customer data centers, then this is worth watching as a candidate for the next business model of hyperscalers overall.

AI-era infrastructure may no longer be a binary choice between cloud and on-prem. It may be entering an era where the compute platform designed by hyperscalers extends all the way into the customer’s physical space.