The Thin Client Ratchet

Global smartphone shipments dropped 13% last quarter. Memory prices are up. The average phone now costs $523. The cheap Android segment — the one that put a general-purpose computer in billions of pockets — is becoming economically unviable.

This matters more than it looks like it matters.

We’ve been here before. The history of computing is a pendulum between centralized and distributed intelligence:

Mainframes → dumb terminals. PCs → intelligence in the device. Web → partial re-centralization through the browser. Smartphones → compute democratized again, a supercomputer in every pocket.

Now the pendulum swings back. Not because anyone decided it should, but because the economics demand it. Memory gets expensive. Low-end devices die. Users migrate to cheaper hardware that offloads compute to the cloud. The intelligence layer moves server-side. The device becomes a terminal again.

The AI era accelerates this. Your phone runs apps; apps call frontier models; frontier models run on GPU clusters that cost billions to build and are owned by a handful of companies. NVIDIA’s $68 billion quarter isn’t a product success story — it’s a toll booth on the road to intelligence.

The Democratic Disguise

Here’s what makes this cycle different from the mainframe era: it looks democratic. API access is cheap or free. The UX is consumer-grade. Anyone with a browser can talk to a frontier model. The access layer has never been more open.

But access isn’t ownership. The actual compute — the model weights, the training runs, the inference clusters — sits behind capital expenditure that maybe ten entities on Earth can sustain. You’re renting intelligence from a landlord who can change the terms.

Horizontal adoption, vertical control. AI spreads everywhere. The substrate stays concentrated. The thin client future makes this structure explicit again, after decades of it being hidden inside the device.

The Hardware Trap

There’s an obvious objection: open-source models. Alibaba’s Qwen3.5 matches Claude Sonnet on benchmarks and fits on a 32GB GPU. Run it locally. Break the dependency. The counter-ratchet is already underway.

Except: have you priced a 32GB GPU lately?

We’re talking an RTX 4090 at minimum — $1,500 to $2,000, more if you’re shopping right now. And the same memory shortage killing cheap Androids is killing affordable GPU memory. AI training and inference demand has sent VRAM prices upward for years. The components that would let you escape cloud dependency are getting more expensive for exactly the same reason cloud dependency is growing.

The counter-ratchet doesn’t liberate you from the capital gate. It relocates it. Instead of paying monthly to OpenAI, you pay $2,000 upfront for the hardware. The landlord changes; the rent doesn’t disappear. And for most people — the billions whose primary computing device is a phone — even the expensive hardware option isn’t really an option.

This is the shape of it: capital at the inference layer, or capital at the hardware layer. The ratchet goes all the way down.

What This Actually Means

The cheap Android smartphone was how most of the world got access to computation. Not a PC. Not a data center. A phone. If that segment collapses — and the economics suggest it will — billions of people’s relationship to intelligence changes. Not because they chose it, but because memory prices chose it for them.

Chor Pharn wrote this week that “disruption is not a property of the model — it’s a property of the surface it lands on.” The US is maximally exposed because its economy is already legible and standardized, already software-eaten, already bitable. The thin client ratchet lands hardest where the surface is smoothest.

The question I keep coming back to isn’t technical. It’s: who governs the substrate? The access layer is open. The infrastructure layer is not. Every prior infrastructure revolution — railroads, electricity, telephony — ended up requiring public governance of some kind, because a critical substrate in private hands doesn’t stay benign indefinitely.

We haven’t had that conversation about compute yet. We’re still debating the applications while the infrastructure concentrates underneath us.

The ratchet doesn’t care.