atman

Field note · 04 June 2026

The hardware that makes on-device AI inevitable.

For years, “AI on your device” was a privacy footnote — trade capability for control. Last week, NVIDIA shipped the chip that ends the trade.

What was announced

A laptop chip that runs a 200 billion parameter model.

At Computex 2026, NVIDIA unveiled the RTX Spark Superchip: an Arm CPU paired with a Blackwell GPU, with up to 128 GB of unified memory and 1 petaflop of FP4 AI compute on a machine you can close and put in a backpack.

It runs models up to 200 billion parameters and context windows up to 1 million tokens— entirely on the device. Ships in Surface Ultra, Dell, HP, Lenovo, ASUS, and MSI hardware this fall. The roadmap goes three generations deep: Rubin next, then Rosa Feynman.

This is not a refresh. It is a platform commitment.


            ┌──────────────────────────────────┐
   128 GB → │ ████████████████████████████████ │
            └──────────────────────────────────┘

   model size                fits in 128 GB ?
   ─────────────────────────────────────────────
   7 B      │ █                       │   ✦ yes
   70 B     │ ████████                │   ✦ yes
   200 B    │ ████████████████████    │   ✦ yes
   1 T      │ █████████████████████ × │   ─ no

   yesterday
   ─────────
   prompt ──→ network ──→ cloud GPU ──→ tokens ──→ you

   today
   ─────
   prompt ──────────────→ your machine ──→ tokens

The inflection

Three things had to land at once. They just did.

Models that fit

The 2025–26 wave — MiniMax M3, Llama 4, Qwen 3 — made sub-200B parameter models genuinely competitive. Capability stopped being a frontier-only property.

Memory + bandwidth

Apple Silicon proved the unified-memory thesis. Spark brings it to CUDA, where the weights, kernels, and tooling already live.

An OS that means it

Microsoft is shipping Windows on Arm as an agentic OS — agents as first-class processes, not API calls behind a browser tab.

This is the first moment all three exist on a machine you can buy at Best Buy. The substrate is no longer the bottleneck.

For builders

Develop and deploy on the same machine.

For users

A companion that actually remembers you.

Because the memory never crosses a wire, it can be honest about what it knows. Tools work on a plane. “Your data never leaves the device” stops being marketing and becomes a verifiable property of the system. The conversation is yours again.

Why we’re building what we’re building

We weren’t waiting for permission. We were waiting for the hardware to catch up.

Masi redacts PDFs on WebGPU in your browser. Maya is a Hinglish companion that runs entirely on your device. Drik walks your localhost the way a user would. All three are bets on the same substrate.

Spark is the substrate growing up. Good.

What’s still hard

The post that ages well says this part out loud.

✦ ✦ ✦

The cloud trained the models.
The Spark runs them.
The user owns the conversation.