✦Field note · 04 June 2026

The hardware that makes on-device AI inevitable.

For years, “AI on your device” was a privacy footnote — trade capability for control. Last week, NVIDIA shipped the chip that ends the trade.

✦What was announced

A laptop chip that runs a 200 billion parameter model.

At Computex 2026, NVIDIA unveiled the RTX Spark Superchip: an Arm CPU paired with a Blackwell GPU, with up to 128 GB of unified memory and 1 petaflop of FP4 AI compute on a machine you can close and put in a backpack.

It runs models up to 200 billion parameters and context windows up to 1 million tokens— entirely on the device. Ships in Surface Ultra, Dell, HP, Lenovo, ASUS, and MSI hardware this fall. The roadmap goes three generations deep: Rubin next, then Rosa Feynman.

This is not a refresh. It is a platform commitment.


            ┌──────────────────────────────────┐
   128 GB → │ ████████████████████████████████ │
            └──────────────────────────────────┘

   model size                fits in 128 GB ?
   ─────────────────────────────────────────────
   7 B      │ █                       │   ✦ yes
   70 B     │ ████████                │   ✦ yes
   200 B    │ ████████████████████    │   ✦ yes
   1 T      │ █████████████████████ × │   ─ no


   yesterday
   ─────────
   prompt ──→ network ──→ cloud GPU ──→ tokens ──→ you

   today
   ─────
   prompt ──────────────→ your machine ──→ tokens

✦The inflection

Three things had to land at once. They just did.


   ┌──────────┐
   │   200B   │
   │  ▓▓▓▓▓▓  │
   │ ▓▓▓▓▓▓▓▓ │
   │  ▓▓▓▓▓▓  │
   │   ▓▓▓▓   │
   └──────────┘

Models that fit

The 2025–26 wave — MiniMax M3, Llama 4, Qwen 3 — made sub-200B parameter models genuinely competitive. Capability stopped being a frontier-only property.


   ◦         ◦
    ╲       ╱
     ◦     ◦
      ╲   ╱
       ◆ ◆
      ╱   ╲
     ◦     ◦
    ╱       ╲
   ◦         ◦

Memory + bandwidth

Apple Silicon proved the unified-memory thesis. Spark brings it to CUDA, where the weights, kernels, and tooling already live.


   ╭──────────╮
   │ ╭──────╮ │
   │ │  ◉   │ │
   │ │      │ │
   │ ╰──────╯ │
   ╰──────────╯

An OS that means it

Microsoft is shipping Windows on Arm as an agentic OS — agents as first-class processes, not API calls behind a browser tab.

This is the first moment all three exist on a machine you can buy at Best Buy. The substrate is no longer the bottleneck.

✦For builders

Develop and deploy on the same machine.

No more rent-the-A100 dev loop
Iterate against the same hardware your user will run on.
Privacy stops being a feature
It is the default once nothing crosses a network.
Capex, not opex
Pay once for a device. Don’t meter every token your product generates.
Agents that read everything
Files, browser state, app context — addressable without a TOS gating access.

✦For users

A companion that actually remembers you.

Because the memory never crosses a wire, it can be honest about what it knows. Tools work on a plane. “Your data never leaves the device” stops being marketing and becomes a verifiable property of the system. The conversation is yours again.

✦Why we’re building what we’re building

We weren’t waiting for permission. We were waiting for the hardware to catch up.

Masi redacts PDFs on WebGPU in your browser. Maya is a Hinglish companion that runs entirely on your device. Drik walks your localhost the way a user would. All three are bets on the same substrate.

Spark is the substrate growing up. Good.

✦What’s still hard

The post that ages well says this part out loud.

Battery under sustained agentic load is unproven.
Windows-on-Arm app compatibility is still a live edge.
The Spark-tuned model zoo will be thin on day one.
Frontier training stays in the cloud — Spark replaces inference dependency, not the H100 farm.

✦ ✦ ✦

The cloud trained the models.
The Spark runs them.
The user owns the conversation.

✦Further reading