node_002 · essay

On building
from first principles
in 2026.

The phrase has been hollowed out. Here is what we mean by it, what we don't, and the three rules we actually use.

P. NARAYANNEW DELHI · 22 APRIL 2026

“First principles” has had a hard decade. It has been said by every founder, on every podcast, until it now means roughly the same thing as “we tried.” We use the phrase carefully, because we still think it points at something real.

What follows is what it means to us, in 2026, in a field where the priors are loud and the constraints are quietly disappearing.

IThe original sense

The phrase comes from physics, where it means: derive the answer from invariants – conservation laws, the Standard Model, the geometry of spacetime – rather than from previously fitted parameters. A first-principles calculation is one in which nothing is borrowed from prior measurement.

That is a high bar. It is not actually how most engineering happens, and it is not how we work either. But the bar is the point. It tells us what to aim for: the answer should follow from things we can defend, not from things we inherited.

IIThe 2026 problem

Two things have changed since the phrase was last useful.

First, the priors are now overwhelming. There are more reference architectures, paper implementations, half-good frameworks, and confident blog posts than any team can read. Most of them work in the case they were written for and break in yours. The default mode is to assemble, not to derive.

Second, the constraints are softer than they have ever been. Capital is patient. Compute is rentable. Headcount is global. The classical forcing function – we have one machine, four months, and two engineers – is largely gone. Nothing pushes back against assembly.

The combined effect is a generation of systems that work because someone else’s system worked, and that fail in ways no one in the room can debug. The substrate is full of borrowed assumptions.

“Most systems today work because someone else’s system worked.”

– ESSAY · §II

IIIThree rules

We are not romantic about this. We borrow constantly. But we have three rules that govern when, and that we apply without exception.

Rule one – name the load-bearing assumption. For any system we are about to build, we write down the single sentence that, if false, makes the rest of the work pointless. If we cannot write it, we do not start. If we can write it but cannot test it cheaply, we do that test before we write any other code.

Rule two – measure what the substrate is actually doing. Before we optimise a model, we read what the kernel is doing. Before we optimise a kernel, we read what the memory hierarchy is doing. The vast majority of “performance” decisions in our field are made one or two layers above the layer where the work is being lost.

Rule three – refuse the default solution once. For every component, the first version we write is the obvious one. We then ask, exactly once, whether the obvious one is right. Sometimes it is. When it isn’t, we have already paid the cost of building it and can compare. This is the cheapest form of first-principles work we have found, and the one we use most often.

IVThe trap

First principles can become a performance. A team will rewrite a stable subsystem from scratch, declare it more elegant, lose three months, and call the loss discipline. We have done this. It is seductive because the work feels rigorous in the moment.

The test we use: at the end of a from-scratch effort, can we name a measurable property of the new thing – latency, error rate, surface area, cost – that is at least 2× better than what we replaced? If not, we threw away time.

First principles is a tool for finding constraint, not for demonstrating taste.

VWhere it pays

Almost all of the leverage we have found has come from the same shape: a load-bearing assumption that everyone in the field had quietly inherited, and that turned out to be wrong, or at least negotiable, in our setting.

Examples – without details, because the work is unfinished:

– That a particular axis of parallelism has to be the one the hardware was originally marketed for.

– That a layer’s compute and memory traffic are independent budgets.

– That a workload’s sparsity pattern is a property of the model, rather than of the substrate. (See node_004.)

In each case, the leverage was not in inventing something exotic. It was in noticing that the assumption was load-bearing and then asking, plainly, whether it was true.

VIThe smaller claim

We are not claiming that first-principles thinking is rare. We are claiming that it is cheap, and that most teams in 2026 have stopped reaching for it because the cost of assembly has fallen faster than the cost of derivation.

Our bet is that this gap reverses, somewhere in the substrate, in the next five years. The teams that can still derive will compound; the teams that can only assemble will hit the ceiling of what their borrowed assumptions allow.

That is the bet, plainly. The work is the proof.

– P. Narayan, for the founding team.
New Delhi · 22 April 2026.

Replies welcome at hello@vyanacompute.com.