Shipping a feature in two hours is no longer a meaningful achievement. Last week I wrote up a feature idea, handed it to Claude, built it, tested it, deployed it. Done before lunch.
But I noticed something: the quality of the output depended entirely on a few paragraphs I wrote beforehand.
The story usually told about AI goes like this: "Anyone can build products now, so what do PMs even do?" Wrong question. The right one: what can't AI do?
The answer is clear. It's bad at generating ideas. Bad at prioritization. And most critically — it's terrible at deciding what "good" looks like.
a16z observed something worth sitting with: AI models can now design and run their own A/B tests. But the ideas they generate are "bland, derivative, and generally lacking spark." The execution is there. The taste isn't.
That's everything for a PM.
Writing specs used to be an administrative task. You'd throw something into Jira, tell the developer to ask if confused, then spend the review meeting discussing what went sideways. The spec wasn't a precise expression of intent — it was a formality.
Now the spec is the product.
Tell an AI agent "resolve appointment conflicts" and you get something vague. Tell it "if two appointments overlap, cancel the second one created, notify the clinician, send the patient an SMS without mentioning the cancellation reason" — you get exactly what you wanted.
The difference isn't execution capability. The difference is spec quality.
I see this concretely working on DentalBulut. Writing "show communication history on the patient card" versus writing "show the last 5 communications on the patient card — type (SMS/WhatsApp/call), date, and a content summary, sorted newest to oldest, with full content expanding on click" — both are specs, but one produces something, the other produces a conversation.
A large portion of what I actually do now is this: know the system well enough to describe what I want precisely. And be able to say what "sufficient" means versus what "good" means.
Some people call this "taste." I think it's more accurate to call it: contextual evaluation capacity.
So what does this change?
First: tolerating ambiguity is no longer a virtue.
"User experience will be improved" might have passed as a spec in 2019. Not anymore. AI gives vague output for vague specs. And fixing vague output takes longer than writing a clear spec in the first place.
Second: the gap between "knowing what you want" and "being able to describe what you want" has widened.
Most PMs know what they want — or at least feel it. But writing that down in precise, testable, measurable language is a different skill. And that skill is currently worth at least as much as technical knowledge.
The evals question connects here too.
AI built something. Who decides whether it built it correctly? Answering that question requires having defined what "correct" looks like beforehand. Which means: spec, again.
Writing test criteria, defining quality thresholds, anticipating edge cases before they appear — these are now PM work. Or at minimum, the knowledge a PM needs to hand off to an AI. Nobody else carries that domain context.
"Vibe coding" was a moment. Building from feeling. Nice idea, but vibes aren't enough in product management. You need the vibe made concrete.
Maybe the new PM skill is "vibe spec" — being able to externalize your intuition. Write down what you want, why you want it, what it should look like, and how you'll know if it's working.
Execution got cheap. The ability to describe got valuable.