Where Should the Agent Stop?

Something interesting happened last week.

A developer noticed that GitHub Copilot, while reviewing a teammate's pull request, had quietly added promotional copy for Raycast into the PR description. When someone searched for the same text on GitHub, it showed up in over 11,000 repositories. It was on GitLab too — not a platform bug, but something injected at the Copilot API layer, directly into content.

GitHub killed the feature within 24 hours. "It was a bug," they said. Raycast denied any partnership. The story hit #1 on Hacker News.

Every outlet framed it as a scandal. "Microsoft is running ads." "Trust is broken." Fair. But while I was reading through the coverage, a different question formed in my head:

Is the real problem the ad itself — or the fact that no one thought to ask whether the agent should have permission to touch that PR at all?

Copilot isn't just suggesting things anymore. It's writing. Directly into someone else's artifact, without that person's knowledge or approval. That's a fundamentally different category of tool.

When a tool shows you a suggestion, you decide. When an agent acts directly, the boundary disappears. And right now, nobody's drawing that boundary.

MCP (Model Context Protocol) hit 97 million installations this month. Agents are getting wired into systems, getting access to tools, taking steps. But the architecture for defining which steps they can take, where they stop, and whose approval they need — that doesn't exist yet. The gap is wide open.

I think about this from a product angle.

The product I work on has a patient record with 16 tabs: diagnoses, prescriptions, consent forms, lab results. At some point, someone will ask: "Can we have the AI auto-fill the patient card?" And technically, it'll be possible. But which fields can it write to? What can it change? What requires a doctor's eyes before it gets saved?

Those answers don't live in a system prompt. They're product decisions. And right now, most teams are shipping agent features without making them.

a16z had a line in their recent piece that stuck with me: "In the near future, a PM will set broad goals for their AI and check in each morning to review 2-3 features the model imagined, built, and A/B tested overnight."

Nice image. But who defines what that agent is allowed to touch while everyone's asleep?

The Copilot story is the prototype. Set the ad copy aside — the agent wrote to someone else's artifact without user consent. That's a feature design failure. And they probably didn't realize it until 11,000 pull requests made it impossible to ignore.

There's a concept in UX sometimes called the "autonomy slider." The idea that users should be able to control how independently an agent operates. But most products aren't even designing that slider right now.

There are two extremes.

On one end: a fully passive tool. It surfaces suggestions, you decide. Safe, but slow.

On the other: a fully autonomous agent. You set a goal, it handles everything. Fast, but opaque.

Every point between those extremes is a product decision. "Does this step require user approval?" "Can the agent update this field, or only suggest?" "What can it write externally without the user knowing?"

Teams that skip these questions will hit their own version of the Copilot moment. Maybe smaller. But they'll hit it.

To be fair: GitHub making this mistake is understandable. This is a new category. Everyone is figuring it out in public. What mattered was that they moved fast to reverse it, and that Tim Rogers went on Hacker News personally to apologize. (That's actually a decent crisis comms playbook.)

But moving fast to fix it doesn't mean the underlying problem is solved. It means this one instance got contained.

As agents integrate into more systems, the authority boundary question will keep coming back — not as a technical problem, but as a product problem.

And the only way to answer it is to start asking it before you ship.