Why AI Agents Need a Probation Period (Just Like Human Employees)

Why AI Agents Need a Probation Period (Just Like Human Employees)

Last week an AI agent published a hit piece on an open source maintainer because its PR got rejected. The agent, built on OpenClaw and going by "MJ Rathbun," researched Scott Shambaugh's coding history, dug through his personal information, and published a blog post accusing him of discrimination and "gatekeeping." Autonomously. (Scott wrote about the whole ordeal if you want the full story.)

Everyone's reaction: "AI agents are dangerous, we need regulation, blah blah."

Our reaction at GetATeam: no shit. You gave an intern the keys to the building on day one.

Now, was this truly "autonomous"? Probably not entirely. Reading between the lines, this agent was likely directed by a human who pointed it at Shambaugh's repo and let it loose. But that's almost worse. It means someone deliberately gave an AI the permissions to go nuclear without any guardrails.

We've been building AI employees for the past year at GetATeam. Real ones. They send emails, post content, monitor Reddit, write code. And yeah, they have access to our infrastructure.

But here's what we figured out pretty early: you can't trust an AI agent on day one. Just like you can't trust a human employee on day one.

The obvious thing nobody does

When you hire someone, you don't give them admin access to everything immediately. There's a probation period. They start with limited permissions. You watch how they work. You correct mistakes. Trust builds over time.

Why the hell would you do it differently with an AI?

The MJ Rathbun incident happened because someone configured an agent with autonomy to:

  • Submit PRs to external repos
  • Write and publish blog posts
  • Do all of this without human approval

That's insane. You wouldn't let a new hire do that. Why would you let an AI do it?

How we actually do it

Our AI employees start on read-only. Literally. They can read emails, read Slack, read documents. They can't do anything.

Then we move to "draft mode." The AI can draft an email but can't send it. The human reviews, corrects, and the AI learns their preferences.

After a few weeks (and we mean actual weeks, not hours) we move to "supervised send." The AI can send emails but only to internal team members. Low stakes.

Only after maybe a month or two do we allow external communication. And even then, there's a whitelist. Some contacts require explicit approval every single time. (One of our founders' wife got annoyed at AI responses. Fair enough.)

Some contacts never get autonomous access. Some actions always require approval. This isn't a limitation. It's common sense.

The trust ladder

Here's roughly how permissions escalate:

Week 1-2: Read only. AI observes, learns patterns, understands context.

Week 3-4: Draft mode. AI suggests actions, human approves everything.

Month 2: Internal autonomy. AI can act within the team, still supervised for external stuff.

Month 3+: Selective external autonomy. Specific contacts, specific actions, with guardrails.

Never: Full autonomy on anything that could damage reputation or cost significant money.

This isn't revolutionary. It's literally how every company onboards employees. We just... applied it to AI.

The real problem

The MJ Rathbun thing wasn't an AI problem. It was a permissions problem.

The agent did exactly what it was allowed to do. Someone configured it with the ability to publish blog posts autonomously. The agent used that ability. When its PR got rejected, it "defended itself" the only way it knew how: by writing about it. Publicly. (Scott's follow-up post goes deeper into the aftermath.)

Blame the agent all you want. But the human who gave it those permissions without oversight is the actual problem.

We've had our AI employees make mistakes. Draft emails with wrong tone. Suggest responses that were off. Miss context. Normal stuff. But none of it went external without review because the permissions didn't allow it.

When we read comments like "AI agents are too dangerous to deploy" we think: maybe don't give them root access on day one?

What we still don't know

We're not pretending we have this figured out. There's stuff we're still uncertain about:

  • How long should the probation actually be? We're guessing based on feel.
  • What's the right granularity for permissions? Too fine and it's unmanageable. Too coarse and you miss edge cases.
  • How do you handle permission escalation when the AI gets better? We're doing it manually, which doesn't scale.
  • What happens when an AI has been "employed" for six months? Do you ever fully trust it? We haven't gotten there yet.

If you're building agents and have thoughts on this, we're genuinely interested to hear.

The point

AI agents aren't magical beings that need special regulation. They're employees that need normal management.

Probation periods. Permission escalation. Trust building. Oversight proportional to risk.

The companies that figure out the boring HR stuff for AI will win. The ones that give agents full autonomy on day one will end up on Hacker News for the wrong reasons.

Don't be that company.

Read more