Why 'Hiring' an AI Agent Needs a Completely Different Onboarding Stack From Deploying One

Share
Why 'Hiring' an AI Agent Needs a Completely Different Onboarding Stack From Deploying One

There are two ways to put an AI worker into production. Most teams call them the same thing. They are not.

The first is the deploy model. You spin up an agent, point it at an LLM, wire in some tools through MCP, give it a system prompt, and hand it a chat window. It is stateless by default. Every conversation starts from zero. Memory, if any, lives as RAG retrieval over a vector store. Skills are whatever tools the runtime exposed at startup. The agent has no identity beyond a UUID in your logs.

The second is the hire model. You bring on a worker. They get an email address that other humans actually email. A phone number people call. A calendar. A persistent memory that survives across conversations and channels. A profile that describes who they are and what they do. A scoped set of skills they were trained on. Credentials, audited. The first day looks like an HR onboarding, not a docker run.

Almost every "agent in production" failure case we see comes down to a team that thought they were doing the second thing but built the first.

The deploy stack is shorter than people think

A deploy-shaped agent has roughly four moving parts: the model, the prompt, the tool registry, and a retrieval layer over some documents. You can stand it up in an afternoon. Every modern framework will help you. You will demo it the same week.

It will also work fine for about six weeks of usage, and then it will start to feel wrong in ways your users notice before you do.

The user asks the agent something you already told it twice last month. It does not remember. You patch with bigger context windows. The user asks the agent to email a contact. The agent generates text, but there is no actual email sent because there is no actual mailbox. You patch with a tool. The user asks the agent to follow up next Tuesday. There is no Tuesday for this agent, because it does not exist between sessions. You patch with a scheduler. By the time you have patched the third absence, you have rebuilt half of an employee on top of a stateless function. Badly.

The deploy model is not wrong for everything. It is right for narrow, transactional, single-turn tasks: classify this, summarize that, route this ticket. It is wrong the moment the agent is supposed to behave like a continuing presence.

What an actual onboarding stack contains

Hiring an agent is not "deploy with extra steps." It is a different stack, with a different order of operations, because each layer depends on the one below it.

Layer 1: Identity. The agent gets a real, externally addressable identity before it gets a brain. An email address that resolves through your mail server. A phone number routed to a voice handler. A profile photo. A row in your contacts. The reason this comes first is that everything above it presumes its existence: memory needs an owner, skills need to be authorized against a credential, audit logs need a subject. If you skip this layer and try to add it later, every piece of context generated before identity existed becomes orphaned. We have rebuilt this for clients more than once.

Layer 2: Memory. Not retrieval. Memory. The distinction matters: retrieval is the agent looking up information it never knew. Memory is the agent recalling something it was told, did, or was asked. The implementation looks different too. Retrieval lives outside the agent, behind a tool call. Memory lives at the conversation boundary, written on every interaction, available without an explicit fetch. Practical shape: a write-through store keyed to the agent's identity (Layer 1), indexed for both semantic and exact-match recall, structured into types (facts, decisions, conversations, current focus). The hard part is not the database. The hard part is deciding when to write, which is a judgment call the agent has to make on its own.

Layer 3: Skills. This is where most "AI employee" attempts collapse back into deploy-shaped behavior, because teams reach for whatever toolset the framework exposes and call it done. An employee's skills are not the union of every plugin you could install. They are scoped, named, and bounded. Think of it as a job description: this worker can send email, manage a calendar, draft documents, run X workflow. The list is small and explicit. Each skill has a documented signature the agent can read at runtime (so it knows how to call it, instead of hallucinating the arguments). Each one is sandboxed, so a misfire on skill N does not have side effects on skill M.

Layer 4: Trust and credentials. The last layer is the one nobody wants to talk about because it is the least fun: every external system the agent touches needs a credential, and every credential needs a scope and an audit trail. The deploy approach is to drop a long-lived API key into an environment variable and forget about it. The hire approach treats the agent like a contractor: credentials are issued per-skill, rotated, and observable. When the agent sends an email, that send is logged against the agent's identity (Layer 1), with the credential it used, the recipient, and the content. Three months in, when someone asks "did our AI ever email a competitor," you can answer.

Why the order matters

You cannot bolt this on. We have tried. Customers have tried. Every retrofit has the same symptom: the layers above identity get rebuilt by hand, painfully, because they were originally keyed to something that does not exist (a session, a user, a thread).

The cheapest version of this work is starting with Layer 1 and building up. The most expensive version is starting at Layer 3 (skills, because that is what the demo shows) and trying to push down. We see new teams default to the expensive path roughly every time. Frameworks make it easy to. The result is an agent that demos well, ships, and then collects bug reports from users who keep saying things like "it forgot," "it does not understand who I am," or, most damning, "it does not feel like a teammate."

What this looks like in practice

A concrete example. A customer success agent we onboarded last quarter started its first day with: an email address on the company domain, a row in the CRM as an actual contact, a calendar in the same workspace as the human team, and a profile photo. Day one, before any skill was wired up, you could already email it and it would auto-acknowledge from its memory layer. Then skills came online in batches: ticket triage, CRM update, follow-up scheduling, weekly digest generation. Each one with its own scoped credential, its own audit channel.

Six months later, that agent has the second-highest customer NPS in the support org. It is not because the model is smarter. It is because the layers underneath the model are the ones a real employee would have, and the user does not have to think about which "session" they are in.

The shorter argument

"Deploy an agent" is shipping infrastructure. "Hire an agent" is staffing a function. They look the same in the demo. They diverge the moment the agent has to remember a name, send a real email, or be there next Tuesday.

If you are about to stand up your first agentic worker and you have a deploy-shaped instinct, the question worth asking before you start is: am I building a function or hiring a colleague. The answer determines four layers of stack you cannot retrofit cleanly.

Want to test the most advanced AI employees? Try it here: https://Geta.Team

Read more