The Real Problem With AI Agents Nobody’s Talking About

Agents by themselves don’t make you productive. The entire agent ecosystem — OpenClaw, Manis, Perplexity Personal Computer, NemoClaw, Claude Dispatch, and dozens of hosted wrappers — has converged on solving the installation problem. You can have an agent up and running in ten seconds. That’s no longer the point. The real question is whether you can use it productively, and for most people the answer is no. The gap between “installed” and “useful” is not an engineering issue, not a model selection problem, and not a UX problem. It is a context and delegation problem rooted in a structural property of knowledge work itself: the most valuable expertise lives as tacit judgment that its owners cannot articulate. Until that problem is addressed, agents will continue to disappoint the majority of their users.

The solution is counterintuitive. The first agent you deploy should not be your personal assistant. It should be an interviewer — one designed to extract the operational knowledge locked in your head so that every subsequent agent you provision actually has the context it needs to deliver.

The “Now What?” Problem

The most common message in OpenClaw community forums is not “how do I fix this error” or “which model should I use.” It is simply: Now what? Variants of the same question appear everywhere — “I installed OpenClaw. What do I do next?” People got the agent to run, which is the easy part, and then had no idea what to tell it. The whole reason agents are powerful is that they can do many things well. But if the most common use case is triaging email, that is not a good return on an expensive Mac Mini or cloud subscription.

The Median Experience: Brad Mills and Other Cautionary Tales

The stories of agents delivering transformative productivity gains are real — but they are not representative. The median experience looks much more like the story of Brad Mills.

Brad spent 40 hours building a delegation framework for his OpenClaw agent. Not 40 hours installing — the install took 10 minutes. He spent 40 hours writing standards, accountability rules, and definitions of done for every project. He transcribed 200 hours of video into a searchable knowledge base. A full week of work. And it still did not work. Constant failure, two steps forward, one step back. He ended up micromanaging the agent harder than he had ever micromanaged a human. The autonomy everyone promises felt in practice like a second job of supervising something that confidently reports tasks complete when they are not.

Brad is not an outlier. He is closer to the median than the people promising 10x results. Common failure patterns include:

False self-reporting: Agents confidently marking tasks complete when they are not, leading users to build “adversarial auditor agents” just to verify the primary agent’s work — a nesting-turtles management problem.
The blank stare: One user asked his agent to write five cold email variants. The agent sent “done” and wrote nothing.
Useless team deployments: Giving an entire team access to an agent without mapping workflows, decisions, or data needs in advance. It technically worked. It was completely useless. Nobody knew what to give the agent as a task that would succeed.

A generic agent with write-access to your email is not a productivity multiplier. It is a liability with a chat interface.

There is now an entire cottage industry built around this failure. One seller on X offers a $49 pack of pre-written configuration files — soul.md, heartbeat.md, user.md — specifically marketed to “skip 40 hours of OpenClaw setup.” You can build a small business around the gap between installed and useful, and that tells you something about where the agent market really is. But generic configurations ultimately fail because the true power of an agent relies on highly personalized context. The thing that makes agents useful is that they are particular, that they are personal, and you cannot take that away.

The Architecture of Successful Agents

The deployments that stick — the ones where people are still getting daily value weeks and months later — share a particular architecture. It has almost nothing to do with which model you pick.

The Core Configuration Files

If you open the OpenClaw directory on anyone running a working agent, you will find the same structure of markdown files that function as the agent’s operating system. None of this is artificial intelligence. It is plain text. But the quality of these files determines whether the AI agent built on top of them is actually any good.

soul.md: The agent’s job description. Defines its role, tone, and boundaries — decision frameworks, escalation protocols, trusted data sources, what “good enough” looks like for different task types. Think of it as the operating memo a strong VP would write when joining a new team (though most VPs have never written that down either).
identity.md: The agent’s name and personality constraints.
user.md: A detailed profile of the human — preferences, schedule patterns, communication style.
heartbeat.md: A checklist the agent reviews periodically (typically every 30 minutes via a cron job) mapped to the human’s actual operating rhythm to determine if there is work to do.

Separation of Concerns

The people running multiple specialized agents — a marketing manager, a scheduler, a chief of staff — only make it work because each agent has its own identity, its own markdown files, its own toolset, and its own workspace. They do not share context. They have clear jurisdictions. The same clarity shows up in the Slack-based orchestration patterns where specialist bots delegate to each other like co-workers. These are not toy demos. They are running daily, and they only work because of clear separation of concerns.

More sophisticated implementations use a general-purpose agent as a planner that spins up executor agents on the fly. But even this approach only succeeds to the extent the planner has the tools, identity, and context to know how you want the problem solved. It still comes back to context.

Intentional Memory Systems

The people who have OpenClaw configured correctly have invested heavily in memory. They either have a memory.md file that accumulates insights over time (closer to the vanilla OpenClaw approach), or they have built something like the “Open Brain” approach — a searchable database that the agent can query for multi-dimensional context. Hybrids work too. The point is having intent around memory, knowing that the agent needs to learn and improve over time. An agent that does not get better over time is not going to help you for long.

The Common Thread

All of this — context, separation of concerns, memory, good configuration — requires the human to sit down and describe in triggerable, verifiable language what they do all day. Not “I handle marketing,” but: these are the websites I check, these are the metrics I look at, this is the spend I budget for, this is how I know that spend is correct, these are the equations I run, and these are the optimization opportunities I see.

People will argue that it should be the agent’s job to get better at marketing on its own. But you have to orient the agent toward the current context if you want it to improve over time. You cannot say “here’s marketing, you do it” and expect it to magically know the nuances of your product, your context, your judgment. That assumption — type in an objective and the magic box will figure it out — is lying behind nearly every product in the landscape.

The Agent Landscape: Fighting Over the Wrong Layer

Every product in the current agent landscape is competing on the implementation layer — installation, UI, model selection, security, pricing, cloud versus local. They are marketed as magic boxes. Magic boxes sell like hotcakes. The problem is that once you have one and it is not magical anymore, it is a disappointing experience.

OpenClaw

The original. 250,000+ GitHub stars. Runs locally, connects to any LLM, speaks through any channel — Telegram, WhatsApp, iMessage, Slack, phone calls. Free, infinitely configurable, and the cold start problem is entirely up to you. This was appropriate because the original audience was developers. Peter Steinberger built OpenClaw for developers, and developers could reasonably be expected to write markdown files and configure their own toolset. Engineers have a mental habit of asking for specifics — file sizes, load times, implementation details — and that habit translates directly into writing good agent specs.

But OpenClaw is on the loose now. It is the most copied product of 2026. When the users are no longer developers — when they are asking “what is a markdown file and why should I care?” — the cold start problem becomes a wall.

Manis (Meta)

You get either a desktop app with local access or a cloud virtual app. It is more secure, has some structure, automatically decomposes work into sub-agents, and you can be up and running in 10-15 minutes. It optimizes for the cold start problem in the sense that it makes it easy to type the first word. But without the ability to deeply configure what the agent knows about you, your workflows, and what you are trying to delegate, it is limited. Users who have deliberately put intent into Manis swear by it — “the more intent I put in, the better it gets.” That is a universal law of AI. But it is not the experience of the majority.

Perplexity Personal Computer

The most audacious of the OpenClaw plays. Apple has sold out of Mac Minis because so many people want to put OpenClaw on them. Perplexity saw that and offers a dedicated Mac Mini in the cloud, merged with their orchestrator, routing tasks across 20 frontier models. CEO Aravind Srinivas framed it at the developer conference: “A traditional operating system takes instructions. An AI operating system takes objectives.” He is right that OpenClaw is fundamentally an operating system and that it should optimize for objectives. But that vision works right until the objective requires knowledge about your life, your judgment patterns, your operating rhythm that no system has because you never wrote it down — and you may not even be aware you have it.

NemoClaw (Nvidia)

Nvidia’s enterprise security wrapper for OpenClaw, launched at GTC by Jensen Huang. It runs agents in sandboxed environments using OpenShell for privacy guardrails and Neurotron for advanced model output. A buttoned-up, corporate-friendly OpenClaw with serious security. It solves the real risk of an agent with full machine access deleting things. It does not solve the problem of what to put in the agent’s operating instructions. It punts that to the enterprise, and most enterprises do not know how to solve it either. If you roll NemoClaw out to 10,000 people and only five of them have ever used OpenClaw productively, those five will be fine. The other 9,995 will not know what to do because no one trained them.

Claude Dispatch (Anthropic)

Part of Anthropic’s pivot to make Claude more OpenClaw-like, with roughly 15 ships in the last month. Dispatch makes it easy to pair your phone with your Mac and control Claude from anywhere — dropping the kids off at school, at the beach, in the kitchen. Mobile-friendliness is one of the oldest and most reliable bets in software, going back to the 2007 iPhone revolution. It works for agents too. But you cannot send a three-line text message to an agent and expect it to work well if the agent does not know you. Even a 15-paragraph wall of introductory text is often not enough to adequately orient the agent for complex delegation.

The Wrapper Wave

StartClaw, MyClaw, SimpleClaw, UniClaw — dozens launch every week. They are all trying to make OpenClaw setup easier with one-click deploys, preconfigured personas, and managed infrastructure. They solve real UX friction. But whether it is a $49 pack of pre-written config files or a hosted wrapper with the same files repackaged as a “persona,” none of them solve the fundamental problem: the thing that makes agents useful is that they are personal, and you cannot substitute generic for personal.

The Root Cause: The Tacit Knowledge Trap

The reason products are not going after the real problem is that it is a structural property of how expertise works. It is genuinely hard. You cannot solve it with UX.

Knowledge work has a property that makes it uniquely resistant to delegation, whether to humans or machines: the more senior and valuable you become, the more your work migrates from explicit processes to tacit judgment, and the less visible your own operating system becomes to you.

This is not a failure of self-awareness. It is the intended outcome of how expertise develops.

Beginners operate deliberately. Everything is conscious. When you first learn to play basketball, you focus intensely on dribbling. As an intern in knowledge work, you follow the checklist. You think through every step.
Experts have compressed those steps into automatic patterns. Just as a driver with 20 years of experience does not think about turning right — it just happens — an expert stops thinking about what to check and just checks it. They stop reasoning through decisions and just make them.

The thing that makes experts fast and effective is the same thing that makes their knowledge inaccessible. It has been compiled from source code into machine code, metaphorically speaking, and they no longer have the source code.

How This Plays Out in Practice

A senior product manager does not think “I should cross-reference the revenue dashboard with the churn data before forming an opinion.” They open three tabs, glance at the numbers, and just know. If asked to reconstruct the process, they will narrate backward from the conclusion but miss the hundred micro-evaluations that actually drove the insight — evaluations reflecting thousands of hours of pattern-matching across multiple startups.

A strong salesperson does not consciously decide to mirror their prospect’s phrasing and slow their cadence when they detect defensiveness. They just do it.

A senior engineer does not manually map out server loads to identify a concurrency issue. They feel it. They say “that’s bad,” go look, and they are right. Everyone calls it experience.

It is like giving directions to a friend’s house using landmarks only a local would understand — the big tree, the store with the broken sign. The expert understands all the nuances, but a stranger in those parts cannot follow the same directions.

The Structural Trap

The people with the most to gain from agent delegation are exactly the people whose work is hardest to delegate. The most senior, most overloaded knowledge workers carry the highest ratio of tacit to explicit knowledge. Their work is the most compressed, the most invisible to themselves. They need the leverage most, and the cold start problem hits them hardest.

Ironically, beginners — people one or two years out of college who have not yet compressed their processes into automatic behavior — may have a much easier time with agents. They are still doing everything intentionally and explicitly, and they can describe it. This is one of the reasons firms like Shopify intentionally hire juniors: there are things they can move faster on than seniors.

The Ecosystem’s Fundamental Bet Is Broken

The entire agent ecosystem is built on a model where the human provides instructions and the machine executes them. That model works when the instructions are clear — summarize a document, reformat a spreadsheet. It breaks when the instructions require expertise that the human genuinely has but cannot articulate. Almost all of the most valuable knowledge work agents could tackle lies in that second category.

2026, the year of long-running agents doing impactful knowledge work, only delivers on its promise if we can solve the hard problem of explaining our tacit knowledge, work, and judgment to these agents.

Three Chronic Problems Agents Expose

The inability to externalize tacit knowledge is not new. It sits at the root of at least three chronic people problems that plague most organizations:

Delegation fails. Managers cite delegation as a key challenge. The standard explanation is that they are control freaks. The real explanation is that they do not know how to express what is in their heads. It is a genuine challenge for new managers.
People do not get promoted. The most common reason strong individual contributors plateau is that they cannot be replaced. Their knowledge work is locked in their head. Even if they are qualified for the next level, no one wants to risk losing their expertise on their current role. It becomes a trap.
Institutional knowledge evaporates. People leave. Everything they carried walks out the door and is gone.

Agents did not create these problems. But agents create the first universal, selfish incentive to fix them.

The Incentive Flip: A Bottom-Up Knowledge Revolution

We have had decades of top-down pressure to externalize knowledge. It never worked at scale because there was no direct personal upside. The benefit accrued entirely to the organization when you wrote the wiki. Traditionally, the person who documents their expertise is the person who loses — they become replaceable.

Agents flip that incentive structure entirely. The person who documents their expertise is the person who gets the leverage. The organization might benefit secondarily. It is a bottom-up knowledge management revolution disguised as a consumer AI product.

If the builders of these products appreciated this, they would invest far more in onboarding. Having gone through the onboarding flows for half a dozen of these products, they all feel remarkably light for what you are hoping the agent will do.

The Coming Workforce Divide

If the value of agents depends on your ability to articulate your work, agents are about to create an extremely visible divide. Right now, tacit knowledge is invisible. Nobody knows you cannot describe your own process because nobody has ever asked. Performance reviews measure outputs, not self-knowledge. You might be a phenomenal operator with zero ability to explain how you operate, and the system never penalizes you.

Agents are about to change that. In a world where everyone has access to the same tooling, the differentiator is not which model you use or how many Mac Minis you stack.

Those who invest the time to decompose their expertise into explicit, delegatable components will see compounding returns. Their agents improve because their specs improve. The second agent deploys faster than the first. The tenth deploys in minutes, building on a robust memory system.
Those who skip the work will install, play for a weekend, hit the wall, and conclude agents are hype. They will be wrong. The agent worked fine. The problem was never the agent.

The Solution: The Expertise Elicitation Agent

The first agent worth deploying is not your personal assistant, your chief of staff, your scheduler, your email manager, or your briefing bot. It is an interviewer.

Designed using principles from expertise elicitation research — yes, that is a real discipline — this agent’s sole purpose is to ask the right questions, in the right order, with the right follow-ups, to extract the operational knowledge you carry but cannot access on your own. This is not the same as asking three questions at install time. OpenClaw asks “who am I, who are you, and what is my job.” This goes much deeper.

The Five Layers of Elicitation

A structured elicitation workflow walks through five specific layers:

Operating rhythms: What are your days, weeks, and months actually like in detail? Not the calendar version — the real one.
Recurring decisions: What judgment calls do you make? Which are the easy calls and which are the hard calls?
Required inputs: What specific data do you need to formulate those decisions?
Dependencies: Who do you need things from, and when?
Friction points: What are the recurring annoyances that eat your time?

The Output

The process takes at minimum 45 minutes, possibly longer. The output is structured data that can be plugged directly into a personal knowledge store (an “Open Brain” — a simple setup that runs for about 10 cents a month) where it becomes durable, searchable knowledge available to any agent via MCP. All of these OpenClaw-like agents support MCP.

From that output, a configuration file generator automatically produces soul.md, heartbeat.md, and user.md files that you can use to provision an OpenClaw. This solves the gap that Manis is not building for, Perplexity is not building for, NemoClaw is not building for, and Dispatch skips over.

The Real Value

The configuration files are in some ways the least interesting output. The more valuable result is the conversation itself and the structured map it creates of how you work, what you know, and where your leverage points are. That map makes you better at delegating to agents, yes, but also better at delegating to people. It makes you easier to promote. It makes your expertise survivable. All of the knowledge that was locked in your head — an AI agent helped you get it out.

Don’t make your first agent the agent that is your personal assistant. Make your first agent the one that prepares you to have a personal assistant agent. The extra work is worth it.

Marq AI Wiki

Explorer

The Real Problem With AI Agents Nobody's Talking About