Shopify Made 5,938 People Better at AI. Not With Training. By Watching.

When Shopify CEO Toby Lütke posted about the company’s internal coding agent, River, the numbers got all the attention: nearly 6,000 employees, thousands of Slack channels, one in eight merged pull requests. But the numbers aren’t the story. The story is a single design choice underneath them. River doesn’t work in private. Every conversation an engineer has with it happens in a public Slack channel where anyone can scroll back and watch how the work got done. That, and not the raw usage, is the part nobody is copying.

Most companies have a hidden AI problem, and it has nothing to do with tooling. Employees are already using AI constantly, but almost all of it happens in private windows. The good prompt vanishes into one person’s chat history. The clever correction never leaves a single browser tab. The workflow that worked last week gets rebuilt from scratch by the next person who never knew it existed. Individuals are getting smarter while the organization stays exactly where it was. This is what Nate calls the apprenticeship gap, and it widens every quarter that more of a team’s real thinking disappears into chats nobody else can see.

The fix is to create declared public spaces where real work, especially senior work, can be watched and learned from. Not surveillance of every private chat, but intentional channels with clear boundaries, modeled from the top, and reinforced with creative constraints. Shopify’s most underrated move is the simplest one: you cannot talk to River in a DM at all. That single binding constraint forces collaboration and makes learning visible by default. Companies that build this capacity compound their intelligence as an organization, instead of just accelerating isolated individuals.

The Hidden AI Problem Is Visibility, Not Tooling

Most companies have a hidden AI problem, and it has nothing to do with tooling. Employees are using AI all day long. They’re asking ChatGPT to rewrite emails. They’re using Claude to reason through a tricky customer issue. They’re running coding agents to inspect repos. They’re getting Copilot to summarize 40-page docs in two minutes. They’re quietly building small workflows that save them hours every week.

Almost all of it happens in private: private software, private windows. This isn’t a conversation about whether that’s secure. It’s about the fact that it isn’t shared:

  • The good prompt disappears into one person’s chat history.
  • The clever correction stays inside one employee’s browser tab.
  • The workflow that worked yesterday gets rediscovered next week by the next person who builds the same thing from scratch, because nobody told them it existed.

That last point is real, not hypothetical. Nate notes he has talked to Amazonians who will tell him there are six, eight, ten different vibe-coded tools inside the company solving the exact same problem. Individuals are getting smarter. The company is not. That is the gap.

Most companies have already bought the tools they need, so the bottleneck at this point isn’t tooling per se. It’s visibility. The thing that gets rewarded now is the comprehension layer, not the output layer.

Shopify’s River: The Design Choice Is the Story

Shopify’s internal coding agent is named River, and the usage numbers are substantial:

  • Employees: 5,938 used River in one 30-day stretch this spring.
  • Reach: across more than 4,400 Slack channels.
  • Output: 1,800 pull requests opened in Shopify’s main monorepo in a single week.
  • Share of merged code: about 1 in every 8 merged pull requests at Shopify comes from River today.

Those are the numbers people grabbed onto when Toby Lütke posted about it. Underneath them is the design choice that’s actually the story: River doesn’t work in private.

Every conversation an engineer has with River happens in a public Slack channel. Other engineers can scroll back through the thread and see how a senior engineer scoped the task, what context she loaded, where the agent got stuck, what she rejected, and what she kept. That’s the part nobody is copying.

The Apprenticeship Gap

For most of human history, the way we learned skilled work was by being near skilled workers, and that hasn’t changed. You watched how the senior person framed the problem, what they noticed, what they ignored. You picked up the bits that never showed up in any training manual. You learned the craft from the process as much as from the finished product.

Now think about what happens when most of the actual thinking happens in a private window. The junior employee never sees how the senior person instructs their agents. The new manager never watches an experienced operator verify an answer. The correction that made a workflow reusable stays invisible to everyone except the person who wrote it. Everyone is alone with their model, which means everyone has to rediscover the same lessons from scratch.

That is the apprenticeship gap, and it gets wider every single quarter, because more of a team’s actual thinking is happening inside chat windows nobody can see.

Lessons From Manufacturing: Polanyi’s Paradox

One of the clearest examples of how hard it is to get implicit knowledge into digital systems comes from the manufacturing era, specifically from John Deere and similar companies that build complex physical machine tooling.

There is a whole generation of American workers nearing retirement who are extraordinarily skilled at complex tooling and manufacturing. They know things in their fingertips, literally, that they cannot speak or express. This goes back to Polanyi’s Paradox: we know more than we can tell. PMs have been racing to capture that knowledge and turn it into machine learning algorithms before these workers retire, because there are fewer and fewer people stepping into factories to carry it forward.

What’s striking, talking to a PM who has actually done this, is how hard it is to get right. There is no way to fully capture the felt experience of someone’s ability to shape a particular piece of steel and turn it into an algorithm. You can approximate it; you can’t perfect it. That’s why key bottlenecks in the supply chain are driven by single individuals with extraordinary ability in their fingertips:

  • The one person who knows how to paint the racing stripes on a Rolls-Royce.
  • The one machinist, somewhere in Oregon, who knows how to test the quality of a particular type of Boeing screw.

(As an aside: if you think these people retiring is the issue at Boeing, it isn’t. That’s a much longer conversation, and they are not the problem.)

That physical knowledge is hard to speak and hard to communicate, and software has the exact same thing. The implicit knowledge that strong operators use to interact with AI is just as hard to articulate, which is why it’s worth learning from our physical engineering counterparts when we try to solve the software version of this problem.

What Public AI Work Actually Looks Like

Dumping every chat transcript into a Slack channel is not the answer. That just pollutes the channel. What you want to make visible is four parts of the work:

  1. The task. What was the person actually trying to get done?
  2. The context. What did they tell the model? What did they paste in? What did they leave out?
  3. The interaction. How did they prompt? What did the first answer look like? How did they push back? What did they ask the model to redo?
  4. The review. What did the human accept? What did they reject? What did they verify manually, what did they rewrite, and why?

If you only share your final answer, the team learns almost nothing. If you share all four parts, the team starts to build a sense of shared taste, which is one of the tremendous bottlenecks in AI adoption.

A prompt library doesn’t fix this. It captures static instructions but misses the messy context, the revisions, and the moment where the model produced something plausible and the human said “no, that’s wrong for our customer,” or “no, that violates our tone,” or “no, that analysis skipped the constraint that actually matters here.”

Nate’s own commentary here is instructive: when people watch him use AI, the thing they notice is that he says no to the model a lot, says it very quickly, and bases it on a rapid assessment of the quality of what the model is producing. That tends to surprise people, and it’s exactly the kind of thing that’s worth making visible. The most valuable part of AI work is rarely the prompt. The prompt is the easy part to copy. The surrounding habit is what actually teaches.

Privacy: Declared Spaces, Declared Rules

The first objection to all of this is privacy, and it’s a serious one. Employees should not assume their private AI chats are going to become company property. On paper, most people have agreements saying whatever they type into the company AI belongs to the company. In reality, most people don’t act that way, and if you declared every AI chat default-public, a lot of people would simply stop using AI. You don’t want to push good work underground.

What’s being described here is the opposite of surveillance: declared spaces and declared rules. That’s the beauty of the Shopify example. Senior people run real work where the team can watch. Toby does this himself with River in a public channel. The point of the channel is to make learning visible, full stop.

In practice:

  • Create declared channels. A product team gets an AI workbench channel. A sales team gets a sanitized customer research workflow channel. A finance team gets a read-only analysis pattern channel. Engineering gets public agent channels for certain classes of non-sensitive tasks.
  • Make the boundary explicit. The team needs to know exactly what belongs in the public channel and what does not. Customer data stays private. HR stays private. Legal strategy stays private.
  • Get creative in regulated environments. With effort, you can build a workflow that puts clinical decision support, anonymized patient records, and treatment reasoning into a public space so the team can watch the agent operate, without disclosing PII or violating HIPAA. It isn’t perfectly easy, but the alternative is letting a law meant to protect patient privacy quietly become a constraint on AI learning.

The takeaway is not “make regulated work public in a non-compliant way.” It’s “create a safe public surface for the parts of AI work that can teach, without exposing protected information,” and lean into that as far as you can.

Senior People Have the Most Valuable Judgment and the Least Visible Process

This is where it gets uncomfortable. The most important public AI work in your company has to come from senior people, and in most companies senior people have the most valuable judgment paired with the least visible process. They write the final memo, but you don’t know how they did it. They make the decision, but don’t tell you why. They edit the strategy deck and approve the customer plan, and all the thinking happens offstage.

With AI, that offstage thinking can get even more hidden. A senior leader can use an agent to pressure-test a plan, rewrite a board update, compare scenarios, identify risks in a roadmap, or critique a launch narrative, and never share any of it. If all of that stays private, the organization never sees how a strong operator actually uses AI.

The fix is to ask senior people to run some non-sensitive work in public, and to equip them to do it as easily as River makes it at Shopify. Real work with real stakes:

  • A leader asking an agent to critique a launch plan in a team channel.
  • A senior engineer investigating a low-risk bug with an agent while narrating the review out loud.
  • A sales leader showing how they turn account notes into a call-prep brief, with customer-sensitive details stripped out.
  • A product leader asking AI to find weak assumptions in a roadmap narrative.

The junior person no longer copies the prompt. They see the judgment in action: how senior people frame ambiguity, how much context is enough, how often the first answer is wrong, how a good operator pushes back. They learn that using AI well is active supervision, not passive consumption. Most AI training never gets close to this, because training only tells people what the tool can do. Public senior workflows show people how individuals at the top of their craft actually use AI.

This is exactly what Toby is modeling at Shopify. As CEO, he also considers himself an individual contributor, and he deliberately puts his work in a public channel, letting other people ask questions of his agent and critique his choices while he shapes results with it. Is it a little chaotic? Yes. Is he still the one telling the agent what to do? Also yes. But that open-room format lets him teach and socialize what he wants to drive through the company in a way nothing else does. It’s also what he concluded when he wrote out his full reflection on apprenticeship in the age of AI: doing this multiplies your time and impact in ways you don’t expect.

Measuring What Matters: Learning and Reuse

Token volume has its place. Task and workflow counts are useful. But the metrics that matter here are about learning and reuse:

  • How many reusable workflows did the team create in the last month from a public agent channel?
  • How many got adopted by another person or another team?
  • How many examples got pinned because they changed how somebody works?
  • How often did a public workflow prevent duplicated effort somewhere else? (Hard to measure, but worth trying.)
  • How many stale examples got retired?
  • How many failures became better review rules?

The best signal is sometimes not “AI usage is up.” Sometimes the best signal is that a given mistake is happening less often on the team. That’s what organizational learning actually looks like. It’s hard to measure, but it reflects reality better than raw usage.

The practical question for leaders isn’t whether your team is using AI. Most of them already are, a lot. The practical question is: what AI work inside your company is making one person better while everyone else falls behind? If that work stays private, you’re paying for the same lesson twice, three times, ten times. Either your senior people start running real work where the team can watch, or every individual gets faster while the company stays where it is. The companies that pick the ability to learn from one another are the ones that compound organizationally.

The Real Takeaway: The Power of Constraints

The biggest thing to take away from this is the power of constraints to drive collaboration. It’s underrated how much of what works at Shopify is facilitated by one simple rule: agents never run in DMs in Slack. You cannot interact with River in a DM. It’s not possible.

DMs are wildly popular in Slack, and Slack has been fighting them at the product level for a while precisely because they’re demonstrably bad for teamwork even though individuals love them. By insisting agents only work in public channels, you put a binding constraint in favor of collaboration and learning. It can feel a little binding, and you have to be willing to comply with it, but that’s the point.

The larger lesson isn’t about Slack. It’s that creative, careful constraints shape incentives toward learning. The exercise for any leader is to audit your environment and ask: where are we putting intentional constraints that individuals may sometimes find frustrating, but that on the whole promote collective, public learning for AI?

To get the flywheel started: stand up one declared channel per team, write a pinned message at the top stating what the channel is for (reusable workflows, useful failures, prompt revisions), and make public-by-default the norm with constraints like Shopify’s no-DM rule. As patterns repeat, and they will, turn them into playbooks, skills, or inputs for the next challenge. You can even point AI at the channel to gather the lessons learned. It’s one of the fastest ways to socialize real AI usage there is, and it’s how a whole team, junior folks included, actually starts to get smarter together.

Meta

Added: 2026-05-26