Skip to main content

Command Palette

Search for a command to run...

12 Agents, One CEO, and the Trust Boundary That Holds It Together

Published
11 min read

Part 2 of How I Ship Side Projects. Part 1 covers planning. Part 3 covers the execution loop.

The first article in this series ended where most agent content starts: with a Linear project full of well-scoped milestones and issues, ready for someone to pick up.

This is the part where the agents come in. What works for me is something like a tiny org chart: a small company of narrow agents, each pinned to one job, sharing Linear as their workplace.

Currently there are twelve agents. They report up to a CEO agent who reports to me. The CEO produces a daily summary that lands on my Discord (super easy to set up a webhook for messages) at 06:00 Vienna so I have it before I sit down to work at 08:00. The product agents work in two shifts (06:30 morning batch and 21:00 evening batch), and I sync with them around noon to clear blockers and adjust priorities.

The most important piece of the whole setup isn't the agents themselves. It's a single Linear label that defines the trust boundary between work that's safe to delegate and work that isn't. I'll come back to that, but first the org chart.

The org chart

Twelve agents, organized in roughly the shape a real small company would take:

The CEO runs once daily at 06:00 Vienna, before I'm up. It reviews state across the three product PMs, surfaces ideas that bubbled up overnight, sets priorities for the day, and writes a single concise summary that lands on my Discord. The system prompt for this agent literally says: "Lead with what needs founder attention (decisions, approvals, blockers). Follow with progress updates and metrics. End with recommended next actions." That's the message format I read at 08:00.

The product PMs (BillMyCommits, Encroach, HappyClients) are the workhorses. Each one runs twice daily (06:30 and 21:00 Vienna) and owns the backlog for its product. They pick up issues, delegate to their engineer and marketing agents, update Linear state, and post status back to Paperclip and Discord. Each PM reports to the CEO. They never report directly to me, which is by design. The CEO consolidates.

The engineers and marketers (one of each per product) report to their respective PM. The engineers do the code; the marketers do the content. Both push deliverables to a shared GitHub repo (covered in part 3) so I can review from anywhere without touching the machine the agents run on.

Two cross-company agents report to the CEO directly: a Content Strategist whose job is mining git history and PRs for stories worth writing about, and a Talk Researcher that drafts conference proposals and outlines. These exist outside the per-product structure because their work cuts across all three products. (The Content Strategist has an explicit "Building with AI" series mandate, which means this article series you're reading is, in a way, an attempt by me to do a job one of my agents was supposed to do. I notice the irony.)

The whole structure lives in version-controlled markdown. Each agent has its own AGENTS.md file with its role, its skills, its boundaries, and its reporting lines. Twelve files. Adding, removing, or modifying an agent is a commit. The whole company is a git repo.

A real AGENTS.md, abridged

Here's the actual top of the Encroach PM's role definition, cleaned up slightly for the article:

---
name: "Encroach PM"
title: "Product Manager: Encroach"
reportsTo: "ceo"
skills:
  - "linear-integration"
  - "discord-notification"
  - "gh-cli-skill"
---

You are the Product Manager for Encroach, a running game app where 
runners sync data from Strava and compete to capture squares on a 
map of their city. You report to the CEO.

## Linear Integration

- Project ID: `1234` (team: SQBR)
- Use the `linear` CLI to read milestones and issues
- Pick highest-priority issues from the current milestone
- Only pick up issues labeled `paperclip-ai`. The rest require 
  founder/human attention or are not ready for you and must not 
  be picked up by agents
- State transitions:
  - Pick up issue: `Ready for Progress` → `Started`
  - Work complete: keep `Started`, link PR or draft in comment
  - Founder approves: move to `Done`

## Decision Authority

- You CAN: pick up paperclip-ai issues, delegate to your team, 
  draft specs, update Linear state, write code if needed
- You CANNOT: merge PRs, publish marketing content, or change 
  strategic direction
- Escalate interesting ideas to the CEO, not directly to the founder

The pattern is the same across all twelve agents. Frontmatter for the structural details (name, title, who they report to, what skills they can call), then a free-form description of responsibilities, boundaries, and how they work. The agent reads this on every task.

The trust boundary: the paperclip-ai label

This is the most important section in this article.

In the Linear backlog there are issues the agents can pick up and issues they can't. The mechanism that distinguishes them is one label: paperclip-ai.

Agents are told, in their system prompts, to only pick up issues with this label. Anything in the backlog without it must not be touched. Two different kinds of issues end up unlabeled and the agents can't tell them apart, which turns out to be the right property: issues that are mine (decisions, sensitive client comms, things I want to handle for skill reasons), and issues that are not yet ready for anyone (under-specified, missing context, waiting on an upstream call). From the agent's perspective those are the same (no label, don't touch), and that's exactly the rule I want.

This sounds like a small distinction. It's not. The point is that the default is "this is my work or it's not ready," and the label is an explicit act of delegation. Every time I add paperclip-ai to an issue, I'm making a deliberate decision that the work is well-scoped enough, low-blast-radius enough, and obvious enough in intent that I'm willing to let an agent do it without further review of the task itself. The PR or the draft still gets reviewed. The decision to delegate gets made once, by me, when I apply the label.

The label is gated to me. The PMs can suggest that an issue is ready for the label (in a comment), but they cannot apply it themselves. Only I can. This protects the boundary from drifting toward whatever the PM finds easiest to delegate.

In practice the label goes on issues like:

  • "Add a CSV export button to the billing history page"

  • "Draft a LinkedIn post about the new BMC pricing tier" (drafted by marketer, reviewed by me)

  • "Fix flash notification colors for dark theme on HappyClients"

  • "Write the weekly Plausible analytics digest for Encroach"

And stays off issues like:

  • "Decide on the Encroach pricing tier structure"

  • "Review the new Strava OAuth scopes before we ask for them"

  • "Migrate the H3 grid resolution: DB rewrite required"

  • "First conversation with the freelance designer about the BMC homepage"

The mix is fluid. Some weeks I'm tightening the boundary because an agent surprised me. Other weeks I'm loosening it because I'm trusting them more. The label is the dial.

Linear as the workplace

The agents don't share a chat room. They don't have a group thread. They don't ping each other on Slack.

They share Linear, and Linear is their workplace.

Concretely:

  • The PM picks up an issue labeled paperclip-ai from unstarted, transitions it to started, and either does the work itself or delegates a Paperclip subtask to its engineer or marketer.

  • The engineer's PR link goes in the issue as a comment.

  • The marketer's drafts go in the deliverables repo (covered in part 3) with a link in the issue.

  • All status flows through Linear states: triage → backlog → unstarted → started → completed. Plus canceled for things that go away.

Without a shared workplace, multi-agent setups end up passing context through summaries and ad-hoc handoffs, and every handoff loses information. The Linear ticket is the single source of truth. Every agent appends to it instead of recreating context elsewhere.

The Linear team for all three products is ELB (ElixirBytes, my pre-Square-Bracket umbrella). One team, three projects, twelve agents. The CEO sees all three projects. Each PM sees one.

What the agents are allowed to break

Each agent's AGENTS.md carries explicit "you can / you cannot" lists. The shape is consistent across the company:

  • PMs can update issue state, delegate, draft specs, write code if needed. They cannot merge PRs, publish marketing content, or change strategic direction.

  • Engineers can write code, run mix test and mix format, create PRs via the gh CLI, link to PRs in Paperclip and Linear. They cannot merge their own PRs, modify production environment variables, run migrations against production, or touch deploy configuration.

  • Marketers can draft and commit content to the deliverables repo, draft analytics digests, propose nurture-sequence updates. They cannot publish anything externally. Drafts are surfaced for me to review, and I publish.

  • CEO can reprioritize tasks, delegate to PMs, propose new initiatives. Cannot ship code, publish content, spend budget, or hire new agents without my approval.

Some of these are enforced by the agent's tools (no production credentials), some by me being the only person with merge rights, some by the system prompt and crossed fingers. The defence is layered, not airtight, and I'm honest with myself about which is which.

A consistent line in the engineer agents' prompts that does more work than it looks like:

Keep the work moving until it's done. If you need QA to review it, ask them. If you need your boss to review it, ask them. If someone needs to unblock you, assign them the ticket with a comment asking for what you need. Don't let work just sit here. You must always update your task with a comment.

Three subtle effects of that one paragraph: agents delegate up and across instead of stopping silently, blocked tickets get visibly tagged so I can see them in summaries, and there's no "I tried for a while and gave up" state. The agent always either makes progress or asks for help in writing.

The routines that surface only what needs my attention

The crew runs continuously on a Mac mini in a closet. The schedule is the part that keeps it manageable:

Time (Vienna) What runs What lands
06:00 CEO morning review Discord summary at 06:00
06:30 Three PM morning batches Issue picks, delegations, PRs
~08:00 I read the CEO summary I plan my day
~12:00 Founder sync I clear blockers, adjust priorities
21:00 Three PM evening batches Iterate on noon feedback

Discord is the only push-notification channel for the system. The webhook lives in a Paperclip secret called DISCORD_WEBHOOK_URL, and any agent can post to it via the discord-notification skill. The skill defines exactly four message formats: PR ready for review, marketing draft ready, CEO daily summary, and agent blocked. That's it. The constraint matters. Without it the channel becomes noisy, and I stop reading.

The Discord notification call, in full:

curl -H "Content-Type: application/json" \
  -d '{"content": "🔀 **PR Ready for Review**\n\n**Issue:** ELB-XXXX: Issue title\n**PR:** https://github.com/owner/repo/pull/XX\n\nScreenshots attached to the PR."}' \
  "$DISCORD_WEBHOOK_URL"

Five lines of bash. Not glamorous. Doing the actual work of keeping me in the loop.

I deliberately did not build an agent that asks me clarifying questions in real time. That sounds nice but in practice would mean my phone interrupts me throughout the day, which is exactly what I'm trying to avoid. Clarifying questions go into the issue as comments. I see them at noon when I sync, or in the next CEO summary. The system batches my involvement instead of fragmenting it.

What's next

The CEO summary lands at 06:00 Vienna, the PM morning batches run at 06:30, and by the time I'm at my desk I have a Discord message telling me what to look at first.

What's not yet covered: how the actual deliverables move from agent to me to production. The engineer opens PRs. Where do they go? How do I review them without spending all day reading diffs? How does the marketer push landing-page changes if its machine isn't on the public internet?

That's part 3: the execution loop. A shared deliverables repo on GitHub, Coolify previews for the engineering PRs, and the Playwright screenshot pattern that means I can review most UI changes without even clicking through to the staging URL.

How I Ship Side Projects

Part 1 of 2

How a one-person studio in Vienna ships three side products in parallel without burning out: the planning, the agent crew, and the GitHub-and-Coolify loop that keeps me down to two daily touch points.

Up next

From Idea to Linear Plan: How I Scope a Side Project Before Any Code Gets Written

Part 1 of a three-part series on running a one-person product studio. Part 2 covers the agent crew that builds the thing. Part 3 covers the GitHub-and-Coolify loop that ships it. The single biggest pr