Agentic Engineering: Agent Design
Specialization Over Intelligence
The Permission Problem
Your agent is supposed to be a senior engineer. You gave it the spec. You gave it the codebase. You gave it full access.
Then it asks: “Should I proceed with this approach?”
Or worse: it doesn’t ask. It violates boundaries. Uses tools it shouldn’t. Modifies files outside its scope. Changes requirements instead of implementing them.
Most teams experience this pattern:
Typical agent workflow:
Agent asks for permission (5 minutes lost)
You clarify what you already specified (10 minutes)
Agent proceeds, violates a boundary (writes to wrong directory)
You correct it (5 minutes)
Agent asks if correction looks good (another 5 minutes)
Total: 25 minutes of back-and-forth for work that should be autonomous.
After building dozens of specialized agents, the pattern became clear: Agents act uncertain not because the models are weak, but because the prompts are weak.
What If Your Agents Had Senior-Level Judgment?
Not perfect judgment—what judgment is perfect? But senior-level judgment:
Knows when to ask vs when to proceed
Respects boundaries without being told repeatedly
Makes decisions within their authority
Follows conventions without explicit instruction
Communicates only when genuinely necessary
Most agents lack judgment not because of model capability, but because of identity clarity.
We write prompts like job descriptions:
“You write code following best practices”
“Use appropriate tools”
“Ask if you need clarification”
When you hire a senior engineer, you don’t say “write code following best practices.” You say:
“You’re a Staff Rails Engineer with 20 years experience. You follow the Rails way—convention over configuration. When you see a routing question, you know RESTful patterns. When you see database work, you prevent N+1 queries by reflex. You don’t ask permission to apply what you know.”
Same principle applies to AI agents. Give them identity, not just instructions.
The Judgment Gap
Most agents ask too many questions or violate too many boundaries because:
No clear identity: Generic role (”helpful assistant”), no expertise level, no guiding philosophy. Agent doesn’t know what “senior judgment” looks like in this domain.
Vague boundaries: “Use tools as needed” or “Follow the plan” without explicit ALLOWED/FORBIDDEN lists. Agent either over-asks (safe) or over-reaches (fast but dangerous).
Unclear authority: What’s fixed vs flexible? What can agent decide vs must respect? When requirements vs when implementation details? Agent either changes things it shouldn’t (scope creep) or asks about things it should decide (implementation details).
The solution isn’t better models. It’s better agent design.
The Six Components of Strong Agents
After building numerous specialized agents, a pattern emerged. Agents that acted with senior-level judgment shared the same structure:
┌─────────────────────────────────────────────┐
│ 1. STRONG IDENTITY │
│ Role, expertise, philosophy, seniority │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 2. TOOL RESTRICTIONS │
│ Explicit ALLOWED/FORBIDDEN with rationale│
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 3. AUTHORITY BOUNDARIES │
│ INPUT (fixed) vs OUTPUT (your decision) │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 4. WORKFLOW INTEGRATION │
│ Numbered steps with success criteria │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 5. QUALITY STANDARDS │
│ Sacred Rules (must) + Taste (should) │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 6. COMMUNICATION GUIDELINES │
│ When to ask vs when to proceed │
└─────────────────────────────────────────────┘Autonomy Emerges From Constraint
Unclear roles create hesitation. An agent without explicit expertise doesn’t know when its judgment applies. It defaults to asking permission rather than risk exceeding unclear bounds.
Vague authority creates insecurity. Without knowing what’s fixed versus flexible, agents either violate scope boundaries or seek validation for decisions within their authority. Both waste time.
Unlimited freedom creates chaos. An agent with no explicit constraints has no framework for judgment. It tries everything, fails repeatedly, and learns nothing transferable between tasks.
Explicit boundaries enable autonomy. When an agent knows precisely what it cannot change, it moves confidently within what it can. When it knows which tools are forbidden, it uses allowed tools without trial-and-error. When it knows when to ask versus proceed, it asks only when necessary.
This applies equally to humans and AI agents. Senior engineers are effective not despite constraints, but because of them. Rails conventions don’t limit DHH—they enable him to build faster by eliminating low-value decisions. The same mechanism works for agents.
The six components that follow formalize this principle into practice.
Component 1: Strong Identity
Most agent prompts start weak:
❌ “You are a helpful assistant that writes code.”
❌ “You are a marketing agent.”
These create uncertain agents. No expertise level. No philosophy. No cultural grounding.
The Four-Part Identity Pattern
Part 1: Role + Seniority + Experience
You are a **Senior Rails Engineering Agent** with 37signals/DHH-level expertise.
**Role:** Staff/Principal Rails Engineer (20+ years experience)“Senior” creates confidence. “37signals/DHH-level” grounds in specific philosophy—public figures the LLM knows. “20+ years” primes deep pattern knowledge.
Part 2: Expertise Areas
**Expertise:** Ruby on Rails, Hotwire (Turbo + Stimulus), PostgreSQLSpecific domains prime relevant knowledge. Agent knows what it’s expert in. Boundaries clear.
Part 3: Guiding Philosophy
**Philosophy:** “The Rails way” - Convention over Configuration, YAGNIProvides decision framework. Tie-breaker when multiple approaches are valid. Creates consistency (”What would the 37signals team do?”).
Part 4: Technology Stack
**Technology Stack:** Rails 8+, Hotwire, PostgreSQL, Solid QueuePrimes specific tool knowledge. Grounds recommendations in real constraints. Makes decisions actionable.
Complete Identity Example:
You are a **Senior Rails Engineering Agent** with 37signals/DHH-level expertise.
**Role:** Staff/Principal Rails Engineer (20+ years experience)
**Expertise:** Ruby on Rails, Hotwire (Turbo + Stimulus), PostgreSQL
**Philosophy:** “The Rails way” - Convention over Configuration, YAGNI
**Core Workflow:** Red-Green-Refactor (TDD always)
**Technology Stack:** Rails 8+, Hotwire, PostgreSQL, Solid QueueThis agent won’t ask “Should I use Turbo?” It knows its stack. It knows its philosophy. It proceeds with confidence.
Component 2: Tool Restrictions
Vague tool guidance creates boundary violations:
❌ “Use appropriate tools as needed”
The agent tries tools. Fails. Tries again. Eventually asks: “Which tools should I use?”
Pattern: Explicit ALLOWED/FORBIDDEN
### Tool Restrictions
**ALLOWED:**
- [Tool] - [Purpose and when to use]
- [Tool] - [Purpose and when to use]
**FORBIDDEN:**
- [Tool] - [Rationale for prohibition]
- [Tool] - [Rationale for prohibition]Planning Agent Example (Read-Only)
**ALLOWED:**
- Read - Feature specs, existing documentation, reference materials
- Glob - Find related files for context
- Grep - Search for patterns and examples
- Write - ONLY for creating plan documents in plans/ directory
**FORBIDDEN:**
- Edit - Cannot modify existing documents (plans are new, not edits)
- Bash - Not needed for planning (use Read/Glob/Grep)Clear what’s allowed (four tools, one Write boundary). Clear what’s forbidden (Edit, Bash). Rationale prevents confusion (”not needed” vs “not allowed”).
Execution Agent Example (Full Access)
**ALLOWED:**
- Read, Write, Edit, Bash, Glob, Grep - Full implementation access
**FORBIDDEN:**
- (None - full access for implementation)Explicit “full access” statement. Still lists what’s available. No forbidden tools = maximum autonomy.
Validation Agent Example (Read-Only + Report)
**ALLOWED:**
- Read - Implementation artifacts, plans, specifications
- Glob - Find all files to validate
- Grep - Search for patterns and violations
- Bash - Run tests, linters (read-only commands)
- Write - ONLY for validation reports in reports/ directory
**FORBIDDEN:**
- Edit - Cannot modify code (validation only, not correction)
- Write to code directories - Reports go in reports/ onlyCan run tests (Bash allowed). Cannot fix issues (Edit forbidden). Single Write permission (reports only). Role boundary enforced through tools.
Component 3: Authority Boundaries
Most agents either ask too much or change too much because they don’t know what’s fixed versus flexible.
Given a feature spec, should the agent change the requirements? (No—that’s scope creep.) Choose the data model? (Yes—that’s architectural decision.) Modify acceptance criteria? (No—those define success.) Pick implementation patterns? (Yes—that’s technical choice.)
Without clear boundaries, agents either ask about implementation details they should decide or change requirements they should respect.
The INPUT/OUTPUT Pattern
### Authority Boundaries
**INPUT (What You Receive) - AUTHORITATIVE:**
- [What comes from upstream - you cannot change this]
- [What’s fixed by specifications]
**OUTPUT (What You Produce) - YOUR AUTHORITY:**
- [Technical decisions you own]
- [Approach choices within scope]
**Examples of INPUT (fixed):**
- ❌ [Thing you cannot change]
**Examples of OUTPUT (your decision):**
- ✅ [Thing you decide]Software Architect Example
**INPUT (Feature Spec) - AUTHORITATIVE:**
- Business requirements - you CANNOT change these
- Acceptance criteria - these become test scenarios
- Feature scope - fixed by product decisions
**OUTPUT (Architecture) - YOUR AUTHORITY:**
- Data model design (JSONB vs relational, indexes)
- API design (endpoints, parameters, responses)
- Frontend patterns (which UI framework patterns to use)
- Performance optimizations (caching, query optimization)
- Task ordering (which work happens in which sequence)
**Examples of INPUT (fixed):**
- ❌ “This feature should track fewer fields”
- ❌ “We don’t need approval timestamps”
**Examples of OUTPUT (your decision):**
- ✅ “Use JSONB for flexible state storage”
- ✅ “Extract approval logic to Approval model”Agent knows what not to change (requirements). Agent knows what to decide (technical approach). Concrete examples prevent confusion. Scope creep prevented. Permission-seeking reduced.
Marketing Strategist Example
**INPUT (Campaign Brief) - AUTHORITATIVE:**
- Campaign goals and KPIs - you CANNOT change these
- Budget allocation - fixed by finance approval
- Brand guidelines - non-negotiable standards
- Timeline constraints - fixed by launch date
**OUTPUT (Strategy) - YOUR AUTHORITY:**
- Channel selection (paid social, email, content, etc.)
- Audience segmentation approach
- Messaging hierarchy and positioning
- Creative direction and tone
- A/B test design and hypothesis
**Examples of INPUT (fixed):**
- ❌ “We should increase the budget”
- ❌ “Let’s extend the launch date”
**Examples of OUTPUT (your decision):**
- ✅ “Focus budget on Instagram and TikTok for Gen Z audience”
- ✅ “Use storytelling format with customer testimonials”Agent stops asking “Can I use this channel?” and starts deciding based on expertise.
Component 4: Workflow Integration
Vague workflows create inconsistent execution:
❌ “Implement the feature following best practices”
❌ “Create a marketing strategy”
No clear steps. No validation points. Agent wings it.
Pattern: Numbered Steps with Checkpoints
## Core Workflow
**IMPORTANT:** Follow these [N] steps for EVERY [task/artifact].
### 1. [STEP NAME]
- [What to do]
- [Success criteria]
- [Output/checkpoint]
### 2. [STEP NAME]
- [What to do]
- [Success criteria]
- [Output/checkpoint]Software Engineer Example (TDD)
## Core Workflow: Red-Green-Refactor
**IMPORTANT:** Repeat these 7 steps for EVERY task, one at a time.
### 1. UNDERSTAND
- Read task carefully
- Identify files involved
- Identify which loaded rules apply
- Plan unhappy path tests
### 2. CREATE BRANCH
- Create feature branch: `feature/[id]-[name]`
- Confirm branch created: `git branch --show-current`
### 3. UPDATE CHANGELOG
- Add task to CHANGELOG.md “In Progress” section
- Create subtask checklist if complex
### 4. RED (Write Tests)
- Write failing test for happy path
- Write failing tests for unhappy paths (validation, auth, 404, edge cases)
- Run test suite → Confirm all new tests FAIL
- Commit: `[ID] [Component] Test - Description`
### 5. GREEN (Make It Pass)
- Implement minimum code to pass tests
- Load and apply Sacred Rules (technical correctness)
- Run test suite → Confirm all tests PASS
- Commit: `[ID] [Component] Add - Description`
### 6. REFACTOR (Apply Taste)
- Load and apply Sacred Taste (code quality)
- Improve code within modified files only
- Run test suite → Confirm still green
- Commit: `[ID] [Component] Refactor - Description`
### 7. VALIDATE
- Self-check against loaded rules
- Security checklist
- Verify strings localized
- Update CHANGELOG.md to “Completed”
- Mark task completeNumbered sequence (no skipping). Success criteria per step (”confirm tests FAIL”). Tool usage specified (git commands, test suite). Commit format enforced through examples. Skill loading integrated (step 5 = rules, step 6 = taste).
Agent follows the workflow. Every time. No asking “What should I do next?”
Component 5: Quality Standards
“Follow best practices” is too vague. Which practices? Says who?
Two-Tier Quality System
Sacred Rules - Technical Correctness (MUST follow)
Non-negotiable standards. Violations cause failures or errors.
Software Development:
BR-01: Use
params.expect()notparams.require()BR-08: Prevent N+1 queries with eager loading
FR-01: Use
dom_id()helpers for element IDsFR-07: Semantic HTML required
Marketing:
MR-01: All campaigns must define measurable KPIs
MR-02: Target audience validated against data
MR-04: Brand voice guidelines compliance
MR-06: UTM parameters configured for all links
Sacred Taste - Quality Preferences (SHOULD follow)
Maintainability guidelines. Not breaking, but better.
Software Development:
BT-01: Methods ≤15 lines
BT-03: Controller logic minimal
FT-01: Semantic CSS classes
FT-03: UI components ≤50 lines
Marketing:
MT-01: Headlines ≤10 words for digital
MT-02: Active voice in CTAs
MT-04: Tone consistency across channels
Loading Strategy (Progressive Disclosure)
Don’t dump all rules at once. Load just-in-time:
**Before work:**
- Load [domain-skill]/SKILL.md (navigation file, ~80 lines)
**During correctness phase:**
- Load specific Sacred Rule reference files as needed
- Example: Working on params? Load BR-01-params-expect.md
**During quality phase:**
- Load specific Sacred Taste reference files as needed
- Example: Improving methods? Load BT-01-method-length.mdThis is the Skills layer from Article 1. Agents have access to institutional knowledge without drowning in it. We’ll talk more about this in the Skills article that is coming soon.
Component 6: Communication Guidelines
Weak agents ask too much:
“Should I proceed?”
“Does this look good?”
“Is this the right approach?”
Every question costs time. Most are unnecessary.
Pattern: When to Ask vs When to Proceed
## Communication Guidelines
### When to Ask for Clarification
**ASK when:**
- [Scenario requiring genuine clarification]
- [Scenario with major trade-offs]
- [Scenario with missing critical information]
**DO NOT ask:**
- “Should I proceed?” - Always proceed with documented assumptions
- “Is this the right approach?” - Trust your expertise and guidelines
- “Does this look good?” - Apply quality verification checklistSoftware Architect Example
### When to Ask for Clarification
**ASK when:**
- Feature spec has genuine ambiguity that cannot be resolved by conventions
- Multiple valid architectural approaches exist with major trade-offs
(document options with pros/cons before asking)
- Critical information missing that prevents architectural decision
(e.g., external service required but not specified)
**DO NOT ask:**
- “Should I proceed?” - Always proceed with documented assumptions
- “Is this the right approach?” - Trust Rails conventions and your expertise
- “Does this look good?” - Apply quality verification checklist
### Clarification Format
**Question:**
[Clear, specific question]
**Context:**
[Why this matters for the architecture]
**Options Considered:**
1. [Option A]: [Pros/Cons]
2. [Option B]: [Pros/Cons]
**Recommended Approach:**
[Your recommendation with rationale]
**Impact if Wrong:**
[What happens if we choose wrong]Agent knows when asking is appropriate (genuine ambiguity). Agent knows when to proceed (implementation details). When asking, provides options and recommendation (not just question).
Real-World Results
My reference implementation: visionaire-rails-team
Five specialized agents, each with the six components:
1. Architect Agent (Planning Phase)
Identity: Senior Rails Technical Architect
Tools: Read-only + Write plans
Authority: Cannot change requirements, decides all technical approach
Workflow: 6-step architecture process
2. Engineer Agent (Execution Phase)
Identity: Senior Rails Engineer, 20+ years TDD
Tools: Full access (Read, Write, Edit, Bash)
Authority: Cannot change architecture, decides implementation details
Workflow: 7-step Red-Green-Refactor
3. Feature Validator (Compliance Phase)
Identity: Senior Quality Analyst, compliance expert
Tools: Read-only + Bash (tests) + Write reports
Authority: Cannot change code, validates spec compliance
Workflow: 5-step compliance verification
4. Code Reviewer (Quality Phase)
Identity: Senior Code Reviewer, patterns expert
Tools: Read-only + Write reports
Authority: Cannot change code, assesses quality
Workflow: 6-step quality assessment
5. Spec Validator (Requirements Phase)
Identity: Senior Requirements Analyst
Tools: Read-only + Write reports
Authority: Cannot change implementation, validates requirements met
Workflow: 4-step requirements verification
Behavioral Changes After Implementing Six Components:
Before the six components, agents were uncertain. They asked permission for implementation details, violated tool boundaries, changed requirements instead of implementing them, followed no consistent workflow, and applied vague “best practices.”
With the six components, agents act with judgment. They proceed autonomously within authority, respect boundaries by design, implement requirements as specified, follow consistent workflows, and apply explicit quality standards.
The behavioral shift: agents stopped seeking validation for decisions within their authority and stopped violating boundaries outside their authority. Permission requests dropped per feature because agents knew when asking was appropriate. Boundary violations dropped per feature because tool restrictions were explicit.
The mechanism: unclear boundaries create uncertainty, which creates either over-asking (safe but slow) or over-reaching (fast but chaotic). Clear boundaries create confidence, which creates autonomous execution within defined scope.
Beyond Software: Judgment in Any Domain
The same six components work for any domain requiring autonomous expertise.
Legal Contract Analysis:
Six-component agent:
Identity: Senior Counsel, 12+ years commercial agreements
Tools: Read contracts, Write reports only
Authority: Cannot change contracts, categorizes risk
Workflow: 6-step Scan-Categorize-Analyze-Report
Quality: Legal Sacred Rules for flagging
Communication: Escalate only critical items
Result: Autonomous risk assessment with clear escalation boundaries. Agent stops asking “Should I flag this clause?” and starts applying Legal Sacred Rules to determine flagging criteria.
Marketing Campaign Strategy:
Six-component agent:
Identity: Director-level Strategist, data-driven storytelling
Tools: Read briefs, Web research, Write strategies
Authority: Cannot change budget, decides channels
Workflow: 8-step Research-Segment-Strategy-Measure
Quality: Marketing Sacred Rules for compliance
Communication: Ask only when goals conflict
Result: Autonomous strategy development with data-backed decisions. Agent stops asking “Is this channel appropriate?” and starts evaluating channels against audience fit and budget constraints.
The Five Principles
1. Identity Creates Confidence
Give seniority, expertise, philosophy, and stack. Not “helpful assistant” but “Senior Expert with [specific grounding].” Agent thinks from experience, not uncertainty.
2. Restrictions Enable Autonomy
Explicit ALLOWED/FORBIDDEN prevents trial-and-error. Agent knows boundaries before acting. More restrictions = less asking.
3. Authority Prevents Scope Creep
Clear INPUT (fixed) vs OUTPUT (flexible) boundaries. Agent implements, not redefines. Respects requirements, decides approach.
4. Workflows Create Consistency
Numbered steps with checkpoints. Same process every time. No improvisation, no asking “what next?”
5. Standards Replace “Best Practices”
Sacred Rules (must follow) + Sacred Taste (should follow). Concrete, verifiable, domain-specific. Not vague “quality.”
Implementation Path
You don’t need to redesign all your agents at once. Start with one. Apply the six components. Measure the behavioral change.
Day 1: Pick Your Weakest Agent (1 hour)
Which agent asks the most questions? Violates the most boundaries? That’s your starting point.
Day 2: Add Strong Identity (30 minutes)
You are a **[Senior] [Role]** with [expertise grounding].
**Role:** [Seniority level] ([years] experience)
**Expertise:** [Specific domains]
**Philosophy:** [Guiding principles]
**Technology Stack:** [Specific tools]Day 3: Add Tool Restrictions (30 minutes)
**ALLOWED:**
- [Tool 1] - [When/why to use]
**FORBIDDEN:**
- [Tool 1] - [Rationale]Day 4: Add Authority Boundaries (1 hour)
**INPUT (What You Receive) - AUTHORITATIVE:**
- [Fixed requirement 1]
**OUTPUT (What You Produce) - YOUR AUTHORITY:**
- [Decision 1]
**Examples of INPUT (fixed):**
- ❌ [Cannot change this]
**Examples of OUTPUT (your decision):**
- ✅ [You decide this]Week 2: Add Workflow Steps (2 hours)
## Core Workflow
**IMPORTANT:** Follow these [N] steps for EVERY [task].
### 1. [STEP]
- [What to do]
- [Success criteria]Week 3: Add Quality Standards (2-3 hours)
**Sacred Rules ([PREFIX]-*)** - MUST follow:
- [RULE-01]: [Non-negotiable standard]
**Sacred Taste ([PREFIX]-*)** - SHOULD follow:
- [TASTE-01]: [Quality preference]Week 4: Add Communication Guidelines (1 hour)
**ASK when:**
- [Genuine ambiguity]
**DO NOT ask:**
- “Should I proceed?” - Always proceedTest the Agent:
Run it on a task you’ve done before. Compare:
Questions asked: Before vs After
Boundary violations: Before vs After
Rework cycles: Before vs After
The judgment improvement will be measurable.
What’s Coming Next
This article covered agent design—the six components that create senior-level judgment.
Article 3: “Skills - Institutional Knowledge for AI Teams”
How to structure Sacred Rules and Sacred Taste
Progressive disclosure patterns
Building reference documentation agents actually use
Article 4: “Orchestration - Coordinating Specialists”
Revision loop patterns when agents need second attempts
Escalation to humans when automation isn’t enough
Batch vs interactive execution modes
Article 5: “Metadata - The Learning Layer”
Quality metrics that actually matter
Learning from patterns across executions
Continuous improvement cycles
Summary
Building autonomous agents isn’t about better models or longer prompts. It’s about explicit identity, boundaries, and authority.
Six components—Identity, Tool Restrictions, Authority Boundaries, Workflow Integration, Quality Standards, Communication Guidelines—formalize how constraint enables autonomy.
The agents I build now proceed autonomously, respect boundaries, and communicate only when necessary. Not because the models improved. Because the agent design improved.
Next:
Previous:
Quick Reference
The Six Components:
Strong Identity - Role + seniority + expertise + philosophy + stack
Tool Restrictions - Explicit ALLOWED/FORBIDDEN with rationale
Authority Boundaries - INPUT (fixed) vs OUTPUT (your decision)
Workflow Integration - Numbered steps with success criteria
Quality Standards - Sacred Rules (must) + Sacred Taste (should)
Communication Guidelines - When to ask vs when to proceed
Key Patterns:
Seniority creates confidence (”Senior” not “helpful”)
Restrictions enable autonomy (clear boundaries = less asking)
Authority prevents scope creep (input fixed, output flexible)
Workflows create consistency (same steps every time)
Standards replace vagueness (Sacred Rules not “best practices”)
Guidelines reduce questions (explicit when to ask)
Judgment Indicators:
Questions only for genuine ambiguity
Boundary violations near zero
Consistent workflow execution
Quality standards self-applied
Autonomous decision-making within authority
Communication only when necessary
Start Here:
Pick your weakest agent (asks most questions)
Add strong identity (seniority + expertise + philosophy)
Define tool restrictions (ALLOWED/FORBIDDEN)
Clarify authority boundaries (INPUT/OUTPUT)
Number workflow steps (with success criteria)
Set communication guidelines (when to ask vs proceed)

