Skip to main content
L4inner-loop

CLI Platform

When I need to manage projects, communicate with engineering, audit system health, or measure outcomes — drmg is the single control surface that works for both human operators and AI agents.

1,200
Priority Score
Pain × Demand × Edge × Trend × Conversion
Customer Journey

Why should I care?

Five cards that sell the dream

1Why

Five CLIs, one binary.

What if every CLI command shared one platform library?

The friction: Five independent CLIs means five arg parsers, five env loaders, five output formats. Every new script reinvents the wheel.

The desire: One binary (drmg) that routes to all commands. Shared platform lib handles parsing, output, env, and DB context.

The proof: 9 namespaces, 60+ commands, zero duplicate infrastructure.

Picture
Five terminal windows merging into one — cables converging on a single dark console. Cinematic, 16:9
1 / 5

Same five positions. Different seat.

The operator sees one binary. The agent sees runtime introspection. The contributor sees enforced boundaries. The platform sees compounding extraction.

Feature Dev Journey

How did this get built?

Five cards that show the process

1Job

The job: extract, don't invent.

What proven patterns already exist across 5 CLIs?

Situation: Five CLIs built independently. 4 duplicate arg parsers, 4 env loaders, 2+ direct SQL scripts. Every session pays the duplication tax.

Intention: 8 shared concerns extracted into one platform lib. Every "Custom" or "Direct" cell in the capability matrix is a duplication cost.

Obstacle: The 3,826-line monolith. Decomposing it without breaking 40+ daily commands.

Picture
A capability matrix — 8 rows (concerns) x 5 columns (CLIs). Custom/Direct cells highlighted in red, shared cells in green. Cinematic, 16:9
1 / 5

The pitch is the shape. The flow diagrams prove the thinking. The VV stories validate the value.

Problem

Situation

Dream team and agents interface with the stackmates system through the drmg CLI daily — plan dashboard, comms read/post, audit, measure, data gaps. The CLI works but doesn't report what operators need: plans drift without stale warnings, dashboards show plan IDs not project names, merged PRs leave plan DB out of sync, session boundaries aren't queryable, and error messages assume you already know the fix.

Intention

An agent-grade CLI control surface scored 16/20 on the 10-dimension CLI Standard — structured I/O, runtime introspection, context discipline, input hardening, safety rails, response safety, packaged guidance, multi-surface coherence, headless auth, and failure design.

Obstacle

The CLI was built to solve engineering problems (monolith decomposition, boundary enforcement). Operator and agent needs were bolted on, not designed in. Dashboard output is plan-centric, not project-centric. Error recovery requires tribal knowledge. No batch operations. No staleness detection.

Hardest Thing

Reframing a working CLI from 'does it run commands' to 'does it give operators and agents the information they need to make decisions' — without breaking 60+ commands agents depend on daily.

Scorecard

Priority (5P)

5/5
Pain
4/5
Demand
4/5
Edge
5/5
Trend
3/5
Convert

Readiness (5R)

Principles5 / 5
Performance4 / 5
Platform5 / 5
Process4 / 5
Players3 / 5

What Exists

ComponentState
drmg unified binaryWorking
libs/cli-platformWorking
Plan CLI (decomposed)Working
ETL CLI (absorbed)Working
Comms CLI (absorbed)Working
ESLint boundary ruleWorking
drmg describeWorking

Relationships

PRDContributes
Agent PlatformParent platform this instrument serves. Agents use drmg for all CLI operations.
Project ManagementPlan-cli is the most-used CLI. Decomposition was driven by plan-cli's 3,826-line monolith.
ETL Data ToolETL CLI absorbed into drmg. 8 live pipelines share platform lib.
Kill Signal

CLI Standard score drops below 12/20 for two consecutive sprints, or agents abandon drmg for direct script calls. Neither has triggered.

Questions

If the CLI is the control surface for the entire system, what's the cost of every operator workaround that should be a command?

  • How many decisions per session are delayed because the dashboard requires 4 commands instead of 1?
  • What's the cost of plan-DB drift — measured in wrong priorities acted on, not minutes to sync?
  • If an agent can't distinguish 'no messages' from 'connection broken', what decisions does it make on false data?