Workflow Engine
You have AI. You have business processes. What you don't have is a brain that decides which AI does what, how confident it is, and when to call a human.
That's the gap. Not more AI models. Not better prompts. A decision engine that turns business tasks into work receipts — automatically, measurably, and with an audit trail that proves the output is trustworthy.
The Job
When a venture needs to automate structured business operations — generating proposals, processing applications, routing decisions, producing reports — help it execute those tasks with AI at 70%+ automation rate, measurable confidence, and clear escalation paths for the 30% that still needs a human.
| Trigger Event | Current Failure | Desired Progress |
|---|---|---|
| New business task arrives | Human reads task, decides who handles it, manually delegates | Engine reads task, matches capabilities, executes or escalates |
| AI generates output | No confidence score — "is this good enough?" is a gut call | Every output carries a confidence score. Below threshold = human review. |
| SME gets pulled into routine work | Subject matter experts spend 80% of time on tasks AI could handle | 80% reduction in SME intervention on structured operations |
| Audit request | No trail — who did what, when, why, and how confident were they? | Full audit trail per execution: skill used, confidence score, escalation decision |
| New capability added | Requires developer to wire it into every workflow | Register the skill. The engine discovers and matches it automatically. |
The job: "Execute structured business tasks with AI — and prove the output is trustworthy."
The hidden objection: "What if the AI gets it wrong and nobody catches it?" The answer: every execution has a confidence score. Below threshold, a human reviews. The engine doesn't guess — it measures.
Five Questions
Every business task hits the same five questions. The engine answers all of them before a single token is generated.
| # | Question | Component | What It Does |
|---|---|---|---|
| 1 | What can we do? | Skill Registry | Database of registered capabilities — what the system knows how to execute |
| 2 | What does this task need? | Capability Matcher | Analyses the incoming task and maps it to registered skills |
| 3 | How do we execute? | Orchestrator | Sequences the matched skills into an execution plan |
| 4 | How sure are we? | Confidence Scorer | Scores every output. 85%+ = auto-approve. Below = escalate. |
| 5 | When to involve humans? | Escalation Router | Routes low-confidence or high-stakes tasks to the right human |
Most automation tools answer question 3 and skip the rest. That's why they break. Orchestration without confidence scoring is a liability. Confidence scoring without escalation is a false promise. All five questions. Every time.
Why Five, Not Three
Skip the Skill Registry and you hard-code capabilities — every new skill requires a developer. Skip the Confidence Scorer and you can't prove quality. Skip the Escalation Router and you automate failures at scale. The five components aren't features. They're a trust architecture.
The Economics
| Metric | Value |
|---|---|
| Value per automated execution | ~$250 (SME time saved + speed + consistency) |
| Monthly value at 100 executions | $25,000 |
| Platform cost | $199/month |
| Break-even | 10 executions/month |
| ROI at 100 executions | 125x |
| SME time recovered | 80% reduction in intervention on structured tasks |
The value isn't in the software. It's in the SME hours that come back. Every execution the engine handles is an hour a subject matter expert spends on work that actually requires expertise.
Feature / Function / Outcome
| # | Feature | Function | Outcome | Status |
|---|---|---|---|---|
| 1 | Skill Registry | Register, discover, and version AI capabilities in a database | New skills available without code changes — configure, don't build | Build ~80% |
| 2 | Capability Matcher | Analyse task requirements and map to registered skills | Right skill for the right task, every time | Build ~80% |
| 3 | Orchestrator | Sequence matched skills into execution plans with dependency resolution | Complex multi-step tasks run automatically | Build ~80% |
| 4 | Confidence Scorer | Score every output on a 0-100 scale with configurable thresholds | Quality is measured, not assumed. 85%+ target. | Build ~70% |
| 5 | Escalation Router | Route low-confidence or high-stakes outputs to designated humans | Humans handle exceptions. AI handles the volume. | Build ~60% |
| 6 | Audit Trail | Log every execution: input, skill used, confidence, escalation decision, output | Full provenance. Answer "why did the system do this?" in seconds. | Build ~70% |
| 7 | Work Receipt Generator | Produce structured output documents per execution | Every task produces a receipt — the proof of work | Build ~75% |
| 8 | Threshold Configuration | Business users set confidence thresholds per task type | Risk tolerance is a business decision, not an engineering one | Gap |
| 9 | Skill Performance Analytics | Track skill accuracy over time, flag degradation | Skills that get worse get caught before they damage trust | Gap |
| 10 | Workflow Builder integration | Visual interface for configuring engine behaviour | Business users create and manage workflows without code | Gap — UI layer |
Business Dev
This is the backend brain. It doesn't have a UI — it powers everything that does.
| Layer | Decision | Current Hypothesis | Validation Signal |
|---|---|---|---|
| ICP | Who benefits first? | Stackmates ventures with 50+ structured tasks/month | 5 task types reach 70%+ automation in first cycle |
| Offer | What does this enable? | "Your AI handles the volume. Your experts handle the exceptions." | One venture reports 80% reduction in SME time on automated task types |
| Channel | How does adoption happen? | Internal-first — every venture's automation runs through this engine | New task types registered via Skill Registry, not custom code |
| Conversion | What proves it works? | Confidence scores above 85% on automated outputs | Client accepts auto-approved outputs without manual review |
| Retention | Why does it stick? | More skills registered = more tasks automated = harder to replace | Skill registry grows monthly. Automation rate trends up quarter-over-quarter. |
| Expansion | How does it compound? | Cross-venture skill sharing — a skill built for PrettyMint works for HowzUs | One skill registered by one venture is reused by another without modification |
Commissioning
| Component | Schema | API | UI | Tests | Status |
|---|---|---|---|---|---|
| Skill Registry | Done | Done | Pending | Partial | 80% |
| Capability Matcher | Done | Done | Pending | Partial | 80% |
| Orchestrator | Done | Done | Pending | Partial | 75% |
| Confidence Scorer | Done | Partial | Pending | Partial | 70% |
| Escalation Router | Done | Partial | Pending | Pending | 60% |
| Audit Trail | Done | Done | Pending | Partial | 70% |
| Work Receipt Generator | Done | Done | Pending | Partial | 75% |
| Threshold Configuration | Pending | Pending | Pending | Pending | 0% |
| Skill Performance Analytics | Pending | Pending | Pending | Pending | 0% |
Status: Backend ~80% complete in engineering repo. Core engine runs. UI layer is 0% — all power currently accessible only through API. The Workflow Builder is the missing interface.
Metrics
| Metric | Target | Why It Matters |
|---|---|---|
| Automation rate | 70%+ of structured operations | North star — the engine is doing its job |
| Confidence scores | 85%+ on auto-approved outputs | Quality threshold — below this, humans review |
| SME intervention reduction | 80% | The economic case — experts on expert work |
| Skill registry growth | +3 skills/month | Platform compounding — more skills, more automation |
| Escalation accuracy | 95%+ (right human, right time) | Trust — escalations that matter, not noise |
| Audit trail completeness | 100% of executions logged | Compliance — every decision is traceable |
Risks + Kill Signal
| Risk | Mitigation |
|---|---|
| Confidence scorer is inaccurate — high scores on bad output | Calibration loop: human review of auto-approved outputs. Score accuracy tracked as a metric. |
| Skill registry becomes a dumping ground — too many low-quality skills | Quality gate: every skill must pass validation before registry entry. Performance tracking removes degraded skills. |
| Engine works but nobody trusts it | Audit trail is the trust mechanism. If people can't see why the engine decided what it decided, trust won't form. |
| Framework lock-in — tied to one AI provider | Framework-agnostic design. The engine orchestrates capabilities, not models. Swap the model, keep the skill. |
| Over-automation — engine handles tasks it shouldn't | Escalation Router exists for this. The confidence threshold is the safety valve. |
Kill signal: If automation rate stays below 30% after 90 days, the engine isn't learning. Either the Skill Registry is too thin, the Capability Matcher can't map tasks to skills, or the Confidence Scorer is too conservative. Below 30% means humans are doing the work anyway and the engine is overhead. Kill and simplify.
Mycelium Capability
This is the decision engine — the backend brain that powers AI automation for any venture in the network. It doesn't face users directly. It powers every tool that does.
| Venture | Uses The Engine For | Dependency |
|---|---|---|
| Stackmates | Proposal generation, RFP responses, client reporting | Primary — first consumer |
| Dreamineering | Content pipeline automation, research synthesis | Planned |
| HowzUs | Property assessment reports, compliance checks | Planned |
| PrettyMint | Product descriptions, inventory analysis | Planned |
| BerleyTrails | Trail recommendation generation, data quality tasks | Planned |
| TouchForFun | Session planning, prompt generation | Planned |
| BetterPractice | Protocol recommendations, progress reports | Planned |
When the engine works, every venture gets AI automation without building its own decision layer. When it doesn't, every venture hard-codes AI calls and hopes for the best.
The Workflow Builder is the visual interface that makes this engine accessible to business users. Engine without Builder = power without access. Builder without Engine = interface without intelligence.
Context
- Phygital Mycelium — The capability network this engine belongs to
- Workflow Builder — The visual frontend that makes this engine usable
- ETL Data Tool — Upstream dependency: engine needs clean data to reason over
- Data Interface — How the engine accesses structured data
- Commissioning State Machine — Progressive maturity model this engine follows
- Sales CRM & RFP — First consumer: CRM workflows are the proving ground
- Mushroom Caps — The ventures that consume this automation primitive
- Jobs To Be Done — Demand-side thinking: what progress, not what features
- Flow Engineering — Maps that produce code artifacts
- Standards — Where proven automation patterns graduate to