Skip to main content

Workflow Engine

You have AI. You have business processes. What you don't have is a brain that decides which AI does what, how confident it is, and when to call a human.

That's the gap. Not more AI models. Not better prompts. A decision engine that turns business tasks into work receipts — automatically, measurably, and with an audit trail that proves the output is trustworthy.


The Job

When a venture needs to automate structured business operations — generating proposals, processing applications, routing decisions, producing reports — help it execute those tasks with AI at 70%+ automation rate, measurable confidence, and clear escalation paths for the 30% that still needs a human.

Trigger EventCurrent FailureDesired Progress
New business task arrivesHuman reads task, decides who handles it, manually delegatesEngine reads task, matches capabilities, executes or escalates
AI generates outputNo confidence score — "is this good enough?" is a gut callEvery output carries a confidence score. Below threshold = human review.
SME gets pulled into routine workSubject matter experts spend 80% of time on tasks AI could handle80% reduction in SME intervention on structured operations
Audit requestNo trail — who did what, when, why, and how confident were they?Full audit trail per execution: skill used, confidence score, escalation decision
New capability addedRequires developer to wire it into every workflowRegister the skill. The engine discovers and matches it automatically.

The job: "Execute structured business tasks with AI — and prove the output is trustworthy."

The hidden objection: "What if the AI gets it wrong and nobody catches it?" The answer: every execution has a confidence score. Below threshold, a human reviews. The engine doesn't guess — it measures.


Five Questions

Every business task hits the same five questions. The engine answers all of them before a single token is generated.

#QuestionComponentWhat It Does
1What can we do?Skill RegistryDatabase of registered capabilities — what the system knows how to execute
2What does this task need?Capability MatcherAnalyses the incoming task and maps it to registered skills
3How do we execute?OrchestratorSequences the matched skills into an execution plan
4How sure are we?Confidence ScorerScores every output. 85%+ = auto-approve. Below = escalate.
5When to involve humans?Escalation RouterRoutes low-confidence or high-stakes tasks to the right human

Most automation tools answer question 3 and skip the rest. That's why they break. Orchestration without confidence scoring is a liability. Confidence scoring without escalation is a false promise. All five questions. Every time.

Why Five, Not Three

Skip the Skill Registry and you hard-code capabilities — every new skill requires a developer. Skip the Confidence Scorer and you can't prove quality. Skip the Escalation Router and you automate failures at scale. The five components aren't features. They're a trust architecture.


The Economics

MetricValue
Value per automated execution~$250 (SME time saved + speed + consistency)
Monthly value at 100 executions$25,000
Platform cost$199/month
Break-even10 executions/month
ROI at 100 executions125x
SME time recovered80% reduction in intervention on structured tasks

The value isn't in the software. It's in the SME hours that come back. Every execution the engine handles is an hour a subject matter expert spends on work that actually requires expertise.


Feature / Function / Outcome

#FeatureFunctionOutcomeStatus
1Skill RegistryRegister, discover, and version AI capabilities in a databaseNew skills available without code changes — configure, don't buildBuild ~80%
2Capability MatcherAnalyse task requirements and map to registered skillsRight skill for the right task, every timeBuild ~80%
3OrchestratorSequence matched skills into execution plans with dependency resolutionComplex multi-step tasks run automaticallyBuild ~80%
4Confidence ScorerScore every output on a 0-100 scale with configurable thresholdsQuality is measured, not assumed. 85%+ target.Build ~70%
5Escalation RouterRoute low-confidence or high-stakes outputs to designated humansHumans handle exceptions. AI handles the volume.Build ~60%
6Audit TrailLog every execution: input, skill used, confidence, escalation decision, outputFull provenance. Answer "why did the system do this?" in seconds.Build ~70%
7Work Receipt GeneratorProduce structured output documents per executionEvery task produces a receipt — the proof of workBuild ~75%
8Threshold ConfigurationBusiness users set confidence thresholds per task typeRisk tolerance is a business decision, not an engineering oneGap
9Skill Performance AnalyticsTrack skill accuracy over time, flag degradationSkills that get worse get caught before they damage trustGap
10Workflow Builder integrationVisual interface for configuring engine behaviourBusiness users create and manage workflows without codeGap — UI layer

Business Dev

This is the backend brain. It doesn't have a UI — it powers everything that does.

LayerDecisionCurrent HypothesisValidation Signal
ICPWho benefits first?Stackmates ventures with 50+ structured tasks/month5 task types reach 70%+ automation in first cycle
OfferWhat does this enable?"Your AI handles the volume. Your experts handle the exceptions."One venture reports 80% reduction in SME time on automated task types
ChannelHow does adoption happen?Internal-first — every venture's automation runs through this engineNew task types registered via Skill Registry, not custom code
ConversionWhat proves it works?Confidence scores above 85% on automated outputsClient accepts auto-approved outputs without manual review
RetentionWhy does it stick?More skills registered = more tasks automated = harder to replaceSkill registry grows monthly. Automation rate trends up quarter-over-quarter.
ExpansionHow does it compound?Cross-venture skill sharing — a skill built for PrettyMint works for HowzUsOne skill registered by one venture is reused by another without modification

Commissioning

ComponentSchemaAPIUITestsStatus
Skill RegistryDoneDonePendingPartial80%
Capability MatcherDoneDonePendingPartial80%
OrchestratorDoneDonePendingPartial75%
Confidence ScorerDonePartialPendingPartial70%
Escalation RouterDonePartialPendingPending60%
Audit TrailDoneDonePendingPartial70%
Work Receipt GeneratorDoneDonePendingPartial75%
Threshold ConfigurationPendingPendingPendingPending0%
Skill Performance AnalyticsPendingPendingPendingPending0%

Status: Backend ~80% complete in engineering repo. Core engine runs. UI layer is 0% — all power currently accessible only through API. The Workflow Builder is the missing interface.


Metrics

MetricTargetWhy It Matters
Automation rate70%+ of structured operationsNorth star — the engine is doing its job
Confidence scores85%+ on auto-approved outputsQuality threshold — below this, humans review
SME intervention reduction80%The economic case — experts on expert work
Skill registry growth+3 skills/monthPlatform compounding — more skills, more automation
Escalation accuracy95%+ (right human, right time)Trust — escalations that matter, not noise
Audit trail completeness100% of executions loggedCompliance — every decision is traceable

Risks + Kill Signal

RiskMitigation
Confidence scorer is inaccurate — high scores on bad outputCalibration loop: human review of auto-approved outputs. Score accuracy tracked as a metric.
Skill registry becomes a dumping ground — too many low-quality skillsQuality gate: every skill must pass validation before registry entry. Performance tracking removes degraded skills.
Engine works but nobody trusts itAudit trail is the trust mechanism. If people can't see why the engine decided what it decided, trust won't form.
Framework lock-in — tied to one AI providerFramework-agnostic design. The engine orchestrates capabilities, not models. Swap the model, keep the skill.
Over-automation — engine handles tasks it shouldn'tEscalation Router exists for this. The confidence threshold is the safety valve.

Kill signal: If automation rate stays below 30% after 90 days, the engine isn't learning. Either the Skill Registry is too thin, the Capability Matcher can't map tasks to skills, or the Confidence Scorer is too conservative. Below 30% means humans are doing the work anyway and the engine is overhead. Kill and simplify.


Mycelium Capability

This is the decision engine — the backend brain that powers AI automation for any venture in the network. It doesn't face users directly. It powers every tool that does.

VentureUses The Engine ForDependency
StackmatesProposal generation, RFP responses, client reportingPrimary — first consumer
DreamineeringContent pipeline automation, research synthesisPlanned
HowzUsProperty assessment reports, compliance checksPlanned
PrettyMintProduct descriptions, inventory analysisPlanned
BerleyTrailsTrail recommendation generation, data quality tasksPlanned
TouchForFunSession planning, prompt generationPlanned
BetterPracticeProtocol recommendations, progress reportsPlanned

When the engine works, every venture gets AI automation without building its own decision layer. When it doesn't, every venture hard-codes AI calls and hopes for the best.

The Workflow Builder is the visual interface that makes this engine accessible to business users. Engine without Builder = power without access. Builder without Engine = interface without intelligence.


Context