Skip to main content

ETL Data Tool — Capability Map

What can we actually do?

Capability Assessment

Capability	Have	Gap	Build Ratio
PostgreSQL schemas (73 domain types)	Yes	—	0% new
Drizzle ORM repositories	Yes	—	0% new
agent-etl-cli.ts (DDD load order)	Yes	—	0% new
ETL pipeline dashboard (8 pipelines)	Yes	—	0% new
File source connectors	Yes (90%)	10% edge cases	10% new
API source connectors	Yes (85%)	15% edge cases	15% new
Database connectors	Partial (75%)	25%	25% new
NZBN API wrapper	No	Full gap	100% new
Companies Office integration	No	Full gap	100% new
Crawl4AI enrichment pipeline	No	Full gap	100% new
Trust scoring engine	No	Full gap	100% new
ANZSIC classification	No	Full gap	100% new
Scheduled extraction (cron)	No	Full gap	100% new

Aggregate build ratio: ~60% composition (existing infra), ~40% new code.

Skill Gaps

Skill	Available	Gap
TypeScript ETL development	Yes — existing pipeline patterns	—
REST API integration	Yes — existing connector patterns	—
Drizzle ORM	Yes — 73 schemas defined	—
Web scraping (Crawl4AI)	Partial — research done, not implemented	Docker config + LLM extraction
NZ government APIs	No — APIs documented but no wrapper	API key registration + TypeScript client
Trust scoring algorithms	No — formula designed, not implemented	Pure function, no external deps

Reuse Inventory

Existing Asset	Reuse For
`agent-etl-cli.ts`	Load pattern for all new data
File source connector	NZBN bulk data (JSON/CSV dumps)
API source connector	NZBN REST API + Companies Office API
Quality metrics engine	Trust scoring input validation
Type-safe loader	All PostgreSQL inserts with rollback

Context

Dependency Map — What must happen first
Agent & Instrument Diagram — How agents orchestrate

Capability Assessment
Skill Gaps
Reuse Inventory
Context