Skip to main content

ETL Data Tool — Capability Map

What can we actually do?

Capability Assessment

CapabilityHaveGapBuild Ratio
PostgreSQL schemas (73 domain types)Yes0% new
Drizzle ORM repositoriesYes0% new
agent-etl-cli.ts (DDD load order)Yes0% new
ETL pipeline dashboard (8 pipelines)Yes0% new
File source connectorsYes (90%)10% edge cases10% new
API source connectorsYes (85%)15% edge cases15% new
Database connectorsPartial (75%)25%25% new
NZBN API wrapperNoFull gap100% new
Companies Office integrationNoFull gap100% new
Crawl4AI enrichment pipelineNoFull gap100% new
Trust scoring engineNoFull gap100% new
ANZSIC classificationNoFull gap100% new
Scheduled extraction (cron)NoFull gap100% new

Aggregate build ratio: ~60% composition (existing infra), ~40% new code.

Skill Gaps

SkillAvailableGap
TypeScript ETL developmentYes — existing pipeline patterns
REST API integrationYes — existing connector patterns
Drizzle ORMYes — 73 schemas defined
Web scraping (Crawl4AI)Partial — research done, not implementedDocker config + LLM extraction
NZ government APIsNo — APIs documented but no wrapperAPI key registration + TypeScript client
Trust scoring algorithmsNo — formula designed, not implementedPure function, no external deps

Reuse Inventory

Existing AssetReuse For
agent-etl-cli.tsLoad pattern for all new data
File source connectorNZBN bulk data (JSON/CSV dumps)
API source connectorNZBN REST API + Companies Office API
Quality metrics engineTrust scoring input validation
Type-safe loaderAll PostgreSQL inserts with rollback

Context