ETL Data Tool — Capability Map
What can we actually do?
Capability Assessment
| Capability | Have | Gap | Build Ratio |
|---|---|---|---|
| PostgreSQL schemas (73 domain types) | Yes | — | 0% new |
| Drizzle ORM repositories | Yes | — | 0% new |
| agent-etl-cli.ts (DDD load order) | Yes | — | 0% new |
| ETL pipeline dashboard (8 pipelines) | Yes | — | 0% new |
| File source connectors | Yes (90%) | 10% edge cases | 10% new |
| API source connectors | Yes (85%) | 15% edge cases | 15% new |
| Database connectors | Partial (75%) | 25% | 25% new |
| NZBN API wrapper | No | Full gap | 100% new |
| Companies Office integration | No | Full gap | 100% new |
| Crawl4AI enrichment pipeline | No | Full gap | 100% new |
| Trust scoring engine | No | Full gap | 100% new |
| ANZSIC classification | No | Full gap | 100% new |
| Scheduled extraction (cron) | No | Full gap | 100% new |
Aggregate build ratio: ~60% composition (existing infra), ~40% new code.
Skill Gaps
| Skill | Available | Gap |
|---|---|---|
| TypeScript ETL development | Yes — existing pipeline patterns | — |
| REST API integration | Yes — existing connector patterns | — |
| Drizzle ORM | Yes — 73 schemas defined | — |
| Web scraping (Crawl4AI) | Partial — research done, not implemented | Docker config + LLM extraction |
| NZ government APIs | No — APIs documented but no wrapper | API key registration + TypeScript client |
| Trust scoring algorithms | No — formula designed, not implemented | Pure function, no external deps |
Reuse Inventory
| Existing Asset | Reuse For |
|---|---|
agent-etl-cli.ts | Load pattern for all new data |
| File source connector | NZBN bulk data (JSON/CSV dumps) |
| API source connector | NZBN REST API + Companies Office API |
| Quality metrics engine | Trust scoring input validation |
| Type-safe loader | All PostgreSQL inserts with rollback |
Context
- Dependency Map — What must happen first
- Agent & Instrument Diagram — How agents orchestrate