ETL Data Tool
What's the point of building schemas if nothing flows through them?
Scorecard
| Dimension | Score | Evidence |
|---|---|---|
| Pain | 5/5 | CRM = 0 businesses. Nowcast = 0 signals. Every data consumer blocked. |
| Demand | 4/5 | Internal dependency for 5+ PRDs. No external demand yet. |
| Edge | 4/5 | 73 domain types + free NZ govt APIs + existing Drizzle repos. 6+ months to replicate. |
| Trend | 5/5 | 73% AI projects fail on data (Gartner). MCP scraping tools exploding. Crawl4AI 58K stars. |
| Conversion | 3/5 | Clear internal path. External pricing untested. |
| Composite | 1200 | 5 x 4 x 4 x 5 x 3 |
Kill signal: Data loads but nobody queries it within 14 days = extraction theater.
Issues
| # | Severity | What Happens | Fix |
|---|---|---|---|
| 18 | MEDIUM | /settings/etl returns 404. Settings sidebar "ETL Pipelines" links to missing page. | Create the route or fix sidebar link to correct path. |
Context
- Sales CRM & RFP — Primary consumer: needs business profiles and contacts
- Sales Dev Agent — Consumer: needs qualified leads from classified businesses
- Pipeline Nowcast — Consumer: needs business signals for variance prediction
- Data Interface — Child: access layer that sits on ETL output
- AI Data Industry — Market context and competitive landscape
Questions
What happens to every downstream PRD if this pipeline stays empty for another month?
- If NZBN gives us 700K businesses for free, what's our excuse for having zero in the CRM?
- At what trust score threshold does scraped data become more dangerous than no data?
- When the first venture queries ETL-loaded data without writing extraction code, does that prove the Mycelium thesis?