Extract, Transform, Load
Data is the lifeblood of modern civilisation.
🗃️ Sources
3 items
📄️ Web Scraping
Data scraping with AI.
Data migration from one data source to another.
Process
Think of data source as immutable, never change data that provided from raw source.
Use scripts to clean data making a central repository for any systems to consume.
Use additional scripts to transform data to be consumed by target systems.
Collection:
- Extract
- Transform
Manipulation:
- Target Transformation
- Load
Data Collection
Scripts and processes to extract data from it's source.
Store data locally as it was provided from source. Do not alter.
Script to create a new instance of data cleaned up to match local system data integrity rules.
Suffix local column names with _local
to avoid name collisions.
For example data with country names in COUNTRY create COUNTRY_LOCAL column.
Use logic to transform data as per local requirements.
For example if data was provided as NZ but local system requires full country name of New Zealand then run script to populate.
Manipulation
Use scripts are required to split cleaned source data into data pools for systems to load as required.
Apply system specific business roles then aggregate data if required to match purpose of the system.
Load data into external systems for:
- Flow Control Dashboards
- Strategic Analysis
- Alogrithmic Decision Making