Data ETL

Extract, transform, load is the general procedure of copying data from one or more sources into a destination system which represents the data differently from the source(s) or in a different context than the source(s).

  • Extract, extracting the data from the source system(s).
  • Transform, a series of rules or functions are applied to the extracted data in order to prepare it for loading into the end target.
  • Load, loads the data into the end target, which can be any data store including a simple delimited flat file or a data warehouse.

The typical real-life ETL cycle consists of the following execution steps:

  • Cycle initiation
  • Build reference data
  • Extract (from sources)
  • Validate
  • Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates)
  • Stage (load into staging tables, if used)
  • Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair)
  • Publish (to target tables)
  • Archive

Tools

Top 12 Free and Open Source ETL Tools for Data Integration