Pages

Search This Blog

Tuesday, December 9, 2014

[DI] ETL versus ELT

When does ETL win?

  • Ordered transformations not well suited to set processing.
  • Integration of third party software tools best managed by Informatica outside of the RDBMS (e.g., name and address standardization utilities).
  • Maximize in-memory execution for multiple step transformations that do not require access to large volumes of historical or lookup data (note: caching plays a role).
  • Streaming data loads using message-based feeds with "real-time" data acquisition.

When does ELT win?

  • Leverage of high performance DW platform for execution reduces capacity requirements on ETL servers - this is especially useful when peak requirements for data integration are in a different window than peak requirements for data warehouse analytics.
  • Significantly reduce data retrieval overhead for transformations that require access to historical data or large cardinality lookup data already in the data warehouse.
  • Batch or mini-batch loads with reasonably large data sets, especially with pre-existing indices that may be leveraged for processing.
  • Optimize performance for large scale operations that are well suited for set operations such as complex joins and large cardinality aggregations.

No comments: