Data Warehouse: Concepts on ETL Processes (Transforming)










Data Warehouse: Four (4) typical transformation and loading scenarios

(a) Key Lookup Scenario
(b) Business Rule Violation Scenario
(c) Data error scenario
(d) Pivoting Scenario



(a) key Lookup Scenario
  • You need to load sales transaction data into a retail Data Warehouse
  • In the operational system, each product is identified by Uniform Price Codes (UPCs);
  • In the Data Warehouse, product is identified by product_id.
  • You need to
(1) Map UPC --> PRODUCT_ID; and
(2)As you load the source data, lookup the map table and transform the data accordingly



(b) Business Rule Violation Scenario
  • Some UPC Codes are invalid
  • In which ways can you implement the transformation and handle the invalid UPCs?
  • i.e: new sales data might not have valid UPC codes
(1) use Create Table As Select (CTAS) to identify and log invalid UPCs
(2) use an outer join (+) and convert invalid UPCs to NULL
(3) use multi-table INSERT and separate the invalid UPCs to a different table

(1) use CTAS to identify and log invalid UPCs

(2) use an outer join (+) and convert invalid UPCs to NULL

(3)use multi-table INSERT and separate the invalid UPCs to a different table



(c) Data Error Scenarios
  • If the quality of the data is unknown, you may face unexpected errors, such as a a failed datatype conversion, or a constraint violation.
  • Such errors can happen independently of whether the UPC for the particular record is or not valid.
  • Here you can use a DML error logging table to capture unexpected errors.

2 comments:

  1. This is just so timely and valuable info, thanks so much for sharing this with me Informatica Online Training

    ReplyDelete
  2. With blogs like this around I don't even need website anymore.
    I can just visit here and see all the latest happenings in the world.

    ReplyDelete