Data Warehouse: Four (4) typical transformation and loading scenarios
(a) Key Lookup Scenario
(b) Business Rule Violation Scenario
(c) Data error scenario
(d) Pivoting Scenario
(a) key Lookup Scenario
- You need to load sales transaction data into a retail Data Warehouse
- In the operational system, each product is identified by Uniform Price Codes (UPCs);
- In the Data Warehouse, product is identified by product_id.
- You need to
(1) Map UPC --> PRODUCT_ID; and
(2)As you load the source data, lookup the map table and transform the data accordingly
(b) Business Rule Violation Scenario
- Some UPC Codes are invalid
- In which ways can you implement the transformation and handle the invalid UPCs?
- i.e: new sales data might not have valid UPC codes
(1) useCreate Table As Select (CTAS)
to identify and log invalid UPCs
(2) use an outer join (+) and convert invalid UPCs to NULL
(3) use multi-table INSERT and separate the invalid UPCs to a different table
(1) use CTAS to identify and log invalid UPCs
(2) use an outer join (+) and convert invalid UPCs to NULL
(3)use multi-table INSERT and separate the invalid UPCs to a different table
(c) Data Error Scenarios
- If the quality of the data is unknown, you may face unexpected errors, such as a a failed datatype conversion, or a constraint violation.
- Such errors can happen independently of whether the UPC for the particular record is or not valid.
- Here you can use a DML error logging table to capture unexpected errors.