Showing posts with label Partitioning. Show all posts
Showing posts with label Partitioning. Show all posts

Q&As: Data Warehousing and Partitioning (II)



(Q) What is a Partition-Wise Join? How can SQL statements benefit from Full Partition-Wise Joins?

  • A P-W Join divides a large join into smaller joins between a pair of partitions from the two joined tables.
  • To use P-W Joins:
    • (a) equipartition BOTH tables on their JOIN keys or
    • (b) use Reference partitioning
  • P-W Joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel
  • Full P-W Joins: Divide a large join into smaller joins between a pair of partitions from the two joined tables
  • Tables MUST be equipartitioned on their join keys, or use reference partitioning.


(Q) What are the two most common types of indexes in Data Warehouse environments?

  • B-tree and bitmap indexes
  • Both can be created as local indexes on a partitioned table, inheriting the table's partitining strategy
  • Local nonprefixed indexes can improve performance because many index partitions can be scanned in parallel by range queries on the index key.
SQL> create index cust_last_name_ix ON customers(last_name) LOCAL PARALLEL NOLOGGING;


(Q) How can bitmap index improve performance in a Data Warehouse environment?

  • Bitmap indexes use a very efficient storage mechanism for low cardinality columns.
  • Fact table foreign key column are ideal candidates for bitmap indexes, since generally there are few distinct values relative to the total number of rows in the fact table. (Because dimension tables should not have many rows)


(Q) What is Partition Exchange Load (PEL)? When is it used? What benefits does it provides?

  • Oracle strongly recommends that the larger tables or fact tables in a data warehouse are partitioned.
  • with a partitioned table, you can load data quickly and easily with minimal impact on business users by using the exchange partition command.
  • With the exchange partition command you can swap the data in a non-partitioned table into a particular partition in your partitioned table.
  • NO physical data movement occurs with the exchange partition command.
  • Instead the data dictionary is updated to exchange a pointer from the partition to the table and vice versa.
  • Because there is no physical movement of data, an exchange does not generate redo and undo, making it a VERY fast operation.
  • You can convert by exchanging their data segments:
    • (a) a partition (or subpartition) into a nonpartitioned table and
    • (b) a nonpartitioned table into a partition (or subpartition) of a partitioned table
  • In DWH environments, exchanging partitions facilitates high-speed data loading of new, incremental data into an existing partitioned table.
  • OLTP and DWH also can benefit from exchanging old data partitions out of a partitioned table.
  • In the latter case, the data is purged from the partitioned table without being deleted and can be archived separately afterwards.

(1) create a separate table that looks exactly like a single partition in the existing table (including indexes, constraints and partition schema)
(2) Load the data in the new table, build indexes, enable constraints. This will not impact current users of the existing table
(3) Exchange the existing table partition with the newly created table
i.e. (a) create new partition on existing table 
 SQL> Alter table sales add partition p_sales_jun_2007 
                                                    values less than (to_date('01-JUL-2007', 'DD-MON-YYYY');

(b) create temp table that looks like existing partition 
 SQL> create table sales_jun_2007 COMPRESS FOR OLTP as select * from sales where 1=0;

(c) Populate table and create indexes and constraints 
 SQL> create bitmap index time_id_jun_2007_bix ON sales_jun_2007(time_id) nologging;
 SQL> create bitmap index cust_id_jun_2007_bix ON sales_jun_2007(cust_id) nologging;
 SQL> create bitmap index prod_id_jun_2007_bix ON sales_jun_2007(prod_id) nologging;
 SQL> create bitmap index promo_id_jun_2007_bix ON sales_jun_2007(promo_id) nologging;
 SQL> create bitmap index channel_id_jun_2007_bix ON sales_jun_2007(channel_id) nologging;
 
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT prod_id_fk 
                                            FOREIGN KEY (prod_id) REFERENCES products(prod_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT cust_id_fk 
                                            FOREIGN KEY (cust_id) REFERENCES customers(cust_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT promo_id_fk 
                                            FOREIGN KEY (promo_id) REFERENCES promotions(promo_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT time_id_fk 
                                            FOREIGN KEY (time_id) REFERENCES times(time_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT channel_id_fk 
                                            FOREIGN KEY (channel_id) REFERENCES channels(channel_id);

(d) Exchange partition 
 SQL> Alter table sales EXCHANGE PARTITION p_sales_jun_2007 WITH TABLE sales_jun_2007
   including indexes;

Partition Exchange Load (PEL)



Q&As: Data Warehosing and Partitioning (I)




(Q) In which ways partitions facilitate administrative operations?

  • Partitioning enable administrative operations to work on subsets of data.
  • You can add, organize or drop a partition with minimal or zero interruption to read-only applications
  • Adding a partition: you can defer segment creation, which is useful when the segment is very large
  • Changing partition granularity: You can split or merge partitions.
  • Swapping partitions: you can easily add, remove, or swap a large amount of data quickly
  • Partitioning enhances data access and improve overall application performance:
    • More users can query the system, since each SQL is likely to access smaller amounts of data
    • Complex queries run faster. Less IO access


(Q) List three advantages of using partitioned objects

(1) Backup and recovery can be performed on a lower level of granularity (at the partition level)
(2) Part of the database objects can be placed in compressed storage, while others remain uncompressed
(3) You can transparently store data on different storage tiers to lower costs


(Q) How can the database take advantage of partitioning when executing an SQL?

  • Using parallel executions to speed up queries, DML and DDL.
  • Individual parallel execution servers can work on their own data sets, identified by partition boundaries


(Q) what is Partition pruning?

  • When executing a SQL statement, the Optimizer analyzes FROM and WHERE clauses to eliminate unneeded partitions from the execution plan's access list.
  • In the example below, the database scans only four partitions when executing the query. In an non-partitioned table, all data would have to be scanned.


(Q) What is the difference between static pruning and dynamic pruning?

  • Static pruning - occurs at compile-time.
    • example: SQL stmt with WHERE clause with a constant literal on the partition key column
  • Dynamic pruning - occurs at run-time, when the exact partitions to be accessed are not known before hand
    • example: SQL stmt with operator or function in the WHERE condition


(Q) How can you identify when static pruning is used in an execution plan?

  • In the execution plan, the OPERATION column shows PARTITION RANGE SINGLE, indicating that only a single (or range of) partition(s) is being accessed.
  • If OPERATION = PARTITION RANGE ALL, then ALL partitions are being accessed
select * from sales where time_id = to_date('01-jan-2001', 'dd-mon-yyyy');
... Returns 766 rows

explain plan for select * from sales where time_id = to_date('01-jan-2001', 'dd-mon-yyyy');
plan FOR succeeded

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------
Plan hash value: 3733446669

---------------------------------------------------------------------------------------------------------------------  
| Id | Operation                          | Name           | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------- 
|  0 | SELECT STATEMENT                   |                |  766 | 22980 |    27   (0)| 00:00:01 |       |       |
|  1 |  PARTITION RANGE SINGLE            |                |  766 | 22980 |    27   (0)| 00:00:01 |    17 |    17 |
|  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |  766 | 22980 |    27   (0)| 00:00:01 |    17 |    17 |
|  3 |    BITMAP CONVERSION TO ROWIDS     |                |      |       |            |          |       |       |
|* 4 |     BITMAP INDEX SINGLE VALUE      | SALES_TIME_BIX |      |       |            |          |    17 |    17 |
--------------------------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):                
----------------------------------------
   4 - access("TIME_ID"=TO_DATE(' 2001-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))      

 16 rows selected 


(Q) What types of SQL statements are candidates for partition pruning?

  • On Range or List partitioned columns: When you use range, LIKE, equality, and IN-list predicates
  • On Hash Partitioned columns: When you use equlity and IN-list predicates


(Q) How can datatype conversions affect the type of partition pruning that may be used?

  • Datatype conversion typically lead to dynamic, when static pruning would have otherwise been possible
  • Static pruning provides better performance than dynamic pruning
  • example: Proper use of TO_DATE function guarantees that the database can uniquely determine the date value and thus the partition that should be accessed
Compare the following queries:
SELECT SUM(amount_sold) total_revenue FROM sh.sales
(clause 1)   WHERE time_id between '01-JAN-00' and '31-DEC-00' ;

(clause 2)   WHERE time_id between '01-JAN-2000' and '31-DEC-2000' ;

(clause 3)   WHERE time_id between to_date('01-JAN-2000', 'dd-MON-yyyy') and 
                                   to_date('31-DEC-2000', 'dd-MON-yyyy');

(Clause 1:) 
Leads to dynamic pruning, because date needs to be implicitly converted and does not match nls_date_format

SQL> explain plan for 
 SELECT SUM(amount_sold) total_revenue 
 FROM sales
 WHERE time_id between '01-JAN-00' and '31-DEC-00';

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------
Plan hash value: 3772476237   

---------------------------------------------------------------------------------------------------- 
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
---------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT           |       |     1 |    13 |   592  (19)| 00:00:08 |       |       | 
|   1 |  SORT AGGREGATE            |       |     1 |    13 |            |          |       |       | 
|*  2 |   FILTER                   |       |       |       |            |          |       |       | 
|   3 |    PARTITION RANGE ITERATOR|       |   230K|  2932K|   592  (19)| 00:00:08 |   KEY |   KEY | 
|*  4 |     TABLE ACCESS FULL      | SALES |   230K|  2932K|   592  (19)| 00:00:08 |   KEY |   KEY | 
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
                                                  
   2 - filter(TO_DATE('01-JAN-00')<=TO_DATE('31-DEC-00'))
   4 - filter("TIME_ID">='01-JAN-00' AND "TIME_ID"<='31-DEC-00')

 17 rows selected 


(Clause 2:) 
Uses STATIC pruning, because although date needs to be implicitly converted is DOES match nls_date_format

SQL> explain plan for 
 SELECT SUM(amount_sold) total_revenue 
 FROM sales
 WHERE time_id between '01-JAN-2000' and '31-DEC-2000';

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT 
-------------------------------------------------------------------------------------------------
Plan hash value: 1500327972 

--------------------------------------------------------------------------------------------------- 
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
--------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT          |       |     1 |    13 |   136  (11)| 00:00:02 |       |       | 
|   1 |  SORT AGGREGATE           |       |     1 |    13 |            |          |       |       | 
|   2 |   PARTITION RANGE ITERATOR|       |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 | 
|*  3 |    TABLE ACCESS FULL      | SALES |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 | 
--------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):  
---------------------------------------------------  

   3 - filter("TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

 15 rows selected 

(Clause 3:) 
Uses STATIC pruning, because although date is explicitly converted

SQL> Explain plan for
SELECT SUM(amount_sold) total_revenue 
FROM sales
   WHERE time_id between to_date('01-JAN-2000', 'dd-MON-yyyy') and 
                         to_date('31-DEC-2000', 'dd-MON-yyyy');

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT 
-------------------------------------------
Plan hash value: 1500327972

---------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |     1 |    13 |   136  (11)| 00:00:02 |       |       |
|   1 |  SORT AGGREGATE           |       |     1 |    13 |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR|       |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 |
|*  3 |    TABLE ACCESS FULL      | SALES |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 |
---------------------------------------------------------------------------------------------------
                                                                                                  
Predicate Information (identified by operation id): 
--------------------------------------------------- 

   3 - filter("TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

 15 rows selected 

Q&As: Partitioning




(Q) Which of the following sentences about partitioning and data compression are true?

Basic data distribution methods: Range, Hash, and List
(a) Single-Level Partitioning
  • Range partitioning
  • Hash Partitioning
  • List Partitioning
(b) Composite Partitioning
  • Range-Range
  • Range-Hash
  • Range-List
  • List-Range
  • List-Hash
  • List-List
(c) Interval Partitioning
(d) System Partitioning
(e) Reference Partitioning
(f) Virtual column-based partitioning


(Q) Which of the following sentences about partitioning and data compression are true?

( ) Data in a partitioned table can be compressed on a partition-by-partition basis
( ) When you alter a partition to enable compression, compression is applied only to future data to be inserted
( ) When you enable compression on a partition, all data in the partition, including previously inserted data, is compressed
( ) Before adding a compressed partition to a partitioned table, you must either drop or make UNUSABLE all existing bitmap indexes
( ) You DO NOT have to drop or make unusable b-tree indexes in a table before adding a compressed partition