Showing posts with label Oracle. Show all posts
Showing posts with label Oracle. Show all posts

Using Oracle External Tables



  • External tables do not reside in the database, and can be in any format for which an access driver is provided.
  • You can select, join, or sort external table data.
  • You can also create views and synonyms for external tables.
  • You CANNOT execute DML operations (UPDATE, INSERT, or DELETE) , and CANNOT create indexes on external tables.

External tables and Data Warehouse
  • Exporting data from a database:
    • External tables provide a framework to unload the result of an arbitrary SELECT statement into a platform-independent Oracle-proprietary format that can be used by Oracle Data Pump.
  • External tables provide a valuable means for performing basic extraction, transformation, and loading (ETL) tasks that are common for data warehousing.

Creating External tables
  • CREATE TABLE...ORGANIZATION EXTERNAL statement.
  • This statement creates only metadata in the data dictionary.
  • External tables can be thought of as views that allows running any SQL query against external data without requiring that the external data first be loaded into the database.

Accessing External Data
  • An access driver is the actual mechanism used to read the external data in the table.
  • When you use external tables to unload data, the metadata is automatically created based on the data types in the SELECT statement.

Oracle Database provides two access drivers for external tables:
  • ORACLE_LOADER driver (default)
    • allows the reading of data from external files using the Oracle loader technology.
    • provides data mapping capabilities which are a subset of the control file syntax of SQL*Loader utility.
  • ORACLE_DATAPUMP driver
    • lets you unload data: read data from the database and insert it into an external table, represented by one or more external files—and then reload it into an Oracle Database.








Scenario 1:
Unload data to flat files and create External Table.
ORACLE_LOADER driver









Step 1 - Export data from table sh.customers

/* Step 1 - Export data from table sh.customers 
   Procedure UNLOAD_DATA receives table_name and table_owner as parameter and 
     unloads the table data to a flat file */

set serveroutput on

DECLARE
  procedure unload_data( 
       p_table_name in varchar2,
       p_owner  in varchar2)
  IS
   sqltext varchar2(1000);
   v_textrecord varchar2(2000);
   cv sys_refcursor;

   fdest utl_file.file_type;
   type FileAttrRec is Record (
                      vfilexists BOOLEAN,
                      vfilelength number,
                      vblocksize  binary_integer);
   vfilerec fileattrrec;
   vfilename varchar2(30);

    procedure get_sql (
       p_sqltext in out varchar2,
       p_table_name in varchar2,
       p_owner in varchar2) 
    IS
      cursor c is
        select column_name
        from dba_tab_columns
        where table_name = p_table_name and owner = p_owner
        order by column_id;
      i integer;
      sql_stmt varchar2(2000);
      from_clause varchar2(100);
    begin
      i :=0;
      sql_stmt := 'SELECT ';
      from_clause :=  'FROM ' || p_owner ||'.' || p_table_name;
      for r in c loop
        if i = 0 then
          sql_stmt := sql_stmt || ' ' || r.column_name;
          i := 1;
        else
          sql_stmt := sql_stmt || '|| ''|''|| ' || r.column_name;
        end if;
      end loop;
      sql_stmt := sql_stmt || ' as qrystr ' || chr(10) || from_clause;
      p_sqltext := sql_stmt;
      -- dbms_output.put_line( sql_stmt);
    end get_sql; 

  begin
  -- 1. Get Query SQL
   dbms_output.enable(20000);
   sqltext := '';  
   get_sql(sqltext, p_table_name, p_owner);
   --dbms_output.put_line('Select stmt:' || sqltext); 

  -- 2. Open cursor for query. Write each row in the OS file
       vfilename := 'unload_' || p_owner ||'_'|| p_table_name || '.dat';
       utl_file.fgetattr('DATA_DUMP_DIR', vfilename, vfilerec.vfilexists, 
                              vfilerec.vfilelength, vfilerec.vblocksize);
       if vfilerec.vfilexists then 
          fdest := utl_file.fopen('DATA_PUMP_DIR', vfilename, 'a', 2048);
          dbms_output.put_line('Destination file exists. Appending..'); 
       else
          fdest := utl_file.fopen('DATA_PUMP_DIR', vfilename, 'w', 2048);
       end if;
  
   open cv for sqltext;
   LOOP
       fetch cv into v_textrecord;
       EXIT WHEN cv%NOTFOUND;
       utl_file.put_line(fdest, v_textrecord, true);
   END LOOP;
   utl_file.fclose(fdest);
   close cv; 
  END unload_data;

BEGIN
 unload_data('CUSTOMERS' , 'SH');
END;


Step 2 - Split unloaded file and create external table using the resulting files

$ ls -ltr
total 482928
...
-rw-r--r-- 1 oracle oinstall  10661903 2012-03-20 22:01 unload_SH_CUSTOMERS.dat

-- spliting the file in two smaller files
$ split -30000 unload_SH_CUSTOMERS.dat unload_SH_CUSTOMERS.dat
$ ls -ltr
total 503760
...
-rw-r--r-- 1 oracle oinstall  10661903 2012-03-20 22:01 unload_SH_CUSTOMERS.dat
-rw-r--r-- 1 oracle oinstall   5768182 2012-03-20 22:15 unload_SH_CUSTOMERS.dataa
-rw-r--r-- 1 oracle oinstall   4893721 2012-03-20 22:15 unload_SH_CUSTOMERS.datab

 Step 2 - Create external table using unloaded file(s)
 
 -- create the external table
SQL>  Create table ext_customers1
     ( 
         "CUST_ID"                NUMBER,
         "CUST_FIRST_NAME"        VARCHAR2(20 BYTE),
         "CUST_LAST_NAME"         VARCHAR2(40 BYTE),
         "CUST_GENDER"            CHAR(1 BYTE),
         "CUST_YEAR_OF_BIRTH"     NUMBER(4,0),
         "CUST_MARITAL_STATUS"    VARCHAR2(20 BYTE),
         "CUST_STREET_ADDRESS"    VARCHAR2(40 BYTE),
         "CUST_POSTAL_CODE"       VARCHAR2(10 BYTE),
         "CUST_CITY"              VARCHAR2(30 BYTE),
         "CUST_CITY_ID"           NUMBER,
         "CUST_STATE_PROVINCE"    VARCHAR2(40 BYTE),
         "CUST_STATE_PROVINCE_ID" NUMBER,
         "COUNTRY_ID"             NUMBER,
         "CUST_MAIN_PHONE_NUMBER" VARCHAR2(25 BYTE),
         "CUST_INCOME_LEVEL"      VARCHAR2(30 BYTE),
         "CUST_CREDIT_LIMIT"      NUMBER,
         "CUST_EMAIL"             VARCHAR2(30 BYTE),
         "CUST_TOTAL"             VARCHAR2(14 BYTE),
         "CUST_TOTAL_ID"          NUMBER,
         "CUST_SRC_ID"            NUMBER,
         "CUST_EFF_FROM" DATE,
         "CUST_EFF_TO" DATE,
         "CUST_VALID" VARCHAR2(1 BYTE)
       )
  ORGANIZATION EXTERNAL 
  (
     TYPE ORACLE_LOADER
     DEFAULT DIRECTORY data_pump_dir
     ACCESS PARAMETERS
      (
          records delimited by newline
          badfile data_pump_dir:'cust1ext%a_%p.bad' 
          logfile data_pump_dir:'cust1ext%a_%p.log'
          fields terminated by '|'
          missing field values are null
      )
     LOCATION ('unload_SH_CUSTOMERS.dataa', 'unload_SH_CUSTOMERS.datab')
  )
  PARALLEL
  REJECT LIMIT UNLIMITED;

table EXT_CUSTOMERS1 created.

SQL> select table_name, type_name, default_directory_name, reject_limit, access_parameters, property 
 from dba_external_tables;
 
TABLE_NAME      TYPE_NAME       DEFAULT_DIRECTORY_NAME   REJECT_LIMIT    ACCESS_PARAMETERS                       
--------------- --------------- ------------------------ --------------- ------------------------------------------
EXT_CUSTOMERS1   ORACLE_LOADER   DATA_PUMP_DIR            UNLIMITED       records delimited by newline             
                                                                          badfile data_pump_dir:'cust1ext%a_%p.bad'
                                                                          logfile data_pump_dir:'cust1ext%a_%p.log' 
                                                                          fields terminated by '|'                  
                                                                          missing filed values are null             
 

SQL> select count(*) from ext_customers1;

  COUNT(*)
----------
     55500









Scenario 2: Unloading and Loading Data with the ORACLE_DATAPUMP Access Driver






Step 1
  • (a) Use CREATE TABLE ... ORGANIZATION EXTERNAL ... AS SELECT ...
  • This will create a file with the data resulting from the specified query.
  • The file is created with a binary format that can only be read by the ORACLE_DATAPUMP access driver
  • The example below creates an external table and populates the dump file for the external table with the data from table sh.customers.

connect sh/**;

SQL> create table customers1_xt
  organization external
  ( 
    type oracle_datapump
    default directory data_pump_dir
    location ('customers1_xt.dmp')
  )
  as select * from sh.customers;

table created.

SQL> 


(b) Check the OS files created: 

$ ls -ltr
total 60672
-rw-r----- 1 oracle oinstall 10334208 2012-03-21 00:33 customers1_xt.dmp
-rw-r--r-- 1 oracle oinstall      123 2012-03-21 00:36 CUSTOMERS1_XT_7165.log

(c) Compare descriptions of customers and customers1_xt tables.

SQL> desc customers;

Name                   Null     Type         
---------------------- -------- ------------ 
CUST_ID                NOT NULL NUMBER       
CUST_FIRST_NAME        NOT NULL VARCHAR2(20) 
CUST_LAST_NAME         NOT NULL VARCHAR2(40) 
CUST_GENDER            NOT NULL CHAR(1)      
CUST_YEAR_OF_BIRTH     NOT NULL NUMBER(4)    
CUST_MARITAL_STATUS             VARCHAR2(20) 
CUST_STREET_ADDRESS    NOT NULL VARCHAR2(40) 
CUST_POSTAL_CODE       NOT NULL VARCHAR2(10) 
CUST_CITY              NOT NULL VARCHAR2(30) 
CUST_CITY_ID           NOT NULL NUMBER       
CUST_STATE_PROVINCE    NOT NULL VARCHAR2(40) 
CUST_STATE_PROVINCE_ID NOT NULL NUMBER       
COUNTRY_ID             NOT NULL NUMBER       
CUST_MAIN_PHONE_NUMBER NOT NULL VARCHAR2(25) 
CUST_INCOME_LEVEL               VARCHAR2(30) 
CUST_CREDIT_LIMIT               NUMBER       
CUST_EMAIL                      VARCHAR2(30) 
CUST_TOTAL             NOT NULL VARCHAR2(14) 
CUST_TOTAL_ID          NOT NULL NUMBER       
CUST_SRC_ID                     NUMBER       
CUST_EFF_FROM                   DATE         
CUST_EFF_TO                     DATE         
CUST_VALID                      VARCHAR2(1)


SQL> desc customers1_xt;

Name                   Null     Type         
---------------------- -------- ------------ 
CUST_ID                NOT NULL NUMBER       
CUST_FIRST_NAME        NOT NULL VARCHAR2(20) 
CUST_LAST_NAME         NOT NULL VARCHAR2(40) 
CUST_GENDER            NOT NULL CHAR(1)      
CUST_YEAR_OF_BIRTH     NOT NULL NUMBER(4)    
CUST_MARITAL_STATUS             VARCHAR2(20) 
CUST_STREET_ADDRESS    NOT NULL VARCHAR2(40) 
CUST_POSTAL_CODE       NOT NULL VARCHAR2(10) 
CUST_CITY              NOT NULL VARCHAR2(30) 
CUST_CITY_ID           NOT NULL NUMBER       
CUST_STATE_PROVINCE    NOT NULL VARCHAR2(40) 
CUST_STATE_PROVINCE_ID NOT NULL NUMBER       
COUNTRY_ID             NOT NULL NUMBER       
CUST_MAIN_PHONE_NUMBER NOT NULL VARCHAR2(25) 
CUST_INCOME_LEVEL               VARCHAR2(30) 
CUST_CREDIT_LIMIT               NUMBER       
CUST_EMAIL                      VARCHAR2(30) 
CUST_TOTAL             NOT NULL VARCHAR2(14) 
CUST_TOTAL_ID          NOT NULL NUMBER       
CUST_SRC_ID                     NUMBER       
CUST_EFF_FROM                   DATE         
CUST_EFF_TO                     DATE         
CUST_VALID                      VARCHAR2(1)  



(d) Check data on customers1_xt and compare with original table

SQL> select count(*) from customers1_xt;

COUNT(*)               
---------------------- 
55500

SQL> select * from customers minus select * from customers1_xt;

no rows selected


(e) You CANNOT perform DML on the external table:

SQL> delete from customers1_xt where cust_id = 100055;
delete from customers1_xt where cust_id = 100055
            *
ERROR at line 1:
ORA-30657: operation not supported on external organized table


(f) You can also Create an external table with multiple dump files and with a degree of parallelism higher than one.

SQL> create table customers1_xtp3
 organization external
 (
   type oracle_datapump
   default directory data_pump_dir
   location('customers1_xtp31.dmp' ,'customers1_xtp32.dmp' ,'customers1_xtp33.dmp' )
 )
 parallel 3
 as select * from sh.customers;

$ ls -ltr
total 60672
-rw-r----- 1 oracle oinstall 10334208 2012-03-21 00:33 customers1_xt.dmp
-rw-r--r-- 1 oracle oinstall      123 2012-03-21 00:36 CUSTOMERS1_XT_7165.log
-rw-r--r-- 1 oracle oinstall       41 2012-03-21 10:13 CUSTOMERS1_XTP3_2690.log
-rw-r--r-- 1 oracle oinstall       41 2012-03-21 10:13 CUSTOMERS1_XTP3_2737.log
-rw-r--r-- 1 oracle oinstall       41 2012-03-21 10:13 CUSTOMERS1_XTP3_2739.log
-rw-r--r-- 1 oracle oinstall       41 2012-03-21 10:13 CUSTOMERS1_XTP3_2741.log
-rw-r----- 1 oracle oinstall  2990080 2012-03-21 10:13 customers1_xtp32.dmp
-rw-r----- 1 oracle oinstall  3792896 2012-03-21 10:13 customers1_xtp33.dmp
-rw-r----- 1 oracle oinstall  3592192 2012-03-21 10:13 customers1_xtp31.dmp


(g) You can transfer or copy the dump file(s) and use them for another external table either in the same or in a different database. 

SQL> Create table customers2_xt
     ( 
         "CUST_ID"                NUMBER,
         "CUST_FIRST_NAME"        VARCHAR2(20 BYTE),
         "CUST_LAST_NAME"         VARCHAR2(40 BYTE),
         "CUST_GENDER"            CHAR(1 BYTE),
         "CUST_YEAR_OF_BIRTH"     NUMBER(4,0),
         "CUST_MARITAL_STATUS"    VARCHAR2(20 BYTE),
         "CUST_STREET_ADDRESS"    VARCHAR2(40 BYTE),
         "CUST_POSTAL_CODE"       VARCHAR2(10 BYTE),
         "CUST_CITY"              VARCHAR2(30 BYTE),
         "CUST_CITY_ID"           NUMBER,
         "CUST_STATE_PROVINCE"    VARCHAR2(40 BYTE),
         "CUST_STATE_PROVINCE_ID" NUMBER,
         "COUNTRY_ID"             NUMBER,
         "CUST_MAIN_PHONE_NUMBER" VARCHAR2(25 BYTE),
         "CUST_INCOME_LEVEL"      VARCHAR2(30 BYTE),
         "CUST_CREDIT_LIMIT"      NUMBER,
         "CUST_EMAIL"             VARCHAR2(30 BYTE),
         "CUST_TOTAL"             VARCHAR2(14 BYTE),
         "CUST_TOTAL_ID"          NUMBER,
         "CUST_SRC_ID"            NUMBER,
         "CUST_EFF_FROM" DATE,
         "CUST_EFF_TO" DATE,
         "CUST_VALID" VARCHAR2(1 BYTE)
       )
  ORGANIZATION EXTERNAL 
  (
     TYPE ORACLE_DATAPUMP
     DEFAULT DIRECTORY data_pump_dir
     LOCATION ('customers1_xtp31.dmp' ,'customers1_xtp32.dmp' ,'customers1_xtp33.dmp')
  );

SQL> select count(*) from customers2_xt;

COUNT(*)               
---------------------- 
55500   

SQL> Select * from sh.customers minus select * from customers2_xt;

no rows selected.

Data Warehouse: Basic Concepts (I)



Common requirements of Data Warehouse systems

  • Must provide easy access to the organization's information
  • Show consistency in the display of information
  • Must be adaptive and resilient to change
  • Must safely keep information
  • Must serve as a foundation for improved decision making
  • Must be accepted by the business community in the enterprise


Common components of a Data Warehouse (Kimball)

Four components:
(a) Operational Source System
(b) Data Staging Area
(3) Data Presentation Area
(4) Data Access Tools

"One of the biggest threats to DWH success is confusing the components' roles and functions."
i.e. Don't give access to end users to the Staging Area. Don't build the DWH on a normalized design.




Characteristics of source operational systems

  • Main priorities of source operational system are processing performance and availability
  • Queries against source systems are narrow, one-record-at-a-time queries, and are severy restricted in their demands on the Operating System
  • Source systems are not queried in broad and unexpected ways: queries are tunned, specific, and known before hand.
  • Source systems maintain little historical data



(Q) what are +two key architectural requirements for the staging area?

  • (a) Staging area is off-limits to business users
  • (b) Does not provide query and presentation services
  • (c) Does not (should not) need to have a normalized structure to hold data


(Q) What are some common characteristics of the staging area?

  • Dominated by tasks such as sorting and sequential processing
  • Often not based on relational technology but instead on many flat files
  • Usually most of the tasks of extraction, transformation and loading (ETL) of data from operational systems into the data warehouse is concentrated in the staging area.
  • Tables in the staging area should be segregated from the "live" data warehouse, i.e., they should NOT be available for end users' queries
  • A basic implementation here is to have an identical schema to the one that exists in the source operational system(s) but with some structural changes to the tables, such as range partitioning.
  • Alternatively, in some implementations all data transformation processing is done “on the fly” as data is extracted from the source system before it is inserted directly into the Presentation area.


(Q) What are four common transformations done (usually in the staging area) to the extracted data during the ETL process?

  • (a) Cleansing (correct misspellings, resolve domain conflicts, deal with missing data, reformatting)
  • (b) Combining data from multiple sources
  • (c) Deduplicating data
  • (d) Assigning warehouse keys


(Q) What are some performance considerations when planning the loading stage of the ETL process?

  • The goal in this phase is to load the data into the warehouse in the most expedient manner.
  • Careful consideration should be given to where the data being-loaded resides and how you load it into the database.
  • For example, do not use a serial database link or a single JDBC connection to move large volumes of data.
  • Flat files are the most common and preferred mechanism of large volumes of data.

  • The overall speed of your load will be determined by
    • (A) how quickly the raw data can be read from staging area and
    • (B) how fast it can be processed and inserted into the database.
  • You should stage the raw data across as many physical disks as possible to ensure the reading it is not a bottleneck during the load.


(Q) What tasks are usually part of the loading of the Presentation area?

  • Load each data Mart
  • Index new data for query performance
  • Supply/compute appropriate aggregates (i.e. build or refresh Materialized Views)


(Q) What is a Data Mart? How it relates to the Presentation Area and to the Data Warehouse?

  • A Data Mart is a section of the overall presentation area, which usually consists of a series of integrated data marts.
  • At its simplest form, it presents the data from a single business process
  • Modern Data Marts may be updated. Changes in lables, hierarchies, status, and corporate ownership may trigger changes
  • These changes are managed-load updates, however.
  • The Data Warehouse presentation area in a large enterprise may consist of 20 or more similar-looking data marts.
  • Each Data mart has similar dimensional models.
  • Each data mart contains various fact tables, each having 5-15 dimension tables, that are often shared among the various fact tables.


(Q) What is dimensional modeling? How does it relate to 3NF models?

  • The key difference between dimensional and 3NF models is the degree of normalization.
  • Both types can be represented in E-R diagrams, but 3NF are intended to reduce data redundancy.
  • Dimensional model contains the same information as a normalized model.
  • Dimensional model's goals are understandability, query performance and resilience to change.
  • 3NF modeling boosts performance of operational systems: an update or insert transaction only needs to touch the database in one place.
  • UPDATES or INSERTS do not have to touch(read/write) many different structures in the database.
    3NF models are too complex for DWH queries.
  • Dimensional modelings is applicable to both relational and multidimensional databases.


(Q) What are some key goals of dimensional modeling?

Dimensional model's goals are (a) understandability, (b) query performance and (c) resilience to change.


(Q) What does the concept of metadata represent in a Data Warehouse context?

Metadata is all the information in the Data Warehouse environment that is NOT the actual data itself


(Q) What types of metadata are there?

  • (1) Operational source system metadata - Source schemas, copybooks that facilitate the extraction proces
  • (2) Staging area metadata - Used to guide the transformation and loading processes. Includes
    • staging file layouts
    • Target table layouts
    • transformation and cleaning rules
    • conformed dimension and fact definitions
    • aggregation definitions
    • ETL transmission schedules and run-log results
  • (3) DBMS metadata - System tables, partition settings, indexes and view definitions, security privileges and grants
  • (4) data access tools metadata - Identify (a) business names and definitions for the presentation area's tables and columns; (b) application template specifications; (c) access and usage statistics
  • (5) Security Settings metadata - From Source transactional data all the way to user desktops



(Q) Consider the vocabulary of dimensional modeling: What are Fact tables?

  • Fact tables are the primary tables in a dimensional model, and store numerical performance measurements of the business
  • Measurement data resulting from a business process should be stored in a single data mart.
  • A Fact represents a business measure.
  • A measurement is taken at the intersection of all the dimensions (day, product, store, channel), which defines the grain of the measurement
  • All measurements in a fact table should have the same grain.
  • The example below shows the SALES fact table and its associated dimension tables.
(Q) What types of facts can be stored in a fact table?
  • Additive - can be rulled up. (i.e. amount_sold, quantity_sold)
  • Semiadditive - can be added only along some of the dimensions
  • Nonadditive - Cannot be added.. Averages and counts are possible.
(Q)what are the categories in which fact table grains fall into?
  • Transaction - (i.e. amount_sold and quantity_sold
  • Periodic Snapshot
  • Accumulating Snapshot
  • In the image below, quantity and amount are transactional additive facts
(Q) What are Dimension tables?
  • Contain the textual descriptors of the business
  • Tables should have as many meaningful columns as possible
  • sometimes 50-100 attributes
  • attributes here are the primary source of query constraints, groupings, and report labels
  • Attributes often represent hierarchical relationships (i.e. product -> brand -> Category)
  • Dimension tables are typically highly denormalized
  • Check the attributes of TIMES and CUSTOMERS in the example below. Denormalization is extensive.
(Q) How dimension tables improve the quality of a Data Warehouse?
  • Dimension tables should have many and robust attributes, all relevant to the business.
  • Robust dimension attributes deliver robust analytic slicing and dicing capabilities
  • Dimensions implement the user interface to the Data Warehouse
(Q) what characteristics Dimension Attributes should have?
  • Attributes should consist of real words rather than cryptic abbreviations
(Q) When designing a Data Warehosue, how can you decide whether a data field extracted from an operational systems belongs to a Fact table or to a Dimension table?
  • Ask whether
    • (a) field is a measurement that takes on lots of values and participates in calculations (FACT)
    • (b) filed is a discretely valued description that is more or less constant and participate in constraints (DIMENSION)

Q&As: Data Warehousing and Partitioning (II)



(Q) What is a Partition-Wise Join? How can SQL statements benefit from Full Partition-Wise Joins?

  • A P-W Join divides a large join into smaller joins between a pair of partitions from the two joined tables.
  • To use P-W Joins:
    • (a) equipartition BOTH tables on their JOIN keys or
    • (b) use Reference partitioning
  • P-W Joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel
  • Full P-W Joins: Divide a large join into smaller joins between a pair of partitions from the two joined tables
  • Tables MUST be equipartitioned on their join keys, or use reference partitioning.


(Q) What are the two most common types of indexes in Data Warehouse environments?

  • B-tree and bitmap indexes
  • Both can be created as local indexes on a partitioned table, inheriting the table's partitining strategy
  • Local nonprefixed indexes can improve performance because many index partitions can be scanned in parallel by range queries on the index key.
SQL> create index cust_last_name_ix ON customers(last_name) LOCAL PARALLEL NOLOGGING;


(Q) How can bitmap index improve performance in a Data Warehouse environment?

  • Bitmap indexes use a very efficient storage mechanism for low cardinality columns.
  • Fact table foreign key column are ideal candidates for bitmap indexes, since generally there are few distinct values relative to the total number of rows in the fact table. (Because dimension tables should not have many rows)


(Q) What is Partition Exchange Load (PEL)? When is it used? What benefits does it provides?

  • Oracle strongly recommends that the larger tables or fact tables in a data warehouse are partitioned.
  • with a partitioned table, you can load data quickly and easily with minimal impact on business users by using the exchange partition command.
  • With the exchange partition command you can swap the data in a non-partitioned table into a particular partition in your partitioned table.
  • NO physical data movement occurs with the exchange partition command.
  • Instead the data dictionary is updated to exchange a pointer from the partition to the table and vice versa.
  • Because there is no physical movement of data, an exchange does not generate redo and undo, making it a VERY fast operation.
  • You can convert by exchanging their data segments:
    • (a) a partition (or subpartition) into a nonpartitioned table and
    • (b) a nonpartitioned table into a partition (or subpartition) of a partitioned table
  • In DWH environments, exchanging partitions facilitates high-speed data loading of new, incremental data into an existing partitioned table.
  • OLTP and DWH also can benefit from exchanging old data partitions out of a partitioned table.
  • In the latter case, the data is purged from the partitioned table without being deleted and can be archived separately afterwards.

(1) create a separate table that looks exactly like a single partition in the existing table (including indexes, constraints and partition schema)
(2) Load the data in the new table, build indexes, enable constraints. This will not impact current users of the existing table
(3) Exchange the existing table partition with the newly created table
i.e. (a) create new partition on existing table 
 SQL> Alter table sales add partition p_sales_jun_2007 
                                                    values less than (to_date('01-JUL-2007', 'DD-MON-YYYY');

(b) create temp table that looks like existing partition 
 SQL> create table sales_jun_2007 COMPRESS FOR OLTP as select * from sales where 1=0;

(c) Populate table and create indexes and constraints 
 SQL> create bitmap index time_id_jun_2007_bix ON sales_jun_2007(time_id) nologging;
 SQL> create bitmap index cust_id_jun_2007_bix ON sales_jun_2007(cust_id) nologging;
 SQL> create bitmap index prod_id_jun_2007_bix ON sales_jun_2007(prod_id) nologging;
 SQL> create bitmap index promo_id_jun_2007_bix ON sales_jun_2007(promo_id) nologging;
 SQL> create bitmap index channel_id_jun_2007_bix ON sales_jun_2007(channel_id) nologging;
 
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT prod_id_fk 
                                            FOREIGN KEY (prod_id) REFERENCES products(prod_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT cust_id_fk 
                                            FOREIGN KEY (cust_id) REFERENCES customers(cust_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT promo_id_fk 
                                            FOREIGN KEY (promo_id) REFERENCES promotions(promo_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT time_id_fk 
                                            FOREIGN KEY (time_id) REFERENCES times(time_id);
 SQL> ALTER TABLE sales_jun_2007 ADD CONSTRAINT channel_id_fk 
                                            FOREIGN KEY (channel_id) REFERENCES channels(channel_id);

(d) Exchange partition 
 SQL> Alter table sales EXCHANGE PARTITION p_sales_jun_2007 WITH TABLE sales_jun_2007
   including indexes;

Partition Exchange Load (PEL)



Data Guard: Switchover to a Physical Standby Database




(Step 1) Verify that the primary database can be switched to a physical standby database

  • A value of TO STANDBY or SESSIONS ACTIVE indicates that the primary can be switched to a standby role.
SQL> select name, db_unique_name, switchover_status from v$database;

NAME   DB_UNIQUE_NAME   SWITCHOVER_STATUS
--------- ------------------------------ --------------------
DG11G   atlanta    TO STANDBY


(Step 2) Initiate the switchover of the primary database to a physical standby database

  • After completion of this statement, the Data Guard configuration will have two physical standby databases and no primary.
SQL> alter database commit to switchover to physical standby with session shutdown;

Database altered.


(Step 3) Shutdown and then mount the former primary database

  • After completion of this statement, the original primary is a standby database.
  • You can see a graphical view of the switchover process here.
SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
...
Database mounted.

SQL> select name, db_unique_name, switchover_status, database_role from v$database;

NAME   DB_UNIQUE_NAME   SWITCHOVER_STATUS    DATABASE_ROLE
--------- ------------------------------ -------------------- ----------------
DG11G   atlanta    TO PRIMARY       PHYSICAL STANDBY


(Step 4) Check that the switchover target is ready to be switched to the primary role

  • After completion of this statement, the original standby is a primary database.
SQL> select name, db_unique_name, switchover_status, database_role from v$database;


NAME   DB_UNIQUE_NAME   SWITCHOVER_STATUS    DATABASE_ROLE
--------- ------------------------------ -------------------- ----------------
DG11G   fresno    TO PRIMARY       PHYSICAL STANDBY


(Step 5) Switch the target physical standby database role to the primary role

SQL> alter database commit to switchover to primary with session shutdown;

Database altered.

SQL> select name, db_unique_name, switchover_status, database_role from v$database;

NAME   DB_UNIQUE_NAME   SWITCHOVER_STATUS    DATABASE_ROLE
--------- ------------------------------ -------------------- ----------------
DG11G   fresno    NOT ALLOWED       PRIMARY


(Step 6) Open the new primary database

SQL> alter database open;

Database altered.


(Step 7) Start Redo Apply on the new physical standby database

  • You can see a graphical view of the switchover process here.
SQL> select name, db_unique_name, switchover_status, database_role from v$database;

NAME   DB_UNIQUE_NAME   SWITCHOVER_STATUS    DATABASE_ROLE
--------- ------------------------------ -------------------- ----------------
DG11G   atlanta    RECOVERY NEEDED      PHYSICAL STANDBY


SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.



Q&As: Data Warehosing and Partitioning (I)




(Q) In which ways partitions facilitate administrative operations?

  • Partitioning enable administrative operations to work on subsets of data.
  • You can add, organize or drop a partition with minimal or zero interruption to read-only applications
  • Adding a partition: you can defer segment creation, which is useful when the segment is very large
  • Changing partition granularity: You can split or merge partitions.
  • Swapping partitions: you can easily add, remove, or swap a large amount of data quickly
  • Partitioning enhances data access and improve overall application performance:
    • More users can query the system, since each SQL is likely to access smaller amounts of data
    • Complex queries run faster. Less IO access


(Q) List three advantages of using partitioned objects

(1) Backup and recovery can be performed on a lower level of granularity (at the partition level)
(2) Part of the database objects can be placed in compressed storage, while others remain uncompressed
(3) You can transparently store data on different storage tiers to lower costs


(Q) How can the database take advantage of partitioning when executing an SQL?

  • Using parallel executions to speed up queries, DML and DDL.
  • Individual parallel execution servers can work on their own data sets, identified by partition boundaries


(Q) what is Partition pruning?

  • When executing a SQL statement, the Optimizer analyzes FROM and WHERE clauses to eliminate unneeded partitions from the execution plan's access list.
  • In the example below, the database scans only four partitions when executing the query. In an non-partitioned table, all data would have to be scanned.


(Q) What is the difference between static pruning and dynamic pruning?

  • Static pruning - occurs at compile-time.
    • example: SQL stmt with WHERE clause with a constant literal on the partition key column
  • Dynamic pruning - occurs at run-time, when the exact partitions to be accessed are not known before hand
    • example: SQL stmt with operator or function in the WHERE condition


(Q) How can you identify when static pruning is used in an execution plan?

  • In the execution plan, the OPERATION column shows PARTITION RANGE SINGLE, indicating that only a single (or range of) partition(s) is being accessed.
  • If OPERATION = PARTITION RANGE ALL, then ALL partitions are being accessed
select * from sales where time_id = to_date('01-jan-2001', 'dd-mon-yyyy');
... Returns 766 rows

explain plan for select * from sales where time_id = to_date('01-jan-2001', 'dd-mon-yyyy');
plan FOR succeeded

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------
Plan hash value: 3733446669

---------------------------------------------------------------------------------------------------------------------  
| Id | Operation                          | Name           | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------- 
|  0 | SELECT STATEMENT                   |                |  766 | 22980 |    27   (0)| 00:00:01 |       |       |
|  1 |  PARTITION RANGE SINGLE            |                |  766 | 22980 |    27   (0)| 00:00:01 |    17 |    17 |
|  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |  766 | 22980 |    27   (0)| 00:00:01 |    17 |    17 |
|  3 |    BITMAP CONVERSION TO ROWIDS     |                |      |       |            |          |       |       |
|* 4 |     BITMAP INDEX SINGLE VALUE      | SALES_TIME_BIX |      |       |            |          |    17 |    17 |
--------------------------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):                
----------------------------------------
   4 - access("TIME_ID"=TO_DATE(' 2001-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))      

 16 rows selected 


(Q) What types of SQL statements are candidates for partition pruning?

  • On Range or List partitioned columns: When you use range, LIKE, equality, and IN-list predicates
  • On Hash Partitioned columns: When you use equlity and IN-list predicates


(Q) How can datatype conversions affect the type of partition pruning that may be used?

  • Datatype conversion typically lead to dynamic, when static pruning would have otherwise been possible
  • Static pruning provides better performance than dynamic pruning
  • example: Proper use of TO_DATE function guarantees that the database can uniquely determine the date value and thus the partition that should be accessed
Compare the following queries:
SELECT SUM(amount_sold) total_revenue FROM sh.sales
(clause 1)   WHERE time_id between '01-JAN-00' and '31-DEC-00' ;

(clause 2)   WHERE time_id between '01-JAN-2000' and '31-DEC-2000' ;

(clause 3)   WHERE time_id between to_date('01-JAN-2000', 'dd-MON-yyyy') and 
                                   to_date('31-DEC-2000', 'dd-MON-yyyy');

(Clause 1:) 
Leads to dynamic pruning, because date needs to be implicitly converted and does not match nls_date_format

SQL> explain plan for 
 SELECT SUM(amount_sold) total_revenue 
 FROM sales
 WHERE time_id between '01-JAN-00' and '31-DEC-00';

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------
Plan hash value: 3772476237   

---------------------------------------------------------------------------------------------------- 
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
---------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT           |       |     1 |    13 |   592  (19)| 00:00:08 |       |       | 
|   1 |  SORT AGGREGATE            |       |     1 |    13 |            |          |       |       | 
|*  2 |   FILTER                   |       |       |       |            |          |       |       | 
|   3 |    PARTITION RANGE ITERATOR|       |   230K|  2932K|   592  (19)| 00:00:08 |   KEY |   KEY | 
|*  4 |     TABLE ACCESS FULL      | SALES |   230K|  2932K|   592  (19)| 00:00:08 |   KEY |   KEY | 
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
                                                  
   2 - filter(TO_DATE('01-JAN-00')<=TO_DATE('31-DEC-00'))
   4 - filter("TIME_ID">='01-JAN-00' AND "TIME_ID"<='31-DEC-00')

 17 rows selected 


(Clause 2:) 
Uses STATIC pruning, because although date needs to be implicitly converted is DOES match nls_date_format

SQL> explain plan for 
 SELECT SUM(amount_sold) total_revenue 
 FROM sales
 WHERE time_id between '01-JAN-2000' and '31-DEC-2000';

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT 
-------------------------------------------------------------------------------------------------
Plan hash value: 1500327972 

--------------------------------------------------------------------------------------------------- 
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
--------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT          |       |     1 |    13 |   136  (11)| 00:00:02 |       |       | 
|   1 |  SORT AGGREGATE           |       |     1 |    13 |            |          |       |       | 
|   2 |   PARTITION RANGE ITERATOR|       |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 | 
|*  3 |    TABLE ACCESS FULL      | SALES |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 | 
--------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):  
---------------------------------------------------  

   3 - filter("TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

 15 rows selected 

(Clause 3:) 
Uses STATIC pruning, because although date is explicitly converted

SQL> Explain plan for
SELECT SUM(amount_sold) total_revenue 
FROM sales
   WHERE time_id between to_date('01-JAN-2000', 'dd-MON-yyyy') and 
                         to_date('31-DEC-2000', 'dd-MON-yyyy');

plan FOR succeeded.

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT 
-------------------------------------------
Plan hash value: 1500327972

---------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |     1 |    13 |   136  (11)| 00:00:02 |       |       |
|   1 |  SORT AGGREGATE           |       |     1 |    13 |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR|       |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 |
|*  3 |    TABLE ACCESS FULL      | SALES |   230K|  2932K|   136  (11)| 00:00:02 |    13 |    16 |
---------------------------------------------------------------------------------------------------
                                                                                                  
Predicate Information (identified by operation id): 
--------------------------------------------------- 

   3 - filter("TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

 15 rows selected 

Data Guard: Switchover to a Physical Standby (II)









Data Warehouse: Concepts on ETL Processes (Extraction)







(Q) Why defining and creating the extraction process may be a very time-consuming stage in the ETL process?

  • Often multiple and independent sources. May be poorly documented. 3rd party applications may not be modifiable.
  • Business requirements on source may not accept an extra overhead of the extraction process, nor accept changes in its availability.
  • Often you cannot add aditional logic to the source systems to support incremental extraction of data.


(Q) What are the two Logical extraction methods?

  • Full Extraction and Incremental Extraction
  • Full Extraction:
    • ALL data is extracted from the source system. Here you don’t need to keep track of what has changed sinc e the last extraction.
    • However, it is more costly in space, since the staging area might need to be much larger
  • Incremental Extraction:
    • Here you need to keep track of all data that has changed since the last extraction or some complex business event like the last booking day of a fiscal period.
    • This information may be provided by the source system (i.e. an application column, or a change table). In this case the source system would likely have to be modified to incorporate the change tracking functionality).
    • Alternatively, Oracle provides a Change Data Capture (CDC) mechanism (with Oracle Golden Gate)


(Q) What two methods of physical extraction are there?

  • Online extraction:
    • Data is extracted directly from the source itself. Extraction process connects directly to the source.
  • Offline extraction:
    • Data is staged explicitly outside the original source system.
    • In this case, the data either has an exisitng structure (i.e. redo logs, archive logs, or transportable tablespace) or was created by an extraction routine. (i.e stored in flat files)
    • The alternatives are: Flat files, dump files (imp/exp), redo and archived logs (how to read these?) or transportable tablespaces.


(Q) What is the physical extraction technique recommended by Oracle most of the time? Why is it recommended?

  • Oracle recommends the use of transportable tablespaces.
  • (For this, either the source system is in an Oracle DB or an extraction routine was used to copy the data to a "staging" transportable tablespace.


(Q) What is the key-enabling technology for providing near real-time, or on-time, data warehousing?

  • Change Data Capture mechanism
  • Used for incremental extractions (i.e, nightly extractions, which need to extract only data modified since last extraction).


(Q) What usually is the most challenging technical issue in data extraction?

  • Devising a mechanism of Change Data Capture.
(Q) What are some techniques for implementing a change capture mechanism on Oracle source databases?
  • You can use timestamps, partitioning or triggers. Each can be used either if the data is directly accessed through distributed queries or exported to flat file.
  • Timestamps columns are not always available in an operational source system.
  • In these cases, how could a timestamp solution for change capture be implemented?
    • You would have to add a new column in every sytem table and write triggers to update the new column each time an operation modifies the row data.
    • Note that trigger-based CDC mechanisms can affect performance of the source system.
(Q) What Oracle feature uses an internalized trigger-based mechanism to keep change capture?
  • Materialized view logs use an internalized trigger-based mechanism.
  • These logs are used by Materialized views to identify changed data.
(Q) What are the alternatives available for extracting data into files if the source system is an Oracle database?
  • You can use SQL*Plus,
  • Use OCI or a Pro*C program,
  • use the Export utility,
  • use External tables.
(Q) You want to extract data using SQL*Plus from a single partitioned table. How can you parallelize this process?
If the table is range partitioned, you can initiate one SQL*Plus session for each partition.
SPOOL order_jan.dat 
Select * from orders PARTITION (orders_jan1998); 
SPOOL OFF; 
(Q) You want to extract data using SQL*Plus from a single non-partitionied table. What two criteria can you use to parallelize this process?
Logical or physical criteria.
  • Logical method: based on logical ranges of column values.
Select … where order_date 
between to_date('01-JAN-99') and to_date('31-JAN-99'); 
  • Physical method: based on a range of values.
  • Check the data dictionary and find the set of data blocks that make up the table.
  • From here, derive a set of rowid-range query.
Select … where order_date where rowid between v1 and v2;
(Q) What are the differences between extracting data using data pump export utility and data pump external table mechanism?
  • Export utility:
    • Metadata + data are exported. Output must be processed with the Import utility
  • External table data pump utility:
    • Results of a SELECT can be stored in an oracle-internal data pump format and processed as a regular external table.
    • Metadata is NOT exported. Results can be stored in multiple extraction files.
    • If metadata is required, need to use the DBMS_METADATA package.
    Set LONG 2000 
    Select dbms_metadata.get_ddl('table', 'extract_cust') from dual; 
    
    (Q) What is an important scalability limitation to extraction through distributed operations?
    • The data is transported through a single Oracle Net connection.
    • For large amounts of data, file-based data extraction and transportation techniques are often more scalable.

    Snapshot Standby: Role Transition cycle





    Create, test, resynchronize with production, create again




    Q&As: Data Guard, Role Transitions (I)



    (Q) Which 6 steps must be performed to prepare for a role transition?


    (1) Verify that each database is properly configured for the role that it is about to assume

    • Check that necessary initialization parameters are properly configured in each database
      • FAL_SERVER, DB_FILE_NAME_CONVERT, LOG_FILE_NAME_CONVERT
      • LOG_ARCHIVE_DEST_n and LOG_ARVHIVE_DEST_STATE_n
    • Check that ONLINE REDO LOGs and STANDBY REDO LOGs exist and are properly configured on all databases
    SQL> select status, type, member, group#
      2* from v$logfile;
    
    STATUS TYPE MEMBER         GROUP#
    ------- ------- ------------------------------------------------------------ ----------
     ONLINE /u01/app/oracle/oradata/fresno/redo03.log         3
     ONLINE /u01/app/oracle/oradata/fresno/redo02.log         2
     ONLINE /u01/app/oracle/oradata/fresno/redo01.log         1
     STANDBY /u01/app/oracle/oradata/fresno/standby_redo01.log        4
     STANDBY /u01/app/oracle/oradata/fresno/standby_redo02.log        5
     STANDBY /u01/app/oracle/oradata/fresno/standby_redo03.log        6
     STANDBY /u01/app/oracle/oradata/fresno/standby_redo04.log        7
    
    7 rows selected.
    


    (2) Verify that there are no Redo Transport errors or Redo Gaps at the standby database

    • For LOG_ARCHIVE_DEST_2, STATUS should be VALID and GAP_STATUS should be NO GAP
    (On the Physical Standby:)
    SQL> select dest_name, status, type, gap_status
      2  from v$archive_dest_status;
    
    DEST_NAME            STATUS    TYPE            GAP_STATUS
    ------------------- --------- --------------- ----------
    LOG_ARCHIVE_DEST_1   VALID     LOCAL
    LOG_ARCHIVE_DEST_2   VALID     UNKNOWN         NO GAP
    ...
    STANDBY_ARCHIVE_DEST VALID     UNKNOWN         NO GAP
    
    (On the primary database)
    SQL> select dest_name, status, type, gap_status
      2* from v$archive_dest_status;
    
    DEST_NAME            STATUS    TYPE           GAP_STATUS
    -------------------- --------- -------------- ------------------------
    LOG_ARCHIVE_DEST_1   VALID     LOCAL
    LOG_ARCHIVE_DEST_2   VALID     PHYSICAL       NO GAP
    LOG_ARCHIVE_DEST_3   INACTIVE  LOCAL
    


    (3) Verify that temporary files exist on the Standby that match the temporary files on the primary database



    (4) Remove any delay in applying redo that may be in effect on the standby database that will become the new primary database

    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY DISCONNECT FROM SESSION;


    (5) Before performing a switchover from an Oracle RAC primary databsae to a physical standby, shut down all but one primary db instance



    (6) If the physical standby that will become primary on the switchover is in real-time query mode, consider bringing all instances of the standby database to the mounted but not open state. This will allow:

    • (a) the fastest possible role transition and/li>
    • (b) to cleanly terminate any user sessions connected to the physical standby prior to the role transition


    Q&As: Fast Recovery Area (III)



    (Q) What types of database files can be stored in the FAST RECOVERY AREA?

    • control file, online redo logs, archived redo logs, flashback logs, and RMAN backups


    (Q) What two types of files, with respect to their use by the database, are kept in the FAST RECOVERY AREA?

    • Permanent files – Active files used by the database instance (control file, online redo logs)
    • Transient files – backups that may be deleted according to the retention policy



    (Q) What happens if the instance cannot write to a multiplexed copy of the control file stored in the fast recovery area?

    • The instance fails.
    • Failure occurs EVEN if accessibel multiplexed copies are accessible outside the recovery area.


    (Q) What is the difference between FLASH RECOVERY AREA and FAST RECOVERY AREA?

    • Just the name. FLASH became FAST with 11g


    (Q) When configuring the FAST RECOVERY AREA, what happens if you specify DB_RECOVERY_FILE_DEST but DO NOT specify DB_RECOVERY_FILE_DEST_SIZE?

    • Specifying DB_RECOVERY_FILE_DEST without specifying DB_RECOVERY_FILE_DEST_SIZE is not allowed.





    DataGuard: Configuring a Physical Standby


    Configure Primary Server





    Primary Server Setup


    (1) Is database in Archivelog Mode?
    Home -> Availability -> Recovery Settings
    SQL> SELECT log_mode FROM v$database;
    
    LOG_MODE
    ------------
    ARCHIVELOG
    
    If database is in Noarchivelog mode, you need to switch it to Archivelog mode.
    
    SQL> SHUTDOWN IMMEDIATE;
    SQL> STARTUP MOUNT;
    SQL> ALTER DATABASE ARCHIVELOG;
    SQL> ALTER DATABASE OPEN;


    (2) Is Force Logging enabled?
    SQL> select force_logging from v$database;
    FOR
    ---
    NO
    
    SQL> alter database force logging;
    Database altered.
    


    (3) Edit initialization parameters for the primary role
    db_name db_unique_name log_archive_config log_archive_dest_1
    log_archive_dest_2 log_archive_dest_state_1 log_archive_dest_state_2 fal_server
    db_file_name_convert log_file_name_convert remote_login_passwordfile

    Home -> Server -> Initialization Parameters
    SQL> select name, db_unique_name from v$database;
    
    NAME   DB_UNIQUE_NAME
    --------- ------------------------------
    DG11G   atlanta
    
    SQL> show parameter db_name;
    
    NAME         TYPE  VALUE
    ------------------------------------ ----------- ------------------------------
    db_name         string  dg11g
    
    SQL> show parameter db_unique_name;
    
    NAME         TYPE  VALUE
    ------------------------------------ ----------- ------------------------------
    db_unique_name        string  atlanta
    

    SQL> alter system set log_archive_config='dg_config=(atlanta, fresno)';  
    System altered.    
        
    SQL> show parameter log_archive_config    
        
    NAME                         TYPE    VALUE    
    ---------------------------- ------- ------------------------------    
    log_archive_config           string  dg_config=(atlanta, fresno)    
        
    Obs.    
       In this configuration, primary=> atlanta; standby=> fresno    
    


    SQL> show parameter log_archive_dest_
    
    NAME        TYPE  VALUE
    ---------------------------- ----------- ----------------------------
    log_archive_dest_            string  location=use_db_recovery_file_
                 dest valid_for=(all_logfiles,
          all_roles) db_unique_name=atla
          nta
    ...
    NAME              TYPE    VALUE
    --------------------------- ----------- ------------------------------
    log_archive_dest_2          string  service=fresno async valid_for
          =(online_logfiles, primary_rol
          e) db_unique_name=fresno
    ...
    
    Home -> Server -> Initialization Parameters


    SQL> show parameter fal_server 
     
    NAME         TYPE  VALUE 
    ------------------------------------ ----------- ------------------------------ 
    fal_server        string  fresno 
    
    Home -> Server -> Initialization Parameters


    SQL> show parameter remote_login_pa
    
    NAME         TYPE  VALUE
    ------------------------------------ ----------- ------------------------------
    remote_login_passwordfile      string  EXCLUSIVE
    


    SQL> alter system set db_file_name_convert='fresno', 'atlanta' scope=spfile; 
    System altered. 
    
    SQL> alter system set log_file_name_convert='fresno', 'atlanta' scope=spfile; 
    System altered. 
    
    SQL> alter system set standby_file_management=auto; 
    System altered. 
    
    Home -> Server -> Initialization Parameters -> Standby Database


    (4) Configure Oracle Net for Redo Transport
    Redo Transport moves archived redo logs from the primary to the standby database.
    The two databases need to be connected through Oracle Net.
    For this you can edit tnsnames.ora files in each server. i.e.
    # Generated by Oracle configuration tools.
    
    ATLANTA =
      (DESCRIPTION =
        (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = TCP)(HOST = oe5srv.localdomain)(PORT = 1521))
        )
        (CONNECT_DATA =
          (SID = DG11G)
        )
      )
    
    FRESNO =
      (DESCRIPTION =
        (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = TCP)(HOST = oe5clt.localdomain)(PORT = 1521))
        )
        (CONNECT_DATA =
          (SID = DG11G)
        )
      )
    


    (5) create a backup copy of the primary database datafiles
    Using RMAN:

    Home -> Server -> Availability -> Schedule Backup
    • Select a whole Database backup and enter host credentials

    Home -> Server -> Availability -> Schedule Customized Backup -> Options

    Home -> Server -> Availability -> Schedule Customized Backup -> Settings

    Home -> Server -> Availability -> Schedule Backup -> Schedule

    Home -> Server -> Availability -> Schedule Backup -> Review

    Home -> Server -> Availability -> Schedule Backup -> Summary


    (6) Create a Control file for the Physical Standby Database
    SQL> alter database create standby controlfile as '/tmp/fresno_stdby.ctl';
    
    Database altered.
    


    (7) Create a Parameter File for the Physical Standby Database.
    Edit the Parameter file to configure the settings for the Standby
    (3.1) Create a PFILE from the SPFILE used by the Primary DB
    
    SQL> create pfile= '/tmp/initfresno.ora' from spfile;
    File created.
    
    (3.2) Modify the PFILE to be used on a Physical Standby
    $ cat initfresno.ora 
    dg11g.__db_cache_size=67108864
    dg11g.__java_pool_size=4194304
    dg11g.__large_pool_size=4194304
    dg11g.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
    dg11g.__pga_aggregate_target=150994944
    dg11g.__sga_target=205520896
    dg11g.__shared_io_pool_size=0
    dg11g.__shared_pool_size=117440512
    dg11g.__streams_pool_size=4194304
    *.audit_file_dest='/u01/app/oracle/admin/fresno/adump'
    *.audit_trail='db'
    *.compatible='11.2.0.0.0'
    *.control_files='/u01/app/oracle/oradata/fresno/control01.ctl',
                    '/u01/app/oracle/flash_recovery_area/fresno/control02.ctl'
    *.db_block_size=8192
    *.db_domain=''
    *.db_file_name_convert='atlanta', 'fresno'
    *.db_name='dg11g'
    *.db_recovery_file_dest='/u01/app/oracle/flash_recovery_area'
    *.db_recovery_file_dest_size=4039114752
    *.db_unique_name='fresno'
    *.diagnostic_dest='/u01/app/oracle'
    *.dispatchers='(PROTOCOL=TCP) (SERVICE=dg11gXDB)'
    *.fal_server='atlanta'#In case of a switchover, fetch records from the fal_server
    *.log_archive_config='dg_config=(atlanta, fresno)'#db_unique_names of the databases in the DG configuration
    *.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles, all_roles) 
                                db_unique_name=fresno'#configure local redo log archiving. Use Fast recovery Area
    *.log_archive_dest_2='service=atlanta async valid_for=(online_logfiles, primary_role) 
            db_unique_name=atlanta'#configure remote log archiving. Configure redo transport to physical standby
    *.log_archive_format='%t_%s_%r.dbf'
    *.log_file_name_convert='atlanta', 'fresno'
    *.memory_target=356515840
    *.open_cursors=300
    *.processes=150
    *.remote_login_passwordfile='EXCLUSIVE'
    *.standby_file_management='AUTO'
    *.undo_tablespace='UNDOTBS1'
    


    Standby Server Setup


    (1) Create appropriate directories in the Standby server. Copy files from the prymary to the standby system.
    (8.0) Create directories for fresno database files
    $ mkdir -p oradata/fresno
    $ mkdir -p flash_recovery_area/fresno
    $ mkdir -p admin/fresno/adump
    
    (8.1) Copy the standby control files to all locations (in the same server)
    $ scp oracle@oe5srv:/tmp/fresno.ctl /u01/app/oracle/oradata/fresno/control01.ctl
    $ cp /u01/app/oracle/oradata/fresno/control01.ctl /u01/app/oracle/flash_recovery_area/fresno/control02.ctl
    
    (8.2) Copy Archivelogs and backups (in the same server)
    $ scp -r oracle@oe5srv:/u01/app/oracle/flash_recovery_area/ATLANTA/archivelog  /u01/app/oracle/flash_recovery_area/fresno
    $ scp -r oracle@oe5srv:/u01/app/oracle/flash_recovery_area/ATLANTA/backupset  /u01/app/oracle/flash_recovery_area/fresno
    
    (8.3) Copy passwordfile and parameter file
    $ scp -r oracle@oe5srv:$ORACLE_HOME/dbs/orapwdg11g $ORACLE_HOME/dbs
    $ scp -r oracle@oe5srv:/tmp/initfresno.ora /tmp/initfresno.ora

    (2) Create SPFILE from the edited parameter file
    oracle@oe5clt:$ echo $ORACLE_SID  
    dg11g
    
    oracle@oe5clt:$ sqlplus / as sysdba 
    SQL*Plus: Release 11.2.0.1.0 Production on Thu Mar 8 00:37:44 2012
    Copyright (c) 1982, 2009, Oracle.  All rights reserved.  
      
    Connected to an idle instance.  
    
    SQL> create spfile from pfile='/tmp/initfresno.ora';
      
    File created.  
    

    (3) Restore backup
    RMAN> startup mount; 
     
    Oracle instance started 
    database mounted 
    
    RMAN> restore database; 
     
    Starting restore at 08-MAR-12 
    using channel ORA_DISK_1 
     
    channel ORA_DISK_1: starting datafile backup set restore 
    channel ORA_DISK_1: specifying datafile(s) to restore from backup set 
    channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/fresno/system01.dbf 
    channel ORA_DISK_1: restoring datafile 00002 to /u01/app/oracle/oradata/fresno/sysaux01.dbf 
    channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/fresno/undotbs01.dbf 
    channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/fresno/users01.dbf 
    channel ORA_DISK_1: reading from backup piece /u01/app/oracle/flash_recovery_area/FRESNO/backupset/o1_mf_nnndf_BACKUP_ATLANTA_000_0_7ojnc7g1_.bkp 
    channel ORA_DISK_1: piece handle=/u01/app/oracle/flash_recovery_area/FRESNO/backupset/o1_mf_nnndf_BACKUP_ATLANTA_000_0_7ojnc7g1_.bkp tag=BACKUP_ATLANTA_000_030712102725 
    channel ORA_DISK_1: restored backup piece 1 
    channel ORA_DISK_1: restore complete, elapsed time: 00:02:37 
    Finished restore at 08-MAR-12 
    

    (4) Create Online and Standby Redo logs on the Standby Database
    SQL> alter system set standby_file_management=manual;
    
    System altered.
    
    SQL> alter database add logfile('/u01/app/oracle/oradata/fresno/online_redo01.log') size 50m;
    
    Database altered.
    
    SQL> alter database add logfile('/u01/app/oracle/oradata/fresno/online_redo02.log') size 50m;
    
    Database altered.
    
    SQL> alter database add logfile('/u01/app/oracle/oradata/fresno/online_redo03.log') size 50m;
    
    Database altered.
    
    SQL> alter system set standby_file_management=auto;
    
    System altered.
    
    
    SQL> alter database add standby logfile('/u01/app/oracle/oradata/fresno/standby_redo01.log') size 50m;
    
    Database altered.
    
    SQL> alter database add standby logfile('/u01/app/oracle/oradata/fresno/standby_redo02.log') size 50m;
    
    Database altered.
    
    SQL> alter database add standby logfile('/u01/app/oracle/oradata/fresno/standby_redo03.log') size 50m;
    
    Database altered.
    
    SQL> alter database add standby logfile('/u01/app/oracle/oradata/fresno/standby_redo04.log') size 50m;
    
    Database altered.
    


    Start Apply Service

    SQL> Alter database recover managed standby database disconnect from session;
    
    Database altered.