Q&As: Backup with Recovery Manager



(Q) Which statements about using RMAN to perform backups are correct?


  • ( ) You CAN perform incremental backups with RMAN
  • ( ) RMAN detects corrupt blocks and logs in V$DATABASE_BLOCK_CORRUPTION
  • ( ) Automatically establishes name and locations of all files to be backed up
  • ( ) Backups are recorded in the CONTROL FILE, which is the main repository of RMAN metadata.
  • ( ) Optionally, you CAN store backup metadata in a recovery catalog, which should be a schema in a database different of the one backed up.
  • ( ) The primary storage for RMAN repository information for a database is always in the control file of the database. 
  • ( ) RMAN supports proxy copy, a feature that allows a media manager to manage completely the transfer of data between disk and backup media
  • ( ) RMAN backups init parameter file
  • ( ) RMAN DOES NOT backup password and networking files.
  • ( ) RMAN supports a platform-independent language for backups

Q&As: Incremental Backup



(Q) What are the characteristics of Incremental Backups?


  • Store only blocks changed since a previous incremental (Level 0 or Level 1) backup
  • Provide more compact backups and faster recovery
  • Less redo logs are applied during recovery
  • If you enable block change tracking, then full table scans on input files will not be performed during recovery
  • You can take a Level 0 or Level 1 Incremental Backup

$ rman
RMAN> CONNECT TARGET /
RMAN> BACKUP
       INCREMENTAL LEVEL 1 CUMULATIVE
       SKIP INACCESSIBLE
       DATABASE;

The example above:
  • Backs up all blocks changed since the most recent LEVEL 0 incremental backup.
  • If no LEVEL 0 backup exists, RMAN makes a LEVEL 0 automatically.
  • Inaccessible files are skipped

Oracle Scheduler: Use credentials for a detached job that performs cold backups



Here are the steps to create a daily detached job that performs a cold backup of an oracle 11g database.
This example is a bit more detailed than the one provided in the administration guide.

Steps:
(1) Create the shell script that invokes RMAN.
  • This script will be executed by the detached job.
  • The script should be executable by the user who has backup privileges in the OS. (usually oracle).
(2) create the RMAN script
(3) Write a PL/SQL block that performs the following steps:
3.1 create a credential object with username/password of the OS user who should have permission to run the shell script.
3.2 create a program object, which should use the credential just created to run the shell script.
3.3 alter the program object setting its attribute to detached.
3.4 create a job object


(1) Create the script that invokes RMAN
  • Create a shell script that calls an RMAN script to perform a cold backup.
  • The shell script is located in $ORACLE_BASE/scripts/coldbackup.sh.
  • It must be executable by the user who installed Oracle Database (typically the user oracle).
#!/bin/sh
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
export ORACLE_SID=orcl
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib
$ORACLE_HOME/bin/rman TARGET / @$ORACLE_HOME/scripts/coldbackup.rman trace /u01/app/oracle/backup/coldbackup.out &
exit 0

(2)Create the RMAN Script
  • Create an RMAN script that performs the cold backup and then ends the job.
  • At the end, the script should call the Scheduler routine to inform the job completion status
  • The script is located in $ORACLE_BASE/scripts/coldbackup.rman.
run {
# Shut down database for backups and put into MOUNT mode
shutdown immediate
startup mount

# Perform full database backup
backup full format "/u01/app/oracle/backup/%d_FULL_%U" (database) ;

# Open database after backup
alter database open;

# Call notification routine to indicate job completed successfully
sql " BEGIN DBMS_SCHEDULER.END_DETACHED_JOB_RUN(''sys.backup_job'', 0, null); END; ";
}

(3) Create credentials, detached program and job with inline schedule
Submit the following PL/SQL block:
-- 3.1 create a credential for the job to use
-- 3.2 create the program object, specifying the shell script created earlier as the external executable. Specify it as detached program.
-- 3.3 create the job, specifying the credential object created earlier.
Begin 
 dbms_scheduler.create_credential('credrman', 'oracle', '');

 dbms_scheduler.create_program (
   program_name   => 'sys.backup_program',
   program_type   => 'executable',
   program_action => '/u01/app/oracle/scripts/coldbackup.sh',
   enabled        => TRUE,
   detached       => TRUE);
   
 dbms_scheduler.create_job (
   job_name     => 'sys.backup_job',
   program_name => 'sys.backup_program',
   repeat_interval => 'FREQ=DAILY; BYHOUR=13; BYMINUTE=20',
   credential_name => 'credrman');
   
 dbms_scheduler.enable('sys.backup_job');
end;

You can check the status of the created job with the query below:
SQL> select owner, job_name, job_style, program_name, schedule_type, 
       to_char(start_date, 'yyyy-mm-dd hh:mi:ss') start_date_, enabled, state, 
       run_count runs, failure_count failures, 
from dba_scheduler_jobs
where job_name = 'BACKUP_JOB';

JOB_NAME     JOB_STYLE   PROGRAM_NAME     SCHEDULE_TYPE  START_DATE_         ENABLED STATE     RUNS FAILURES
------------ ----------- ---------------- ------------- -------------------- ------- --------- ---- --------
BACKUP_JOB   REGULAR     BACKUP_PROGRAM   CALENDAR       2011-12-09 01:12:49 TRUE    SCHEDULED 0    0       


Some troubelshooting:
  • If a credential object is not created and assigned to the job, it will fail.
  • The result will be an error entry in the alert file and a trace generated:
-//- alert.log  -//-
(...)
Fri Dec 09 14:25:51 2011
Errors in file /u01/app/oracle/diag/rdbms/orcl11r2/orcl11r2/trace/orcl11r2_j000_20165.trc:
ORA-12012: error on auto execute of job 12693
ORA-27369: job of type EXECUTABLE failed with exit code: 274670
(...)

  • If a credential object is not created, Oracle will try to run the external program using the credentials specified in the $ORACLE_HOME/rdbms/admin/externaljob.ora file.
  • The credentials specified in that file (in a unix/linux installation) are user:nobody, group:nobody.
  • The consequence is that the job will likely fail with the error 'ORA-27369: job of type EXECUTABLE failed with exit code: 274670.'
  • This somewhat criptical error message suggests (as discussed here) that the problem is due to OS permissions in the execution of the external program.



SQL> select status, error#, run_duration, credential_name
     from dba_scheduler_job_run_details 
     where job_name='BACKUP_JOB';
STATUS                         ERROR#                 RUN_DURATION CREDENTIAL_NAME
------------------------------ ---------------------- ------------ ------------------ 
FAILED                         274670                 0 0:0:2.0 
SUCCEEDED                      0                      0 0:7:26.0   CREDRMAN                                                                                                                                                                                           

SQL> select log_id, operation, status, credential_name 
     from dba_scheduler_job_log 
     where job_name='BACKUP_JOB';
LOG_ID                 OPERATION                      STATUS     CREDENTIAL_NAME                                                   
---------------------- ------------------------------ ---------- ------------------ 
192                    RUN                            FAILED
194                    RUN                            SUCCEEDED  CREDRMAN                                                          

More on Oracle 11g ASM and Grid Configuration

"Como a veces no puede dormir, en vez de contar corderitos contesta mentalmente la correspondencia atrasada, porque su mala conciencia tiene tanto insomnio como él."
Un tal Lucas (Cortázar)




Oracle Grid Infrastructure and Oracle ASM: Configuration changes in 11g R2



  • Oracle 11g Release 2 introduced the Oracle Grid Infrastructure installation.
  • Prior to 11g R2, Oracle Automatic Storage Management (ASM) software was automatically installed when you installed the Oracle database software.
  • Since 11g R2, if you you need to use Oracle ASM you need first to install the Oracle Grid Infrastructure Software.

In a single instance database environment
  • An Oracle Grid Infrastructure installation includes:
  1. Oracle Automatic Storage Management (ASM),
  2. the Listener and
  3. Oracle Restart


  • A New method of installing Automatic Storage Management (ASM) with Oracle 11g R2:
    • In a cluster configuration: Oracle ASM shares an Oracle home with Oracle Clusterware
    • In a single instance datbase configuration: Oracle ASM shares an Oracle home with Oracle Restart.
  • To upgrade an existing Oracle ASM installation:
    • Upgrade Oracle ASM by running an OGI upgrade

For a clustered environment
  • An Oracle Grid Infrastructure installation includes:
    1. Oracle Clusterware
    2. Oracle Automatic Storage Management (ASM), and
    3. the Listener









About Oracle ASM:
  • Oracle ASM is a volume manager and file system.
  • Like other volume managers, Oracle ASM group disks into one or more disk groups.
  • While the administrator manages the disk groups, these operate like black boxes and the placement of datafiles within each disk group is automatically managed by Oracle.

  • Separation between database and ASM administration:
    • Oracle ASM administration requires SYSASM privilege.
    • Besides creating a division of responsibilities between ASM and Database administration, this also helpt to prevent that different databases using the same storage accidentally overwrite each others files.

  • Starting in Oracle 11g R2, Oracle ASM can also store Oracle Cluster Registry and voting disks files.
  • Oracle ASM Cluster File System (ACFS) is a new file system and storage management design that extentds Oracle ASM technology to support data that cannot be stroed in Oracle ASM (both in single instance and cluster configurations).
  • ACFS is installed with Oracle Grid Infrastructure.


MongoDB with Python: Data manipulation



Here are some basic steps for data manipulation in MongoDB using Python.


  • Manipulating data in MongoDB with CRUD operations: Create, Retrieve, Update, Delete
  • These are method calls equivalent to DML statments in relational databases:
    • (Insert, Select, Update, Delete).
  • CRUD operations are atomic operations.
  • An atomic operation has an all-or-nothing semantic. This means, for example, that update methods that affect various document attributes have to be valid for all of them, or will fail.
  • However, MongoDB does not support traditional locking and complex transactions.
  • Differently of traditional relational databases, Insert, Update and Delete operations in MongoDB work in a "fire and forget" model:
    • The client issue the method call but do not wait for confirmation on whether or when the statement was actually executed in the database. Once the call is placed over the network, the application returns.
    • You may need to explicitly check (with another call: getLastErrorObj()) the status of the operation. (see example below)
    • When a GetLastErrorObj() command is issued it forces the data to be sent over the wire to the server and it returns to the client an error document if there was an error in the last operation.

  • Comparing data manipulating operations in a relational table and in a MongoDB collection:
Relational Database MongoDB
Table BLOG (author, post, tags, date) Collection BLOG (Columns not statically defined)
INSERT statement
SQL> INSERT into BLOG
Values ("joe", v_post, "MongoDB, Python", sysdate)
>>> post = { "author": "joe",
        "text": "Blogging about MongoDB",
        "tags": ["MongoDB", "Python"],
        "date": datetime.datetime.utcnow()}
>>> db.blog.insert(post)
SELECT statement
SQL> SELECT * from BLOG
Where author = "joe"
>>> db.blog.find({"Author": "joe"})
UPDATE statement
SQL> Update BLOG set tags = "MongoDB, Python"
     where author = "joe"
>>> db.blog.update({"author":"joe"},
        { "$set": ["MongoDB", "Python"]})
DELETE statement
SQL> DELETE from BLOG where author = "joe"
>>> db.blog.remove({"author":"joe"})

Creating a new collection
Databases and Collections in MongoDB are created only when the first data is inserted.
$ ipython
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
Type "copyright", "credits" or "license" for more information.
...
In [2]: import pymongo                  --- import pymongo package
In [3]: from pymongo import Connection
In [4]: from bson import ObjectId

In [5]: connection = Connection()
In [6]: connection
Out[6]: Connection('localhost', 27017)  --- connected to localhost, in the default TCP port
In [7]: connection.database_names()     --- list existing databases
Out[7]: [u'test', u'local']

In [8]: db = connection['blogdb']       --- connect to a new database. 
                                        --- It will be created when the first object is inserted.

In [9]: post = { "author": "John", 
...          "text": "Blogging about MongoDB"};

In [9]: db.posts.insert(post);                      --- The first insert creates the new collection 'posts'
Out[9]: ObjectId('...')
In [10]: db.collection_names()
[u'system.indexes', u'posts']
Notes:
  • Within the database, there are several namespaces.
  • In fact, each collection is one namespace, including some system namespaces ($freelist, indexes).
  • You can further organize different collection within a single namespace by prepending each collection name with a common namespace name.
  • For example, the collections book.info and book.authors are in the same namespace book.

Inserting a document in a collection
  • In MongoDB documents within a collection do not have all to have the same number and type of fields ("columns"). In other words, schemas in MongoDB are dynamic, and can vary within one collection.
  • PyMongo uses dictionary objects to represent JSON-style documents.
  • To add a new document to a collection, using ipython:
In [9]: post = { 
   ...:     'author': 'Joann',
   ...:     'text': 'Just finished reading the Book of Nights'}

In [10]: db.posts.insert(post)        --- Method call to create a new document (post)
Out[10]: ObjectId('4eb99ad5a9e15833b1000000')

In [17]: for post in db.posts.find():   --- listing all documents in the posts collection.
             post
   ....:     
   ....:     
Out[18]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
  • Note that you don't need to specify the "_id" field when inserting a new document into a collection.
  • The document identifier is automatically generated by the database and is unique across the collection.

  • Every MongoDB object has an "_id" attribute.
  • If it is not explicitly assigned when the object is created, it is implicitly created by the MongoDB client driver at object creation.
  • The "_id" attribute is:
    • Intended to be globally unique (but this in not necessarily true)
    • Required to be unique within a collection
    • Indexed by default and
    • Immutable
  • The ObjectID is 12 bytes long, with the first 4 bytes representing the object creation timestamp.
ObjectID("4bface1a2231316e04f3c434") 

  4bface1a -- timestamp of object creation (4 bytes)
  223131   -- derived from client machine's IP (3 bytes) 
  6e04     -- derived from client's process id (2 bytes)  
  f3c434   -- counter (3 bytes)
  • To get the object creation timestamp:
In [1]: post = {'author': 'David',
   ....:         'text' : "David's Blog"}

In [2]: db.posts.insert(post)
Out[2]: ObjectId('4eb9bcada9e15809f3000000')

In [3]: print post["_id"].generation_time
2011-11-08 23:35:09+00:00
































  • You can also execute bulk inserts:
In [13]: many_posts = [{'author': 'David',
   ....:                'text' : "David's Blog"},
   ....:               {'author': 'Monique',
   ....:                'text' : 'My photo blog'}]

In [14]: db.posts.insert(many_posts)
Out[14]: [ObjectId('4eb9bcada9e15809f3000000'), ObjectId('4eb9bcada9e15809f3000001')]

In [15]: for post in db.posts.find():
   ....:     post
   ....:     
   ....:     
Out[15]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
Out[15]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000000'),
 u'author': u'David',
 u'text': u"David's Blog"}
Out[15]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique',
 u'text': u'My photo blog'}


Selecting (reading) documents inside collections
  • Data in MongoDB is represented by structures of key-value pairs, using JSON-style documents.
  • The query language used by MongoDB is essentially "example-based," plus some modifiers.
  • Query results are delivered via cursors.
  • -- The find method below returns a cursor to variable c. 
    -- The method query all documents in which attribute x equals to 20, and returns 10, starting with the 11th one. 
    var c = db.collection.find({x:20}).skip(10).limit(10)
    c.next()                                 -- iteract through the cursor with .next()
    c.next()
    
  • Let's query the collection "things" and ask for ONE document in that collection. Use the find_one() method.
In [48]: db.posts.find_one()                  --- returns the first document in the collection
Out[48]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}

We can also define criteria for the query. For example,
(a) return one document with field "author" equal to "Joann"
In [49]: db.posts.find_one({'author':'Joann'})
Out[49]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}

Querying more than one document
A query returns a cursor pointing to all the documents that matched the query criteria.
To see these documents you need to iteract through the cursor elements:
In [51]: for post in db.posts.find():
    post
   ....:     
   ....:     
Out[52]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
Out[52]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000000'),
 u'author': u'David',
 u'text': u"David's Blog"}
Out[52]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique',
 u'text': u'My photo blog'}

-- Alternatively, you can explicitly define a cursor variable: 
In [53]: cursor = db.posts.find()
In [54]: for x in cursor:
   ....:     x
   ....:     
   ....:     
Out[54]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
Out[54]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000000'),
 u'author': u'David',
 u'text': u"David's Blog"}
Out[54]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique',
 u'text': u'My photo blog'}


You can also return only some of the document fields. (Similar to a SQL query that returns only a subset of the table columns).
In [64]: for post in db.posts.find({}, {'text': 1}):
    post
   ....:     
   ....:     
Out[65]: {u'_id': ObjectId('...'), u'text': u'Just finished reading the Book of Nights'}
Out[65]: {u'_id': ObjectId('...'), u'text': u"David's Blog"}
Out[65]: {u'_id': ObjectId('...'), u'text': u'My photo blog'}

Updating documents in collections
  • MongoDB supports atomic updates in document fields as well as more traditional updates for replacing an entire document.
  • Use the update() method to entirely replace the document matching criteria with a new document.
  • If you want to modify only some attributes of a document, you need to use one of the $set modifier.
  • update() usually takes two parameters:
    • the first select the documents that will be updated (similar to the WHERE clause on SQL);
    • the second parameter contains the new values for the document attributes.

Now, update the post where the author was Monique.
(1) substitute the document for an entirely new document
In [68]: post = db.posts.find_one({'author': 'Monique'})
In [69]: post
Out[69]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique',
 u'text': u'My photo blog'}

In [70]: post['author'] += 'Berger'          # expand the contents of 'author'
In [71]: post['edited'] = True               # add a new field

In [72]: post                                # the variable now has an extra field
Out[72]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'MoniqueBerger',
 'edited': True,
 u'text': u'My photo blog'}

In [73]: db.posts.save(post)                  # update the entire document.
Out[73]: ObjectId('4eb9bcada9e15809f3000001') 

(2) Another way to update only some fields of a document, is to use the $set update modifier.
  • The $set modifier works like the SET clause on an SQL Update statement, with which you can specify the columns that will be updated
In [95]: db.posts.update({'author': 'Monique Berger'}, { '$set': {'text': 'My GREAT photo blog'}})  # updating
In [96]: post = db.posts.find_one({'author': 'Monique Berger'})
In [97]: post
Out[97]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique Berger',
 u'edited': True,
 u'text': u'My GREAT photo blog'}

You can add a new field and increment its value with $push.
Bellow, a comment and a reply are added to the document.
In [18]: comment = { '_id' : ObjectId(),
   ....:             'by' : 'David',
   ....:             'text' : 'Post some photos, then'
   ....: }

In [19]: comment
Out[19]: 
{'_id': ObjectId('4eba1d3ba9e1580f1c000000'),
 'by': 'David',
 'text': 'Post some photos, then'}

In [23]: db.posts.update({'_id': post['_id']}, 
                {'$push' :  {'comments': comment}, '$inc' : {'nComments' : 1}})

In [29]: reply = { 
   ....:   '_id': ObjectId(),
   ....:  'inReplyTo': comment['_id'],
   ....:  'by' : 'Marcel',
   ....:  'text' : 'Yes, post some photos!'
   ....: }

In [30]: db.posts.update({'_id' : post['_id']},
   ....:           {'$push': {'comments': reply}, '$inc': {'nComments': 1}})

In [31]: post = db.posts.find_one({'comments.by' : 'David'})
In [32]: post
Out[32]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique Berger',
 u'comments': [{u'_id': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'by': u'David',
                u'text': u'Post some photos, then'},
               {u'_id': ObjectId('4eba1ea7a9e1580f1c000001'),
                u'by': u'Marcel',
                u'inReplyTo': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'text': u'Yes, post some photos!'}],
 u'edited': True,
 u'nComments': 2,
 u'text': u'My GREAT photo blog'}
Note that $push appends value to an existing array.
If the array in the field do not exist, $push creates the field and the array with one element.
In this case, $push created the field "comments" and inserted a comment.
Subsequently, a reply comment is also inserted in the field, with the value of number of comments to the post incremented.

# Updating the comment:
In [34]: db.posts.update({'_id': post['_id'], 'comments._id': comment['_id']}, 
   ....:  { '$set' : {'comments.$.text': 'Post some photos and write about them!'}})

In [35]: post = db.posts.find_one({'comments.by' : 'David'})
In [36]: post
Out[36]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique Berger',
 u'comments': [{u'_id': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'by': u'David',
                u'text': u'Post some photos and write about them!'},
               {u'_id': ObjectId('4eba1ea7a9e1580f1c000001'),
                u'by': u'Marcel',
                u'inReplyTo': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'text': u'Yes, post some photos!'}],
 u'edited': True,
 u'nComments': 2,
 u'text': u'My GREAT photo blog'}


# Now you may want to remove the reply to the comment on Monique's post:
In [38]: db.posts.update({'_id': post['_id']},
   ....:               {'$pull': {'comments' : {'_id': reply['_id']}}, '$inc': {'nComments': -1}})

In [39]: post = db.posts.find_one({'comments.by' : 'David'})

In [40]: post
Out[40]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique Berger',
 u'comments': [{u'_id': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'by': u'David',
                u'text': u'Post some photos and write about them!'}],
 u'edited': True,
 u'nComments': 1,
 u'text': u'My GREAT photo blog'}

# Now updating tags and tag statistics in the post:
# adding tags to the post; creating a tags collection; maintain tag count;
# upsert option: update if the record that matches already exists. If not, create the object and then apply update. 
In [41]: for tag in ['photos', 'python', 'mongodb']:
   ....:     db.posts.update({'_id': post['_id']}, { '$addToSet': { 'tags': tag}})
   ....:     db.tags.update({'_id': tag}, { '$inc':{'count': 1}}, upsert=True)
   ....:     
   ....:     
In [42]: post = db.posts.find_one({'tags' : 'photos'})
In [43]: post
Out[43]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique Berger',
 u'comments': [{u'_id': ObjectId('4eba1d3ba9e1580f1c000000'),
                u'by': u'David',
                u'text': u'Post some photos and write about them!'}],
 u'edited': True,
 u'nComments': 1,
 u'tags': [u'photos', u'python', u'mongodb'],
 u'text': u'My GREAT photo blog'}

# listing the tags collection: 
In [45]: for tag in db.tags.find():
    tag
   ....:     
   ....:     
Out[46]: {u'_id': u'photos', u'count': 1}
Out[46]: {u'_id': u'python', u'count': 1}
Out[46]: {u'_id': u'mongodb', u'count': 1}


Deleting documents from collections
To delete a document from a collection use the method remove, passing as parameter a document field that either (a) uniquely identifies the document you want to delete or (b) identifies the set of documents you want to delete.s
>>> db.blog.remove({"author":"Monique"})
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(2011, 11, 7, 22, 10, 43, 77000), u'text': u'Blogging about MongoDB', u'_id': ObjectId('4eb857b3a9e158609c000004'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
>>>


On Atomic Operations
  • In MongoDB, only operations executed against a single document can be performed as atomic operations.
  • An atomic operation is a set of operations that can be combined in such a way that the entire set appear to be a single operation to the rest of the system. An operation is atomic when:
    • 1. No other process knows about the changes being made until the entire set of operations has completed.
    • 2. if one of the operations fails, the entire set fails.

  • In traditional RDBMSs, atomicity is guaranteed through locks.
    During an update operation, for example, all rows that need to be modified are locked and either all modifications succeed or none will.
  • No other transaction can see the locked rows while the update process is being performed.
  • MongoDB, however, does not support locking or transactions (begin-commit-rollback).
  • In this way, while atomic operations are guaranteed to succeed, MongoDB cannot prevent race conditions in which data is changed by user A while user B's update in the same data is being performed.
  • The result in this case is that, although user B's atomic update completes successfully, it ultimately modified the wrong data: it modified the value set by User A, and not the value User B had initially read.

Q&As: Parallelism and Advanced Compression




(Q) How can you manually set the degree of parallelism at object level?

  • ALTER TABLE sales PARALLEL 8;
  • You can set a fixed DOP at a table or index level


(Q) Which of the operations below can make use of parallel execution?

(1) When accessing objects: table scans, index fast full scans, partitioned index range scans
(2) Joins: nested loops, sort merges, hash, start transformations
(3) DDL staements: CTAS, Create Index, Rebuild Index, Rebuild Index Partition, Move/Split/Coalesce Partition
(4) DML statements
(5) Parallel Query
(6) Other SQL operations: group by, not in, select distinct, union, union all, cube, and rollup, aggregate and tables functions
(7) SQL*Loader i.e. $sqlldr CONTROL=load1.ctl DIRECT=true PARALLEL=true


(Q) In which type of objects parallel executions CANNOT be used?

Parallel DDL cannot be used on tables with object or LOB columns


(Q) How can you gather I/O Calibration statistics? How often should it be done?

  • Use DBMS_RESOURCE_MANAGER.CALIBRATE_IO procedure
  • I/O calibration is a one-time action if the physical hardware does not change.

Advanced Compression


(Q) What is Advanced Compression?

Introduced in 11g, includes compression for
  • structured data (numbers, chars)
  • Unstructured data (documents, images, etc)
  • backups (RMAN and Data Pump) and
  • Network transport (redo log transport during Data Guard gap resolution)



(Q) What are the benefits of Advanced Compression?

(a) storage reduction – compression of all types
(b) performance improvement – compressed blocks result in higher I/O throughput
(c) Memory efficiency – oracle keeps data compressed in memory
(d) backups – enhanced compression capabilities
(e) Data Guard – allows faster synchronization of databases during gap resolution process.



(Q) What improvement Advanced Compression brings to the table compression feature introducted in Oracle9i?

With Table compression feature – data could be compressed ONLY during bulk load operations
With Advanced CompressionDuring Inserts and Updates also. Also Compression and Deduplication of SecureFiles


(Q) Does table data in compressed tables get decompressed before it is read?

No. Oracle reads directly from compressed blocks in memory.


(Q) What features are included in the Advanced Compression option?

  • OLTP table compression – improved query performance with minimal write perf overhead
  • SecureFiles – SecureFiles compression for any unstructured content. Deduplication to reduce redundancy
  • RMAN – Multiple backup compression levels (faster --- better ratio)
  • Data Pump Compression – Exports can be coompressed
  • Data Guard – Can compress redo data (reduced network traffic, faster gap resolution)


(Q) What types of data compression can be done with RMAN (using Advanced Compression Option)

  • HiGH – Good for backups over slower networks
  • MEDIUM – Recommended for most environments. (about the same as regular compression)
  • LOW – Least effect on backup throughput


(Q) How to enable Advanced Compression option?

  • Set parameter enable_option_advanced_compression = TRUE
  • With Advanced compression option enabled, you can:
    • RMAN> CONFIGURE COMPRESSION ALGORITHM [HIGH|MEDIUM|LOW
  • V$RMAN_COMPRESSION_ALGORITHM describes supported algorithms
(Q) How can the various features under Advanced Compression be turned on?
For table Compression - Methods of Table compression on 11gR2:
Basic compression – direct path load only
  • i.e. CREATE/ALTER table … COMPRESS [BASIC] – Direct-path only
OLTP compression – DML operations
  • i.e. CREATE/ALTER table … COMPRESS FOR OLTP – Direct-path only
Warehouse compression (hybrid Columnar Compression) Online archival compression (hybrid columnar compression)
For SecureFiles -
i.e CREATE TABLE t1 (a CLOB) LOB(a) STORE AS SECUREFILE( COMPRESS LOW [MEDIUM|HIGH] DEDUPLICATE [KEEP_DUPLICATES] )
For RMAN -
RMAN> BACKUP AS COMPRESSED BACKUPSET DATABASE PLUS ARCHIVELOG;
or
RMAN> CONFIGURE DEVICE TYPE [DISK | TAPE] BACKUP TYPE TO COMPRESSED BACKUPSET;
For Data Pump -
COMPRESSION = [ALL|DATA_ONLY|METADATA_ONLY|NONE]
  • ALL and DATA_ONLY requires ACO enabled.
  • i.e expdp hr DIRECTORY=dpump_dir1 DUMPFILE=hr_comp.dmp COMPRESSION=DATA_ONLY

MongoDB with Python: a quick introduction (I)



Here are some basic steps for data manipulation in MongoDB using Python.

Download pymongo
pymongo is a native Python driver for MongoDB.
The PyMongo distribution contains tools for working with MongoDB.

(1) Installing PyMongo is very simple if you have setuptools installed. To install setuptools you need to:
(a) Download the egg file for your version of python: get it here.
(b) After downloaded, execute the egg as if it were an actual shell scipt:
$ sudo sh setuptools-0.6c11-py2.6.egg

(2) With setuptools installed, you can install pymongo using:
$ sudo easy_install pymongo
Searching for pymongo
Best match: pymongo 2.0.1
Processing pymongo-2.0.1-py2.6-linux-i686.egg
pymongo 2.0.1 is already the active version in easy-install.pth

Using /usr/local/lib/python2.6/dist-packages/pymongo-2.0.1-py2.6-linux-i686.egg
Processing dependencies for pymongo
Finished processing dependencies for pymongo

(b) or you can Install from source
$ git clone git://github.com/mongodb/mongo-python-driver.git pymongo
$ cd pymongo/
$ python setup.py install

To test whether the installation was successful, try to import pymongo package into python without raising an exception:
jdoe@lambda:$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import pymongo
>>>

Connect to the MongoDB server and check that you're connected to the local host in the default port.
>>> from pymongo import Connection
>>> connection = Connection()            -- create a connection with the default server/port
>>> connection                           -- print connection details
Connection('localhost', 27017')

-- You can explicitly specify host and tcp port where the mongoDB service you want to connect is running.  
>>> connection = Connection('192.117.47.23', 20120)
>>>

Connect to a database
Once connected to the database server, you need to connect to a specific mongodb database.
>>> connection.database_names()       --- list the available databases in the server
[u'mynewdb', u'local', u'test']
>>>
>>> db = connection['mynewdb']        --- connects to 'mynewdb'
>>>  
>>> db.name                           --- list name of database you're connected to
u'mynewdb'
>>>

Access database collections
Collections can be thought as analogous to tables in relational databases. To see existing collections in the database:
>>> db.collection_names()           --- list existing collections
[u'mycollection', u'system.indexes', u'things', u'comments']
>>>
>>> things = db['things']
>>>
>>> things.name                     --- print collection name
u'things'
>>>
>>> things.database                 --- database that holds the collection
Database(Connection('localhost', 27017), u'mynewdb')
>>>
>>> things.count()                  --- get the number of existing documents in the collection
5


  • Manipulating data in MongoDB with CRUD operations: Create, Retrieve, Update, Delete
  • These are the atomic operations used to manipulate the data.
  • These are method calls equivalent to DML statments in relational databases (Insert, Select, Update, Delete).
  • Comparing data manipulating operations in a relational table and in a MongoDB collection:
Relational Database MongoDB
Table BLOG (author, post, tags, date) Collection BLOG (Columns not statically defined)
INSERT statement
SQL> INSERT into BLOG
Values ("joe", v_post, "MongoDB, Python", sysdate)
>>> post = { "author": "joe",
        "text": "Blogging about MongoDB",
        "tags": ["MongoDB", "Python"],
        "date": datetime.datetime.utcnow()}
>>> db.blog.insert(post)
SELECT statement
SQL> SELECT * from BLOG
Where author = "joe"
>>> db.blog.find({"Author": "joe"})
UPDATE statement
SQL> Update BLOG set tags = "MongoDB, Python"
     where author = "joe"
>>> db.blog.update({"author":"joe"},
        { "$set": ["MongoDB", "Python"]})
DELETE statement
SQL> DELETE from BLOG where author = "joe"
>>> db.blog.remove({"author":"joe"})

Creating a new collection
Databases and Collections in MongoDB are created only when the first data is inserted.
$ ipython
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
Type "copyright", "credits" or "license" for more information.
...
In [2]: import pymongo                  --- import pymongo package
In [3]: from pymongo import Connection
In [4]: from bson import ObjectId

In [5]: connection = Connection()
In [6]: connection
Out[6]: Connection('localhost', 27017)  --- connected to localhost, in the default TCP port
In [7]: connection.database_names()     --- list existing databases
Out[7]: [u'test', u'local']

In [8]: db = connection['blogdb']       --- connect to a new database. 
                                        --- It will be created when the first object is inserted.

In [9]: post = { "author": "John", 
...          "text": "Blogging about MongoDB"};

In [9]: db.posts.insert(post);                      --- The first insert creates the new collection 'posts'
Out[9]: ObjectId('...')
In [10]: db.collection_names()
[u'system.indexes', u'posts']


Note: Collections can also be organized in namespaces, defined using a dot notation. For example, you could create two collections named: book.info and book.authors.

Inserting a document in a collection
  • In MongoDB documents within a collection do not have all to have the same number and type of fields ("columns"). In other words, schemas in MongoDB are dynamic, and can vary within one collection.
  • PyMongo uses dictionary objects to represent JSON-style documents.
  • To add a new document to a collection, using ipython:
In [9]: post = { 
   ...:     'author': 'Joann',
   ...:     'text': 'Just finished reading the Book of Nights'}

In [10]: db.posts.insert(post)        --- Method call to create a new document (post)
Out[10]: ObjectId('4eb99ad5a9e15833b1000000')

In [17]: for post in db.posts.find():   --- listing all documents in the posts collection.
             post
   ....:     
   ....:     
Out[18]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
  • Note that you don't need to specify the "_id" field when inserting a new document into a collection.
  • The document identifier is automatically generated by the database and is unique across the collection.
  • You can also execute bulk inserts:
In [13]: many_posts = [{'author': 'David',
   ....:                'text' : "David's Blog"},
   ....:               {'author': 'Monique',
   ....:                'text' : 'My photo blog'}]

In [14]: db.posts.insert(many_posts)
Out[14]: [ObjectId('4eb9bcada9e15809f3000000'), ObjectId('4eb9bcada9e15809f3000001')]

In [15]: for post in db.posts.find():
   ....:     post
   ....:     
   ....:     
Out[15]: 
{u'_id': ObjectId('4eb99ad5a9e15833b1000000'),
 u'author': u'Joann',
 u'text': u'Just finished reading the Book of Nights'}
Out[15]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000000'),
 u'author': u'David',
 u'text': u"David's Blog"}
Out[15]: 
{u'_id': ObjectId('4eb9bcada9e15809f3000001'),
 u'author': u'Monique',
 u'text': u'My photo blog'}

Selecting (reading) documents inside collections
  • Data in MongoDB is represented by structures of key-value pairs, using JSON-style documents.
  • Let's query the collection "things" and ask for ONE document in that collection. Use the find_one() method.
>>> things.find_one()                             --- returns the first document in the collection
{u'_id': ObjectId('4eb787821b02fd09c403b219'), u'name': u'mongo'}

Here it returned a document containing two fields (key-value pairs): 
  "_id": ObjectId('4eb787821b02fd09c403b219')  --- (an identifier for the document), and 
  "name": 'mongo'                              --- a "column" "name" with its associated value, the string 'mongo'.

We can also define criteria for the query. For example,
(a) return one document with field "name" equal to "mongo"
>>> things.find_one({"name":"mongo"});
{u'_id': ObjectId('4eb787821b02fd09c403b219'), u'name': u'mongo'}
>>>

(b) return one document with field "name" equal to "book"
>>> things.find_one({"name":"book"});
{u'keywords': [u'NoSQL', u'MongoDB', u'PyMongo'], u'date': datetime.datetime(2011, 11, 7, 19, 47, 44, 722000), u'_id': ObjectId(',,,'), u'name': u'book', u'title': u'Mastering MongoDB'}

Note: The dynamic nature of the mongoDB database schemas can be seen in the results of the queries above. Here the collection "things" has two documents with different number of fields ("columns") and datatypes: 
 {"name":"mongo"}
 {"name":"book, "title": "Mastering MongoDB", "Keywords":["NoSQL", "MongoDB", "PyMongo"], "date": datetime.datetime(2011, 11, 7, 19, 47, 44, 722000)} 

Querying more than one document
A query returns a cursor pointing to all the documents that matched the query criteria.
To see these documents you need to iteract through the cursor elements:
>>> for thing in things.find():
...     thing
... 
{u'_id': ObjectId('...'), u'name': u'mongo'}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 1.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 2.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 3.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 4.0}
{u'keywords': [u'NoSQL', u'MongoDB', u'PyMongo'], u'date': datetime.datetime(...), u'_id': ObjectId('...'), u'name': u'book', u'title': u'Mastering MongoDB'}
{u'keywords': [u'programming', u'Python', u'MongoDB'], u'date': datetime.datetime(...), u'_id': ObjectId('...'), u'name': u'book', u'title': u'Python and MongoDB'}
{u'name': u'book', u'title': u'Python Notes', u'keywords': [u'programming', u'Python'], u'year': 2011, u'date': datetime.datetime(...), u'_id': ObjectId('4...')}

-- Alternatively, you can explicitly define a cursor variable: 
>>> cursor = things.find()
>>> for x in cursor:
...     x
... 
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 1.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 2.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 3.0}
{u'x': 4.0, u'_id': ObjectId('...'), u'j': 4.0}
{u'keywords': [u'NoSQL', u'MongoDB', u'PyMongo'], u'date': datetime.datetime(...), u'_id': ObjectId('...'), u'name': u'book', u'title': u'Mastering MongoDB'}
{u'keywords': [u'programming', u'Python', u'MongoDB'], u'date': datetime.datetime(...), u'_id': ObjectId('...'), u'name': u'book', u'title': u'Python and MongoDB'}
{u'name': u'book', u'title': u'Python Notes', u'keywords': [u'programming', u'Python'], u'year': 2011, u'date': datetime.datetime(...), u'_id': ObjectId('...')}
>>> 


You can also return only some of the document fields. (Similar to a SQL query that returns only a subset of the table columns).
>>> for thing in things.find({"name":"book"}, {"keywords": 1}):
...     thing
... 
{u'keywords': [u'NoSQL', u'MongoDB', u'PyMongo'], u'_id': ObjectId('...')}
{u'keywords': [u'programming', u'Python', u'MongoDB'], u'_id': ObjectId('...')}
{u'keywords': [u'programming', u'Python'], u'_id': ObjectId('...')}
>>> 

Updating documents in collections
  • MongoDB supports atomic updates in document fields as well as more traditional updates for replacing an entire document.
  • Use the update() method to entirely replace the document matching criteria with a new document.
  • If you want to modify only some attributes of a document, you need to use one of the $set modifier.
  • update() usually takes two parameters:
    • the first select the documents that will be updated (similar to the WHERE clause on SQL);
    • the second parameter contains the new values for the document attributes.


Example: insert a new document in the blog collection, and update the tag values.
(1) Insert a new document in the blog collection

>>>new_post = { "author": "Monique", 
...       "text": "Sharding in MongoDB",
...       "tags": ["MongoDB"],
...       "date": datetime.datetime.utcnow()};
>>>
>>> db.blog.insert(new_post)
ObjectId('...')
>>> 

(2) list documents in the collection
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(2011, 11, 7, 22, 10, 43, 77000), u'text': u'Blogging about MongoDB', u'_id': ObjectId('...'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
{u'date': datetime.datetime(2011, 11, 8, 1, 5, 32, 604000), u'text': u'Sharding in MongoDB', u'_id': ObjectId('...'), u'author': u'Monique', u'tags': [u'MongoDB']}
>>> 

Now, update the post where the author was Monique.
(1) substitute the document for an entirely new document
>>> db.blog.update({"author":"Monique"}, { "author": "Monique", "text": "Sharding in MongoDB", "tags": ["MongoDB", "scalability"], "date": datetime.datetime.utcnow()});
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(2011, 11, 7, 22, 10, 43, 77000), u'text': u'Blogging about MongoDB', u'_id': ObjectId('...'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
{u'date': datetime.datetime(2011, 11, 8, 1, 8, 43, 416000), u'text': u'Sharding in MongoDB', u'_id': ObjectId('...'), u'author': u'Monique', u'tags': [u'MongoDB', u'scalability']}
>>> 

Note that the previous update replaced the previous document entirely, even if all you needed to do was to add one new tag to the tags field of the document. If you call the update method and pass only the new values for the tags attribute, the resulting update will be incorrect:
>>> db.blog.update({"author":"Monique"}, { "tags": ["MongoDB", "scalability"]});
>>>
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(...), u'text': u'Blogging about MongoDB', u'_id': ObjectId('...'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
{u'_id': ObjectId('...'), u'tags': [u'MongoDB']}                  --- updated document
>>> 

(2) Another way to update only some fields of a document, is to use the $set update modifier.
  • The $set modifier works like the SET clause on an SQL Update statement, with which you can specify the columns that will be updated
>>> db.blog.update({"author":"Monique"}, { "$set": {"tags": ["MongoDB","Scalability"]}});
>>>
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(...), u'text': u'Blogging about MongoDB', u'_id': ObjectId('...'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
{u'date': datetime.datetime(...), u'text': u'Sharding in MongoDB', u'_id': ObjectId('...'), u'tags': [u'MongoDB', u'Scalability'], u'author': u'Monique'}
>>> 

(3) Since the "tags"field is an array, you can more efficiently use the $push update modifier.
  • $push appends value to field, if field is an existing array, otherwise sets field to the array [value] if field is not present.
>>> db.blog.update({"author":"Monique"}, { "$push": {"tags":"Python"}});
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(2011, 11, 7, 22, 10, 43, 77000), u'text': u'Blogging about MongoDB', u'_id': ObjectId('4eb857b3a9e158609c000004'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
{u'date': datetime.datetime(2011, 11, 8, 1, 5, 32, 604000), u'text': u'Sharding in MongoDB', u'_id': ObjectId('4eb88081a9e158609c000005'), u'tags': [u'MongoDB', u'Scalability', u'Python'], u'author': u'Monique'}
>>> 


Deleting documents from collections
To delete a document from a collection use the method remove, passing as parameter a document field that either (a) uniquely identifies the document you want to delete or (b) identifies the set of documents you want to delete.s
>>> db.blog.remove({"author":"Monique"})
>>> for post in db.blog.find():
...     post
... 
{u'date': datetime.datetime(2011, 11, 7, 22, 10, 43, 77000), u'text': u'Blogging about MongoDB', u'_id': ObjectId('4eb857b3a9e158609c000004'), u'author': u'John', u'tags': [u'MongoDB', u'NoSQL', u'Python']}
>>>


Installing MongoDB on Ubuntu


Recently I have been trying MongoDB and other NoSQL databases for a project.
So I decided to compile some of my notes on posts here.

MongoDB: Download, Install and Configuration

Installing MongoDB on Ubuntu
To install MongoDB on Ubuntu, you can use the packages made available by 10gen, following the steps below:

(1) add a line to your /etc/apt/sources.list
deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen

(2) Add the 10gen GPG key, or apt will disable the repository (apt uses encryption keys to verify if the repository is trusted and disables untrusted ones).
jdoe@quark:/etc/apt$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
Executing: gpg --ignore-time-conflict --no-options --no-default-keyring --secret-keyring /etc/apt/secring.gpg --trustdb-name /etc/apt/trustdb.gpg --keyring /etc/apt/trusted.gpg --primary-keyring /etc/apt/trusted.gpg --keyserver keyserver.ubuntu.com --recv 7F0CEB10
gpg: requesting key 7F0CEB10 from hkp server keyserver.ubuntu.com
gpg: key 7F0CEB10: public key "Richard Kreuter " imported
gpg: no ultimately trusted keys found
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
jdoe@quark:/etc/apt$ 

(3) To install the package, update the sources and then install:
$ sudo apt-get update
$ sudo apt-get install mongodb-10gen

(4) Create directory for datafiles and database logs.
MongoDB by default try to store datafiles in /data/db.
If the directory does not exist, the server will fail to start unless you explicitly assign a different, existing location for the datafiles.
For security reasons, make sure the directory is created as a non-root user.
(a) You can create the default directory: 
$ sudo mkdir -p  /data/db/
S sudo chown `id -u` /data/db

(b) you can choose to store datafiles somewhere else. If you choose this, make sure to specify the datafile location with the --dbpath option when starting the MongoDB server.
$ sudo mkdir -p  
S sudo chown `id -u` 

(5) You can test the installation, by calling the mongodb shell
jdoe@quark:~$ mongo
MongoDB shell version: 2.0.1
connecting to: test
> 

The installation creates the mongodb user and install files according to the default configuration below:
Installed architecture
  • binaries installed on /usr/bin
jdoe@quark:/usr/bin$ ls -l mongo*
-rwx... mongo            -- database shell
-rwx... mongod           -- mongodb daemon. This is the core database process
-rwx... mongodump        -- hotbackups. creates a binary representation of the entire database, collections or collection objects
-rwx... mongoexport      -- exports a collection to JSON or CSV
-rwx... mongofiles       -- tool for using GridFS, a mechanism for manipulating large files in MongoDB
-rwx... mongoimport      -- imports a JSON/CSV/TSV file into a MongoDB
-rwx... mongorestore     -- restores the output of mongodump
-rwx... mongos           -- sharding controller. Provides automatic load balancing and partitioning
-rwx... mongostat        -- show usage statistics (numbers and percentuals) on a running monodb instance 
-rwx... mongotop         -- provide read/write statistics on collections and namespaces in a mongodb instance
  • Configuration file installed on /etc/mongodb.conf
  • database files will be created in: dbpath=/var/lib/mongodb
  • log files will be created in : logpath=/var/log/mongodb/mongodb.log



Starting up and Stopping MongoDB
  • mongod is MongoDB core database process. It can be manually started to run in the foreground or as a daemon.
  • MongoDB is a database server: it runs in the foreground or background and waits for connections from the user.
  • There are a number of options with which mongod can be initialized.
  • The startup options fall into general, replication, master/slave, replica set and sharding categories. Some of the startup options are:
--port      num   TCP port which mongodb will use
--maxConns  num   max # of simultaneous connections
--logpath   path  log file path
--logappend       append instead of overwrite log file
--fork            fork server process (daemon)
--auth            authenticate users
--dbpath    path  directory for datafiles
--directoryperdb  each database will be stored in its own directory
--shutdown        shutdowns server

(a) start mongodb running in the foreground in a terminal. Data stored in /mongodb/data. mongodb uses default port 27017.
(You need to create the /mongodb/data first).
jdoe@quark:~$ mkdir -p /mongodb/data
jdoe@quark:~$ mongod --dbpath /mongodb/data  
...
Sun Nov  6 19:05:09 [initandlisten] options: { dbpath: "/mongodb/data" }
Sun Nov  6 19:05:09 [websvr] admin web console waiting for connections on port 28017
Sun Nov  6 19:05:09 [initandlisten] waiting for connections on port 27017
...

jdoe@quark:~$ ps -ef | grep mongo
jdoe   20519 16142  0 19:05 pts/1    00:00:10 mongod --dbpath /mongodb/data
jdoe   20566 20034  0 19:49 pts/2    00:00:00 grep mongo

jdoe@quark:~$ ls -l /mongodb/data
total 4
-rwxr-xr-x 1 jdoe jdoe 6 2011-11-06 19:05 mongod.lock

(b) start mongodb as a daemon, running on TCP port 20012. Data stored in /mongodb/data. Logs on /mongodb/log.
jdoe@quark:~$ mongod --fork --port 20012 --dbpath /mongodb/data/ --logpath /mongodb/logs/mongodblog --logappend 
forked process: 20655
jdoe@quark:~$ all output going to: /mongodb/logs/mongodblog

Alternatively, you can start/stop mongoDB by:
jdoe@quark:~$ sudo start mongodb
mongodb start/running, process 2824

jdoe@quark:~$ sudo stop mongodb
mongodb stop/waiting

Stopping MongoDB

  1. Contro-C will do it, if the server is running on the foreground. Mongo waits until all ongoing operations complete and then exits.
  2. Alternatively:
(a) call mongod with --shutdown option
jdoe@quark:~$ mongod --dbpath /mongodb/data --shutdown
killing process with pid: 20746
or
(b) use database shell (mongo)
(Here note the confusing output of the db.shutdownServer() call. Although the messages suggest failure, the database is shutdown as expected).
jdoe@quark:~$ mongo quark:20012
MongoDB shell version: 2.0.1
connecting to: quark:20012/test
> use admin
switched to db admin
> db.shutdownServer()
Sun Nov  6 20:23:59 DBClientCursor::init call() failed
Sun Nov  6 20:23:59 query failed : admin.$cmd { shutdown: 1.0 } to: quark:20012
server should be down...

Oracle Flashback Technology (III) - Flashback Data Archive



Oracle Flashback technology
Logical Flashback (do not depend on RMAN, rely on undo data)
Physical flashback
New on Oracle 11g:



Using Flashback Data Archive (Oracle Total Recall)
  • With Data Archive you can store and track transactional changes to a record over its lifetime.
  • It permanently stores undo information on flashback archives, allowing you to keep the transactional history of a object since its creation.
  • Flashback archives are enabled on individual tables and are located in tablespaces, and have a name, a specified retention period and a space quota on the tablespace.
  • A database can have multiple flashback archives.
  • when a DML transaction commits on a flashback archive enabled table, the Flashback Data Archiver (FBDA) process stores the pre-image of the rows into a flashback archive.
  • FBDA also manages the data within the flashback archives (purging data beyond retention period).
  • Historical data can be queried using the Flashback Query AS OF clause.
  • Useful for compliance with record stage policies and audit requirements.


To enable flashback archiving for a table:
  • You need FLASHBACK ARCHIVE privilege on a flashback data archive
  • Table cannot be clustered, nested, temporary, remote or external
  • Table cannot have LONG nor nested columns


Create a Flashback Data Archive
(1) Create a new tablespace (you may also use an existing one)

SQL> create tablespace fda_ts
   datafile '/u01/app/oracle/oradata/test112/fda1_01.dbf'
   size 1m autoextend on next 1m;

SQL> select tablespace_name, status, contents, retention
  from dba_tablespaces
  where tablespace_name ='FDA_TS';

TABLESPACE_NAME                STATUS    CONTENTS  RETENTION   
------------------------------ --------- --------- ----------- 
FDA_TS                         ONLINE    PERMANENT NOT APPLY   

(2) Create Flashback archvies:

SQL> create flashback archive default fda_1m tablespace fda_ts   -- Must be SYSDBA to create DEFAULT FDA
  quota 1G retention 1 month;                                -- To change use ALTER FLASHBACK ARCHIVE...SET DEFAULT

SQL> create flashback archive fda_2yr tablespace fda_ts retention 2 year;
  
SQL> create flashback archive fda_10d tablespace fda_ts retention 10 day;

Managing Flashback Data Archives:
(1) Manage FDA tablespaces:

ALTER FLASHBACK ARCHIVE...
   ...SET DEFAULT;
   ... ADD TABLESPACE... QUOTA...;
   ... MODIFY TABLESPACE...
   ... REMOVE TABLESPACE...

(2) Manage retention period:

ALTER FLASHBACK ARCHIVE fda_name MODIFY RETENTION n [Year | Month | day ];

(3) Purge historical data

ALTER FLASHBACK ARCHIVE...
   ...PURGE ALL;                          -- Purge ALL historical data
   ...PURGE BEFORE TIMESTAMP (SYSTIMESTAMP - INTERVAL 'n' DAY);
   ...PURGE BEFORE SCN scn_num;

(4) Drop FDA:

DROP FLASHBACK ARCHIVE fda_name;           -- Drops FDA. Keeps tablespace.

Enabling FDAs on objects:
  • FDA is disabled by default
  • User needs FLASHBACK ARCHIVE privilege to create enable flashback archive on object.
SQL> conn sys/pwd as sysdba;

SQL> grant flashback archive on fda_1m to userA;

SQL> conn userA/pwd;

SQL> Create table emp 
  (empno number primary key,
   ename varchar2(20),
   salary number) 
  flashback archive fda_1m;


-- To Disable Flashback archive on table
SQL> ALTER TABLE emp NO flashback archive;

Information about Flashback data Archives:
DBA_FLASHBACK_ARCHIVE, DBA_FLASHBACK_ARVHIE_TS and DBA_FLASHBACK_ARCHIVE_TABLES
SQL> select owner_name, flashback_archive_name, retention_in_days, status, 
       to_char(last_purge_time, 'dd-mon-yy hh24:mi:ss')
from dba_flashback_archive;

OWNER_NAME    FLASHBACK_ARCHIVE_NAME  RETENTION_IN_DAYS      STATUS  LAST_PURGE_TIME           
------------- ----------------------- ---------------------- ------- ------------------------- 
SYS           FDA_1M                  30                     DEFAULT 25-oct-11 13:34:14 
SYS           FDA_2YR                 730                            25-oct-11 13:34:54 
SYSTEM        FDA_10D                 10                             25-oct-11 13:38:05

SQL> select * from dba_flashback_archive_ts;

FLASHBACK_ARCHIVE_NAME  FLASHBACK_ARCHIVE#     TABLESPACE_NAME                QUOTA_IN_MB 
----------------------- ---------------------- ------------------------------ ------------
FDA_1M                  1                      FDA_TS                         1024        
FDA_2YR                 2                      FDA_TS                                     
FDA_10D                 3                      FDA_TS                                     


SQL> select * from dba_flashback_archive_tables;

TABLE_NAME  OWNER_NAME                     FLASHBACK_ARCHIVE_NAME ARCHIVE_TABLE_NAME   STATUS   
----------- ------------------------------ ---------------------- ------------------- -------- 
EMP         SYSTEM                         FDA_1M                 SYS_FBA_HIST_75434  ENABLED  


Example: Viewing table history.
(1) Insert data on emp
(2) Keep record of some points in time
(3) Query the historical data on emp
SQL> select to_char(systimestamp, 'dd-mon-yy hh24:mi:ss') start_time, 
       current_scn start_scn
     from v$database;

START_TIME         START_SCN              
------------------ ---------------------- 
25-oct-11 14:22:25 1498655       

SQL> select * from emp;

EMPNO                  ENAME                SALARY                 
---------------------- -------------------- ---------------------- 

-- PL/SQL block performs a number of DMLs on emp and prints timestamps
set serveroutput on
declare
 procedure get_timestamp
 is
   v_time varchar2(25);
   v_scn  integer;
 begin
   select to_char(systimestamp, 'dd-mon-yy hh24:mi:ss') start_time, 
        current_scn start_scn into v_time, v_scn
   from v$database;
   dbms_output.put_line('timestamp: ' || v_time);
   dbms_output.put_line('SCN:       ' || v_scn);
end;
 
begin
  insert into emp values (1, 'John', 2000);
  commit;
  dbms_lock.sleep(2);
  get_timestamp();
  for i in 1 .. 10 
  loop
   update emp set salary =salary*1.05 where empno=1;
   commit;
   dbms_lock.sleep(2);
   if i=5 then
     insert into emp values (2, 'Mary', 3000);
     update emp set salary = 2500 where empno =1;
     commit;
     dbms_lock.sleep(2);
     update emp set ename = initcap(ename);
     commit;
     insert into emp values (3, 'Gary', 1500);
     delete from emp where empno=2;
     commit;
     get_timestamp();   
   end if;
  end loop;
  dbms_lock.sleep(2);
  get_timestamp();
end;
/

anonymous block completed
timestamp: 25-oct-11 14:22:27
SCN:       1498659
timestamp: 25-oct-11 14:22:39
SCN:       1498683
timestamp: 25-oct-11 14:22:51
SCN:       1498700

SQL> select * from emp;

EMPNO                  ENAME                SALARY                 
---------------------- -------------------- ---------------------- 
1                      John                 3190.70390625          
3                      Gary                 1500         

SQL> select to_char(systimestamp, 'dd-mon-yy hh24:mi:ss') end_time, 
       current_scn end_scn
     from v$database;
END_TIME           END_SCN                
------------------ ---------------------- 
25-oct-11 14:22:51 1498701


(a) Select data at a point in time
SQL> select *  from emp as of scn 1498683;

EMPNO                  ENAME                SALARY                 
---------------------- -------------------- ---------------------- 
1                      John                 2500                   
3                      Gary                 1500       

SQL> select * 
     from emp as of timestamp to_timestamp('25-oct-11 14:22:51', 'dd-mon-yy hh24:mi:ss');
EMPNO                  ENAME                SALARY                 
---------------------- -------------------- ---------------------- 
1                      John                 3190.70390625          
3                      Gary                 1500  


(b) select all versions of a row betwen two points in time

SQL> select *
     from emp
       versions between scn 1498659 and 1498700
     where empno =1;
EMPNO                  ENAME                SALARY                 
---------------------- -------------------- ---------------------- 
1                      John                 3190.70390625          
1                      John                 2000                   
1                      John                 2100                   
1                      John                 2205                   
1                      John                 2315.25                
1                      John                 2431.0125              
1                      John                 2552.563125            
1                      John                 2500                   
1                      John                 2500                   
1                      John                 2625                   
1                      John                 2756.25                
1                      John                 2894.0625              
1                      John                 3038.765625            
1                      John                 3190.70390625          

 14 rows selected 


SQL> select versions_xid xid, versions_startscn start_scn,
            versions_endscn end_scn, versions_operation operation,
            empno, ename, salary
     from emp
        versions between scn 1498659 and 1498700
     where empno =1;

XID              START_SCN   END_SCN                OPERATION EMPNO   ENAME                SALARY                 
---------------- ----------- ---------------------- --------- ------- -------------------- ---------------------- 
03000F008B040000 1498633     1498674                I         1       John                 3190.70390625          
05001F00AA040000 1498657     1498661                I         1       John                 2000                   
030003008B040000 1498661     1498664                U         1       John                 2100                   
02000A007E040000 1498664     1498667                U         1       John                 2205                   
01000D003A030000 1498667     1498670                U         1       John                 2315.25                
0400090075030000 1498670     1498672                U         1       John                 2431.0125              
06000B0094040000 1498672     1498674                U         1       John                 2552.563125            
0900080096040000 1498674     1498678                U         1       John                 2500                   
03001F008B040000 1498678     1498685                U         1       John                 2500                   
09001F0097040000 1498685     1498688                U         1       John                 2625                   
080010006C050000 1498688     1498691                U         1       John                 2756.25                
0700190078030000 1498691     1498694                U         1       John                 2894.0625              
03001A008B040000 1498694     1498697                U         1       John                 3038.765625            
05001E00AB040000 1498697                            U         1       John                 3190.70390625          

 14 rows selected