Saturday, 10 September 2016

Datapump Architecture. What is Master table in Datapump ?

Master Table :

The Master Table is created in the schema of the current user running the Pump Dump export or import, and it keeps tracks of lots of detailed information.

The Master Table is used to track the detailed progress information of a Data Pump job.

This will store the following information :

· The status of every worker process involved in the operation.

· The current set of dump files involved.

· The job’s user-supplied parameters.

· The state of current job status and restart information.

· The current state of every object exported or imported and their locations in the dump file set.

Note : The Master Table is the key to Data Pump’s restart capability in the event of a planned or unplanned job stoppage.

Behaviour of Master Table :

This table is created at the beginning of a Data Pump operation and is dropped at the end of the successful completion of a Data Pump operation. The Master Table can also be dropped if the job is killed using the kill_job interactive command. If a job is stopped using the stop_job interactive command or if the job is terminated unexpectedly, the Master Table will be retained.

The keep_master parameter can be set to Y to retain the Master Table at the end of a successful job for debugging purposes

The name of the Master Table is the same as the Data Pump job name and has the following columns:

SQL> Desc <job_name> ;

Process in Datapump Architecture

The master control process

· Maintains job state, job description, restart, and dump file set information in the Master Table.

· This process controls the execution and sequencing of a Data Pump job.

· The master process has two main functions

1. To divide the loading and unloading of data and metadata tasks and handle the worker processes;

2. To manage the information in the Master Table and record job activities in the log file.

Worker Process:

· This handles the request assigned by the master control process. This process maintains the current status of the job, like : ‘pending’ or ‘completed’ or ‘failed’.

· The worker process is responsible for loading and unloading data and metadata.

· The number of worker processes needed can be defined by assigning a number to the parallel parameter.

Parallel Query Process:

· This process is used when the Data Pump chooses External Table API as the data access method for loading and unloading data.

· The worker process that uses the External Table API creates multiple parallel query processes for data movement, with the worker process acting as the query coordinator.

Shadow Process :

· This process is created when a client logs into the Oracle server.

· The shadow process creates a job, which primarily consists of creating the Master Table, creating the queues in Advanced Queues (AQ) used for communication among the various processes, and creating the master control process.

· Once a job is running, the shadow process’ main job is to check the job status for the client process. If the client process detaches, the shadow process goes away; however, the remaining Data Pump job processes are still active.

· Another client process can create a new shadow process and attach to the existing job.