Infolinks

Sunday, 24 June 2012

SQL*Loader Environment

SQL*Loader Environment


SQL*Loader Environment

SQL*Loader takes as input a control file, which controls the behavior of SQL*Loader, and one or more datafiles. Output of the SQL*Loader is an Oracle database (where the data is loaded), a log file, a bad file, and potentially a discard file.

Execution Steps:

Step1    Executes from the command prompt or parameter file.
Step2    Reads the control file.
Step3    Reads the data from the control file or from one or more datafiles.
Step4    Loads the data in the database and logs the information in the log file.
Step5    Places the rejected records in the bad file.
Step6    Places the discarded records in the discard file.

1] The parameter file:

The command line information can be saved in a parameter file. This parameter file is executed from the command prompt. A parameter file has a .par extension. Following is a sample parameter file.
Steps to create a parameter file:
  • Open a text editor.
  • Type in each parameter with its corresponding value as shown in the parameter file.
  • Save the file with the .par extension.
How do you execute the parameter file?
At the command prompt type sqlldr PARFILE = <parfile name>.

2] The Control File:

  • The control file is a text file written in a language that SQL*Loader understands.
  • The control file describes the task that the SQL*Loader is to carry out. The control file tells SQL*Loader where to find the data, how to parse and interpret the data, where to insert the data, and more.
  • It also contains the names and locations of the bad file and the discard file.
  • Some of above information (such as name and location of the input file) can also be passed to SQL*Loader as command-line parameters.
  • It’s also possible for the control file to contain the actual data to be loaded. This is sometimes done when small amounts of data need to be distributed to many sites, because it reduces (to just one file) the number of files that need to be passed around.
 A sample control file is given below:
In general, the control file has three main sections, in the following order:
A] Session-wide information:
The session-wide information contains the names of the input/output files for the data load session. Apart from this, other SQL*Loader parameters can also be listed in this section.
  • The LOAD DATA statement is required at the beginning of the control file.
  • INFILE * specifies that the data is found in the control file and not in an external data file.
  • BADFILE ‘example1.bad’ indicates that all erroneous records must be stored in the file example1.bad.
  • DISCARDFILE ‘example1.dsc’ indicates that all discarded records must be stored in the file example1.dsc.
B] Table and Field_List Information:
The INTO TABLE statement specifies the table into which data should be loaded. In this case it is the dept table. By default, SQL*Loader requires the table to be empty before it inserts any records.
FIELDS TERMINATED BY specifies that the data is terminated by commas, but can also be enclosed by quotation marks. Data types for all fields default to CHAR.
The names of columns to load are enclosed in parentheses. Because no data type or length is specified, the default is type CHAR with a maximum length of 255.
C] Input Data:
BEGINDATA specifies the beginning of the data. The data to be loaded is present below the BEGINDATA command.

3] Input Datafiles:

  • The data to be loaded is contained in one or more datafiles if it is not contained in the control file.
  • The data in the datafile can be in the fixed length format, variable length format, or in the stream record format.
A] Fixed Length Format:
A file is in the fixed record format when all the records in the datafile have the same byte length. This format is not flexible but offers very good performance.then the syntax for the INFILE command is – INFILE student.dat “fix 15″
The syntax for letting SQL*Loader know that the data is in the fixed length format is:
INFILE datafile_name “fix n”
Here INFILE datafile_name refers to the file that contains the data to be loaded. “fix n” implies that each record in the datafile has a fixed byte length of n.
For example if the name of the following datafile is student.dat and the byte length of a record is 15 bytes
0001, —–Rina, 0002, —-Harry, 0003,—–Sudha
B] Variable Length Format:
A file is in the variable record format when the length of each record varies. The length of each record is included at the beginning of the record in the datafile. This format provides some added flexibility over the fixed record format and a performance advantage over the stream record format.
For example, you can specify a datafile that is to be interpreted as being in variable record format as follows:
INFILE “datafile_name” “var n”
Here n specifies the number of bytes in the record length field. If n is not specified, SQL*Loader assumes a length of 5 bytes. If n is specified larger than 40 it results in an error. The following datafile is random.dat and the value for n is 3.
009hello,cd,010world,im,
012my,name is,
SQL*Loader reads the first 3 bytes to gather the length of the record. Here the first record is 9 bytes long. After SQL*Loader has read 9 bytes, it reads the next 3 bytes to find the size of this record which is 10 bytes long. It reads the next 10 bytes of the record and then finds the third record is 12 bytes long and so on.
C] Stream Record Format:
A file is in the stream record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator. Stream record format is the most flexible format, but there can be a negative effect on performance. The syntax for specifying the stream record format is as follows:
INFILE datafile_name ["str terminator_string"]
The terminator_string can be a ‘char_string’  which is a string of characters enclosed in single or double quotation marks or a ‘hex_string’ which is a byte string in hexadecimal format.

4] The Log File:

The log file is a record of SQL*Loader’s activities during a load session. It contains information such as the following:
  • The names of the control file, log file, bad file, discard file, and data file
  • The values of several command-line parameters
  • A detailed breakdown of the fields and datatypes in the data file that was loaded
  • Error messages for records that cause errors
  • Messages indicating when records have been discarded
  • A summary of the load that includes the number of logical records read from the data file, the number of rows rejected because of errors, the number of rows discarded because of selection criteria, and the elapsed time of the load
Always review the log file after a load to be sure that no errors occurred, or at least that no unexpected errors occurred. This type of information is written to the log file, but is not displayed on the terminal screen.

5] The Bad File:

Whenever you insert data into a database, you run the risk of that insert failing because of some types of error. Integrity constraint violations undoubtedly represent the most common type of error. However, other problems, such as the lack of free space in a tablespace, can also cause insert operations to fail. Whenever SQL*Loader encounters a database error while trying to load a record, it writes that record to a file known as the bad file.
  • If one or more records are rejected, the bad file is created and the rejected records are logged.
  • If no records are rejected, then the bad file is not created.

6] The Discard File:

While SQL*Loader is being executed it creates a discard file for records that do not meet any of the loading criteria. The records contained in this file are called discarded records. Discarded records do not satisfy any of the WHEN clauses specified in the control file. These records differ from rejected records. Discarded records do not necessarily have any bad data. A discarded record is never inserted into the Oracle table.
A discard file is created according to the following rules:
  • You have specified a discard filename and one or more records fail to satisfy all of the WHEN clauses specified in the control file. (If the discard file is created, it overwrites any existing file with the same name, so be sure that you do not overwrite any files that you want to retain.)
  • If no records are discarded, then a discard file is not created.

No comments:

Post a Comment