Apps Tuining
Apps Tuining
Tuning SQL
Following tips will help you to write better performing SQL's.
EXISTS & IN
Original
|
SELECT name
FROM employee
WHERE empno IN (SELECT empno
FROM manager)
|
Sub-query retrieves all the records in the manager table every time for each employee record and then compares the data.
|
Modified
| |||
SELECT name
FROM employee
|
WHERE EXISTS (
SELECT manager.empno
FROM manager
WHERE manger.empno = employee.empno)
|
For
each employee record, sub-query verifies if there is matching record in
the manager table and comes out after the first occurrence of the
successful matching.
|
SELECT name
FROM employee
Original
|
SELECT orderno
FROM orders
WHERE orderno NOT IN (SELECT orderno
FROM pendingorders)
|
Sub-query
retrieves all the records in the pendingorders table every time for
each record in the orders table and then compare the data.
| ||||
Modified
| ||||||
WHERE NOT EXISTS (
SELECT manager.empno
FROM manager
WHERE manger.empno = employee.empno)
|
For
each employee record, sub-query verifies if there is matching record in
the manager table and comes out after the first occurrence of the
successful matching (return false in this case).
|
Using Functions on Indexed Columns
For e.g., the following SQL constructs prevent the use of index on empno:
SELECT name SELECT name
FROM employee FROM employee
WHERE NVL(empno, ‘INF000’) = ‘INF001’ WHERE UPPER(empno) = ‘INF001’
SELECT name SELECT name
FROM employee FROM employee
WHERE empno IS NOT NULL WHERE LOWER(empno) = ‘inf001’
Note: ‘IS NULL’ clause on indexed column will use the respective index.
SELECT name
FROM employee
WHERE TO_CHAR(empno) = ‘001’
As
mentioned above, use of SUBSTR function on index column will suppress
the index. But there are cases where you can substitute SUBSTR with LIKE
and still use the index!
For e.g., your condition is that first 2 characters of the dept should be “ES” (RBO case only):
Index will not be used
SELECT name
FROM employee
WHERE SUBSTR(dept,1,2) = ‘ES’
Index will be used
SELECT name
FROM employee
WHERE dept LIKE ‘ES%’
Note: If the wild character ‘%’ mentioned only in the last, index will be used in case of the function LIKE. If it comes in the beginning for ‘LIKE’, or in any case with ‘NOT LIKE’ index will not be used.
Do
not concatenate the index columns with constant strings or add/subtract
numbers on indexed columns. Both of these would stop optimizer from
using index.
For e.g., the following SQL construct prevents the use of index on lastname.
SELECT dept
FROM employee
WHERE lastname || ‘ Query’ = ‘SQL Query’
In the following SQL by adding a number on salary column, use of index is suppressed.
SELECT name
FROM employee
WHERE salary+1000 = <calc_salary>
Above SQL can be better written as follows
SELECT name
FROM employee
WHERE salary = <calc_salary> - 1000 => This will use the index on salary
If
an index in composed of multiple columns and if there are no functions
used on leading column, but functions used on other columns will not
prevent optimizer from using that index.
Suppose unique on index on table employee consists of columns empid, effective_start_date and effective_end_date:
Index will be used
SELECT name
FROM employee
WHERE empid = <emp_id>
AND trunc(effective_start_date) >= sysdate-365
AND trunc(effective_end_date) < sysdate => function not used on leading index column
Index will not be used
SELECT name
FROM employee
WHERE TO_CHAR(empid) = <emp_id>
AND trunc(effective_start_date) >= sysdate-365
AND trunc(effective_end_date) < sysdate => function used on leading index column
Joins & Sub-queries
Ø Execution of a number of individual SQL’s is prevented
Ø Redundant fetches of values already fetched is avoided
Ø Parsing and execution time, overhead of bookkeeping, optimizing, memory overheads, SGA/PGA overheads are avoided
Ø Joins are faster than PL/SQL
Incorrect Query Correct Query
SELECT p.product_code, p.product_version, SELECT p.product_code, p.product_version,
r.text r.text
FROM products p, problem_reports r FROM products p, problem_reports r
WHERE r.product_code = p.product_code WHERE r.product_code = p.product_code
AND p.status = ‘Testing’ AND r.product_version = p.product_version
AND p.status = ‘Testing’
There
are cases where sub-query can be used with join. For e.g., if you want
to select the latest contact of a customer where customer can have
multiple contacts, you may use the following workaround:
SELECT c.customer_name, ct1.last_name
FROM customers c, contacts ct1
WHERE c.customer_id = ct1.customer_id
AND ct.contact_id = (SELECT max(contact_id)
FROM contacts ct2
WHERE ct2.customer_id = c.customer_id)
Outer Join
When you want to retrieve data even if there are no matching records in join, use outer join. Correct syntax for outer join is:
Table1.Column1 (+) = Table2.Column1 or Table1.Column1 = Table2.Column1 (+)
The bracketed sign follows the column of the table which has/may have the missing row.
Optimizing joins
- The more tables are involved the more rows have to be read thereby more expensive usually.
- Retrieve the smallest result set first when joining more than 3 tables, so that total read from database reduces.
- Make sure that joined columns are indexed for fast retrieval; prevent use of functions on joined columns.
- When you create a new index make sure that it’ll affect the performance of existing joins.
Usually
developer efforts are concentrated on tuning the SQL’s used by the
PL/SQL program for performance improvement. Being a procedural language,
there can be situations where excessive CPU usage arises out of the
code in Pl/SQL even though there are no database accesses.
Here different areas of PL/SQL code are categorized and some of the basic principles of optimizing are discussed.
UNION ALL instead of UNION
While
using UNION function to retrieve the records from two queries, output
will be the unique records from both queries in a sorted manner. This
means individual output from each query are compared, sorted and
duplicates are removed. While using UNION ALL Oracle does not do a
sorting. If there are duplicate records it will be shown in the output.
So
if you are sure that your output will not contain any duplicate records
and there is no need to sort the data, always use UNION ALL instead
of UNION.
Order By
Use
‘Order By’ clause only if it is required to satisfy your program logic
or business needs. It works on output of the query hence Oracle does
double processing while using ‘Order By’. In case if you
want to use ‘Order By’, order the columns in select list in the way you
want them to be sorted. Exception is if you want any of the columns to
be sorted in DESC order.
There is no need to use ‘Order By’ with UNION, MINUS, SELECT DISTINCT and GROUP BY features in the query. These queries are automatically sorted by Oracle to provide the necessary feature.
Query Structuring & Driving Table
Based
on the way the query is written, the execution varies even though all
the conditions and tables are same. Suppose if there are five conditions
in your WHERE clause, you may find that based on the order of them
execution time varies for the same output (Rule Based). Or when you
change the order of tables in the FROM clause in some cases the
execution time varies, both for rule based and cost based.
Rule Based Optimizer Case Only: Oracle parses all SQL from the bottom up and from right to left, i.e. backward from the way it’s written when it executes. So it is better to re-arrange the WHERE clause for better performance.
Order of execution of a query
Oracle follows the below mentioned steps while executing a query.
1. If the query contains UNION or UNION ALL, repeat steps 2 to 7 for each individual query.
2. Find
the Cartesian product of the JOIN (group of tables mentioned in the
FROM clause). If there is only one table, that is the product.
3. Apply
the conditions on the WHERE clause (if it’s there) bottom up and from
right to left. Retain the rows for which result is TRUE from product.
4. If
there is GROUP BY clause, group the rows obtained above such that all
rows in each group has the same value for the column in the GROUP BY
clause.
5. For each row (column in the GROUP BY clause), calculate the desired result in SELECT statement. (SUM, MAX, MIN, etc)
6. Eliminate duplicates if DISTINCT is specified
7. Merge the results if the query has UNION. Eliminates the duplicates if it is not ‘UNION ALL’.
8. If ORDER BY is specified, sort the result accordingly.
In
order to tune a query, first check your WHERE clause, make sure that
you have arranged them in the best possible manner. Consider the
following e.g.:
SELECT order_no, qty
FROM orders
WHERE sales_person = ‘INFY’
AND order_no IN (SELECT order_no
FROM pending_orders
WHERE qty < 1000
AND dest = ‘Mangalore’)
AND category = ‘Normal’
Say < sales_person = ‘INFY’> returns 200 records from orders, sub-query returns 900 records and < category = ‘Normal’>
returns 15000 records. Oracle will parse the query through 15000
records first, then pass those through 900 records, then pass the
remainder through 200 records.
It’ll
be much more efficient to have Oracle parse through 200 records first,
then pass those results through the 1700 possible rows and the remainder
through the possible 15000 records. Thus re-writing the above query in
the following way will result a faster query.
SELECT order_no, qty
FROM orders
WHERE category = ‘Normal’
AND order_no IN (SELECT order_no
FROM pending_orders
WHERE qty < 1000
AND dest = ‘Mangalore’)
AND sales_person = ‘INFY’
Now,
all other things being equal in WHERE clause, order of the tables in
the FROM clause decides the execution path. The table listed last in the
FROM clause act as the ‘driving table’ for the query for rule based
optimizer. (Note: Order of tables can change the execution path for some
cases in cost based optimizer also.)
Choose
the table which has least number of rows as driving table when you want
to select all the records from all the tables in the join. When
you are selecting only some records by applying some condition, choose
the table with less no of records after conditions are applied as
driving table. In case of 3 or more table joins, table that has maximum
joins becomes driving table. Driving table is the one which is read
first in the memory. So the query will be faster and efficient if the driving table is small in size.
Following e.g. illustrates driving table:
(Assumption: There is no index in any of the table.)
Table No of records Condition Records after condition applied
Customers 1500 1500
Items 12000 Category = ‘PC’ 1000
Orders 25000 Type = ‘External’ 1200
Case1: Selecting all records from all tables.
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
items itm,
orders ord
WHERE ord.item_id = itm.item_id
AND ord.customer_id = cus.customer_id
AND itm.category = ‘PC’
In
this case orders table will act as the driving table. This means query
first selects all the records from orders table and then match with
items table and then this merged the output compared with customers
table. Above query will work faster if written as follows:
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
orders ord,
items itm => order changed, items act as the driving table
WHERE ord.item_id = itm.item_id
AND ord.customer_id = cus.customer_id
AND itm.category = ‘PC’
Now, say an non-unique index on column ‘type’ is added on the orders table. We want to add the condition < ord.type = ‘External’>
to the query. We know that by this condition no. of records fetched
from the orders table reduces to 1200, then it is better to make orders
table as the driving table (Even though no. of records fetched from
items table is less than orders table, we have to do a FULL TABLE SCAN
on items, so orders table is preffered).
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
items itm,
orders ord
WHERE cus.customer_id = ord.customer_id
AND itm.category = ‘PC’
AND itm.item_id = ord.item_id
AND ord.type = ‘External’
So
based on the available index, no of records fetched and size of the
tables in the join decide how you want to write the query for better
performance.
Cost Based Optimizer – Special Note:
When
using the cost based optimizer ensure that tables are analyzed
regularly by DBA. There is standard concurrent program available in case
of Oracle Application for the same. Above this, to tune the queries in
cost based environment we can use HINTS. Hints instruct Oracle to use a
particular index or not to use a particular index or which table should
be scanned full.
For
some cases you might want to disable an index so that optimizer uses an
alternative index on the table. To disable an index, you can use an
operator on the column or concatenate it will NULL. Following SQL illustrates this.
SELECT amount
FROM ap_invoice_distributions_all
WHERE dist_code_combination_id = <code_combination_id>
AND invoice_id = <invoice_id>
Say there is unique index AP_INVOICE_DISTRIBUTIONS_U1 on <invoice_id> and non-unique index AP_INVOICE_DISTRIBUTIONS_N3 on <dist_code_combination_id>. Execution for the above query is as follows:
SELECT STATEMENT Hint=HINT: RULE
TABLE ACCESS BY INDEX ROWID AP_INVOICE_DISTRIBUTIONS_ALL
INDEX RANGE SCAN AP_INVOICE_DISTRIBUTIONS_N3
To disable the index on <dist_code_combination_id> and make use of the unique index on invoice_id, you can modify the SQL as follows
SELECT amount
FROM ap_invoice_distributions_all
WHERE dist_code_combination_id + 0 = <code_combination_id>
AND invoice_id = <invoice_id>
Modified execution path is
SELECT STATEMENT Hint=HINT: RULE
TABLE ACCESS BY INDEX ROWID AP_INVOICE_DISTRIBUTIONS_ALL
INDEX RANGE SCAN AP_INVOICE_DISTRIBUTIONS_U1
To disable index on last_name (Character field) column, you can:
WHERE last_name || ‘’ = ‘Johns’ or WHERE last_name || ‘ TEST’ = ‘Johns’ || ‘ TEST’
Instead of
WHERE last_name = ‘Johns’
Use operations like + or – for NUMBER columns or concatenate with a string/null for CHARACTER columns.
General Tips
In
case of a composite index (index created with combination of multiple
columns), it is mandatory to specify the leading column of the index in
the where clause for that index to be used.
If
your query fetches more than 10% of total records using an index, then
it may be better to do a FULL table scan rather than indexed search.
Retrieving the data using the index table might be a costly operation in
this case.
Use
table alias when specifying more than one table in a query for every
column in the query. This will reduce the parsing time while executing
the query. It also helps the design, if a column with the same name is
added in another table in the future instead of ambiguously defined.
Choose
the index based on the best selectivity when you have choice using
multiple indexes. Selectivity of index is the distinct no. of values in
column over total no. of values in column.
If
the variable is not necessary, why should it be defined and executed?
Due to multiple design changes or requirement changes, we might define
lot of un-used variables; there may be some costly assign statements to
initialize them. All of this adds to cost of your performance.
After
the completion of the code, search for un-used variables if any. Remove
them from your code. In some cases, you might define some constraints
like NOT NULL for variable. Please cross check the requirement and make
sure that it is really required. For e.g.:
DECLARE
l_chr _empno employees.empno%type NOT NULL;
BEGIN
….
SELECT empno
INTO l_chr _empno
FROM employees
WHERE …….
.....
END;
If
empno is the primary key of the employees table, we know that there
will not be any record with a NULL value, so the constraint defined
during the variable declaration is not at all required.
Defer
the execution till it required. There might be some declaration
statements where the value is assigned though costly process. We can define them where it is actually being required. For e.g.:
DECLARE
l_chr_var1 VARCHAR2(15) := takes_five_miunte(…..);
….
BEGIN
IF criteria1 THEN
use_the_defined_variable(l_chr_var1);
ELSE => say 90% of the cases follows this
Normal_code;
END IF;
END;
Above
case, say we are defining a variable and assigning the value using a
function which takes around 5 minutes of elapsed time to execute it. And
later, this variable is used only for the 10% of the execution flow. We
can re-write the code in a manner that variable is declared and used
only where it is required.
DECLARE
….
BEGIN
IF criteria1 THEN
DECALRE
l_chr_var1 VARCHAR2(15) := takes_five_miunte(…..);
BEGIN
use_the_defined_variable(l_chr_var1);
END;
ELSE
Normal_code;
END IF;
END;
Code within a LOOP executes more than once. Any inefficiency in a loop thus has a multiplied effect. In general following two steps helps to optimize your loop statements:
- Minimize the number of iterations inside loop. As soon as the code does the required job EXIT the loop.
Use condition like WHEN variable = <required condition>
- Make sure that there are statements inside the loop that can be executed outside the loop. If there is a statement that does not refers/depends on loop variables, it may be possible to execute it outside the loop only once rather than many times.
For e.g.:
BEGIN
FOR temp_rec IN temp_cur
LOOP
Process_new_procedure(UPPER(variable1));
END LOOP
END;
Above
case UPPER function is applied to the variable1 for each instance of
the loop. Since the value of variable1 does not depend on the loop
execution, it’s possible to execute this outside the loop as follows:
BEGIN
variable2 := UPPER(variable1);
FOR temp_rec IN temp_cur
LOOP
Process_new_procedure(variable2);
END LOOP
END;
- Use of SYSDATE. While updating or inserting a record, we might want to keep the date information. Or there may be multiple instances of comparing or retrieving the data based on the date. If the time component is not important (most of the cases), it is better to define a variable and assign the SYSDATE value to it. You can use this variable instead of retrieving the date (happens when we use SYSDATE) value each time from database.
See the code below:
CURSOR emp_cur IS
SELECT emp_name, emp_no, TO_CHAR(SYSDATE,’DD-MON-YYYY’)
FROM employee;
For
each record selected in the cursor, unnecessarily system date value is
fetched and converted without a time component, means it is going to
give same value for every record. Is there any need of SYSDATE selection
in the above case?
- Loop within a loop – One common place where there is possibility of unnecessary code execution. Unknowingly, the code for the first part of the loop gets executed under the inner loop.
FOR varaible1 IN 1..100 LOOP
FOR varaible2 IN 1..100 LOOP
variable3 := SQRT(variable1); => This should be executed outside this loop
….
Optimize the IF statements
When
processing the IF statements, PL/SQL engine execute each condition till
it returns TRUE. Once the exact match is found, it does not execute the
remaining part of code under the IF statements. Therefore if most
probable condition are placed initially then the processing time by
PL/SQL engine comes down thus improves overall performance of the code.
For e.g.:
FOR counter1 1..10000 LOOP
IF counter1 < 10 THEN
--- do some process
ELSEIF counter1 >= 10 and counter1 < 20 THEN
--- do some process
ELSEIF counter1 >= 20 and counter1 < 30 THEN
--- do some process
ELSEIF counter1 >= 30 and counter1 < 40 THEN
--- do some process
ELSEIF counter1 >= 90 THEN
--- do some process
END IF;
END LOOP;
The first condition < counter1 < 10 > is TRUE only for 9 times out of 10000, where as the last condition < counter1 >= 90>
is TRUE for 9911 times out of 10000. So, the above code will be much
faster if last condition has placed as the first condition.
FOR counter1 1..10000 LOOP
IF counter1 >= 90 THEN => made as the first condition.
--- do some process
ELSEIF counter1 < 10 THEN
--- do some process
ELSEIF counter1 >= 10 and counter1 < 20 THEN
--- do some process
ELSEIF counter1 >= 20 and counter1 < 30 THEN
--- do some process
ELSEIF counter1 >= 30 and counter1 < 40 THEN
--- do some process
END IF;
END LOOP;
There
are cases where an IF statement contains joint condition using AND or
OR operators. In case of an OR condition, either one of the condition
returns TRUE, the IF statement returns TRUE whereas for AND condition
both needs to be TRUE for the statement to return TRUE. So place the
less costly condition first for better performance. E.g.
IF get_dept_takes_two_minutes(emp_no) = ‘ES’ OR salary < 10000 THEN
--- do the processing
END IF;
As
explained above, the statements returns false if one of the condition
returns false, so it is better to place the salary condition prior to
get_dept_takes_two_minutes function.
IF salary < 10000 OR get_dept_takes_two_minutes(emp_no) = ‘ES’THEN
--- do the processing
END IF;
Note: Placing
condition for AND or OR are valid when used with normal SQL statements
also. Place the most probable condition first as a thumb rule always in a
segment of multiple IF constructs or within the IF conditions.
Use of CURSOR’s
- Network traffic is reduced by selecting only the required values.
- Code is easier to understand.
- Take cares future changes if any.
Some
cases, may be it is better to split the SQL statement in a cursor to
multiple SQLs. For e.g. Say you want to retrieve the employee for each
department who has maximum salary.
CURSOR emp_cur IS
SELECT emp1.emp_no, emp1.dept_id
FROM employee emp1
WHERE emp1.salary = (SELECT MAX(emp2.salary)
FROM employee emp2
WHERE emp2.dept_id = emp1.dept_id)
The
problem with the above query is that maximum salary for a department
will be recomputed multiple times based on the no of employees. It is
better to re-write this query into nested LOOP statements in PL/SQL.
DECALRE
CURSOR dept_cur IS
SELECT dept_id, MAX(salary) maxsal
FROM employee
GROUP BY dept_id; => select department id and maximum salary for that dept.
CURSOR emp_cur (deptid IN departments.dept_id%type, maxsal IN employee.salary%type) IS
SELECT emp_no, dept_id
FROM employee
WHERE salary = maxsal
AND dept_id = deptid;
BEGIN
FOR dept_rec IN Dept_cur
LOOP
FOR emp_rec in emp_cur ( dept_rec.dept_id, dept_rec.maxsal)
LOOP
--- Do the processing
END LOOP;
END LOOP;
END;
But very soon the data base begin to grow, now
oe_order_lines_all had about 46000 lines
oe_order_headers_all had about 16000lines
ra_customer_trx_all had about 45000 lines
ra_customer_trx_lines_all had about 177000 lines
At
this stage a massive tuning exercise was undertaken where in it was
decided to tune one by one all these forms, all the other code was also
re analyzed to check if performance could become a constraint in some
other code also. The tuning work was carried out partly offshore and
partly onsite.
Challenges
The
data in the production instance and the development instance at
offshore was out of sync, offshore had about only about 10-20% of the
data in the production instance due to which:
1) It was difficult to judge the effectiveness of tuning : for
example one of the forms which did not fetch data even after 25 minutes
in the production instance took less than 15-20 seconds at offshore,
hence it was very difficult to check if any tuning done had actually
made a difference.
2) some of the screens had very obvious tuning issues like improper joins redundant code etc and were easy to tune in the first round itself, but in someother
cases where problems were related to the size of data for example index
not getting used, were more cryptic, like in tables with less data at
offshore and more data at onsite, if the index was used the performance
of the SQL was actually deteriorating in the offshore instance but when
the same was done onsite the performance actually improved.
Tuning strategies/techniques
While
a lot of information was available is many places about PL/SQL tuning
most of what we found was not really very useful and what really helped
was not documented properly, hence here I have made an attempt to
document some of the tuning strategies which were found to be very
useful in my project.
1) Before
we actually get into ‘how to tune’, we need to know ‘what to tune’.
Usually <5 SQL’s take up >95% of the time, these are the SQL’s to
be attacked. Tkprof is a tool which can do a detailed analysis of the
form/report/program and check how much time each and every SQL takes and
how it gets executed. The outcome of this analysis is a ‘Trace file’, which can help us in identifying the SQLs to tune.
How to create a trace file:
(i)Remove all trace files from ..../udump directory (optional, for easy
identification of the trace file)
(ii). For a form - Open the form go to (Help à Diagnostics à trace on)
(iii). For a concurrent program or a report, go to the program registration
screen (Application Developerà concurrent à program ) and check the
check box ‘Enable Trace’
(iv). Query for the customer/field which is taking time, or run the report / concurrent
program.
(v). The trace file would be generated, run the following command tkprof request_id.trc
output.tkp
explain=apps/<password> sys=no sort=fchela.
Tkprof sorts sqls taking maximum execution time, and presents them along with the
explain plan and the execution
details.
2) Once the SQLs which need to be tuned are identified the next ste p is to check how the SQL is getting executed, for this ‘Explain plan’ can
be used, ‘Explain Plan’ gives the execution plan of your SQL, using
this you can see how your SQL is actually getting executed, if the
optimizer is in cost mode then the cost of SQL is also shown, we should
work towards minimizing this cost. To use explain plan, a special
PLAN_TABLE table must exist in the user’s schema.
EXPLAIN PLAN
SET STATEMENT_ID = 'stmt_id'
FOR
SELECT *
FROM oe_order_lines_all
WHERE line_id = '12345'
SET STATEMENT_ID = 'stmt_id'
FOR
SELECT *
FROM oe_order_lines_all
WHERE line_id = '12345'
Ctrl+E is the shortcut in Toad to give the explain plan for the Select statement in the sql Editor window of Toad.
SELECT
NVL(SUM(NVL(moq.transaction_quantity,0) *
(NVL(cic.item_cost,0) )),0)
FROM
mtl_onhand_quantities moq,
cst_item_costs cic,
te_inv_org_operating_units_v inv_ou
WHERE
cic.inventory_item_id = moq.inventory_item_id
AND cic.cost_type_id = :g_num_cost_type_id
AND cic.organization_id = moq.organization_id
AND moq.inventory_item_id = :in_num_inventory_item_id
AND inv_ou.org_id = NVL(:in_num_operating_unit,
inv_ou.org_id)
AND inv_ou.organization_id = moq.organization_id
For example for the SQL above the explain plan would be as shown below:
SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=47)
SORT (AGGREGATE)
NESTED LOOPS (Cost=5 Card=1 Bytes=47)
NESTED LOOPS (Cost=5 Card=1 Bytes=44)
NESTED LOOPS (Cost=4 Card=1 Bytes=30)
NESTED LOOPS (Cost=2 Card=1 Bytes=19)
TABLE ACCESS (FULL) OF HR_ORGANIZATION_INFORMATION (Cost=2 Card=1
Bytes=16)
INDEX (UNIQUE SCAN) OF MTL_PARAMETERS_U1 (UNIQUE)
TABLE ACCESS (BY INDEX ROWID) OF MTL_ONHAND_QUANTITIES (Cost=2
Card=25 Bytes=275)
INDEX (RANGE SCAN) OF MTL_ONHAND_QUANTITIES_N1 (NON-UNIQUE)
(Cost=1 Card=25)
TABLE ACCESS (BY INDEX ROWID) OF CST_ITEM_COSTS (Cost=1 Card=3
Bytes=42)
INDEX (UNIQUE SCAN) OF CST_ITEM_COSTS_U1 (UNIQUE)
INDEX (UNIQUE SCAN) OF HR_ORGANIZATION_UNITS_PK (UNIQUE)
3) Now that we know what to tune and how is it actually getting executed, try to understand the SQL, check if it is too complex? Is there a simpler way to
do the same? Mistakes are often committed in the design phase, even
before a single SQL is written. It is prudent to analyze the design and
check if there is a better way to provide the same functionality, this
might save a lot of processing time. In the following SQL the
requirement is to get the latest transaction number, date and type for a
given order.
SELECT unique
rct.trx_number,
rct.trx_date,
rctt.type
FROM
oe_order_headers_all oha,
ra_customer_trx_lines_all rctl,
ra_customer_trx_all rct,
ra_cust_trx_types_all rctt
WHERE
rctl.interface_line_context = 'ORDER ENTRY'
AND rctl.interface_line_attribute1 = oha.order_number
AND rctl.customer_trx_id = rct.customer_trx_id
AND rctt.cust_trx_type_id = rct.cust_trx_type_id
AND oha.header_id = :in_num_header_id
AND rct.trx_date = (
SELECT
MAX(TRX_DATE)
FROM
ra_customer_trx_lines_all rctla,
ra_customer_trx_all rcta
WHERE
rctla.interface_line_context = 'ORDER
ENTRY'
AND rctla.interface_line_attribute1 =
oha.order_number
AND rcta.customer_trx_id =
rctla.customer_trx_id )
AND rct.trx_number = (
SELECT
MAX(TRX_number)
FROM
ra_customer_trx_lines_all rctla,
ra_customer_trx_all rcta
WHERE
rctla.interface_line_context = 'ORDER
ENTRY'
AND rctla.interface_line_attribute1 =
oha.order_number
AND rcta.customer_trx_id =
rctla.customer_trx_id )
The same
functionality can be provided by re writing the SQL as follows, in the
process the cost gets reduced to almost half of the first query.
SELECT
max(rct.trx_number),
max(rct.trx_date),
max(rctt.type)
FROM
oe_order_headers_all oha,
ra_customer_trx_all rct,
ra_cust_trx_types_all rctt
WHERE
rct.interface_header_context = 'ORDER ENTRY'
AND rct.interface_header_attribute1 = oha.order_number
AND rctt.cust_trx_type_id = rct.cust_trx_type_id
AND oha.header_id = :in_num_header_id
Apps Tuining
Tuning SQL
Following tips will help you to write better performing SQL's.
EXISTS & IN
Original
|
SELECT name
FROM employee
WHERE empno IN (SELECT empno
FROM manager)
|
Sub-query retrieves all the records in the manager table every time for each employee record and then compares the data.
|
Modified
| |||
SELECT name
FROM employee
|
WHERE EXISTS (
SELECT manager.empno
FROM manager
WHERE manger.empno = employee.empno)
|
For
each employee record, sub-query verifies if there is matching record in
the manager table and comes out after the first occurrence of the
successful matching.
|
SELECT name
FROM employee
Original
|
SELECT orderno
FROM orders
WHERE orderno NOT IN (SELECT orderno
FROM pendingorders)
|
Sub-query
retrieves all the records in the pendingorders table every time for
each record in the orders table and then compare the data.
| ||||
Modified
| ||||||
WHERE NOT EXISTS (
SELECT manager.empno
FROM manager
WHERE manger.empno = employee.empno)
|
For
each employee record, sub-query verifies if there is matching record in
the manager table and comes out after the first occurrence of the
successful matching (return false in this case).
|
Using Functions on Indexed Columns
For e.g., the following SQL constructs prevent the use of index on empno:
SELECT name SELECT name
FROM employee FROM employee
WHERE NVL(empno, ‘INF000’) = ‘INF001’ WHERE UPPER(empno) = ‘INF001’
SELECT name SELECT name
FROM employee FROM employee
WHERE empno IS NOT NULL WHERE LOWER(empno) = ‘inf001’
Note: ‘IS NULL’ clause on indexed column will use the respective index.
SELECT name
FROM employee
WHERE TO_CHAR(empno) = ‘001’
As
mentioned above, use of SUBSTR function on index column will suppress
the index. But there are cases where you can substitute SUBSTR with LIKE
and still use the index!
For e.g., your condition is that first 2 characters of the dept should be “ES” (RBO case only):
Index will not be used
SELECT name
FROM employee
WHERE SUBSTR(dept,1,2) = ‘ES’
Index will be used
SELECT name
FROM employee
WHERE dept LIKE ‘ES%’
Note: If the wild character ‘%’ mentioned only in the last, index will be used in case of the function LIKE. If it comes in the beginning for ‘LIKE’, or in any case with ‘NOT LIKE’ index will not be used.
Do
not concatenate the index columns with constant strings or add/subtract
numbers on indexed columns. Both of these would stop optimizer from
using index.
For e.g., the following SQL construct prevents the use of index on lastname.
SELECT dept
FROM employee
WHERE lastname || ‘ Query’ = ‘SQL Query’
In the following SQL by adding a number on salary column, use of index is suppressed.
SELECT name
FROM employee
WHERE salary+1000 = <calc_salary>
Above SQL can be better written as follows
SELECT name
FROM employee
WHERE salary = <calc_salary> - 1000 => This will use the index on salary
If
an index in composed of multiple columns and if there are no functions
used on leading column, but functions used on other columns will not
prevent optimizer from using that index.
Suppose unique on index on table employee consists of columns empid, effective_start_date and effective_end_date:
Index will be used
SELECT name
FROM employee
WHERE empid = <emp_id>
AND trunc(effective_start_date) >= sysdate-365
AND trunc(effective_end_date) < sysdate => function not used on leading index column
Index will not be used
SELECT name
FROM employee
WHERE TO_CHAR(empid) = <emp_id>
AND trunc(effective_start_date) >= sysdate-365
AND trunc(effective_end_date) < sysdate => function used on leading index column
Joins & Sub-queries
Ø Execution of a number of individual SQL’s is prevented
Ø Redundant fetches of values already fetched is avoided
Ø Parsing and execution time, overhead of bookkeeping, optimizing, memory overheads, SGA/PGA overheads are avoided
Ø Joins are faster than PL/SQL
Incorrect Query Correct Query
SELECT p.product_code, p.product_version, SELECT p.product_code, p.product_version,
r.text r.text
FROM products p, problem_reports r FROM products p, problem_reports r
WHERE r.product_code = p.product_code WHERE r.product_code = p.product_code
AND p.status = ‘Testing’ AND r.product_version = p.product_version
AND p.status = ‘Testing’
There
are cases where sub-query can be used with join. For e.g., if you want
to select the latest contact of a customer where customer can have
multiple contacts, you may use the following workaround:
SELECT c.customer_name, ct1.last_name
FROM customers c, contacts ct1
WHERE c.customer_id = ct1.customer_id
AND ct.contact_id = (SELECT max(contact_id)
FROM contacts ct2
WHERE ct2.customer_id = c.customer_id)
Outer Join
When you want to retrieve data even if there are no matching records in join, use outer join. Correct syntax for outer join is:
Table1.Column1 (+) = Table2.Column1 or Table1.Column1 = Table2.Column1 (+)
The bracketed sign follows the column of the table which has/may have the missing row.
Optimizing joins
- The more tables are involved the more rows have to be read thereby more expensive usually.
- Retrieve the smallest result set first when joining more than 3 tables, so that total read from database reduces.
- Make sure that joined columns are indexed for fast retrieval; prevent use of functions on joined columns.
- When you create a new index make sure that it’ll affect the performance of existing joins.
Usually
developer efforts are concentrated on tuning the SQL’s used by the
PL/SQL program for performance improvement. Being a procedural language,
there can be situations where excessive CPU usage arises out of the
code in Pl/SQL even though there are no database accesses.
Here different areas of PL/SQL code are categorized and some of the basic principles of optimizing are discussed.
UNION ALL instead of UNION
While
using UNION function to retrieve the records from two queries, output
will be the unique records from both queries in a sorted manner. This
means individual output from each query are compared, sorted and
duplicates are removed. While using UNION ALL Oracle does not do a
sorting. If there are duplicate records it will be shown in the output.
So
if you are sure that your output will not contain any duplicate records
and there is no need to sort the data, always use UNION ALL instead
of UNION.
Order By
Use
‘Order By’ clause only if it is required to satisfy your program logic
or business needs. It works on output of the query hence Oracle does
double processing while using ‘Order By’. In case if you
want to use ‘Order By’, order the columns in select list in the way you
want them to be sorted. Exception is if you want any of the columns to
be sorted in DESC order.
There is no need to use ‘Order By’ with UNION, MINUS, SELECT DISTINCT and GROUP BY features in the query. These queries are automatically sorted by Oracle to provide the necessary feature.
Query Structuring & Driving Table
Based
on the way the query is written, the execution varies even though all
the conditions and tables are same. Suppose if there are five conditions
in your WHERE clause, you may find that based on the order of them
execution time varies for the same output (Rule Based). Or when you
change the order of tables in the FROM clause in some cases the
execution time varies, both for rule based and cost based.
Rule Based Optimizer Case Only: Oracle parses all SQL from the bottom up and from right to left, i.e. backward from the way it’s written when it executes. So it is better to re-arrange the WHERE clause for better performance.
Order of execution of a query
Oracle follows the below mentioned steps while executing a query.
1. If the query contains UNION or UNION ALL, repeat steps 2 to 7 for each individual query.
2. Find
the Cartesian product of the JOIN (group of tables mentioned in the
FROM clause). If there is only one table, that is the product.
3. Apply
the conditions on the WHERE clause (if it’s there) bottom up and from
right to left. Retain the rows for which result is TRUE from product.
4. If
there is GROUP BY clause, group the rows obtained above such that all
rows in each group has the same value for the column in the GROUP BY
clause.
5. For each row (column in the GROUP BY clause), calculate the desired result in SELECT statement. (SUM, MAX, MIN, etc)
6. Eliminate duplicates if DISTINCT is specified
7. Merge the results if the query has UNION. Eliminates the duplicates if it is not ‘UNION ALL’.
8. If ORDER BY is specified, sort the result accordingly.
In
order to tune a query, first check your WHERE clause, make sure that
you have arranged them in the best possible manner. Consider the
following e.g.:
SELECT order_no, qty
FROM orders
WHERE sales_person = ‘INFY’
AND order_no IN (SELECT order_no
FROM pending_orders
WHERE qty < 1000
AND dest = ‘Mangalore’)
AND category = ‘Normal’
Say < sales_person = ‘INFY’> returns 200 records from orders, sub-query returns 900 records and < category = ‘Normal’>
returns 15000 records. Oracle will parse the query through 15000
records first, then pass those through 900 records, then pass the
remainder through 200 records.
It’ll
be much more efficient to have Oracle parse through 200 records first,
then pass those results through the 1700 possible rows and the remainder
through the possible 15000 records. Thus re-writing the above query in
the following way will result a faster query.
SELECT order_no, qty
FROM orders
WHERE category = ‘Normal’
AND order_no IN (SELECT order_no
FROM pending_orders
WHERE qty < 1000
AND dest = ‘Mangalore’)
AND sales_person = ‘INFY’
Now,
all other things being equal in WHERE clause, order of the tables in
the FROM clause decides the execution path. The table listed last in the
FROM clause act as the ‘driving table’ for the query for rule based
optimizer. (Note: Order of tables can change the execution path for some
cases in cost based optimizer also.)
Choose
the table which has least number of rows as driving table when you want
to select all the records from all the tables in the join. When
you are selecting only some records by applying some condition, choose
the table with less no of records after conditions are applied as
driving table. In case of 3 or more table joins, table that has maximum
joins becomes driving table. Driving table is the one which is read
first in the memory. So the query will be faster and efficient if the driving table is small in size.
Following e.g. illustrates driving table:
(Assumption: There is no index in any of the table.)
Table No of records Condition Records after condition applied
Customers 1500 1500
Items 12000 Category = ‘PC’ 1000
Orders 25000 Type = ‘External’ 1200
Case1: Selecting all records from all tables.
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
items itm,
orders ord
WHERE ord.item_id = itm.item_id
AND ord.customer_id = cus.customer_id
AND itm.category = ‘PC’
In
this case orders table will act as the driving table. This means query
first selects all the records from orders table and then match with
items table and then this merged the output compared with customers
table. Above query will work faster if written as follows:
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
orders ord,
items itm => order changed, items act as the driving table
WHERE ord.item_id = itm.item_id
AND ord.customer_id = cus.customer_id
AND itm.category = ‘PC’
Now, say an non-unique index on column ‘type’ is added on the orders table. We want to add the condition < ord.type = ‘External’>
to the query. We know that by this condition no. of records fetched
from the orders table reduces to 1200, then it is better to make orders
table as the driving table (Even though no. of records fetched from
items table is less than orders table, we have to do a FULL TABLE SCAN
on items, so orders table is preffered).
SELECT cus.customer_name, itm.item_name, ord.qty
FROM customers cus,
items itm,
orders ord
WHERE cus.customer_id = ord.customer_id
AND itm.category = ‘PC’
AND itm.item_id = ord.item_id
AND ord.type = ‘External’
So
based on the available index, no of records fetched and size of the
tables in the join decide how you want to write the query for better
performance.
Cost Based Optimizer – Special Note:
When
using the cost based optimizer ensure that tables are analyzed
regularly by DBA. There is standard concurrent program available in case
of Oracle Application for the same. Above this, to tune the queries in
cost based environment we can use HINTS. Hints instruct Oracle to use a
particular index or not to use a particular index or which table should
be scanned full.
For
some cases you might want to disable an index so that optimizer uses an
alternative index on the table. To disable an index, you can use an
operator on the column or concatenate it will NULL. Following SQL illustrates this.
SELECT amount
FROM ap_invoice_distributions_all
WHERE dist_code_combination_id = <code_combination_id>
AND invoice_id = <invoice_id>
Say there is unique index AP_INVOICE_DISTRIBUTIONS_U1 on <invoice_id> and non-unique index AP_INVOICE_DISTRIBUTIONS_N3 on <dist_code_combination_id>. Execution for the above query is as follows:
SELECT STATEMENT Hint=HINT: RULE
TABLE ACCESS BY INDEX ROWID AP_INVOICE_DISTRIBUTIONS_ALL
INDEX RANGE SCAN AP_INVOICE_DISTRIBUTIONS_N3
To disable the index on <dist_code_combination_id> and make use of the unique index on invoice_id, you can modify the SQL as follows
SELECT amount
FROM ap_invoice_distributions_all
WHERE dist_code_combination_id + 0 = <code_combination_id>
AND invoice_id = <invoice_id>
Modified execution path is
SELECT STATEMENT Hint=HINT: RULE
TABLE ACCESS BY INDEX ROWID AP_INVOICE_DISTRIBUTIONS_ALL
INDEX RANGE SCAN AP_INVOICE_DISTRIBUTIONS_U1
To disable index on last_name (Character field) column, you can:
WHERE last_name || ‘’ = ‘Johns’ or WHERE last_name || ‘ TEST’ = ‘Johns’ || ‘ TEST’
Instead of
WHERE last_name = ‘Johns’
Use operations like + or – for NUMBER columns or concatenate with a string/null for CHARACTER columns.
General Tips
In
case of a composite index (index created with combination of multiple
columns), it is mandatory to specify the leading column of the index in
the where clause for that index to be used.
If
your query fetches more than 10% of total records using an index, then
it may be better to do a FULL table scan rather than indexed search.
Retrieving the data using the index table might be a costly operation in
this case.
Use
table alias when specifying more than one table in a query for every
column in the query. This will reduce the parsing time while executing
the query. It also helps the design, if a column with the same name is
added in another table in the future instead of ambiguously defined.
Choose
the index based on the best selectivity when you have choice using
multiple indexes. Selectivity of index is the distinct no. of values in
column over total no. of values in column.
If
the variable is not necessary, why should it be defined and executed?
Due to multiple design changes or requirement changes, we might define
lot of un-used variables; there may be some costly assign statements to
initialize them. All of this adds to cost of your performance.
After
the completion of the code, search for un-used variables if any. Remove
them from your code. In some cases, you might define some constraints
like NOT NULL for variable. Please cross check the requirement and make
sure that it is really required. For e.g.:
DECLARE
l_chr _empno employees.empno%type NOT NULL;
BEGIN
….
SELECT empno
INTO l_chr _empno
FROM employees
WHERE …….
.....
END;
If
empno is the primary key of the employees table, we know that there
will not be any record with a NULL value, so the constraint defined
during the variable declaration is not at all required.
Defer
the execution till it required. There might be some declaration
statements where the value is assigned though costly process. We can define them where it is actually being required. For e.g.:
DECLARE
l_chr_var1 VARCHAR2(15) := takes_five_miunte(…..);
….
BEGIN
IF criteria1 THEN
use_the_defined_variable(l_chr_var1);
ELSE => say 90% of the cases follows this
Normal_code;
END IF;
END;
Above
case, say we are defining a variable and assigning the value using a
function which takes around 5 minutes of elapsed time to execute it. And
later, this variable is used only for the 10% of the execution flow. We
can re-write the code in a manner that variable is declared and used
only where it is required.
DECLARE
….
BEGIN
IF criteria1 THEN
DECALRE
l_chr_var1 VARCHAR2(15) := takes_five_miunte(…..);
BEGIN
use_the_defined_variable(l_chr_var1);
END;
ELSE
Normal_code;
END IF;
END;
Code within a LOOP executes more than once. Any inefficiency in a loop thus has a multiplied effect. In general following two steps helps to optimize your loop statements:
- Minimize the number of iterations inside loop. As soon as the code does the required job EXIT the loop.
Use condition like WHEN variable = <required condition>
- Make sure that there are statements inside the loop that can be executed outside the loop. If there is a statement that does not refers/depends on loop variables, it may be possible to execute it outside the loop only once rather than many times.
For e.g.:
BEGIN
FOR temp_rec IN temp_cur
LOOP
Process_new_procedure(UPPER(variable1));
END LOOP
END;
Above
case UPPER function is applied to the variable1 for each instance of
the loop. Since the value of variable1 does not depend on the loop
execution, it’s possible to execute this outside the loop as follows:
BEGIN
variable2 := UPPER(variable1);
FOR temp_rec IN temp_cur
LOOP
Process_new_procedure(variable2);
END LOOP
END;
- Use of SYSDATE. While updating or inserting a record, we might want to keep the date information. Or there may be multiple instances of comparing or retrieving the data based on the date. If the time component is not important (most of the cases), it is better to define a variable and assign the SYSDATE value to it. You can use this variable instead of retrieving the date (happens when we use SYSDATE) value each time from database.
See the code below:
CURSOR emp_cur IS
SELECT emp_name, emp_no, TO_CHAR(SYSDATE,’DD-MON-YYYY’)
FROM employee;
For
each record selected in the cursor, unnecessarily system date value is
fetched and converted without a time component, means it is going to
give same value for every record. Is there any need of SYSDATE selection
in the above case?
- Loop within a loop – One common place where there is possibility of unnecessary code execution. Unknowingly, the code for the first part of the loop gets executed under the inner loop.
FOR varaible1 IN 1..100 LOOP
FOR varaible2 IN 1..100 LOOP
variable3 := SQRT(variable1); => This should be executed outside this loop
….
Optimize the IF statements
When
processing the IF statements, PL/SQL engine execute each condition till
it returns TRUE. Once the exact match is found, it does not execute the
remaining part of code under the IF statements. Therefore if most
probable condition are placed initially then the processing time by
PL/SQL engine comes down thus improves overall performance of the code.
For e.g.:
FOR counter1 1..10000 LOOP
IF counter1 < 10 THEN
--- do some process
ELSEIF counter1 >= 10 and counter1 < 20 THEN
--- do some process
ELSEIF counter1 >= 20 and counter1 < 30 THEN
--- do some process
ELSEIF counter1 >= 30 and counter1 < 40 THEN
--- do some process
ELSEIF counter1 >= 90 THEN
--- do some process
END IF;
END LOOP;
The first condition < counter1 < 10 > is TRUE only for 9 times out of 10000, where as the last condition < counter1 >= 90>
is TRUE for 9911 times out of 10000. So, the above code will be much
faster if last condition has placed as the first condition.
FOR counter1 1..10000 LOOP
IF counter1 >= 90 THEN => made as the first condition.
--- do some process
ELSEIF counter1 < 10 THEN
--- do some process
ELSEIF counter1 >= 10 and counter1 < 20 THEN
--- do some process
ELSEIF counter1 >= 20 and counter1 < 30 THEN
--- do some process
ELSEIF counter1 >= 30 and counter1 < 40 THEN
--- do some process
END IF;
END LOOP;
There
are cases where an IF statement contains joint condition using AND or
OR operators. In case of an OR condition, either one of the condition
returns TRUE, the IF statement returns TRUE whereas for AND condition
both needs to be TRUE for the statement to return TRUE. So place the
less costly condition first for better performance. E.g.
IF get_dept_takes_two_minutes(emp_no) = ‘ES’ OR salary < 10000 THEN
--- do the processing
END IF;
As
explained above, the statements returns false if one of the condition
returns false, so it is better to place the salary condition prior to
get_dept_takes_two_minutes function.
IF salary < 10000 OR get_dept_takes_two_minutes(emp_no) = ‘ES’THEN
--- do the processing
END IF;
Note: Placing
condition for AND or OR are valid when used with normal SQL statements
also. Place the most probable condition first as a thumb rule always in a
segment of multiple IF constructs or within the IF conditions.
Use of CURSOR’s
- Network traffic is reduced by selecting only the required values.
- Code is easier to understand.
- Take cares future changes if any.
Some
cases, may be it is better to split the SQL statement in a cursor to
multiple SQLs. For e.g. Say you want to retrieve the employee for each
department who has maximum salary.
CURSOR emp_cur IS
SELECT emp1.emp_no, emp1.dept_id
FROM employee emp1
WHERE emp1.salary = (SELECT MAX(emp2.salary)
FROM employee emp2
WHERE emp2.dept_id = emp1.dept_id)
The
problem with the above query is that maximum salary for a department
will be recomputed multiple times based on the no of employees. It is
better to re-write this query into nested LOOP statements in PL/SQL.
DECALRE
CURSOR dept_cur IS
SELECT dept_id, MAX(salary) maxsal
FROM employee
GROUP BY dept_id; => select department id and maximum salary for that dept.
CURSOR emp_cur (deptid IN departments.dept_id%type, maxsal IN employee.salary%type) IS
SELECT emp_no, dept_id
FROM employee
WHERE salary = maxsal
AND dept_id = deptid;
BEGIN
FOR dept_rec IN Dept_cur
LOOP
FOR emp_rec in emp_cur ( dept_rec.dept_id, dept_rec.maxsal)
LOOP
--- Do the processing
END LOOP;
END LOOP;
END;
But very soon the data base begin to grow, now
oe_order_lines_all had about 46000 lines
oe_order_headers_all had about 16000lines
ra_customer_trx_all had about 45000 lines
ra_customer_trx_lines_all had about 177000 lines
At
this stage a massive tuning exercise was undertaken where in it was
decided to tune one by one all these forms, all the other code was also
re analyzed to check if performance could become a constraint in some
other code also. The tuning work was carried out partly offshore and
partly onsite.
Challenges
The
data in the production instance and the development instance at
offshore was out of sync, offshore had about only about 10-20% of the
data in the production instance due to which:
1) It was difficult to judge the effectiveness of tuning : for
example one of the forms which did not fetch data even after 25 minutes
in the production instance took less than 15-20 seconds at offshore,
hence it was very difficult to check if any tuning done had actually
made a difference.
2) some of the screens had very obvious tuning issues like improper joins redundant code etc and were easy to tune in the first round itself, but in someother
cases where problems were related to the size of data for example index
not getting used, were more cryptic, like in tables with less data at
offshore and more data at onsite, if the index was used the performance
of the SQL was actually deteriorating in the offshore instance but when
the same was done onsite the performance actually improved.
Tuning strategies/techniques
While
a lot of information was available is many places about PL/SQL tuning
most of what we found was not really very useful and what really helped
was not documented properly, hence here I have made an attempt to
document some of the tuning strategies which were found to be very
useful in my project.
1) Before
we actually get into ‘how to tune’, we need to know ‘what to tune’.
Usually <5 SQL’s take up >95% of the time, these are the SQL’s to
be attacked. Tkprof is a tool which can do a detailed analysis of the
form/report/program and check how much time each and every SQL takes and
how it gets executed. The outcome of this analysis is a ‘Trace file’, which can help us in identifying the SQLs to tune.
How to create a trace file:
(i)Remove all trace files from ..../udump directory (optional, for easy
identification of the trace file)
(ii). For a form - Open the form go to (Help à Diagnostics à trace on)
(iii). For a concurrent program or a report, go to the program registration
screen (Application Developerà concurrent à program ) and check the
check box ‘Enable Trace’
(iv). Query for the customer/field which is taking time, or run the report / concurrent
program.
(v). The trace file would be generated, run the following command tkprof request_id.trc
output.tkp
explain=apps/<password> sys=no sort=fchela.
Tkprof sorts sqls taking maximum execution time, and presents them along with the
explain plan and the execution
details.
2) Once the SQLs which need to be tuned are identified the next ste p is to check how the SQL is getting executed, for this ‘Explain plan’ can
be used, ‘Explain Plan’ gives the execution plan of your SQL, using
this you can see how your SQL is actually getting executed, if the
optimizer is in cost mode then the cost of SQL is also shown, we should
work towards minimizing this cost. To use explain plan, a special
PLAN_TABLE table must exist in the user’s schema.
EXPLAIN PLAN
SET STATEMENT_ID = 'stmt_id'
FOR
SELECT *
FROM oe_order_lines_all
WHERE line_id = '12345'
SET STATEMENT_ID = 'stmt_id'
FOR
SELECT *
FROM oe_order_lines_all
WHERE line_id = '12345'
Ctrl+E is the shortcut in Toad to give the explain plan for the Select statement in the sql Editor window of Toad.
SELECT
NVL(SUM(NVL(moq.transaction_quantity,0) *
(NVL(cic.item_cost,0) )),0)
FROM
mtl_onhand_quantities moq,
cst_item_costs cic,
te_inv_org_operating_units_v inv_ou
WHERE
cic.inventory_item_id = moq.inventory_item_id
AND cic.cost_type_id = :g_num_cost_type_id
AND cic.organization_id = moq.organization_id
AND moq.inventory_item_id = :in_num_inventory_item_id
AND inv_ou.org_id = NVL(:in_num_operating_unit,
inv_ou.org_id)
AND inv_ou.organization_id = moq.organization_id
For example for the SQL above the explain plan would be as shown below:
SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=47)
SORT (AGGREGATE)
NESTED LOOPS (Cost=5 Card=1 Bytes=47)
NESTED LOOPS (Cost=5 Card=1 Bytes=44)
NESTED LOOPS (Cost=4 Card=1 Bytes=30)
NESTED LOOPS (Cost=2 Card=1 Bytes=19)
TABLE ACCESS (FULL) OF HR_ORGANIZATION_INFORMATION (Cost=2 Card=1
Bytes=16)
INDEX (UNIQUE SCAN) OF MTL_PARAMETERS_U1 (UNIQUE)
TABLE ACCESS (BY INDEX ROWID) OF MTL_ONHAND_QUANTITIES (Cost=2
Card=25 Bytes=275)
INDEX (RANGE SCAN) OF MTL_ONHAND_QUANTITIES_N1 (NON-UNIQUE)
(Cost=1 Card=25)
TABLE ACCESS (BY INDEX ROWID) OF CST_ITEM_COSTS (Cost=1 Card=3
Bytes=42)
INDEX (UNIQUE SCAN) OF CST_ITEM_COSTS_U1 (UNIQUE)
INDEX (UNIQUE SCAN) OF HR_ORGANIZATION_UNITS_PK (UNIQUE)
3) Now that we know what to tune and how is it actually getting executed, try to understand the SQL, check if it is too complex? Is there a simpler way to
do the same? Mistakes are often committed in the design phase, even
before a single SQL is written. It is prudent to analyze the design and
check if there is a better way to provide the same functionality, this
might save a lot of processing time. In the following SQL the
requirement is to get the latest transaction number, date and type for a
given order.
SELECT unique
rct.trx_number,
rct.trx_date,
rctt.type
FROM
oe_order_headers_all oha,
ra_customer_trx_lines_all rctl,
ra_customer_trx_all rct,
ra_cust_trx_types_all rctt
WHERE
rctl.interface_line_context = 'ORDER ENTRY'
AND rctl.interface_line_attribute1 = oha.order_number
AND rctl.customer_trx_id = rct.customer_trx_id
AND rctt.cust_trx_type_id = rct.cust_trx_type_id
AND oha.header_id = :in_num_header_id
AND rct.trx_date = (
SELECT
MAX(TRX_DATE)
FROM
ra_customer_trx_lines_all rctla,
ra_customer_trx_all rcta
WHERE
rctla.interface_line_context = 'ORDER
ENTRY'
AND rctla.interface_line_attribute1 =
oha.order_number
AND rcta.customer_trx_id =
rctla.customer_trx_id )
AND rct.trx_number = (
SELECT
MAX(TRX_number)
FROM
ra_customer_trx_lines_all rctla,
ra_customer_trx_all rcta
WHERE
rctla.interface_line_context = 'ORDER
ENTRY'
AND rctla.interface_line_attribute1 =
oha.order_number
AND rcta.customer_trx_id =
rctla.customer_trx_id )
The same
functionality can be provided by re writing the SQL as follows, in the
process the cost gets reduced to almost half of the first query.
SELECT
max(rct.trx_number),
max(rct.trx_date),
max(rctt.type)
FROM
oe_order_headers_all oha,
ra_customer_trx_all rct,
ra_cust_trx_types_all rctt
WHERE
rct.interface_header_context = 'ORDER ENTRY'
AND rct.interface_header_attribute1 = oha.order_number
AND rctt.cust_trx_type_id = rct.cust_trx_type_id
AND oha.header_id = :in_num_header_id
No comments:
Post a Comment