Infolinks

Friday 29 June 2012

Clusters and Hash Clusters

Clusters and Hash Clusters

Clusters are groups of one or more tables physically stored together because they share common columns and are often used together. This improves disk access time.
The related columns of the tables in a cluster are called the cluster key. The cluster key is indexed so that rows of the cluster can be retrieved with a minimum amount of I/O.
No matter how many tables within the cluster contain the cluster key value, it is stored only once each in the cluster and the cluster index. Therefore, less storage is required.
Whether or not a table is part of a cluster is transparent to users and to applications. Data stored in a clustered table is accessed by SQL in the same way as data stored in a nonclustered table.
Hash Clusters cluster table data in a manner similar to normal, index clusters.

To find or store a row in a hash cluster, Oracle applies a hash function to the row's cluster key value. The resulting hash value corresponds to a data block in the cluster.
All rows with the same key value are stored together on disk.
Hash clusters are a better choice than using an indexed table or index cluster when a table is often queried with equality queries (for example, WHERE product_id=123). For such queries, the specified cluster key value is hashed. The resulting hash key value points directly to the area on disk that stores the rows.
This reduces the amount of I/Os that must be performed to locate and read/write a row of data.

"The continued use of cluster bombs has cost thousands of civilian lives, denied land to the poor and disenfranchised and is now costing the international community millions to eradicate the unexploded submunitions… - Rae McGrath

Related

CREATE CLUSTER

Create a cluster of tables.

Syntax:
   CREATE CLUSTER schema.cluster(col1 type, col2 type…)
            options [ROWDEPENDENCIES|NOROWDEPENDENCIES] [CACHE|NOCACHE] ;
Options:
The options used with this command can be any combination of the following
    PCTUSED int
    PCTFREE int
    INITRANS int
    MAXTRANS int
    SIZE int K | M
    TABLESPACE tablespace_name
    STORAGE storage_clause
    INDEX | [SINGLE TABLE] HASHKEYS int [HASH IS expression]
    PARALLEL parallel_clause 
If the PARALLEL clause is used it should be the last option.
SIZE dictates the amount of space (in bytes) reserved to store all rows with the same cluster key value or the same hash value.
Specify INDEX to create an indexed cluster.
Specify the HASHKEYS clause to create a hash cluster and specify the number of
hash values for a hash cluster.
Specify CACHE if you want the blocks retrieved for this cluster to be placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This may improve performance for small lookup tables.
The ROWDEPENDENCIES setting is primarily used to allow parallel propagation in a replicated database.

ALTER CLUSTER

Change the storage properties of a cluster.

Syntax:
   ALTER CLUSTER schema.cluster options
The options used with this command can be any combination of the following:
    PCTUSED int
    PCTFREE int
    INITRANS int
    MAXTRANS int
    SIZE int K | M
    TABLESPACE tablespace_name
    STORAGE storage_clause
    CACHE int | NOCACHE
    ALLOCATE EXTENT
    ALLOCATE EXTENT SIZE int K | M
    ALLOCATE EXTENT DATAFILE 'filename'
    ALLOCATE EXTENT INSTANCE int
    DEALLOCATE UNUSED
    DEALLOCATE UNUSED KEEP int K | M
    INSTANCE instance_no
    PARALLEL parallel_clause
If the PARALLEL clause is used it should be the last option.
Specify CACHE if you want the blocks retrieved for this cluster to be placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This may improve performance for small lookup tables.
When you explicitly ALLOCATE an extent, Oracle will ignore the cluster’s storage parameters when allocating space for a new extent. Therefore, specify SIZE only if you do not want to use the default value.

ANALYZE Statement

Update statistics.
Now superceeded by GATHER_SCHEMA_STATS

Syntax:
   ANALYZE TABLE tablename COMPUTE|ESTIMATE|DELETE STATISTICS ptnOption options
   ANALYZE INDEX indexname COMPUTE|ESTIMATE|DELETE STATISTICS ptnOption options
   ANALYZE CLUSTER clustername COMPUTE|ESTIMATE|DELETE STATISTICS options

ptnOption
   PARTITION (partion)
   SUBPARTITION (subpartition)

options
   VALIDATE STRUCTURE [CASCADE] [INTO tablename]
   LIST CHAINED ROWS [INTO tablename]
   COMPUTE|ESTIMATE STATISTICS FOR TABLE
   COMPUTE|ESTIMATE STATISTICS FOR ALL COLUMNS
   COMPUTE|ESTIMATE STATISTICS FOR ALL INDEXED COLUMNS
   COMPUTE|ESTIMATE STATISTICS FOR COLUMNS [SIZE int] column [SIZE int]

   When Estimating statistics you can optionally 
   specify a sample:
    …ESTIMATE STATISTICS SAMPLE n ROWS
    …ESTIMATE STATISTICS SAMPLE n PERCENT
Validate structure will perform an integrity check - and will therefore lock the table/index/cluster while it is running.

If the INTO clause is used to store a list of chained rows in a table - the default tablename is CHAINED_ROWS

UTLCHAIN.sql - creates the chained_rows table

The ANALYZE command is available for all versions of Oracle, however to obtain faster and better statistics use the procedures supplied - in 7.3.4 and 8.0 DBMS_UTILITY.ANALYZE_SCHEMA, and in 8i and above - DBMS_STATS.GATHER_SCHEMA_STATS


No comments:

Post a Comment