Monday, November 30, 2009

Data Guard - Logical Standby using Hotbackup

Data Guard is a technique used to minimize the downtime in case of db failures and user mistakes. There are more solutions like RAC, Volume Manager to protect the database and provide the high availability, Protection of single instance failure by providing redundancy to the users. However these are good for high availability and scalability, they do not protect from user mistakes, data corruptions and other disasters that may destroy the database itself.

In the following paragraphs I am going to describe the Logical standby database creation using HOTBACKUPS. Before we go into the creation of the database what benefits we are having using Logical Standby database.

Benefits of Logical Standby database:- Since we know dataguard can be configured in two ways - Physical Standby Database and Logical Standby Database. The most advantage over Physical standby is Logical standby database can be in open mode and simultaneously it can be used for reporting. This advantage comes handy because in Logical standby the SQL's are being applied on the table which are generated from the archived logs which get transfred from the Primary database. Since applying SQL required the database has to be in the Open mode.
Second Important benefit is Resource Utilization, Logical Standby database can have extra indexes and materialized views which might be required for the reporting purpose Since it is not mandatory to have the same physical structure of the Physical standby.

=====================================================================================

PRE REQUISITES:- After getting the advantages, it is required to check the primary database for following points which enables how far standby database is going to support for log apply service:

1. Primary database should be in Archive log mode.
2. Primary database should be checked for the objects which are good candidate for the log apply service. If found that few objects are not a candidate for the log apply service by creating primary key / unique index we can make them a candidate.
3. Ensure Supplemental logging is enabled for the Primary database.

Below are the commands to check the above requirements:-

Log in as sysdba into the primary database then issue the commands.

1. Archive Log List;
This will show if the database is in Archive log list. if the database is not in Archive log mode. Goahead and put the database in archive log mode.

2. SELECT * FROM dba_logstdby_unsupported;

This will show if the primary database contains tables and datatypes that were not supported by a logical stand by database. If the primary database contains tables that were unsupported, log apply services will exclude the tables applying to the logical stand by database.

3. SELECT owner, table_name, bad_column FROM dba_logstdby_not_unique;

OWNER TABLE_NAME B
VCSUSER VCS N

Bad column ‘N’ indicates that the table contains enough column information to maintain the table in the logical standby database, where as ‘Y’ indicates the table column is defined using an unbounded data type, such as LONG.

Add a primary key to the tables that do not have to improve performance.

If the table has a primary key or a unique index with a non-null column, the amount of information added to the redo log is minimal.

4. Ensure supplemental logging is enabled and log parallelism is enabled on the primary database. Supplemental logging must be enabled because the logical standby database cannot use archived redo logs that contain both supplemental log data and no supplemental log data.

SELECT supplemental_log_data_pk,supplemental_log_data_ui FROM v$database;


SUP SUP
--- ---
YES YES

If the supplemental logging is not enabled, then it is required to put the database for supplemental logging mode. In order to do that execute the following

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY,UNIQUE INDEX) COLUMNS;

ALTER SYSTEM SWITCH LOGFILE;


If log parallelism is not enabled, execute the following:

ALTER SYSTEM SET LOG_PARALLELISM=1 SCOPE=BOTH;

Since we are going to create Logical Standby using Hot backups we need to start Resource Manager. If you do not have a resource_manager plan, you can use one of the system defined plans and restart the primary database to make sure it is using the defined plan.

ALTER SYSTEM SET RESOURCE_MANAGER_PLAN=SYSTEM_PLAN SCOPE=BOTH;

SHUTDOWN

STARTUP


====================================================================================

Once the database has been configured to support the Logical standby Use following steps to configure the Logical stand by database.

1. Take a hotbackup of the primary database. Bring a tablespace to backup mode, copy the datafiles of that tablespace using an OS command, and bring the tablespace back online.

SELECT file_name,tablespace_name FROM v$tablespace;

This will give all the tablespace names and the datafiles which we need to backup, this can be done using RMAN or Command Line.

Below is an example how a SYSTEM tablespace can be backedup using command line. Follow the same for all tablespaces.

ALTER TABLESPACE SYSTEM BEGIN BACKUP;

! cp /opt/oracle/oradata/PRIMEDB/system01.dbf /backup/PRIMEDB/HOT/


ALTER TABLESPACE SYSTEM END BACKUP;

2. Create the backup control file.

ALTER DATABASE BACKUP CONTROLFILE TO backup/PRIMEDB/HOT/PRIMEDB_backup.ctl;

3. Bring the database to a QUIESCED State and configure Database Resource Manager for the same.

ALTER SYSTEM QUIESCE RESTRICTED;

Above task can be skip only if you are using 10g relase 2.

4. Build the LogMiner dictionary.

EXECUTE DBMS_LOGSTDBY.BUILD;

5. Identify the archived redo log that contains the LogMiner dictionary and the starting SCN.

You need to switch the log file, to create the archived log, and then query from V$ARCHIVED_LOG.

ALTER SYSTEM ARCHIVE LOG CURRENT;

SELECT NAME FROM V$ARCHIVED_LOG
WHERE (SEQUENCE#= (SELECT MAX(SEQUENCE#) FROM V$ARCHIVED_LOG
WHERE DICTIONARY_BEGIN = 'YES' AND STANDBY_DEST='NO')
);


NAME
-----------------------------------------------------------
/opt/oracle/ARC/MYFN/MYFN_0001_0000000005.arc

SELECT MAX(FIRST_CHANGE#) FROM V$ARCHIVED_LOG
WHERE DICTIONARY_BEGIN='YES';


MAX(FIRST_CHANGE#)
------------------
2856516

Note: Remember to record the name of the archived redo log for use later in the creation process.

6. Bring the database back to normal (UNQUIESCED).

ALTER SYSTEM UNQUIESCE;

ALTER SYSTEM SWITCH LOGFILE;

7. Create the parameter file from spfile in the Primary database. The pfile created will be used to create the pfile of the standby database.

CREATE PFILE=’/backup/PRIMEDB/HOT/Primary_init.ora’ FROM SPFILE;

8. Copy Files from the Primary Database Location to the Standby Location.

Use an operating system copy utility to copy the following binary files from the primary database site to the standby site:

i) Backup data files
ii) Backup of control files
iii) Latest archived redo log
iv) init.ora file
v) password file


9. Set the init.ora Parameters on the Logical standby site.

DB_NAME='PRIMEDB'
INSTANCE_NAME='PRIMEDB_H'
LOG_ARCHIVE_DEST_1='LOCATION=/OPT/ORACLE/ARC/PRIMEDB MANDATORY'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_FORMAT='PRIMEDB_%T_%S.ARC'
REMOTE_ARCHIVE_ENABLE=RECEIVE
LOG_ARCHIVE_START=TRUE
LOG_PARALLELISM=1
PARALLEL_MAX_SERVERS=9
STANDBY_FILE_MANAGEMENT='AUTO'
STANDBY_ARCHIVE_DEST='/OPT/ORACLE/ARC/PRIMEDB/STDBY'

# The following parameter is required only if the primary and standby databases
# are located on the same system.

LOCK_NAME_SPACE=PRIMEDB_H

10. Configure the Listener for Both the Primary and Standby Databases and Restart/reload the listener(s).

11. Because the online logs were not copied from the primary system, the redo logs will have to be cleared. To clear the redo logs, use the following statement for all the groups:

ALTER DATABASE CLEAR LOGFILE GROUP 1;

ALTER DATABASE CLEAR LOGFILE GROUP 2;


12. Recover the Logical Standby Database up to the SCN recorded in step 5 above.

Use the command below to recover the database:

ALTER DATABASE RECOVER AUTOMATIC FROM '/opt/oracle/ARC/PRIMEDB_H/'
UNTIL CHANGE 2856516 USING BACKUP CONTROLFILE;


If error 'ORA-279: change %s generated at %s needed for thread %s' comes the recovery will have to be canceled and recover it manually using the following command.

ALTER DATABASE RECOVER LOGFILE 2>'/opt/oracle/ARC/PRIMEDB_H/MYDB_0001_0000000004.arc'

13. Turn on Data Guard on the Logical Standby Database.

ALTER DATABASE GUARD ALL;

14. Open the Logical Standby Database:

ALTER DATABASE OPEN RESETLOGS;

15. Reset the Database Name of the Logical Standby Database:

Run the Oracle DBNEWID (nid) utility to change the database name of the logical standby database. This will change the database name in the control file. Until now, the dbname in the control file is the primary db name.

SHUTDOWN IMMEDIATE;

STARTUP MOUNT PFILE=''

EXIT


$> export ORACLE_SID=MYDB_H

Modify the init.ora file and set parameter DB_NAME=MYDB_H, and if password file is used then delete and recreate the password file.

SHUTDOWN IMMEDIATE;

16. Create a server parameter file for the standby database:

CREATE SPFILE FROM PFILE=;

17. Restart the Logical Standby Database:

STARTUP MOUNT;
ALTER DATABASE OPEN RESETLOGS;


18. Create a New Temporary File for the Logical Standby Database:

The temporary files are not included as a part of the closed backup operation, so the temporary file will have to be recreated manually.

To create a Temporary file:

ALTER TABLESPACE TEMP ADD TEMPFILE 2> '/opt/oracle/oradata/PRIMEDB_H/temp01.dbf' SIZE 40M REUSE;

19. Register the Archived Redo Log and Start SQL Apply Operations:

To register the most recently archived redo log and begin applying data from the redo logs to the standby database, perform the following steps:

ALTER DATABASE REGISTER LOGICAL LOGFILE
'/opt/oracle/ARC/MYDB_H/MYDB_0001_0000000005.arc’;


Specify the following SQL statement to begin applying redo logs to the logical standby database using the SCN number:

ALTER DATABASE START LOGICAL STANDBY APPLY INITIAL 2856516;

Note: The INITIAL keyword has to be used only for the first time. To apply redo logs thereafter, the following statements should be used.

ALTER DATABASE STOP LOGICAL STANDBY APPLY;

ALTER DATABASE START LOGICAL STANDBY APPLY;


20. Enable Archiving in Logical Standby Database:

This step has to be performed in the Primary Database to enable archiving to the Logical Standby Database. Use the following statements to start archiving and remote archiving:

ALTER SYSTEM SET LOG_ARCHIVE_DEST_3='SERVICE=payroll3 lgwr NO AFFIRM' 2 > SCOPE=BOTH;

ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_3=ENABLE SCOPE=BOTH;


To start remote archiving either of the following statements can be used:

ALTER SYSTEM ARCHIVE LOG CURRENT;

ALTER SYSTEM SWITCH LOGFILE;


This completes the configuration of Logical Standby Database from Hot Backup.

At the last - SQL Apply with Logical Standby Database is a viable option for customers who need to implement a disaster recovery solution or maximum/high availability solution and use the same resources for reporting and decision support operations. The success in creating a Logical Standby Database depends a lot on how the tasks are executed and on the version is being used. It is very important, before starting the creation of a Logical Standby Database, to make sure that all the Initialization Parameters are set correctly, that all the steps are followed in the correct order and the appropriate parameters are used. If everything is done properly then you should be able to do a clean configuration of the Logical Standby Database in the first go.

Thursday, November 12, 2009

Database Migration to ASM

Database Migration to ASM
Since ASM files can not be accessed via the operating system, we need to use the RMAN to move database objects from a non-ASM disk location to an ASM disk group.
Following are the steps to move objects:-
1. Note the filenames of the control files and the online redo log files.
2. Shut down the database NORMAL, IMMEDIATE or TRANSACTIONAL.
3. Backup the database.
4. Edit the SPFILE to use OMF for all the file destinations.
5. Edit the SPFILE to remove the CONTROL_FILES parameter.
6. Run the following RMAN script, substituting your specific filenames as needed:-

STARTUP NOMOUNT;
RESTORE CONTROLFILE FROM '';
ALTER DATABASE MOUNT;
BACKUP AS COPY DATABASE FORMAT '+disk group destination';
SWITCH DATABASE TO COPY;
SQL "ALTER DATABASE RENAME TO '+disk group destination>' ";
ALTER DATABASE OPEN RESETLOGS;
Delete or archive the old database files.

Wednesday, November 11, 2009

Database BACK UP and RECOVERY

Below is the pictorial for BACKUP and RECOVERY..

Let me know if you have any questions..

Tuesday, November 10, 2009

ASM - Oracle 10g New Feature

ASM - Automatic Storage Management
ASM is One of the most valuble feature in Oracle 10g which helps DBA's in managing thousands of datafiles easily. Main features of ASM is to create Disk Groups, composed of disk devices and the files that resides on the disk devices managed as logical unit.
ASM is a combination of OMF and combines with mirroring and stripping features to provide a robust file system and logical volume manager which can also supported with RAC. There is no need of third part Logical Volume Manager now. ASM stripes files and not the logical volumes, which enhances performance by automatically spreading database objects over multiple devices to increase availability by allowing new disk added to the database without shutting down the database. It automatically rebalances the distribution of files with minimal intervention.

Lets see how it works:-
ASM Architecture
As we know the database objects are stored in extents, ASM divides these extents across the disks in the disk groups to enhance the performance and reliability. Also ASM mirrors the datbase objects instead of mirroring the entire disk volumes to provide the flexibility to mirror or stripe the database objects differently depending on their type. Optionally if the underlying hardware is already RAID enabled.
Automatic Rebalancing is yet another feature of ASM. When it is required to increase the disk space it can be added to the disk group and ASM moves a proportional number of files from one or more existing disk to the new disks to maintain the overall I/O balances accross all disks. This all happens when the disk files are online and available to the users.

ASM requires a special type of instance to provide the interface between a traditional Instance and the file system. It can be choosen while istalling the database while choosing the storage type. Choosing ASM does not mean that we can not use the traditional way of storage e.g. raw devices or any file system files. However it gives the ease to convert the existing filesystem to ASM disk groups.


There are three new background processes which supports the ASM Instances: RBAL and ORBn.
RBAL:- RBAL coordinates the disk activity for disk groups. It also serves as open and close of the disks in the disk group on behalf of the database.
OSMB:- It performs the communication between the database and the ASM Instance
ORBn:- Performs actual extent movement between disks in the disk groups. Where n can be from 0 to 9.

Creation of the ASM Instance can be checked with the database installation guidlines provided with the installation guide. We are not discussing the same here. Lets see how to access, start, stop and Disk group related features.

Accessing an ASM Instance:-
Since ASM instance does not have a data dictionary therefore the access to the ASM instance is restricted to users who can authenticate with the operating system conecting as SYSDBA or SYSOPER users that is in dba group.

SYSDBA has all the privileges as usual however the SYSOPER have much more limited set of commands available in an ASM instance. Following is the list of operations available to SYSOPER:-
i) Starting and Shutting down an Instance.
ii) Mounting or dismounting a disk group
iii) Rebalancing a disk group
iv) Altering a disk group’s disk status to ONLINE or OFFLINE
v) Performing an integrity check of a disk group
vi) Accessing V$ASM_* dynamic performance Views.

ASM Initialization Parameters:-
A number of initialization parameters specific to ASM instance. A spfile is recommended over initialization parameter file for an ASM Instance. e.g. ASM_DISKGROUP will automatically be maintained when a disk group is added or dropped, it will free a dba to manually change the values.
Following are the Initialization parameters:-
INSTANCE_TYPE :- INSTANCE_TYPE has a value of ASM for ASM Instance. The default for the traditional instances it is RDBMS.
DB_UNIQUE_NAME:- The default value for the ASM instance is +ASM and it is unique for the group of ASM Instances within a cluster or on a single node. The default value has to be changed only when more than one ASM Instances are running on single node.
ASM_POWER_LIMIT:- ASM_POWER_LIMIT parameter controls how fast rebalance operations occur. The values ranges from 1 to 11, with 11 being the highest possible value however the 1 is the default value. Since this is a dynamic parameter one can set 1 in the day time when the usage are high for the database and set it higher overnight whenever a disk rebalancing operation must occur.
ASM_DISKSTRING:- ASM_DISKSTRING parameter specifies one or more strings, which are OS dependent, to limit the disk devices that can be used to create disk groups. If the value is NULL, all the disk visible to ASM instance are potential candidate for creating disk groups.

SQL> Show parameter asm_diskstring

ASM_DISKGROUPS:- The ASM_DISKGROUP specifies a list containing the names fo the disk groups to be automatically mounted by the ASM instance at startup or by the ALTER DSIKGROUP ALL MOUNT command.
LARGE_POOL_SIZE:- All internal packages are executed from this pool, so the parameter should be set to at least 8MB. This parameter is used in both kind of instances, however this is used differently for ASM instances.
ASM Instance STARTUP and SHUTDOWN:-
An ASM instance while startup has been only MOUNTED and not OPEN. Since there is no control file, database or data dictionary to mount ASM Diskgroups are mounted instead of database. STARTUP NOMOUNT starts up the instance but does not mount the ASM disks. Also we can use the STARTUP RESTRICT to temporarily prevent database instances from connecting to the ASM instance to mount diskgroups.
Performing a SHUTDOWN command on an ASM Instance performs the same SHUTDOWN, before it finishes the SHUTDOWN operation it waits for all dependent databases to shutdown. The only exception is SHUTDOWN ABORT.
For multiple ASM instances sharing disk groups, such as in RAC environment, the failure of an ASM instance does not cause the database instances to fail. Instead another ASM Instance performs a recovery operation for the failed instance.
ASM Dynamic Performance Views:-
ASM Filenames
All ASM files are OMF, so the details of the actual filename with in the disk group is not needed for most administrative functions, When an object in an ASM disk group is dropped, the file is automatically deleted. V$DATAFILE shows the actual filenames within each disk groups.

SQL > Select file#, name, blocks from v$datafile;

File# NAME BLOCKS
--------- ----------------------------- --------------------
1 +DATA1/rac0/datafile/system.256.1 57600
2 +DATA1/rac0/datafile/sysaux.257.1 44800

ASM filenames can be one of six different formats such as
Fully qualified names (+group/dbname/file type/tag.file.incarnation)
Numeric names (+DATA2.256.1)
Alias with Template names- we can also use an alias with a template only when creating a new ASM file.

SQL> create tablespace users4 datafile ‘+data2/uspare(datafile);

Incomplete names – we can use incomplete file name format either for single file or multiple file creation operation.

SQL> create tablespace users5 datafile ‘+data1’;

ASM File types and Templates:-ASM supports all types of files used by the database except for files such as operating system executables.
CONTROLFILE, DATAFILE, ONLINELOG, ARCHIVELOG, TEMPFILE, BACKUPSET, PARAMETERFILE, DATAGUARDCONFIG, FLASHBACK, CHANGETRACKING, AUTOBACKUP, DUMPSET AND XTRANSPORT(Cross platform datafiles)

Administering ASM Disk Groups
As mentioned earlier disk groups benefits in number of ways: I/O performance improved, availability is increased and the ease with which one can add a disk to a disk group or add an entirely new disk group enables you to manage amny more databases in the same amount of time.

Disk Group Architecture:-

Disk Group is collection of physical disks managed as a unit. Every ASM disk, as part of a disk group has an ASM disk name that is either assigned by the DBA or automatically assigned when it is assigned to the disk group.

Files in the disk group are striped on the disks using either coarse striping or fine striping. Coarse striping spreads files in units of 1MB each across all disks. Coarse striping is appropriate for a system with a high degree of concurrent small I/O requests, such as an OLTP env. Alternatively fine striping spreads files in units of 128KB and is appropriate for traditional data warehouse environments or OLTP systems with low concurrency and maximizes response time for individual I/O requests.

Failure Groups and Disk Group Mirroring:-

Before defining the type of mirroring with in a disk group, one must group disks into failure groups. A failure group is one or more disks within a disk group that share a common resource, such as a disk controller, whose failure would cause the entire set of disks to be unavailable to the group. In most cases, an ASM instance does not know the hardware and software dependencies for a given disk. Therefore, unless you specifically assign a disk to a failure group, each disk in a disk group is assigned to its own failure group.
Type of mirroring depends on the number of failure groups available with in the disk group. There are three types of mirroring is available:

External Redundancy: External redundancy requires only one failure group and assumes that the disk is not critical to the ongoing operation of the database or that the disk is managed externally with high availability hardware such as a RAID controller.
Normal Redundancy: Normal redundancy provides two way mirroring and requires at least two failure groups within a disk group. The failure of one of the disks in a failure group does not cause any downtime for the disk group or any data loss other than a slight performance hit for queries against objects in the disk group.
High Redundancy: High redundancy provides three way mirroring and requires at least three failure groups within a disk group. The failure of disks in two out of the three failure groups is for the most part transparent to the database users as in normal redundancy mirroring.

Mirroring is managed at a very low level: extents, not disks are mirrored, each disk will have a mixture of both primary and mirrored (secondary and tertiary) extents on each disk. While there is a slight overhead incurred for managing mirroring at the extent level, it provides the advantage of spreading the load from the failed disk to all other disks instead of a single disk.


More to come in next post......

Display the Amount of Physical Memory and Swap Space on UNIX Systems

Configuring and managing Memory on the Oracle Server is one of the crytical task while installing and managing databases. One Question always comes to the mind.

How Much Swap Space Should Be Configured?======================================================================Unless otherwise specified, the most sensible rule of thumb is to have enough swap allocated to equal 2-3 times your amount of physical memory.If you have more than 1Gig of memory, this rule can be reduced to swap equaling 1.5 times the amount of physical memory.Considering the relatively inexpensive prices of physical storage versus physical memory, there really isn't any justification for not having adequate swap space configured on a UNIX system.

OS Specific Commands:
======================================================================
AIX:
/usr/sbin/lsattr -E -l sys0 -a realmem
/usr/sbin/lsps -s
HP-UX:
grep Physical /var/adm/syslog/syslog.log
/usr/sbin/swapinfo -t
Linux:
cat /proc/meminfo grep MemTotal
/sbin/swapon -s
Solaris:
/usr/sbin/prtconf grep "Memory size"
/usr/sbin/swap -s
Tru64:
vmstat -P grep -i "Total Physical Memory ="
/sbin/swapon -s

Monday, November 9, 2009

AWK Help

UNIX Utilities - awk
The awk command line is:
awk [program-f programfile] [flags/variables] [files]
Command line flags
-f file -- Read the awk script from the specified file rather than the command line
-F re -- Use the given regular expression re as the field separator rather than the default "white space"
variable=value -- Initialize the awk variable with the specified
An awk program consists of one or more awk commands separated by either \n or semicolons.
The structure of awk commands
Each awk command consists of a selector and/or an action; both may not be omitted in the same command. Braces surround the action.
selector [only] -- action is print
{action}[only] -- selector is every line
selector {action} -- perform action on each line where selector is true
Each action may have multiple statements separated from each other by semicolons or \n
Line selection
A selector is either zero, one, or two selection criteria; in the latter case the criteria are separated by commas
A selection criterion may be either an RE or a boolean expression (BE) which evaluates to true or false
Commands which have no selection criteria are applied to each line of the input data set
Commands which have one selection criterion are applied to every line which matches or makes true the criterion depending upon whether the criterion is an RE or a BE
Commands which have two selection criteria are applied to the first line which matches the first criterion, the next line which matches the second criterion and all the lines between them.
Unless a prior applied command has a next in it, every selector is tested against every line of the input data set.
Processing
The BEGIN block(s) is(are) run (mawk's -v runs first)
Command line variables are assigned For each line in the input data set It is read and NR, NF, $I, etc. are set For each command, its criteria are evaluated If the criteria is true/matches the command is executed After the input data set is exhausted, the END block(s) is(are) run
Elementary awk programming Constants
Strings are enclosed in quotes (")
Numbers are written in the usual decimal way;
non-integer values are indicated by including a period (.) in the representation.
REs are delimited by /
Variables Need not be declared
May contain any type of data, their data type may change over the life of the program
Are named as any token beginning with a letter and continuing with letters, digits and underscores
As in C, case matters; since all the built-in variables are all uppercase, avoid this form.
Some of the commonly used built-in variables are:
NR -- The current line's sequential number
NF -- The number of fields in the current line
FS -- The input field separator;
defaults to whitespace and is reset by the -F command line parameter
Fields - Each record is separated into fields named $1, $2, etc
$0 is the entire record
NF contains the number of fields in the current line
FS contains the field separator RE; it defaults to the white space RE, /[]*/
Fields may be accessed either by $n or by $var where var contains a value between 0 and NF
print/printf
print prints each of the values of $1 through $NF separated by OFS then prints a \n onto stdout; the default value of OFS is a blank
print value value ... prints the value(s) in order and then puts out a \n onto stdout;
printf(format,value,value,...) prints the value(s) using the format supplied onto stdout, just like C. There is no default \n for each printf so multiples can be used to build a line. There must be as many values in the list as there are item descriptors in format.
Values in print or printf may be constants, variables, or expressions in any order
Operators - awk has many of the same operators as C, excepting the bit operators. It also adds some text processing operators.
Built-in functions
substr(s,p,l) -- The substring of s starting at p and continuing for l characters
index(s1,s2) -- The first location of s2 within s1; 0 if not found
length(e) -- The length of e, converted to character string if necessary, in bytes
sin, cos, tan -- Standard C trig functions
atan2(x,y) -- Standard quadrant oriented arctangent function
exp, log -- Standard C exponential functions
srand(s), rand() -- Random number seed and access functions
Elementary examples and uses
length($0)>72 -- print all of the lines whose length exceeds 72 bytes
{$2="";print} -- remove the second field from each line
{print $2} -- print only the second field of each line
/Ucast/{print $1 "=" $NF} -- for each line which contains the string 'Ucast' print the first variable, an equal sign and the last variable (awk code to create awk code; a common trick)
BEGIN{FS="/"};NF<4 1="=" ct="1;pr" name="ap" next="next" printf="printf" prv="prv" v="$0}">
Advanced awk programming
Program structure (if, for, while, etc.)
if(boolean) statement1 else statement2 if the boolean expression evaluates to true execute statement1, otherwise execute statement 2
for(v=init;boolean;v change) statement Standard C for loop, assigns v the value of init then while the boolean expression is true executes the statement then the v change
for(v in array) statement Assigns to v each of the values of the subscripts of array, not in any particular order, then executes statement
while(boolean) statement While the boolean expression is true, execute the statement
do statement while(boolean) execute statement, evaluate the boolean expression and if true, repeat
statement in any of the above constructs may be either a simple statement or a series of statements enclosed in {}, again like C; a further requirement is that the opening { must be on the line with the beginning keyword (if, for, while, do) either physically or logically via \ .
break -- exit from an enclosing for or while loop
continue -- restart the enclosing for or while loop from the top
next -- stop processing the current record, read the next record and begin processing with the first command
exit -- terminate all input processing and, if present, execute the END command
Arrays
There are two types of arrays in awk - standard and generalized
Standard arrays take the usual integer subscripts, starting at 0 and going up; multidimensional arrays are allowed and behave as expected
Generalized arrays take any type of variable(s) as subscripts, but the subscript(s) are treated as one long string expression.
The use of for(a in x) on a generalized array will return all of the valid subscripts in some order, not necessarily the one you wished.
The subscript separator is called SUBSEP and has a default value of comma (,)
Elements can be deleted from an array via the delete(array[subscript]) statement
Built-in variables
FILENAME -- The name of the file currently being processed
OFS -- Output Field Separator default ' '
RS -- Input Record Separator default \n
ORS -- Output Record Separator default \n
FNR -- Current line's number with respect to the current file
OFMT -- Output format for printed numbers default %.6g
RSTART -- The location of the data matched using the match built-in function
RLENGTH -- The length of the data matched using the match built-in function
Built-in functions
gsub(re,sub,str) -- replace, in str, each occurrence of the regular expression re with sub; return the number of substitutions performed
int(expr) -- return the value of expr with all fractional parts removed
match(str,re) -- return the location in str where the regular expression re occurs and set RSTART and RLENGTH; if re is not found return 0
split(str,arrname,sep) -- split str into pieces using sep as the separator and assign the pieces in order to the elements from 1 up of arrname; use FS if sep is not given
sprintf(format,value,value,...) -- write the values, as the format indicates, into a string and return that string
sub(re,sub,str) -- replace, in str, the first occurrence of the regular expression re with sub; return 1 if successful, 0 otherwise
system(command) -- pass command to the local operating system to execute and return the exit status code returned by the operating system
tolower(str) -- return a string similar to str with all capital letters changed to lower case
Other file I/O
print and printf may have > (or >>) filename or command appended and the output will be sent to the named file or command; once a file is opened, it remains open until explicitly closed
getline var < name="ae">Advanced examples and uses
{ split($1,t,":")
$1 = (t[1]*60+t[2])*60+t[3]
print
}
Replaces an HH:MM:SS time stamp in the first field with a seconds since midnight value which can be more easily plotted, computed with, etc.
{ for(i = 1; i<=NF; i++) ct[$i] += 1 } END { for(w in ct) { printf("%6d %s",ct[w],w) } } This reads a file of text and creates a file containing each unique word along with the number of occurrences of the word in the text. NR=1 { t0=$1; tp = $1; for(i=1;i<=nv;i++) dp[i] = $(I+1);next} { dt=$1-tp; tp = $1 printf("%d ",$1-t0) for(i=1;i<=nv;i++) { printf("%d ",($(I+1)-dp[i])/dt) dp[i] = $(i+1) } printf("\n") } Take a set of time stamped data and convert the data from absolute time and counts to relative time and average counts. The data is presumed to be all amenable to treatment as integers. If not, formats better the %d must be used. BEGIN{ printf("set term postscript\n") > "plots"
printf("set output 'lpr -Php'\n") > "plots" }
{ if(system("test -s " $1 ".r") {
print "process1 " $1 ".r " $2
printf("plot '%s.data' using 2:5 title '%s'",\
$1,$3) >> "plots"
}
}
END { print "gnuplot < i =" index(" a="$0" hold =" hold" hold ="" hold = "\f" hold = "\n" hold = "\n\n" unfirst =" 1" hold =" hold" b="" ll="="0)" ll="72" nf="="0" b="" b="substr(b,1,length(b)-1)" b="b">ll) {
i = ll
while(substr(b,i,1)=" ") I--
print substr(b,1,i-1)
b = substr(b,i+1)
}
}
END { print b; print "" }

This will take an arbitrary stream of text (where paragraphs are indicated by consecutive \n) and make all the lines approximately the same length. The default output line length is 72, but it may be set via a parameter on the awk command line. Both long and short lines are taken care of but extra spaces/tabs within the text are not correctly handled.

BEGIN { FS = "\t" # make tab the field separator
printf("%10s %6s %5s %s\n\n",
"COUNTRY", "AREA", "POP", "CONTINENT")
}
{ printf("%10s %6d %5d %s\n", $1, $2, $3, $4)
area = area +$2
pop = pop + $3
}
END { printf("\n%10s %6d %5d\n", "TOTAL", area, pop) }

This will take a variable width table of data with four tab separated fields and print it as a fixed length table with headings and totals.
Important things which will bite you
$1 inside the awk script is not $1 of the shell script; use variable assignment on the command line to move data from the shell to the awk script,
Actions are within {}, not selections
Every selection is applied to each input line after the previously selected actions have occurred; this means that a previous action can cause unexpected selections or selection misses.
Operators " " The blank is the concatenation operator
+ - * / % All of the usual C arithmetic operators, add, subtract, multiply, divide and mod.
== != < <= > >= All of the usual C relational operators, equal, not equal, less than, less than or equal and greater than, greater than or equal
&& The C boolean operators and and or
= += -= *= /= %= The C assignment operators
~ !~ Matches and doesn't match
?: C conditional value operator
^ Exponentiation
++ -- Variable increment/decrement
Note the absence of the C bit operators &, , <<>>
[s]printf format items
Format strings in the printf statement and sprintf function consist of three different type of items: literal characters, escaped literal characters and format items. Literal characters are just that: characters which will print as themselves. Escaped literal characters begin with a backslash (\) and are used to represent control characters; the common ones are: \n for new line, \t for tab and \r for return. Format items are used to describe how program variables are to be printed.
All format items begin with a percent sign (%). The next part is an optional length and precision field. The length is an integer indicating the minimum field width of the item, negative if the data is to be white spacethe left of the field. If the length field begins with a zero (0), then instead of padding the value with leading blanks, the item will be padded with leading 0s. The precision is a decimal followed by the number of decimal digits to be displayed for various floating point representations. Next is an optional source field size modifier, usually 'l' (ell). The last item is the actual source data type, commonly one of the list below:
d Integer
f Floating point in fixed point format
e Floating point invaluel format
g Floating point in "best fit" format;
integer, fixed point, or exponential; depending on exact value
s Character string
c Integer to be interpreted as a character
x Integer to be printed as hexadecimal
Examples: %-20s Print a string in the left portion of a 20 character field
%d Print an integer in however many spaces it takes
%6d Print an integer in at least 6 spaces; used to format pretty output
%9ld Print a long integer in at least 9 spaces
%09ld Print a long integer in at least 9 spaces with leading 0s, not blanks
%.6f Print a float with 6 digits after the decimal and as many before it as needed
%10.6f Print a float in a 10 space field with 6 digits after the decimal
Awk is a very nice language with a very strange name. In this first article of a three-part series, Daniel Robbins will quickly get your awk programming skills up to speed. As the series progresses, more advanced topics will be covered, culminating with an advanced real-world awk application demo.
");
}
}
}
//-->
In defense of awk
In this series of articles, I'm going to turn you into a proficient awk coder. I'll admit, awk doesn't have a very pretty or particularly "hip" name, and the GNU version of awk, called gawk, sounds downright weird. Those unfamiliar with the language may hear "awk" and think of a mess of code so backwards and antiquated that it's capable of driving even the most knowledgeable UNIX guru to the brink of insanity (causing him to repeatedly yelp "kill -9!" as he runs for coffee machine).
Sure, awk doesn't have a great name. But it is a great language. Awk is geared toward text processing and report generation, yet features many well-designed features that allow for serious programming. And, unlike some languages, awk's syntax is familiar, and borrows some of the best parts of languages like C, python, and bash (although, technically, awk was created before both python and bash). Awk is one of those languages that, once learned, will become a key part of your strategic coding arsenal.

The first awk
You should see the contents of your /etc/passwd file appear before your eyes. Now, for an explanation of what awk did. When we called awk, we specified /etc/passwd as our input file. When we executed awk, it evaluated the print command for each line in /etc/passwd, in order. All output is sent to stdout, and we get a result identical to catting /etc/passwd.
Now, for an explanation of the { print } code block. In awk, curly braces are used to group blocks of code together, similar to C. Inside our block of code, we have a single print command. In awk, when a print command appears by itself, the full contents of the current line are printed.
$ awk '{ print $0 }' /etc/passwd

In awk, the $0 variable represents the entire current line, so print and print $0 do exactly the same thing.
$ awk '{ print "" }' /etc/passwd

$ awk '{ print "hiya" }' /etc/passwd

Running this script will fill your screen with hiya's. :)
Multiple fields
print $1
$ awk -F":" '{ print $1 $3 }' /etc/passwd

halt7operator11root0shutdown6sync5bin1....etc.
print $1 $3
$ awk -F":" '{ print $1 " " $3 }' /etc/passwd
$1$3
$ awk -F":" '{ print "username: " $1 "\t\tuid:" $3" }' /etc/passwd

username: halt uid:7username: operator uid:11username: root uid:0username: shutdown uid:6username: sync uid:5username: bin uid:1....etc.
External scripts
BEGIN { FS=":"}{ print $1 }

The difference between these two methods has to do with how we set the field separator. In this script, the field separator is specified within the code itself (by setting the FS variable), while our previous example set FS by passing the -F":" option to awk on the command line. It's generally best to set the field separator inside the script itself, simply because it means you have one less command line argument to remember to type. We'll cover the FS variable in more detail later in this article.
The BEGIN and END blocks
Normally, awk executes each block of your script's code once for each input line. However, there are many programming situations where you may need to execute initialization code before awk begins processing the text from the input file. For such situations, awk allows you to define a BEGIN block. We used a BEGIN block in the previous example. Because the BEGIN block is evaluated before awk starts processing the input file, it's an excellent place to initialize the FS (field separator) variable, print a heading, or initialize other global variables that you'll reference later in the program.
Awk also provides another special block, called the END block. Awk executes this block after all lines in the input file have been processed. Typically, the END block is used to perform final calculations or print summaries that should appear at the end of the output stream.
Regular expressions and blocks
/foo/ { print }

/[0-9]+\.[0-9]*/ { print }
Expressions and blocks
fredprint
$1 == "fred" { print $3 }
root
$5 ~ /root/ { print $3 }
Conditional statements
if
{ if ( $5 ~ /root/ ) { print $3 }}

Both scripts function identically. In the first example, the boolean expression is placed outside the block, while in the second example, the block is executed for every input line, and we selectively perform the print command by using an if statement. Both methods are available, and you can choose the one that best meshes with the other parts of your script.
if if
{ if ( $1 == "foo" ) { if ( $2 == "foo" )
{ print "uno" }
else { print "one" }
}
else if ($1 == "bar" )
{ print "two" }
else { print "three" }}
if
! /matchme/ { print $1 $3 $4 }
{ if ( $0 !~ /matchme/ ) { print $1 $3 $4 }}

Both scripts will output only those lines that don't contain a matchme character sequence. Again, you can choose the method that works best for your code. They both do the same thing.
( $1 == "foo" ) && ( $2 == "bar" ) { print }

This example will print only those lines where field one equals foo and field two equals bar.
Numeric variables!
In the BEGIN block, we initialize our integer variable x to zero. Then, each time awk encounters a blank line, awk will execute the x=x+1 statement, incrementing x. After all the lines have been processed, the END block will execute, and awk will print out a final summary, specifying the number of blank lines it found.
Stringy variables
2.01
1.01x$( )1.01
{ print ($1^2)+1 }

If you do a little experimenting, you'll find that if a particular variable doesn't contain a valid number, awk will treat that variable as a numerical zero when it evaluates your mathematical expression.
Lots of operators
Another nice thing about awk is its full complement of mathematical operators. In addition to standard addition, subtraction, multiplication, and division, awk allows us to use the previously demonstrated exponent operator "^", the modulo (remainder) operator "%", and a bunch of other handy assignment operators borrowed from C.
These include pre- and post-increment/decrement ( i++, --foo ), add/sub/mult/div assign operators ( a+=3, b*=2, c/=2.2, d-=6.2 ). But that's not all -- we also get handy modulo/exponent assign ops as well ( a^=2, b%=4 ).
Field separators
Awk has its own complement of special variables. Some of them allow you to fine-tune how awk functions, while others can be read to glean valuable information about the input. We've already touched on one of these special variables, FS. As mentioned earlier, this variable allows you to set the character sequence that awk expects to find between fields. When we were using /etc/passwd as input, FS was set to ":". While this did the trick, FS allows us even more flexibility.
FS="\t+"

Above, we use the special "+" regular expression character, which means "one or more of the previous character".
FS="[[:space:]+]"

While this assignment will do the trick, it's not necessary. Why? Because by default, FS is set to a single space character, which awk interprets to mean "one or more spaces or tabs." In this particular example, the default FS setting was exactly what you wanted in the first place!
FS="foo[0-9][0-9][0-9]"
Number of fields
{if ( NF > 2 ) { print $1 " " $2 ":" $3 }}

Record number
{ #skip header if ( NR > 10 ) { print "ok, now for the real information!" }}

Awk provides additional variables that can be used for a variety of purposes. We'll cover more of these variables in later articles.
We've come to the end of our initial exploration of awk. As the series continues, I'll demonstrate more advanced awk functionality, and we'll end the series with a real-world awk application. In the meantime, if you're eager to learn more, check out the resources listed below.

Multi-line records
Awk is an excellent tool for reading in and processing structured data, such as the system's /etc/passwd file. /etc/passwd is the UNIX user database, and is a colon-delimited text file, containing a lot of important information, including all existing user accounts and user IDs, among other things. I showed you how awk could easily parse this file. All we had to do was to set the FS (field separator) variable to ":".
By setting the FS variable correctly, awk can be configured to parse almost any kind of structured data, as long as there is one record per line. However, just setting FS won't do us any good if we want to parse a record that exists over multiple lines. In these situations, we also need to modify the RS record separator variable. The RS variable tells awk when the current record ends and a new record begins.
As an example, let's look at how we'd handle the task of processing an address list of Federal Witness Protection Program participants:
Jimmy the Weasel100 Pleasant DriveSan Francisco, CA 12345Big Tony200 Incognito Ave.Suburbia, WA 67890

Ideally, we'd like awk to recognize each 3-line address as an individual record, rather than as three separate records. It would make our code a lot simpler if awk would recognize the first line of the address as the first field ($1), the street address as the second field ($2), and the city, state, and zip code as field $3. The following code will do just what we want:
BEGIN { FS="\n" RS=""}

Above, setting FS to "\n" tells awk that each field appears on its own line. By setting RS to "", we also tell awk that each address record is separated by a blank line. Once awk knows how the input is formatted, it can do all the parsing work for us, and the rest of the script is simple. Let's look at a complete script that will parse this address list and print out each address record on a single line, separating each field with a comma.
BEGIN { FS="\n" RS=""}{ print $1 ", " $2 ", " $3}

If this script is saved as address.awk, and the address data is stored in a file called address.txt, you can execute this script by typing "awk -f address.awk address.txt". This code produces the following output:
Jimmy the Weasel, 100 Pleasant Drive, San Francisco, CA 12345Big Tony, 200 Incognito Ave., Suburbia, WA 67890
OFS and ORS
In address.awk's print statement, you can see that awk concatenates (joins) strings that are placed next to each other on a line. We used this feature to insert a comma and a space (", ") between the three address fields that appeared on the line. While this method works, it's a bit ugly looking. Rather than inserting literal ", " strings between our fields, we can have awk do it for us by setting a special awk variable called OFS. Take a look at this code snippet.
print "Hello", "there", "Jim!"

The commas on this line are not part of the actual literal strings. Instead, they tell awk that "Hello", "there", and "Jim!" are separate fields, and that the OFS variable should be printed between each string. By default, awk produces the following output:
Hello there Jim!

This shows us that by default, OFS is set to " ", a single space. However, we can easily redefine OFS so that awk will insert our favorite field separator. Here's a revised version of our original address.awk program that uses OFS to output those intermediate ", " strings:
BEGIN { FS="\n" RS="" OFS=", "}{ print $1, $2, $3}

Awk also has a special variable called ORS, called the "output record separator". By setting ORS, which defaults to a newline ("\n"), we can control the character that's automatically printed at the end of a print statement. The default ORS value causes awk to output each new print statement on a new line. If we wanted to make the output double-spaced, we would set ORS to "\n\n". Or, if we wanted records to be separated by a single space (and no newline), we would set ORS to " ".
Multi-line to tabbed
Let's say that we wrote a script that converted our address list to a single-line per record, tab-delimited format for import into a spreadsheet. After using a slightly modified version of address.awk, it would become clear that our program only works for three-line addresses. If awk encountered the following address, the fourth line would be thrown away and not printed:
Cousin VinnieVinnie's Auto Shop300 City AlleySosueme, OR 76543

To handle situations like this, it would be good if our code took the number of records per field into account, printing each one in order. Right now, the code only prints the first three fields of the address. Here's some code that does what we want:
BEGIN { FS="\n" RS="" ORS="" }
{ x=1
while ( xfor ( initial assignment; comparison; increment ) {code block}

Here's a quick example:
for ( x = 1; x <= 4; x++ ) {print "iteration",x} This snippet will print: iteration 1 iteration 2 iteration 3 iteration 4 Break and continue Again, just like C, awk provides break and continue statements. These statements provide better control over awk's various looping constructs. Here's a code snippet that desperately needs a break statement: while (1) { print "forever and ever..."} Because 1 is always true, this while loop runs forever. Here's a loop that only executes ten times: x=1while(1) { print "iteration",x if ( x == 10 ) { break } x++}
Here, the break statement is used to "break out" of the innermost loop. "break" causes the loop to immediately terminate and execution to continue at the line after the loop's code block.
The continue statement complements break, and works like this:

x=1
while (1)
{ if ( x == 4 )
{ x++ continue }
print "iteration",x
if ( x > 20 ) {
break }
x++}
This code will print "iteration 1" through "iteration 21", except for "iteration 4". If iteration equals 4, x is incremented and the continue statement is called, which immediately causes awk to start to the next loop iteration without executing the rest of the code block. The continue statement works for every kind of awk iterative loop, just as break does. When used in the body of a for loop, continue will cause the loop control variable to be automatically incremented. Here's an equivalent for loop:
for ( x=1; x<=21; x++ ) { if ( x == 4 ) { continue } print "iteration",x} It wasn't necessary to increment x just before calling continue as it was in our while loop, since the for loop increments x automatically. Arrays
You'll be pleased to know that awk has arrays. However, under awk, it's customary to start array indices at 1, rather than 0:
myarray[1]="jim"myarray[2]=456
When awk encounters the first assignment, myarray is created and the element myarray[1] is set to "jim". After the second assignment is evaluated, the array has two elements.
Iterating over arrays Once defined, awk has a handy mechanism to iterate over the elements of an array, as follows:
for ( x in myarray ) {print myarray[x]}

This code will print out every element in the array myarray. When you use this special "in" form of a for loop, awk will assign every existing index of myarray to x (the loop control variable) in turn, executing the loop's code block once after each assignment. While this is a very handy awk feature, it does have one drawback -- when awk cycles through the array indices, it doesn't follow any particular order. That means that there's no way for us to know whether the output of above code will be:
jim456
or
456jim

To loosely paraphrase Forrest Gump, iterating over the contents of an array is like a box of chocolates -- you never know what you're going to get. This has something to do with the "stringiness" of awk arrays, which we'll now take a look at.
Array index stringiness
as you know that awk actually stores numeric values in a string format. While awk performs the necessary conversions to make this work, it does open the door for some odd-looking code:
a="1"b="2"c=a+b+3
After this code executes, c is equal to 6. Since awk is "stringy", adding strings "1" and "2" is functionally no different than adding the numbers 1 and 2. In both cases, awk will successfully perform the math. Awk's "stringy" nature is pretty intriguing -- you may wonder what happens if we use string indexes for arrays. For instance, take the following code:
myarr["1"]="Mr. Whipple"print myarr["1"]
As you might expect, this code will print "Mr. Whipple". But how about if we drop the quotes around the second "1" index?

myarr["1"]="Mr. Whipple"print myarr[1]

Guessing the result of this code snippet is a bit more difficult. Does awk consider myarr["1"] and myarr[1] to be two separate elements of the array, or do they refer to the same element? The answer is that they refer to the same element, and awk will print "Mr. Whipple", just as in the first code snippet. Although it may seem strange, behind the scenes awk has been using string indexes for its arrays all this time!
After learning this strange fact, some of us may be tempted to execute some wacky code that looks like this:

myarr["name"]="Mr. Whipple"print myarr["name"]

Not only does this code not raise an error, but it's functionally identical to our previous examples, and will print "Mr. Whipple" just as before! As you can see, awk doesn't limit us to using pure integer indexes; we can use string indexes if we want to, without creating any problems. Whenever we use non-integer array indices like myarr["name"], we're using associative arrays. Technically, awk isn't doing anything different behind the scenes than when we use a string index (since even if you use an "integer" index, awk still treats it as a string). However, you should still call 'em associative arrays -- it sounds cool and will impress your boss. The stringy index thing will be our little secret. ;)

Array tools
When it comes to arrays, awk gives us a lot of flexibility. We can use string indexes, and we aren't required to have a continuous numeric sequence of indices (for example, we can define myarr[1] and myarr[1000], but leave all other elements undefined). While all this can be very helpful, in some circumstances it can create confusion. Fortunately, awk offers a couple of handy features to help make arrays more manageable.
First, we can delete array elements. If you want to delete element 1 of your array fooarray, type:

delete fooarray[1]

And, if you want to see if a particular array element exists, you can use the special "in" boolean operator as follows:

if ( 1 in fooarray ) {print "Ayep! It's there."} else {print "Nope! Can't find it."}

Resources
If you'd like a good old-fashioned book, O'Reilly's sed & awk, 2nd Edition is a wonderful choice.
Be sure to check out the comp.lang.awk FAQ. It also contains lots of additional awk links.
Patrick Hartigan's awk tutorial is packed with handy awk scripts.

Raw Devices in Oracle

Raw Devices and Oracle
*********************
1. What is a raw device?
A raw device, also known as a raw partition, is a disk partition that is not mounted and written to via the UNIX filesystem, but is accessed via a character-special device driver; it is up to the application how the data is written, since there is no filesystem to do this on the application's behalf.
2. How can a raw device be recognised?
In the /dev directory, there are essentially two type of files: block special and character special. Block special files are used when data is transferred to or from a device in fixed size amounts (blocks), whereas character special files are used when data is transferred in varying size amounts.
Raw devices use character special files; a long listing of the /dev directory shows them with a 'c' at the leftmost position of the permissions field,
e.g. crw-rw-rw- 1 root system 15, 0 Mar 12 09:45 rfd0
In addition, character special files usually have names beginning with an 'r', as shown in the above example. Some devices, principally disks, have both a block special device and a character special device associated with them; for the floppy diskette shown above, there is also a device
brw-rw-rw- 1 root system 15, 0 Apr 16 15:42 /dev/fd0
So the presence of a 'c' in a device does NOT necessarily mean this is a raw device suitable for use by Oracle (or another application). Generally, a raw device needs to be created and set aside for Oracle (or whatever application is going to use it) when the UNIX system is set up - therefore, this needs to be done with close co-operation between the DBA and UNIX system administrator. Once a raw device is in use by Oracle, it must be owned by the oracle account, and may be identified in this way.
3. What are the benefits of raw devices?
There can be a performance benefit from using raw devices, since a write to a raw device bypasses the NIX buffer cache; the data is transferred direct from the Oracle buffer cache to the disk. This is not guaranteed, though; if there is no I/O bottleneck, raw devices will not help. The performance benefit if there is a bottleneck can vary between a few percent to something like 40%. Note that the overall amount of I/O is not reduced; it is just done more efficiently. Another, lesser, benefit of raw devices is that no filesystem overhead is incurred, in terms of inode allocation and maintenance, or free block allocation and maintenance.
4. How can I tell if I will benefit from using raw devices?
There are two distinct parts to this: first, the Oracle database and application should be examined and tuned as necessary, using one or both of the following: Server Manager or SQLDBA "monitor fileio" UTLBstat and UTLestat utilities
(in $ORACLE_HOME/rdbms/admin) After checking your Oracle database and application, the next stage is to identify UNIX-level I/O bottlenecks. This can be done using a UNIX utility such as sar or vmstat. See the relevant manual pages for details. If you identify that there is a UNIX-level problem with I/O, now is the time to start using raw devices. This may well require reorganisation of the entire UNIX system (assuming there are no spare partitions available).
5. Are there circumstances when raw devices have to be used?
Yes. If you are using the Oracle Parallel Server, all data files, control files and redo log files must be placed on raw partitions so they can be shared between nodes. This is a limitation with the UNIX operating system. Also, if you wish to use List I/O or Asynchronous I/O, some versions of UNIX require the data files and control files to be on raw devices for this to work. Consult your platform-specific documentation for details.
6. Can I use the entire raw partition for Oracle?
No. You should specify a tablespace slightly smaller in size than the raw partition size, specifically at least two Oracle block sizes smaller.
7. Can I use the first partition of a disk for a raw device?
This is not recommended. On older versions of UNIX, the first partition contained such information as the disk partition table or logical volume control information, which if overwritten could render the disk useless. More recent UNIX versions do not have this problem, as disk management is done in a more sophisticated manner; consult your operating system vendor for more details, but if in any doubt do not use the first partition.
8. Who should own the raw device?
You will need to create the raw devices as root, but the ownership should be changed to the oracle account afterwards. The group must also be changed to the dba group (usually called dba).
9. How do I specify a raw device in Oracle commands?
Enclose the full pathname in single quotes, e.g. if there are two raw devices, each 30Mb in size, and the database has a 4K block size, the relevant command would look like this:
create tablespace raw_tabspace datafile '/dev/raw1' size 30712K datafile '/dev/raw2' size 30712K
10. Does the Oracle block size have any relevance on a raw device?
It is of less importance than for a UNIX file; the size of the Oracle block can be changed, but it must be a multiple of the physical block size, as it is only possible to seek to physical block boundaries, and hence write only in multiples of the physical block size.
11. How can I back up my database files if they are on raw devices?
You cannot use utilities such as tar or cpio, which expect a filesystem to be present. You must use the dd command, as follows:
dd if=/dev/raw1 of=/dev/rmt0 bs=16k
See the UNIX man page on dd for further details. It is also possible to copy the raw device file (using dd) to a normal UNIX file, and then use a utility such as tar or cpio, but this requires more disk space and has a greater administrative overhead.
12. Providing I am not using Parallel Server, can I use a mixture of raw partitions and filesystem files for my tablespace locations?
Yes. The drawback is that this makes your backup strategy more complicated.
13. Should I store my redo log files on raw partitions?
Redo logs are particularly suitable candidates for being located on raw partitions, as they are write-intensive and in addition are written to sequentially. If Parallel Server is being used, redo logs must be stored on raw partitions.
14. Can I use raw partitions for archive logs?
No. Archive logs must be stored on a partition with a UNIX filesystem.
15. Can I have more than one data file on a raw partition?
No. This means you should be careful when setting up up the raw partition: too small a size will necessitate reorganisation when you run out of space, whereas too large a size will waste any space the file does not use.
16. Should my raw partitions be on the same disk device?
This is inadvisable, as there is likely to be contention. You should place raw devices on different disks, which should also be on different controllers.
17. Do I need to make my raw partitions all the same size?
This is not essential, but it provides flexibility in the event of having to change the database configuration.
18. Do I need to change any UNIX kernel parameters if I decide to use raw devices?
No, but you may wish to reduce the size of the UNIX buffer cache if no other applications are using the machine.
19. What other UNIX-level changes could help to improve I/O performance?
RAID and disk mirroring can be beneficial, depending on the application characteristics, especially whether it is read or write-intensive, or a mixture.
20. How can I gain further performance benefits, after considering all the above?
You will need to buy more disk drives and controllers for your system, to spread the I/O load between devices.

Capacity Planning

Capacity planning involves two main subject areas, current storage space usage and potential growth. Both current storage and potential growth of a database can be assessed using the same basic tools.
What are those basic tools?
Simple tools for capacity planning can best be explained by starting with the following question. How do we accurately assess the current physical size of an Oracle database? There are various methods. Some methods are highly accurate but accuracy will hurt performance. Other methods are extremely fast but under some circumstances can be so inaccurate so as to be rendered useless. So how do we assess the current physical size of an Oracle database? The various methods are listed below.
Datafile sizes. Get datafile sizes of all data containing datafiles from the operating system. This method could be the most inaccurate of all.
Datafile content sizes. This method requires a join on the DBA_DATA_FILES and DBA_FREE_SPACE performance views. This method accumulates values based on what Oracle calls Extents. Extents are the physical chunks that an Oracle datafile is increased by when more space is required. This method is totally inappropriate for an Oracle database with poor physical storage organization, particularly the storage parameter PCTINCREASE. Most commercial Oracle databases are not properly structured in this respect. This method is more accurate than retrieval of datafile sizes from the operating system but can result in very unreliable results.
DBMS_SPACE. This Oracle provided package is utilized to sum up space used and space free at the block level, by database object name. This method adds space used at the block level. This method is more accurate than both of the datafile methods already listed above. However, if many blocks have are partially filled or block usage parameters are left at Oracle installation default values, inaccuracy could be as much as 50% incorrect, smaller or larger. For a small database this is not an issue but a small database does not really need capacity planning.
Statistics. Internal Oracle optimization statistics can be generated with either a minimal or fairly heavy performance impact; the heavier the performance impact the better the quality of the statistics. Also generation of sizing statistics for indexes requires an extra burden on performance. Use of statistics is probably the all-round most effective method for capacity planning. However, some older versions of Oracle databases and applications will use rule-based rather than cost-based optimization. The simple existence of statistics can cause performance problems for rule-based tuned applications. However, the OPTIMIZER_MODE parameter can resolve these issues on a database level. If multiple applications use the same database and these applications used a mix of rule-based and cost-based applications then rule-based applications could have performance problems.
Exact column data lengths. All column lengths in all rows in all tables in a database can be counted using length functions for string fields plus fixed known lengths for all other basic data types. Object data types could be an issue using this method. This method would be the most accurate for a purely relational database (no object data types at all). This method will also affect performance in the extreme.
Oracle Enterprise Manager Capacity Planner Package. The Capacity Planner package in Oracle Enterprise Manager does a lot of the work for you. It also provides lots of very nice graphical representation, automated warnings and bells and whistles. Be warned though! This package as with Oracle Enterprise Manager in general has had many problems in the past. However, the word on the street is that Oracle Enterprise Manager as released with Oracle9i (9.2) is now in excellent condition. Oracle Enterprise Manager will not be covered in this paper.

Now let’s go through each of the above listed methods for capacity assessment and planning in detail.
Datafile Sizes
On Solaris this would require the use of the df –k command and on NT/2000 either the dir command and a Perl script or just a Perl script using a Perl contained package allowing disk reads from within Perl. The script shown in Figure 1 could be used to validate the existence of an acceptable level of disk space available on a Solaris box. This script will indicate when disk space is reaching critical levels. Some simple adjustments to this script could be made to check current disk space results in the file diskspace.log, with previous executions of the script stored in older log files; thus estimating growth rate over a period of time.

Figure 1: A partial script for extracting available disk space
Datafile Content Sizes The query shown in Figure 2 is a partial script showing a join between the Oracle performance views DBA_DATA_FILES and DBA_FREE_SPACE. Note that this script will not amalgamate datafiles into tablespaces but will show space used for each datafile (temporary datafiles excluded). Effectively this query will assess database size in terms of datafile extents. Whenever an autoextensible datafile runs out of space a new extent is automatically allocated to that datafile. For a non-autoextensible datafile new extents have to be added manually by resizing the datafile using the ALTER DATABASE command. The DBA_DATA_FILES column BYTES shows the total size of a datafile in bytes. The DBA_FREE_SPACE column BYTES shows the BYTE size of all free extents in a datafile within a tablespace.
The consistency of extent sizes is largely dependant on settings for the storage parameter PCTINCREASE, for tablespaces and database objects such as tables and indexes. The important point to note about the query in Figure 2 is that the result of byte values in the DBA_FREE_SPACE view can be very inaccurate if PCTINCREASE is set anywhere in the database at greater than 0%. Why? A static database, which has never grown in size, would not be affected by PCTINCREASE if PCTINCREASE has never been applied by the creation of a new extent. This is probably very unlikely. If many new extents are added to a datafile it is quite possible that a new extent added could be much larger than expected. The other issue with setting PCTINCREASE greater than 0% is that empty extents, as a result of deletions, will not be reused since new extents, created larger than old extents, will not be able to reuse old extents, which are smaller. Coalescence can help alleviate this problem but coalescence only manages to join extents, which are physically next to each other.
i In Oracle9i the PCTINCREASE parameter is largely irrelevant. The default for the CREATE TABLESPACE command is to create a locally managed tablespace. The PCT_INCREASE parameter is not set for locally managed tablespaces. The Oracle configuration parameter COMPATIBLE must be set to at least 9.0.0 for this locally managed tablespace default to take effect.
Since many existing Oracle database are pre-Oracle9i this script is still relevant.


Figure 2: A partial script for datafile extent sizes

DBMS_SPACE
DBMS_SPACE is an Oracle provided package capable of summing all blocks for each database object, namely tables, indexes and clusters. Something like the stored PL/SQL code shown in Figure 3 could be used to execute capacity planning database space usage based on used and unused blocks.

The problem with this method is that the settings of block storage parameters such as PCTUSED could cause an unrealistic picture of the actual size of the data. PCTUSED is defaulted to 40% for all tables and indexes. If a block has rows has deleted from it then the block will not be used until the block gets to below 40% filled. If a database has a lot of delete activity, either in many or a few large tables, this method could give very misleading results.



Figure 3: An PL/SQL script calling DBMS_SPACE.UNUSED_SPACE (untested)

Statistics
Before discussing how we can use statistics to capacity plan, let’s go over how we can gather statistics. There are two methods of gathering Oracle database statistics:
The ANALYZE command. Can be used to collect non-optimizer statistics and will be deprecated in a future version of Oracle.

The DBMS_STATS Oracle provided package. Collects optimizer statistics and can be executed in parallel on a multiple CPU system for better performance.
The ANALYZE Command
The command shown below will create non-optimizer statistics for all rows in a specified table.

ANALYZE table name COMPUTE STATISTICS;

A more efficient but less accurate form of the same command estimates the values for statistics by sampling 1064 rows from the specified table. Note the optional SAMPLE clause allowing specification of percentage or number of rows.

ANALYZE table_name ESTIMATE STATISTICS [ SAMPLE { 1-99% rows ];

Estimating statistics is much better for performance but potentially much less accurate than computing all the statistics. Accuracy of estimating statistics depends largely on the size of the database object.

i The ANALYZE command can be used to generate statistics for tables, indexes and clusters.
The DBMS_STATS Package
Statistics can be generated and have all sorts of other things done with them, in various ways, using the DBMS_STATS package. For the purposes of capacity planning the most important aspect of the DBMS_STATS package is that of gathering statistics. Statistics can be gathered for indexes and tables and even for a whole schema or an entire database. The commands below can be used to gather statistics for a single table and a single index.
EXEC DBMS_STATS.GATHER_TABLE_STATS(‘owner’, ‘table name’);
EXEC DBMS_STATS.GATHER_INDEX_STATS(‘owner’, ‘index name’);
i DBMS_STATS is faster at generating statistics then the ANALYZE command.
Using Statistics for Capacity Planning
Calculating the number of blocks or byte size for a table using statistics is simple. After statistics have been generated query the USER_TABLES Oracle performance view with the query as shown below.

SELECT TABLE_NAME, NUM_ROWS,BLOCKS,BLOCKS*db_block_size AS BYTES
FROM USER_TABLES;

Find the db_block_size (database block size), when logged in as SYS or SYSTEM, by executing the query shown below.

SELECT VALUE FROM V$PARAMETER WHERE NAME=’db_block_size’;

You could then add up the size of all tables in a specified schema, assuming you have generated statisitics for all tables in that schema by executing the following query.

SELECT SUM(BLOCKS*db_block_size) AS BYTES
,SUM(BLOCKS*db_block_size)/1024/1024 AS MB
FROM USER_TABLES;

So now we know how to utilize statistics to find the size of all your data. Other objects that could be analyzed and summed up would be objects such as indexes and clusters. Since clusters are rarely used we will ignore them. Indexes on the other hand are generally more numerous in number than tables. It is often found that the total byte size of all indexes in a schema can be larger than that of all tables. We could estimate the size of indexes based on an assumed table to index ratio but that would be inaccurate.

The INDEX_STATS Oracle performance view is the only view that can be used to calculate an exact size for an index. Generation of data into the INDEX_STATS view requires used of the ANALYZE INDEX index name VALIDATE STRUCTURE; command. This command can cause problems and it will contain statistics for only a single index at a time.

A simple method of assessing index size is to use index statistics generated by the ANALYZE INDEX command or the DBMS_STATS.GATHER_INDEX_STATS procedure; the USER_INDEXES view will contain the resulting statistics. The query shown below will retrieve a good estimate of size for an index.

SELECT INDEX_NAME,LEAF_BLOCKS*db_block_size/BLEVEL AS BYTES
FROM USER_INDEXES;

As with the USER_TABLES view, obtain the total size of all indexes in a schema by executing a query like this one.

SELECT SUM(LEAF_BLOCKS*db_block_size/BLEVEL) AS BYTES
,SUM(LEAF_BLOCKS*db_block_size/BLEVEL)/1024/1024 AS MB
FROM USER_INDEXES;

Dividing the number of leaf blocks in an index by its branch level value is a bit of a guess as far as assessing size. For very large indexes it would be sensible to validate this by executing the ANALYZE INDEX index name VALIDATE STRUCTURE; command for that particular index and then executing the query shown below to get a better perspective on the actual size of that index.

SELECT NAME,BR_BLKS+LF_BLKS,USED_SPACE AS BYTES FROM INDEX_STATS;

Four queries are shown below in succession. Each query progressively produces a summary of current database storage capacity for objects within a schema. All of the four queries list byte, megabyte and gigabyte sizes.

The first query lists sizes for all tables in a schema.

SELECT TABLE_NAME
,BLOCKS*8192 AS BYTES
,ROUND(BLOCKS*8192/1024/1024) AS MB
,ROUND(BLOCKS*8192/1024/1024/1024,1) AS GB
FROM USER_TABLES;

The second query shows sizes for all indexes in a schema for each table.

SELECT TABLE_NAME
,SUM(LEAF_BLOCKS*8192/DECODE(BLEVEL,0,1,BLEVEL)) AS BYTES
,ROUND(SUM(LEAF_BLOCKS*8192/DECODE(BLEVEL,0,1,BLEVEL))/1024/1024) AS MB
,ROUND(SUM(LEAF_BLOCKS*8192/DECODE(BLEVEL,0,1,BLEVEL))
/1024/1024/1024,1) AS GB
FROM USER_INDEXES GROUP BY TABLE_NAME;

The third query shows sizes for all tables in a schema plus sizes of indexes created for each of those tables.

SELECT t.TABLE_NAME
,(t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL)))
*8192 AS BYTES
,ROUND((t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL)))
*8192/1024/1024) AS MB
,ROUND((t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL)))
*8192/1024/1024/1024,1) AS GB
FROM USER_TABLES t, USER_INDEXES i
WHERE t.TABLE_NAME = i.TABLE_NAME
GROUP BY t.TABLE_NAME,t.BLOCKS;

The last query of the four is the same as the query used in the second half of the script sample shown in Figure 4. This query sums total physical size of all tables and indexes in a schema.

SELECT SUM((t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL))))
*8192 AS BYTES
,ROUND(SUM((t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL)))
*8192/1024/1024)) AS MB
,ROUND(SUM((t.BLOCKS+SUM(i.LEAF_BLOCKS/DECODE(i.BLEVEL,0,1,i.BLEVEL)))
*8192/1024/1024/1024),1) AS GB
FROM USER_TABLES t, USER_INDEXES i
WHERE t.TABLE_NAME = i.TABLE_NAME
GROUP BY t.TABLE_NAME,t.BLOCKS;
The script shown in Figure 4 could be used, in part or in whole, to generate statistics and calculate the size of all tables and indexes in a schema.


Figure 4: Using statistics for capacity planning

Exact Column Data Lengths
This method of calculating the physical size of a database involves measuring the lengths of all columns, of all rows in all tables and indexes in the database. This method should be the most accurate but a real killer for performance. How accurate do you want to get? This method does border on the ridiculous. However, when doing a migration from a non-Oracle database such as PostGres, and your database is very small, how else would you be able to predict expected capacities? The simple fact is the smaller your database is, and the larger you expect it to get in the future, the more accurate your capacity planning has to be. This type of scenario is common in startup companies where an initial database is miniscule. These types of environments usually expect unimaginable growth. Sometimes these companies are correct. Admittedly not very often but it does happen. In a situation like this a method such as this would apply.

In describing this approach to capacity planning it is best to keep things simple. Thus we will stick to simple datatypes; thus no binary objects or other such nasty, unpredictable things. If non-simple datatypes have to be included then they have to be included, and factored into the calculations.

The simple data types in an Oracle database and their respective lengths are as follows. Note that some datatypes make things simple for use because their actual physical lengths are fixed. Also some datatypes are automatically converted to more general datatypes when applied to columns in tables. Not very efficient use of space but that is something invented by relational database vendors such as Oracle many years ago.

This method may seem complicated but is actually very simple. The only variable length simple datatypes in Oracle are the VARCHAR2 datatypes. Thus VARCHAR2 datatypes are the only columns requiring application of the LENGTH function to the column to find the length of the data in the column. All other datatypes will have fixed lengths.

If you want you can execute the scripts following in a tool such as SQL*Plus Worksheet and you will see what I mean. Create the table shown below and insert a single row as shown. This table is deliberately created with a very large mix of simple datatypes.

CREATE TABLE ATABLE(
rnum INTEGER,
vc1 VARCHAR2(4000) DEFAULT '0123456789',
vc2 VARCHAR2(4000) DEFAULT '012',
c1 CHAR(2) DEFAULT 'ab',
c2 CHAR(10) DEFAULT 'abc',
n1 NUMBER DEFAULT 100,
n2 NUMBER(10) DEFAULT 101,
n3 NUMBER(12,4) DEFAULT 103.1234,
f1 FLOAT DEFAULT 1.23334,
si1 SMALLINT DEFAULT 0,
si2 SMALLINT DEFAULT 100,
i1 INTEGER DEFAULT 0,
i2 INTEGER DEFAULT 222,
d1 DATE DEFAULT SYSDATE,
t1 TIMESTAMP DEFAULT SYSDATE
);
INSERT INTO ATABLE(rnum) VALUES(1);
COMMIT;

Now let’s examine the column length specifications of the table we have just created by selecting the appropriate columns from the USER_TAB_COLUMNS Oracle performance view. Execute the query shown in Figure 5, I have used SQL*Plus Worksheet; it looks nice in here.

In Figure 5 all the data lengths are shown for each type of datatype contained in the table created by the script shown above. Once again, the only variable datatype of the datatypes in the table is the VARCHAR2 datatype. The VARCHAR2 datatype is therefore the only datatype required to have the length of its value measured using the LENGTH function.

The resulting calculation simply needs to multiply the number of rows in each table with the sum of the lengths of its datatype defined column lengths as shown in Figure 5. Adjustment is required for VARCHAR2 datatypes by applying the length functions to the column values. This is why this method can seriously hurt performance. Index sizes can be assessed simply by multiplying table rows again by the sum of the lengths of the columns for all indexes created on each table. Indexes can be erroneous due to null values not being stored in indexes in some cases. However, most modern databases use object front-end applications. These types of applications tend to avoid use of composite keys for anything but many-to-many join resolution entities; these entities usually contain integers in their key values which are never null.

Figure 5: Querying the USER_TAB_COLUMNS view