Monthly Archives: November 2012

Plotting AWR database metrics using R

In a previous post i showed how you can connect from R to the oracle database using the R driver. In this post i will explain how we can run queries against the AWR history tables and gather data that can be plotted using ggplot.

When you install R on linux, like i outlined in the above post, you get an executable named Rscript. Rscript is a NonInteractive variant of the R command, so you can run a R batch file from the linux shell (Like running a bash shell script). I am using Rscript as the interpreter in my script (First line).

ggplot2 is a R library that can be used for plotting in R programs. There is native plotting capability in R and there is another library named lattice. ggplot2 is much more robust and is based on the grammar of graphics. You have to install ggplot2 (install.packages(“ggplot2”)) in R before you can use this.

#!/usr/bin/Rscript
library(ROracle)
library(ggplot2)

 

Process command line arguments. This script expects 3 commandline arguments. Copy each argument to a R variable.

args <- commandArgs(TRUE)
l_dbid <- as.double(args[1])
l_bsnap <- as.double(args[2])
l_esnap <- as.double(args[3])

Connect to Oracle

drv <- dbDriver(“Oracle”)
con <- dbConnect(drv,username=”system”,password=”manager”,dbname=”burl5vb1:1521/rk01″)

Popluate a data frame with the values you will need for bind variables in the query you will be submitting.

my.data = data.frame(dbid = l_dbid, bsnap =l_bsnap,esnap=l_esnap)

Prepare and Execute the query

res <- dbSendQuery(con,”select dhss.instance_number,dhss.snap_id,dhs.end_interval_time et,
round(sum(decode(dhss.metric_name,’User Transaction Per Sec’,dhss.average,0))) utps,
round(sum(decode(dhss.metric_name,’Average Active Sessions’,dhss.average,0))) aas,
round(sum(decode(dhss.metric_name,’Host CPU Utilization (%)’,dhss.average,0))) hcpu,
round(sum(decode(dhss.metric_name,’Buffer Cache Hit Ratio’,dhss.average,0))) bchr,
round(sum(decode(dhss.metric_name,’Logical Reads Per Sec’,dhss.average,0))) lr,
round(sum(decode(dhss.metric_name,’I/O Megabytes per Second’,dhss.average,0))) iombps,
round(sum(decode(dhss.metric_name,’I/O Requests per Second’,dhss.average,0))) iops,
round(sum(decode(dhss.metric_name,’Redo Generated Per Sec’,dhss.average,0))) rg,
round(sum(decode(dhss.metric_name,’Temp Space Used’,dhss.average,0))) ts,
round(sum(decode(dhss.metric_name,’Physical Write Total IO Requests Per Sec’,dhss.average,0))) pw,
round(sum(decode(dhss.metric_name,’Physical Read Total IO Requests Per Sec’,dhss.average,0))) pr
from dba_hist_sysmetric_summary dhss,dba_hist_snapshot dhs
where
dhss.dbid = :1
and dhss.snap_id between :2 and :3
and dhss.metric_name in (
‘User Transaction Per Sec’,
‘Average Active Sessions’,
‘Host CPU Utilization (%)’,
‘Buffer Cache Hit Ratio’,
‘Logical Reads Per Sec’,
‘I/O Megabytes per Second’,
‘I/O Requests per Second’,
‘Redo Generated Per Sec’,
‘Temp Space Used’,
‘Physical Write Total IO Requests Per Sec’,
‘Physical Read Total IO Requests Per Sec’)
and dhss.dbid = dhs.dbid
and dhs.instance_number=1
and dhss.snap_id = dhs.snap_id
group by dhss.instance_number,dhss.snap_id,dhs.end_interval_time
order by 1,2″,data=my.data
)

Fetch the rows, and disconnect from the db.

data <- fetch(res)
dbDisconnect(con)

Open a pdf file to save the graphs to.
Generate the graphs using ggplot.
print the graphs to the pdf file
Close the pdf file.

In the ggplot function call, ET and INSTANCE_NUMBER represent the End Snap Time and Instance Number columns output from the query, and AAS, UTPS, HCPU, PW and PR represent the AverageActiveSessions, UserTransactionPerSecond, HostCpu, PhysicalWrites and PhysicalReads columns from the query.

pdf(“plotstat.pdf”, onefile = TRUE)
p1<-ggplot(data,aes(strptime(ET,format=”%Y-%m-%d %H:%M:%S”),AAS,group=INSTANCE_NUMBER,color=INSTANCE_NUMBER))+geom_point()+geom_line()+ggtitle(“Average Active S
essions”)+labs(x=”Time of Day”,y=”Average Active Sessions”)
p2<-ggplot(data,aes(strptime(ET,format=”%Y-%m-%d %H:%M:%S”),UTPS,group=INSTANCE_NUMBER,color=INSTANCE_NUMBER))+geom_point()+geom_line()+ggtitle(“Transactions Pe
r Second”)+labs(x=”Time of Day”,y=”Transactions Per Second”)
p3<-ggplot(data,aes(strptime(ET,format=”%Y-%m-%d %H:%M:%S”),HCPU,group=INSTANCE_NUMBER,color=INSTANCE_NUMBER))+geom_point()+geom_line()+ggtitle(“CPU Usage”)+lab
s(x=”Time of Day”,y=”Cpu Usage”)
p4<-ggplot(data,aes(strptime(ET,format=”%Y-%m-%d %H:%M:%S”),PW,group=INSTANCE_NUMBER,color=INSTANCE_NUMBER))+geom_point()+geom_line()+ggtitle(“Physical Writes”)
+labs(x=”Time of Day”,y=”Phywical Writes”)
p5<-ggplot(data,aes(strptime(ET,format=”%Y-%m-%d %H:%M:%S”),PR,group=INSTANCE_NUMBER,color=INSTANCE_NUMBER))+geom_point()+geom_line()+ggtitle(“Physical Reads”)+
labs(x=”Time of Day”,y=”Physical Reads”)
print(p1)
print(p2)
print(p3)
print(p4)
print(p5)
dev.off()

You can run this script as follows from the Linux Command Line. The first argument is the dbid, the second argument is the begin snap id and the last argument is the end snap id.

./plotstat.R 220594996 5205 5217

You will then see a pdf document named plotstat.pdf in your directory that has 5 separate graphs in it.
Click on the link below to see a sample file. This is plotting awr data from a 4 node Oracle Rac Database.

plotstat

Click Here to download the whole script, plotstat.R

ggplot2 : Elegant Graphics for Data Analysis is a great book to learn about ggplot2.

Shell script to create a tar archive of oracle trace files.

Whenever you have an oracle database problem and Oracle support asks you to upload the related trace files, the best option is to use the oracle Incident Packaging service to create an archive file that has all the necessary info to be uploaded to oracle.

If you just want to upload all the .trc files generated in the diagnostics trace directory (including but not limited to pmon traces), you can use the following script to generate such an archive file.

The following script accepts

  • The directory name (The location of your trace files)
  • The backup destination directory (The directory where you want the archive to be created. Ensure you have enough space here)
  • The date of the trace files (DD-MON-YYYY)
  • The begin time (HH24MI)
  • The end time (HH24MI)

Then it finds all .trc files that falls in between those begin and end times for the date you specified, from the directory you specified and creates a tar.gz archive file in the destination directory you specified. It creates a directory named trcbakMonDD in your destination directory and places the file in that dir. You can download this file and upload it to oracle.

Usage Example :. /backtraces.sh /u01/11gr2/diag/rdbms/rk01/rk01/trace /tmp ’11-Sep-2012′ 1315 1340

The abov ecommand will backup all .trc files, from the directory  /u01/11gr2/diag/rdbms/rk01/rk01/trace, that have a timestamp between 13:15 and 13:40 on 11th Sep 2012 to a tar Archive in the directory /tmp

I have only tested it on Oracle Enterprise Linux 5. (It is likely that the syntax for the Tar and date commands might be different on different platforms)

Find the script code below

 

#!/bin/bash
#This script can be used to create a tar archive of trace files created in 
#The database diagnostics trace directory between a given time period
#Author : Rajeev Ramdas

if [ $# != 5 ]
then
   echo ./backtraces.sh tracefiledir backupdir DD-Mon-YYYY HH24MI HH24MI
   echo ./backtraces.sh /u01/Rk/Docs/11g/Scripts2 /tmp '09-Nov-2012' 0900 1332
   exit
fi

l_backup_base=$2
l_backdir=trcbak`date --date=${3} +%b%d`
l_backdest=${l_backup_base}/${l_backdir}
l_startdate=`date --date=${3} +%Y%m%d`
l_enddate=`date --date=${3} +%Y%m%d`
l_starttime="${l_startdate}${4}"
l_endtime="${l_enddate}${5}"
l_backfile="${l_backdest}/tracebak-${l_starttime}-${l_endtime}.tar.gz"

if [ ! -d ${1} ]
then
   echo Wrong Backup Dir
   exit 1
fi

if [ ! -d ${2} ]
then
   echo Wrong Backup Dest
   exit 1
fi

if [ -d ${l_backdest} ]
then
   echo Directory Exists
else
   mkdir ${l_backdest}
fi

if [ -f ${l_backfile} ]
then
   rm ${l_backfile}
fi

touch -t "$l_starttime" /tmp/tmpoldfile
touch -t "$l_endtime" /tmp/tmpnewfile

find $1 -type f -newer /tmp/tmpoldfile ! -newer /tmp/tmpnewfile -name '*.trc' |  xargs tar -czvf - | cat > ${l_backfile}

echo Your backup file is ${l_backfile}

Using the R Language with an Oracle Database.

R Programming Language Connectivity to Oracle.

R is an open source programming language and software environment for statistical computing and graphics. It is a fully functional programming language, widely used by statisticians to perform data analysis. It can also be a neat tool for Oracle DBA’s to graph and analyse database performance metrics. If you intend to embark on developing a sizable R+Oracle project, i’d encourage you to use Oracle Enterprise R and/or the Oracle Advanced Analytics.

Below are the steps on how to install and configure the R language on Ubuntu Linux with connectivity to Oracle.

These steps assume that you have an already installed and running Oracle 11gR2 database.

The high level steps are as follows

1) Install the R programming language environment
2) Download and install the oracle instant client
3) Download and install the following R packages
– DBI
– ROracle
4) Start using R with Oracle.

Install the R programming language environment

Refer to the installation instructions at www.r-project.org for your platform.
If you are installing this on Ubuntu Linux (As I have on Ubuntu 12.10), open the “Ubuntu Software Center” and install the following packages.
– R-base
– R-base-dev

Download and install the oracle instant Client

As your regular o/s user, download and install (Installation is nothing other than unzipping the downloaded file) the oracle instant client.
Download The instant client for your o/s platform from http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html.
You need to download
– Instant Client package – Basic
– Instant Client package – SDK

For the purpose of this installation, we are going to assume that the instant client has been installed into /u01/Rk/Apps/oracle/instantclient_11_2.

 Download and install the R packages

DBI

– Download DBI from http://cran.r-project.org/web/packages/DBI/index.html. (Download the package source)
– sudo su –
– cd <To the directory where DBI_0.2-5.tar.gz>

root# R CMD INSTALL DBI_0.2-5.tar.gz
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘DBI’ ...
** R
** inst
** preparing package for lazy loading
Creating a generic function for ‘summary’ from package ‘base’ in package ‘DBI’
** help
*** installing help indices
** building package indices
** installing vignettes
‘DBI.Rnw’
** testing if installed package can be loaded

* DONE (DBI)

ROracle

– Download the ROracle source from http://cran.r-project.org/web/packages/ROracle/index.html
– sudo su –
– cd

– Set the following environment variables

root# export OCI_LIB=/u01/Rk/Apps/oracle/instantclient_11_2
root# export LD_LIBRARY_PATH=/u01/Rk/Apps/oracle/instantclient_11_2:$LD_LIBRARY_PATH
root# R CMD INSTALL ROracle_1.1-5.tar.gz
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘ROracle’ ...
** package ‘ROracle’ successfully unpacked and MD5 sums checked
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/u01/Rk/Apps/oracle/instantclient_11_2/sdk/include -fpic -O2 -pipe -g -c rodbi.c -o rodbi.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/u01/Rk/Apps/oracle/instantclient_11_2/sdk/include -fpic -O2 -pipe -g -c rooci.c -o rooci.o
gcc -std=gnu99 -shared -o ROracle.so rodbi.o rooci.o -L/u01/Rk/Apps/oracle/instantclient_11_2 -lclntsh -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/ROracle/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded

* DONE (ROracle)

Using The R Language with Oracle

Now you are ready to Run your first R program, Run a query against the database, and plot the output on a graph.

Invoke the R language command line by typing in the following

$ R

From the R command line use the following commands. (The formatting is a bit messed up, click on “view code” to see the actual commands)

> library(ROracle)
> drv <- dbDriver("Oracle")
> con <- dbConnect(drv,username="sh",password="sh",dbname="burl5vb1:1521/rk01")
> res <- dbSendQuery(con,"select time_id,sum(quantity_sold) from sales
+ where time_id > to_date('20-DEC-2001','DD-MON-RR')
+ group by time_id")
> data <- fetch(res)
> data
               TIME_ID SUM(QUANTITY_SOLD)
1  2001-12-20 23:00:00                473
2  2001-12-21 23:00:00                374
3  2001-12-22 23:00:00               1034
4  2001-12-23 23:00:00               1662
5  2001-12-24 23:00:00                470
6  2001-12-25 23:00:00                289
7  2001-12-26 23:00:00               1076
8  2001-12-27 23:00:00               1196
9  2001-12-28 23:00:00                232
10 2001-12-29 23:00:00                758
11 2001-12-30 23:00:00                786

> plot(data)

You will see a plot like the one below

Happy R scripting.

If you want to learn the R Language, i would recommend the book  The Art of R programming.