Monthly Archives: September 2011

Monitoring Exadata database machine with Oracle Enterprise Manager 11g

Oracle Enterprise manager Grid control, is hands down the best monitoring and management tool, for the oracle exadata database machine. It comes with plugins to monitor all the hardware components of the database machine, and sensible, preset thresholds for proactive monitoring.

Update (Nov 2011) : Enterprise manager 12c is now available, and Certified to be used with exadata. The master MOS note 1110675.1 covers the installation and configuration details.

Some key points
  • You should use 11gR1 enterprise manager grid control for monitoring.
  • You should use 11gR1 enterprise manager agents, to monitor the targets on the database machine.
  • If you use enterprise wide monitoring tools like tivoli, openview or netcool, use snmp traps from oracle enterprise manager, to notify these monitoring tools (ie dont try to directly use snmp to monitor the exadata components. You could do this but it will be too time consuming).
  • You could potentially use 10.2.0.5 Oem, with 11g agents to monitor the dbmachine, but this is not recommended as a stable/long term solution.
  • The following components (And more) can be monitored using Enterprise Manager
    • Databases hosts
    • Exadata Cells
    • Cisco switch
    • KVM (Keyboard, Video, Mouse)
    • ILOM Monitoring
    • Infiniband switch
    • Power distribution unit (PDU)
You have 3 possible options to configure enterprise manager
  • If you have an existing 11gR1 enterprise manager grid control envrionment, you can patch it with the recommended patches and use that for monitoring the dbmachine targets.
  • You can setup and configure a brand new 11gR1 enterprise manager grid control environment (On a separate server) and configure it to monitor the dbmachine targets.
    • Download the required software
      • Weblogic server 10.3.2 (MOS Note 1106105.1, 1063112.1)
      • Jdk 64 bit (Mos Note 1063587.1)
      • 11gR1 Oms from download.oracle.com
    • Install Java and Web Logic Server (Wls)
      • MOS Note 1063762.1
    • Patch Web Logic Server
      • MOS Note 1072763.1
    • Install 11gR1 Enterprise manager Oracle Management Server (OMS)
      • Install/Create a 11gR2 database to serve as the Enterprise Manager Repository
      • Database pre-reqs for 11.1.0.1 repository (Mos Note 1064441.1)
      • Install/Configure Oms (Mos Notes 1130958.1, 1059516.1)
    • Patch OMS with the required patches to enable database machine monitoring
      • Mos Note 1323298.1
  • You can use an easy install option to setup and configure an enterprise manager environment and configure the plugins.
    • The easy install is delivered as a  patch 11852882 (EMGC setup automation kit)
    • The configuration worksheet has to be filled out properly (Before the installation) and the em.param file has to be generated.
    • Follow the instructions in the readme to do a quick install of a fully configured 11gR1 Enterprise manager installation.
    • This method helps you install/patch  and configure the full 11gR1 oms in just an few steps and is a huge time saver.
Download the required plugins to monitor the following components
Download the plugins from the enterprise manager extensions exchange
http://www.oracle.com/technetwork/database/exadata/index.html#plug-in (Exadata cell plugin)
http://www.oracle.com/technetwork/oem/grid-control/exadata-plug-in-bundle-188771.html (All the rest of the plugins)
Install and Configure the Agent and the Plugins
       Additional tutorials with screenshots on configuring the plugins can be found below
Sending SNMP traps to 3rd party monitoring tools.
  • Get the Mib (Management Information Base) file from your enterprise manager management server and send it to the 3rd party tool administrator (eg: openview or netcool). Follow MOS note 389585.1, to get this MIB file.
  • Then configure your notification methods and rules to send the required snmp traps to the 3rd party tool.

Shell script to generate awr diff reports

Awr diff reports (awrddrpt.sql) are a great tool for comparing database performance characteristics from different time periods. Lot of times, the details in the report can point you to differences in workload profiles, or sql execution differences, helping you to narrow down problem areas.

Awr diff report, can compare the awr data for different time periods

– Period A v/s Period B for the same instance in the same database

– Period A in instance X v/s Period B in instance Y in the same database

– Period A in instance X in database P v/s Period B in instance Y in database Q

I routinely export awr data from different databases (awrextr.sql) and import them to a central database (awrload.sql). However one problem with this is that once it is in the central database, in 11gr2 there is no way of viewing this data via Enterprise Manager. So i end up writing sql statements to compare the data.

Below is a handy shell script that helps you compare two different time periods.

You can run the script using the syntax below.

./genawrd.sh dbid1 inst1 startsnapid1 endsnapid1 dbid2 inst2 startsnapid2 endsnapid2 credentials

eg:
./genawrd.sh 3401191500 3 1111 1112 1346415800 2 757 758 system/manager@rk01
#!/bin/bash

if [ $# != 9 ]
then
echo Syntax  genawrd.sh dbid inst1 startsnapid1 endsnapid1 dbid2 inst2 startsnapid2 endsnapid2 credentials
exit 1
fi

l_dbid=$1
l_instid=$2
l_start_snapid=$3
l_end_snapid=$4

l_dbid2=$5
l_instid2=$6
l_start_snapid2=$7
l_end_snapid2=$8
l_credentials=$9

l_logdir='/u01/Temp/awrs'

l_awr_log_file=${l_logdir}/awrdrpt_${l_instid}_${l_instid2}_${l_start_snapid}_${l_end_snapid}_${l_start_snapid2}_${l_end_snapid2}.html

sqlplus -s $l_credentials << EOC  > /dev/null
set head off
set pages 0
set lines 132
set echo off
set feedback off
set termout off
define inst_num=${l_instid};
define inst_num2=${l_instid2};
define dbid=${l_dbid};
define dbid2=${l_dbid2};
define report_type='html';
define report_name=${l_awr_log_file};
define begin_snap=${l_start_snapid};
define end_snap=${l_end_snapid};
define begin_snap2=${l_start_snapid2};
define end_snap2=${l_end_snapid2};
@$ORACLE_HOME/rdbms/admin/awrddrpi
EOC

Shell script to generate ash report

From time to time i get questions like “Batch program A was running in X minutes on Day Y, Same program ran for 5X minutes on Day Y+1, can you figure out why ? I have found that collecting and analyzing Ash reports for the two given time periods is a good way to start looking at such issues.

The first step is to determine which was the database session that was running the batch program. I step through the following process to identify the database session.

– Find a sql that the batch program executes

– Identify the sqlid for that sql statement

– Query dba_hist_active_sess_history to find the session_id’s executing the sql in a given time period.

select session_id from
dba_hist_active_sess_history
where
sql_id = '&sqlid'
and instance_number = &inst
and snap_id between &Begin and &End
/

Then I run the following shell script to generate the ash report for the given SID and time period.

The syntax for running the script is

./genash.sh dbid 'MM/DD/YY HH24:MI' duration credentials inst sid

eg:
./genashsid.sh 131424457 '01/30/11 11:00' 10 system/manager@rk01 1 1021

 

The actual script is below

#!/bin/bash

if [ $# != 6 ]
then
echo Syntax  genash.sh dbid '01/30/11 11:00' duration credentials inst sid
exit 1
fi

l_dbid=$1
l_start_time=$2
l_starttime_log=`echo $l_start_time|sed 's////g' | sed 's/ //'|sed 's/://'`
l_duration=$3
l_credentials=$4
l_inst=$5
l_sid=$6

l_logdir='/u01/Rk/Docs/ashs'

l_ash_log_file=${l_logdir}/ashrpt_${l_inst}_${l_sid}_${l_starttime_log}.html

sqlplus -s $l_credentials < /dev/null

set head off
set pages 0
set lines 132
set echo off
set feedback off
set termout off

define inst_num=${l_inst};
define dbid=${l_dbid};
define report_type='html';
define report_name=${l_ash_log_file};
define begin_time="${l_start_time}";
define duration=${l_duration};
define target_session_id   = ${l_sid};

define slot_width  = '';
define target_sql_id       = '';
define target_wait_class   = '';
define target_service_hash = '';
define target_module_name  = '';
define target_action_name  = '';
define target_client_id    = '';
define target_plsql_entry  = '';

@$ORACLE_HOME/rdbms/admin/ashrpti
EOC

You can modify the script fairly easily to run it for
– All sessions
– For a specific module name
– A specific sqlid etc etc.

AWR display more than 10 sql’s.

By default AWR display the top 10 sql statements in the “SQL Statistics” section of the report, under the various categories (Elapsed, CPU, gets etc). Sometimes you might want to display more than the top 10 statements, for eg: let us say the top 50. If you want to accomplish this.

In the same session you are executing awrrpti.sql from (or awrrpt.sql) execute the following before you run awrrpti.

 

exec dbms_workload_repository.awr_set_report_thresholds(top_n_sql=>50);