Monthly Archives: January 2009

Finding the Difference in Table Stats

In my previous post I have explained how to capture different statistics from the same table into temporary stattab’s for further analysis (Or to provide to support). In 10.2.0.4 and 11g, dbms_stats has some procedures that can help you get a report on the difference between the two sets of statistics.

DBMS_STATS.DIFF_TABLE_STATS_IN_STATTAB is one such procedure. Continuing from my last example where we created two separate stats tables stattab_old and stattab_new, you can now check the difference  between the two sets of stats by running the following sql statement

set long 500000 longchunksize 500000

select report, maxdiffpct from
table(DBMS_STATS.DIFF_TABLE_STATS_IN_STATTAB(null,’DEPT’,’STATTAB_OLD’,’STATTAB_NEW’));

You will see an output similar to the one below (The stats are not really what you will see on a dept table, but you get the general idea)

###############################################################################

STATISTICS DIFFERENCE REPORT FOR:
.................................

TABLE	      : DEPT
OWNER	      : SCOTT
SOURCE A      : User statistics table STATTAB_OLD
	      : Statid	   :
	      : Owner	   : SCOTT
SOURCE B      : User statistics table STATTAB_NEW
	      : Statid	   :
	      : Owner	   : SCOTT
PCTTHRESHOLD  : 10
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NO DIFFERENCE IN TABLE / (SUB)PARTITION STATISTICS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

COLUMN STATISTICS DIFFERENCE:
.............................

COLUMN_NAME	SRC NDV     DENSITY    HIST NULLS   LEN  MIN   MAX   SAMPSIZ
...............................................................................

DEPTNO		A   1	    .000000203 YES  0	    3	 C105  C105  9363
		B   9	    .000000341 YES  0	    3	 C102  C10A  1465265
DNAME		A   21	    .000000203 YES  0	    6	 C4032 C4032 9363
		B   315     .003967048 YES  0	    6	 C4032 C4032 1465265
LOC		A   25110   .000039824 NO   0	    5	 C2061 C32D1 47114
		B   161368  .000006197 NO   0	    5	 C114  C33C3 1465265
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NO DIFFERENCE IN INDEX / (SUB)PARTITION STATISTICS
###############################################################################

NDV – Number of distinct values

Density – 1/NDV

Query Execution Plan Changing due to New Statistics

You might encounter scenarios where your queries execution plan changed. Eg : On Monday your query was running in less than a minute and on Wednesday your query started running for an hour (The query was exactly the same, query criteria (Including bind variables) were the same and the table data has not changed too much). One of the reasons for the query performing badly now, could be that the execution plan has changed.

One of the reasons that the execution plan has changed, could be that the statistics on the table has changed (Between monday and wednesday, gather stats could have run on the table). Wont it be nice if you were able to restore the old stats on the table, which was helping the query run faster ?

There might be a way in Oracle 10g and Higher.

Whenever oracle collects stats on a table using gather_table_stats, oracle stores away the existing stats on the table before updating the table with the newly collected stats. So there is also a mechanism to restore this stats that oracle backed up.

We can use the following steps to restore this stats (Which was good)

  • Let us say that SCOTT.DEPT is the table in question
  • First of all, find out, when the table stats were modified
    select stats_update_time from user_tab_stats_history where table_name = ‘DEPT’;
  • Create a stats table in the schema of the table owner (This will serve as the temporary holder of the current statistics, if you ever have to restore this)
    exec dbms_stats.create_stat_table ( –
    ‘SCOTT’, –
    ‘stattab_new’);
  • Export the existing table statistics to this temporary table (stattab_new)
    exec dbms_stats.export_table_stats ( –
    ‘SCOTT’, –
    ‘DEPT’, –
    null, –
    ‘stattab_new’, –
    null, –
    true, –
    ‘SCOTT’);
  • Restore the old stats, which used to give you a better execution plan
    exec dbms_stats.restore_table_stats ( –
    ‘SCOTT’, –
    ‘DEPT’, –
    ’21-JAN-09 11.00.00.000000 AM -05:00′);

The third argument you give is the time upto which you want to restore the stats. Once you query the table user_tab_stats_history, determine a time when the stats would have been the good statistics (So pick a time in between the analyze which had the good stats and the analyze that had the bad stats). Use that time as the third argument.

Now if you get the query to reparse it should pick up the updated good statistics, use the good execution plan and execute with good performance.

Note : Try and perfect this technique on test databases before running this on production.

Notes on Oracle Parallel Query – Part I

For a good primer on parallel query in oracle please read the white paper, Oracle Sql Parallel Execution

  • The degree of parallelism for a query can be specified
    • By specifying the parallel clause during table/index creation (Or later using alter)
    • Using a parallel hint in the sql statement
    • Using ‘alter session force parallel query parallel integer’ statement
  • If One of the options in the above statement has not been specified (To enable parallelism) then, irrespective of whether the table or index is partitioned or not, oracle does not use parallelism in the query
  • When using The dedicated server connection model (As opposed to the shared server model), the sessions shadow process acts as the query co-ordinator.
  • The query uses as many “parallel execution servers” as determined by the query optimizer, to execute the query (This is based on a varitey of factors)
  • When there is No parallel degree specified at the table level, in the query hint or at the session level, but parallelism is enabled at the session level (using alter session  force parallel query) then the optimizer uses the default degree of parallelism.
  • The Default Degree of Parellism (DOP) is determined by cpu_count x parallel_threads_per_cpu
  • If you specified parallelism by issueing ‘alter session force parallel query parallel integer’ statement, then the value used for integer is used for the degree of parallelism (If a parallel hint is not specified in the query).
  • All of the above statements assume that there are enough query servers available in the pool and parallel_adaptive_multi_user does not reduce the number of parallel execution servers.
  • If interoperation parallelism can be used then you could end up using double the number of parallel execution servers as the degree of parallelism
  • If for some reason your query cannot be allocated enough parallel execution servers (typically when there are multiple sessions and all the parallel execution servers (gated by parallel_max_servers) are currently in use) and parallel_min_percent > 0 (0 is the default so your query will still get run without parallelism), your query will get an error and you can try again later.

Ok enough basics.

Once a query is running, How do you determine what it is doing ? Things like, is it using parallel query ? What is the degree of parallelism it is using ?

As the query you want to diagnose is still executing, you can run the following query from another session connected with DBA privileges.

SELECT qcsid,
sid,
NVL(server_group,0) server_group,
server_set,
degree,
req_degree
FROM SYS.V_$PX_SESSION
ORDER BY qcsid,
NVL(server_group,0),
server_set;

You will get  an output similar to the table below

QCSID	  SID SERVER_GROUP SERVER_SET	  DEGREE REQ_DEGREE
--------- ---------- ------------ ---------- ---------- ----------
170	  170		 0
170	  136		 1	    1	       5	  5
170	  129		 1	    1	       5	  5
170	  134		 1	    1	       5	  5
170	  130		 1	    1	       5	  5
170	  138		 1	    1	       5	  5
170	  137		 1	    2	       5	  5
170	  133		 1	    2	       5	  5
170	  135		 1	    2	       5	  5
170	  139		 1	    2	       5	  5
170	  131		 1	    2	       5	  5

How do you check how many parallel execution servers are currently in use by running ?

select
statistic,
value
from
v$pq_sysstat
where
statistic like ‘Servers Busy%’;

You will get an output similar to the table below

STATISTIC			    VALUE
------------------------------ ----------
Servers Busy			       10

How do you check, which parallel execution servers are in use, by which session ?

select * from v$px_process;

You will see a result similar to the table below

SERV STATUS	      PID SPID		      SID    SERIAL#
---- --------- ---------- ------------ ---------- ----------
P000 IN USE	       27 9555		      129	  47
P001 IN USE	       29 9557		      131	  14
P003 IN USE	       31 9561		      132	 119
P009 IN USE	       37 9573		      133	  13
P008 IN USE	       36 9571		      134	  10
P004 IN USE	       32 9563		      135	   7
P007 IN USE	       35 9569		      136	   9
P002 IN USE	       30 9559		      137	  10
P005 IN USE	       33 9565		      138	   9
P006 IN USE	       34 9567		      139	   6

How can you monitor the progress of the query ?

SELECT   a.qcsid,
a.sid,
a.server_group,
a.server_set,
substr(b.name,1,20) operation,
a.value
FROM     v$px_sesstat a,
v$statname b
WHERE    a.statistic# = b.statistic#
AND      UPPER(b.name) = ‘PHYSICAL READS’
ORDER BY a.qcsid,
a.server_group,
a.server_set;

You will see results similar to the Table below

     QCSID  SID SERVER_GROUP SERVER_SET OPERATION		  VALUE
---------- ---- ------------ ---------- -------------------- ----------
       170  139 	   1	      1 physical reads		      0
       170  130 	   1	      1 physical reads		      0
       170  134 	   1	      1 physical reads		      0
       170  135 	   1	      1 physical reads		      0
       170  129 	   1	      1 physical reads		      0
       170  131 	   1	      2 physical reads		   6980
       170  136 	   1	      2 physical reads		   7136
       170  137 	   1	      2 physical reads		   6404
       170  133 	   1	      2 physical reads		   6436
       170  138 	   1	      2 physical reads		   5988
       170  170 			physical reads		 766852

How do you check, what the parallel execution servers, are waiting on ?

SELECT px.SID “SID”, p.PID, p.SPID “SPID”, px.INST_ID “Inst”,
px.SERVER_GROUP “Group”, px.SERVER_SET “Set”,
px.DEGREE “Degree”, px.REQ_DEGREE “Req Degree”, w.event “Wait Event”
FROM GV$SESSION s, GV$PX_SESSION px, GV$PROCESS p, GV$SESSION_WAIT w
WHERE s.sid (+) = px.sid AND s.inst_id (+) = px.inst_id AND
s.sid = w.sid (+) AND s.inst_id = w.inst_id (+) AND
s.paddr = p.addr (+) AND s.inst_id = p.inst_id (+)
ORDER BY DECODE(px.QCINST_ID,  NULL, px.INST_ID,  px.QCINST_ID), px.QCSID,
DECODE(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP), px.SERVER_SET, px.INST_ID;

You will see results like in the table below

 SID  PID SPID	 Inst Group	   Set Degree Req Degree Wait Event
---- ---- ------ ---- ----- ---------- ------ ---------- ------------------------------
 170   24 9473	    1					 PX Deq: Execute Reply
 131   27 10236     1	  1	     1	    5	       5 PX Deq: Execution Msg
 132   29 10238     1	  1	     1	    5	       5 PX Deq: Execution Msg
 129   20 10234     1	  1	     1	    5	       5 PX Deq: Execution Msg
 134   31 10242     1	  1	     1	    5	       5 PX Deq: Execution Msg
 139   30 10240     1	  1	     1	    5	       5 PX Deq: Execution Msg
 135   36 10252     1	  1	     2	    5	       5 resmgr:cpu quantum
 136   34 10248     1	  1	     2	    5	       5 resmgr:cpu quantum
 137   35 10250     1	  1	     2	    5	       5 direct path read
 130   33 10246     1	  1	     2	    5	       5 resmgr:cpu quantum
 133   32 10244     1	  1	     2	    5	       5 direct path read

If you start seeing a lot of waits on the event “PX Deq Credit: send blkd”, you can start troubleshooting this by identifying the consumers who are blocking the send (Once you identify them you can probably drill down into that session using ash reports). Metalink Note 304317.1 has a query that can be used to identify the blocking consumers.

Once a statement has finished execution, how do you check if it used parallel query, and how was the statement parellelized ?

Issue the following statement from the same session that the query executed

break on tq_id on server_type
SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
FROM v$pq_tqstat
ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

You will see results similar to the table below

DFO_NUMBER	TQ_ID SERVER_TYP PROCESS      NUM_ROWS	    BYTES
---------- ---------- ---------- ---------- ---------- ----------
	 1	    0 Producer	 P005		   287	     2582
	 1			 P006		   287	     2582
	 1			 P007		   287	     2582
	 1			 P008		   287	     2582
	 1			 P009		   287	     2582
	 1	      Consumer	 P000		   315	     2825
	 1			 P001		   340	     3035
	 1			 P002		   270	     2450
	 1			 P003		   250	     2280
	 1			 P004		   260	     2320
	 1	    1 Producer	 P000		     1	       32
	 1			 P001		     1	       32
	 1			 P002		     1	       32
	 1			 P003		     1	       32
	 1			 P004		     1	       32
	 1	      Consumer	 QC		     5	      160

How do you review all the parallel activity happening in your environment ?

select name,value from v$sysstat
where upper(name) like ‘%PARALLEL OPERATIONS%’
or upper(name) like ‘%PARALLELIZED%’ or upper(name) like ‘%PX%’;

You will see results similar to the table below

NAME								      VALUE
---------------------------------------------------------------- ----------
queries parallelized							 10
DML statements parallelized						  0
DDL statements parallelized						  0
DFO trees parallelized							 10
Parallel operations not downgraded					 10
Parallel operations downgraded to serial				  0
Parallel operations downgraded 75 to 99 pct				  0
Parallel operations downgraded 50 to 75 pct				  0
Parallel operations downgraded 25 to 50 pct				  0
Parallel operations downgraded 1 to 25 pct				  0
PX local messages sent						       2350
PX local messages recv'd					       2350
PX remote messages sent 						  0
PX remote messages recv'd						  0

This should be a good starting point to start analyzing parallel query behaviour

Controlling Parallel query in 11g Rac

            In a Rac environment, when you execute a parallel query, it is very likely that the parallel execution servers, get executed on all nodes in a Rac cluster. In releases prior to Oracle 11g, if you wanted to ensure that all the parallel execution servers for a single query get executed on the same node (or a group of nodes) you had to set the initialization parameter parallel_instance_groups

           Starting with Oracle 11g, you can just create services which are only active on certain nodes in the Rac cluster . So when you login to the database using such a service name, your parallel query will only spawn parallel execution servers on the nodes where the service is active. Oracle automatically adjusts the values for parallel_instance_groups (Without you having to explicitly set it) based on your service name you used to connect. Simplifies our life.

11g copy files from asm to cooked file system

If you were using 10g ASM, you had to use the dbms_file_transfer package to copy files from an asm disk group to a cooked file system and vica versa. 11g ASM makes this process simpler. In 11g, the asmcmd command now has a cp command that can copy ASM files to a cooked file system and files from a cooked file system back to ASM.

You can find the details and syntax for this cp comand  here.

Large SGA’s and ORA-2710 on Linux X86_64

Sometimes when you try to enable large SGA’s on X86_64 linux you might run into ora-2710 errors when starting up the database. This could be because of low value being set for the linux kernel parameter shmall.

You should set the value for shmall as follows.

– Total up the sizes for SGA’s for all the databases you are going to run on the machine
– run the linux command getconf PAGE_SIZE to get your linux page size
– Set shmall equal to the sum of all the SGA’s on the system, divided by the page size

Review metalink note 301830.1 for more info.

Using ASM and OCFS2 for database storage with Oracle RAC

Even though i always prefer to use ASM for all aspects of database storage (database files, flash recovery area etc) when using oracle real application clusters, there might be rare cases where you want to mix and match ASM and ocfs2 for the same Rac database. A specific case that i came across was that the customer’s backup software did not support direct backups from rman to tape (Hence they need to backup to a file system and then backup the filesystem to tape).

It is supported to have the oracle database files on ASM and the oracle flash recovery area on OCFS2, if you have a requirement to do so. DBCA supports the creation of such a database (Where the database storage is on a ASM disk group and the Flash recovery area is on OCFS2). In the DBCA screen where you choose the location of the Flash recovery area, you will not be able to click on the <BROWSE> button and pick an OCFS2 mount point (because by default it only shows asm disk groups). However you can type the OCFS2 mount point into the field and dbca accepts it and configures it correctly. DBCA will actually keep (As it does with ASM) one copy of the control file and the second member of the redo log groups on the OCFS2 mount point.

I am not advocating that customers use this configuration, because of the obvious difference of performance charachteristics between ASM and OCFS2. If one wants to use such a configuration then they should consider creating additional asm disk groups to keep copies of the control files and online redo log’s (As opposed to keeping them on OCFS2).