Monthly Archives: August 2009

Perl and database resident connection pooling

If you use perl with oracle 11g databases, you should consider using database resident connection pooling to reduce the overheads associated with connecting and disconnecting from oracle. Much has been written about how Php applications benefit by using database resident connection pooling (Because Php does not have a connection pooling mechanism of its own, unlike Java). Similar benefits can be derived by Perl Applications Too.

Mostly perl 5 applications will be using DBI and DBD – Oracle to interact with oracle databases. Since DBD – Oracle uses OCI to communicate with the oracle database, it can benefit by using database resident connection pooling.

When the database is configured for database resident connection pooling, the oracle database creates and maintains a pool of database connections. These connections are then shared by applications connecting to the oracle database. The advantage of this is that the connections are already created, so you do not incur the overhead of establishing a brand new connection to the database. You are just reusing an existing one. This is especially helpful if you have an application that establishes connections and disconnects from the oracle database very rapidly/frequently.

A connection pool can be configured and started in the database as follows

SQL> execute dbms_connection_pool.configure_pool(null,minsize=>2,maxsize=>4);

SQL> execute dbms_connection_pool.start_pool;

A connect string can be configured in the tnsnames.ora to connect to this connection pool using the following syntax

RK01POOL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rramdas-us)(PORT = 1521))
(CONNECT_DATA =
(SERVER = POOLED)
(SERVICE_NAME = RK01)
)
)

The perl program can then establish the connection to the database using this connect string in tnsnames.ora

#!/usr/bin/perl

use strict;
use DBI;
my $dbh = DBI->connect( ‘dbi:Oracle:RK01POOL‘,
‘scott’,
‘tiger’,
) || die “Database connection not made: $DBI::errstr”;

Thats all it takes, and now you can reap the benefits of using oracle database resident connection pooling with Perl.

You can use dbms_connection_pool.stop_pool to stop the connection pool in the database.

You can use the data dictionary views dba_cpool_info, and dynamic views v$cpool_cc_info, v$cpool_cc_stats, v$cpool_stats to monitor database resident connection pools.

Notes on Oracle Parallel Query – Part 0

Oracle introduced “Parallel Query” in version 7.1 in 1994. A decade and a half later, it still trips some unsuspecting customers up, when they

  • Turn parallelism on via a hint in all queries

Or

  • Turn parallelism on at every table and index level

Doing this, coupled with high concurrent usage of the database, can lead to a lot of query server processes being spawned, leading to very high cpu usage, high load average, high number of processes waiting on the cpu run queue and high amount of i/o requests. Such situations may result in overall degraded performance. Below are some concepts to note when using parallel query. Sometimes Too much of a good thing is indeed a Bad thing.

Parallel query (and/or parallel DML) is a mechanism to levarage the cpu and i/o resources available on an SMP system to break down a database operation (query, insert , update, delete,merge) into multiple small operations, execute the small operations concurrently on the system, completing the database operation significantly faster than if it would have executed in a serial fashion. Database operations that include large table scans, large index scans or large joins usually benefit highly from using parallel query.

Oracle does not turn this feature on by default for all queries. There are specific steps to perform to turn this on as discussed in my previous post.

The total number of parallel query server’s that can be running at any given time is determined by the initialization parameter parallel_max_servers. This parameter defaults to cpu_count*parallel_threads_per_cpu*(2 if pga_aggregate_target > 0, else 1)*5

Oracle’s recommendation is that you set parallel_max_servers to 2 * Degree of Parallelism * Number of concurrent users

The reason for the recommendation 2 * DOP is that, if the operation can use “Inter Operation” parallelism, it could end up using 2 sets of Query Server Processes at the same time.

“Intra Operation” parallelism means that one operation is executed in parallel using multiple Query Server Processes (eg: Full table scan). “Inter Operation” parallelism means that the output of one parallel operation is being pipelined to another parallel operation (eg: When there are two tables being full table scanned followed by a hash join, so here the first set of query servers could be scanning one table and the second set consuming those rows and building the hash table). So there are circumstances where the operation ends up using two sets of Query Server Processes. Hence the recommendation of 2*DOP.

You ideally want to keep the CPU usage under 70% with your load averages equal to your number of cpu’s and average run queues as close to 0 as possible. The idea is to work the server efficiently but not work it to death.

Keep in mind that on systems like sun’s t2000 and t5240 systems, each thread shows up as a cpu in cpu_count .

You can control the use of parallel query server’s by setting the initialization parameter parallel_adaptive_multiuser=true (It defaults to true). When enabled this parameter causes oracle to enable an adaptive algorithm that reduces the degree of parallelism used by queries based on the database load.

One another common mistake is that for statements like insert /*+ append */ ….select…..  users forget to enable parallel dml. When running normally the select part of the insert gets parallelized (as long as dop is set) but the insert runs in a serial fashion. Once you enable parallel dml even the insert is parallelized.

You have to explicitly issue the command ‘alter session enable parallel dml’ to enable parallel dml. You should avoid turning parallel dml ON  for all sessions and all transactions. Turn it selectively on for the sessions you need parallel dml. There are some restrictions for using parallel dml.

Consider increasing parallel_execution_message size to 4k or 8k from the default of 2k, for improved parallel execution performance.

Determining the best degree of parallelism for tables is a tough problem. You need to consider the following factors.

  • What is the required query response time by the end user
  • How many parallel query servers does it take to achieve that query response time
  • Haw many of such large queries will concurrently execute on my server (Consider your application generated queries and ad hoc queries)
  • How many CPU’s do i have on the system.
  • How much i/o volumes can the system handle

Once you have the above information you need to determine a degree of parallelism that gives a reasonable query response time, when your average number of large queries execute concurrently, without clobbering your CPU, memory and i/o subsystems (Alternatively you could just determine the number of cpu’s, amount of memory, i/o channels and physical disks you need to sustain the above workload and go buy and use that hardware :-)).

Also see my previous post on enabling and monitoring parallel query for more parallel query info.