DBCA Templates and Dangerous “General Purpose” Template

Hi,

I still see that many dba is creating their databases using “General Purpose or Transaction Processing” template.

 

Why you shouldn’t use it? Because it is way too general! if you click the “view details” on the right:

Almost all the “Components” are true. So it installs everything even if you don’t use it. is that a bad thing? Yes, because;

  1. some of them requires Licenses (that means Extra Cost) and you are installing it and if someone uses it then, you have to pay to Oracle. As a small note, you cannot remove them during the installation (I will explain why in a minute)
  2. after you created your database, your upgrade/patching times will increase dramatically. when you upgrade your database, Oracle upgrades everything in the database and that includes the components too whether you use or not! This means more down time (based on your upgrade/patching method).
  3. This database is created with predefined set-up like 200M redo log files. of course you can change them but again you might missed it and that means extra job after db creation. By the way, you cannot change some of them for example default block size! (I will explain why)
  4. you cannot use “some” options with latest installed versions! for example timezone file version! even if your database home is patched (or have already) the latest timezone file version, if you create database with “General Purpose” template, then you will surprised when you open your database, an old version of timezone file will be in use! (again, I am about to explain why)

Why? because this is a “used database”! This database is not “new”. General Purpose database makes things easy for you and speed up the installation processes a little bit more and the reason is, this database is already created! Go to your ORACLE_HOME directory (for ex: /u01/app/oracle/product/19c/dbhome_1) and then go to assistants/dbca/templates. The templates that you see in dbca are here, and also “Seed_Database” is here. if you use General Purpose, Oracle will RESTORE a database from Seed_Database.dbf file (and if this is a container database then pdbseed.dfb too). this is a pre-created database.

This is why, if you use this template, you cannot change block size because pre-created database is used (you cannot change default block size after creation). This is why, your database will have an old timezone file because it is installed in already, you must update timezone file. this is why you cannot remove options during the installation because they are already installed.

So, don’t use General Purpose template. create your own or at least use “custom template”. that way, you will have full control of your database and it is really easy. actually go to templates directory that I mentioned above, copy General_Purpose.dbc with a new name, and then open and change whatever you want but I don’t recommend because general templates has some extra sections and missing some sections since it is restoring a seed database.

I won’t go into whole details of template file but here is the head of it:

here are our “Components” for example.

OMS: Oracle Label Security
JSERVER: Java virtual machine
SPATIAL: Spatial
IMEDIA: Multi-media (it is deprecated any more, don’t install it)
ORACLE_TEXT: Oracle Text search
CWMLITE: OLAP
APEX: Apex but don’t install it, it has a separate installation.
DV: Database Vault
NET_EXTENSIONS: .net extensions for Windows OS only.

Some components are depended for example if you want to install Database Vault, you must install Oracle Label Security too. You can check this on Oracle Support via the specific versions. Be careful,  pre 18c versions, Spatial is depended to Multi-media (IMEDIA) but after 18c it is deprecated and Spatial is no longer depended to it. So, the version is important.

if you down in the template file, you will see database parameters. you can set your defaults in that section. For example, I use unified auditing so disable traditional audit by setting audit_trail to NONE.

you can create a template via dbca (on the first screen choose “Manage Template”). Also, you can create a template from an existing database via dbca using cli:

Use your own templates and remove unnecessary components from your databases. it will be a relief for you and your databases 🙂

Wish you all healthy happy days.

 

lsnrctl or new db connection is too slow

Hello everyone,

I know this is not new but I haven’t seen this problem for a long time and it takes a while to find out it.

A team leader (who is an elder brother to me) from the company I started to my first job called me and asked about a problem on their customer. Their customer complained about too slow db connection and no problem after the connection. first things first, I wanted to check database availability but because of some PATH problems (this was a Windows server) it took some time. after fixing environment variables, I finally reached out to database via local sysdba connection and database is up and running. also, connection was fast. by the way database was an 11.2 version which is quite old.

So, I started to focus on LISTENER. I first checked the status but “lsnrctl status”  command was quite slow. then I stopped and started again but these two operations was also slow. I checked the logfile under: $ORACLE_BASE/diag/tnslsnr/<SERVER_NAME>/<LISTENER_NAME>/trace but everything seemed ok! When I tried to logon via listener on the database server (using the net service name) it took around 70 seconds.

I started a listener trace (adding tracing parameters into sqlnet.ora file and restarting the listener) and tried to make a connection again. as soon as I issue my connection command (sqlplus username/pass@db) trace file is created. this means network connection was fast but while reading the trace file, after passing the connection information, listener was waiting for around 60-70 seconds.

I understand that network has no issue, connection arrives immediately but db connections and lsnrctl commands were slow. as I said I already checked, listener trace directory because I know that if listener.log file under trace directory has a big size, this could cause a slow operations on listener but it was around few megabytes. I checked few more things and finally I realized that I never checked “alert” directory under diag ($ORACLE_BASE/diag/tnslsnr/<SERVER_NAME>/<LISTENER_NAME>/alert) and when I opened it, there were over 1900 log.xml file which is the xml version of listener.log file. I deleted them all and it worked like a charm. Listener was checking all xml files to decide to write log data in which xml file. that was the cause and we were able to solve the problem.

Always check your listener log files and keep their size minimum as much as possible.

Wish you all healthy days.

SESSION_CACHED_CURSORS Parameter and How can It Affect the Database

Hello,

Let’s talk about the SESSION_CACHED_CURSORS parameter. It might have significant affect on your system so understanding how it works and what it cause is important.

This parameter limits the number of cached cursors on the session. As you know there is 2 kind of parse SOFT and HARD. Hard parse is creating the cursor from the scratch. Checking grants, syntax and semantics, creating execution plans etc. Soft parse is using a previously created cursor. So, if Oracle has already calculated all information for a query (sql statement, cursor) then, it stores it in the shared pool and can use it over and over again. not calculating everything from scratch is a big performance increment.

Even in a soft parse, Oracle still need to search through library cache and find the cursor first. SESSION_CACHED_CURSOR allows you to cache a cursor in your sessions PGA (and still in SGA). This way if you run a query over an over, it will be found and execute even faster. Basically, SESSION_CACHED_CURSOR is kind of “Softer” parse. Starting from 11g its default value is 50 and in 21c it is the same.

You can see cached cursors via V$OPEN_CURSOR view. Let’s open a new sqlplus session and check its cached information. to check cache information I will use a separate session because I don’t want to affect the main session. here is main session:

so my main sessions SID is 1470. from monitoring session I will check cached objects in the session:

I won’t be able to put all output because it will be much bigger in a minute. simple explanation for cursor types:

OPEN is currently opened (last run) sql statement.
OPEN-PL/SQL is currently opened cursors in plsql. it could be more than one, you can open many cursor remember that.
OPEN-RECURSIVE is recursive sql statements for your actual statement (for example Oracle needs a privilege check if you have access to those objects, it runs a query to find that)
SESSION CURSOR CACHED is half of what we are looking for here. direct sql statements which has been run in this session and cached in session PGA.
PL/SQL CURSOR CACHED is the other big half of what we are looking for here. Sql/plsql statements which has been run in a plsql object which is called in this session.

there are few more types but I won’t go detail all of them right now. if session runs a sql statements directly (select, insert, update, delete, merge etc) it will be cached as SESSION CURSOR CACHED. if session calls a plsql objects, all sql statements in it will be cached as PL/SQL CURSOR CACHED. Remember, to cache an sql statement, it must be run at least 3 times, only then it will be cached. by saying “run” cursor must be opened and closed. if you just run a query 3 times, you will not see it in cache because last one is still opened. run something else and then it will be in the cache.

you can see session cached cursors count via session statistics:

“session cursor cache hits” is how many times a new statement has been found in session cursor cache. if you  run a query 100 times repeatedly, after first 3 (after 3 runs it will be cached) next 97 will be read from cache and session cursor cache hits will increase by 97. “session cursor cache count” is currently number of cached objects (all sql, plsql and dictionary object queries).

as I said by default SESSION_CACHED_CURSORS is 50 so my session can cache up to 50 different statements and if there is more? Oracle will use a Least Recently Used (LRU) algorithm to keep most used (and recently) queries. Previously issued statements will be deleted from the session cache (not from library cache).

Let’s start some tests and run 3000 different queries 3 times (so they can be cached). of course instead of calling 3000 different queries I will use a simple plsql. those queries will be cached as PL/SQL CURSOR CACHED. Also, I will create a tmp table as copy of dba_objects to do my tests:

before continue, you might wonder why I used dynamic sql. if I use a static sql as “select count(*) from t where object_id = i” then i would be a bind variable and every query I run would be same. so, we couldn’t see the difference.

From the monitoring session let’s check V$OPEN_CURSOR for main session (a little bit long output)

as you see, 50 different select statement has been cached which IDs starting from 2951 to 3000. out last 50 queries. also OPEN cursor is our initial plsql code.

so what would happen if we increase the cache? is it a good thing or a bad thing? well, it depends. it is a good thing because you will be doing “softer” parse. your session will access to repeated queries faster but since this information is stored in PGA (session specific memory area) it will increase the usage of PGA. if you don’t have enough PGA it could be a pain for your database. PGA is automatically managed by Oracle. it grows or shrinks based on session needs. So, you might end up too much pga resize operation and even cannot cache that much queries at all.

Let’s check current PGA size for our main session now, from monitoring session:

so our main (testing) session is using 3 MB pga size. now, I will increase the size of session_cached_cursors to 1000 and run previous 3000 queries again.

check the pga size now:

now it uses 8MB of PGA. Numbers could be small but think it this way; we are using 2.5 times higher PGA then previous. Also, if you have 10.000 active sessions, this could be a problem. it is a high usage. if you have enough memory then, there is nothing to be afraid of or at least you should be adjusting your memory components accordingly. there is no correct size or ratio because this is highly depend on your application coding. You must test the values because if your application uses same queries during a business job then, you can have a smaller session_cached_cursors but if not then, you must increase the size based on your memory.

how about the good side? Yes, let’s measure that. Assume we have a business process running with 400 different queries for every run and run this process 1000 times in a day. how it would affect the performance increasing or decreasing the parameter. I will start with 100.

it took 29.38 seconds to run all queries. Now increasing the parameter value to 400 (which is almost sufficient to store all queries I run):

24.63 seconds. 5 seconds shorter which means around %17 percent faster. which is not bad and actually good. think your system is running faster by %17 percent without doing a complex sql tuning or changing the hardware. if you set session_cached_cursors parameter between 100 and 400, elapsed time would be between 24 and 29 accordingly, pga usage too.

finally, you can also query how many times your session used session cached cursors :

as you see currently, 399 different queries (sql statements) has been cached in the session and those queries has been accessed 2.011.646 times.

This is a quite nice trick to me but must be used cautiously because of the increase of PGA usage.

I hope this helps to increase your databases efficiency.

Wish you all healthy days.

Hierarchical Blocking Session List

Hi,

Today, I needed to list all blocking sessions hierarchically. As you know, all information about sessions (and blocking sessions) are in gv$session (or just v$session if you are using single instance) view. I wanted to see who is blocking who and blocker at the top. you can write simple connect by (or with clause) query but it will list blockers at the bottom so I just developed it a little bit. if I remember correctly there were a small script under rdbms/admin but it was creating some tables etc etc. I just want to get it using a simple sql statements.

here is the sql:

first, I get all blockers’ sids (blockers data source in the query) then I use them to generate the hierarchy (blocker_hierarcy data source in the query) then I remove some blockers from the hierarchy because they are also blocked by some other sessions and will be listed under them.

for example, let’s say sid 10 is blocking 2 different sessions (sid 20 and 30) and also sid 30 is blocking some other session (sid 40). so this query will produce a result like this:

you can see the hierarch in a visual way and the first 40 character of their currently running sql statements.

Hope this will be useful for you.

wish you healthy days.

 

Subquery Caching

Hi,

I want to talk about very sweet performance “trick”. I love when I do something small and gain a lot 🙂 Today, I will demonstrate subquery caching in Oracle and try to explain how it works and it’s limitations.

Probably, you are using some plsql functions in your select (or dml) statements. There is always a package that contains some functions calculates staff or return some constant strings like “base url” or “environment name” etc. some of those functions has parameters too and we call them from SQL. Calling user defined plsql function from sql statement is a performance problem because it will cause context switch a lot. Let’s say you have function return a constant string. To see the performance difference I will add a sleep code into function so it will be easier to see performance problem.

this is our function. you need to execute grant on DBMS_LOCK package to create this procedure. Now, we will call this function from an sql statements. let’s create a table and then call this function:

when I call sf_base_url function in the first select statement (2 rows) it takes 2 seconds and on the second select statement (with 4 rows) it takes 4 seconds. this proves that our function called as many times as the number of rows from the query. Did you see the problem here? this function always returns the same result but it called more than 1 times. Actually calling it just 1 time is enough. Think about it on your databases. you have business function takes 10 seconds to complete and it is called on an sql statements which returns 1000 rows. you will lose 9990 seconds (we must call it at least 1 time so 10 seconds must be spend) calling your function unnecessarily.

This is where the subquery caching comes in. We can tell Oracle to cache this function and use that value from now on. How we do it is a little bit interesting to me 🙂 We just use a subquery to call this function.

as you see, I change sf_Base_url function to a subquery and select the function value from dual. Main query returns 4 rows (rownum <=4) but it only takes 1 second! our function is just called 1 times and result of it used consequent rows. That is so small change and very nice performance gain.

So far so good. how about a function with parameters? Let’s change the function a little bit and make it have 1 parameter, returning our parameter added at the end of base url .

let’s call this from the sql and give a column value as paramater:

it takes 4 seconds as expected. Now, subquery caching:

it also takes 4 seconds. Subquery caching didn’t work? of course not it is used but this time our function is cached with it’s parameters. since every parameter is different, our function is called for each different parameter and every row has different my_string column value. we would see the performance gain if we have same values on our table. lets do it:

Now our table has 3 A, B and C rows. let’s run subquery caching again.

Now it returns 12 rows (3 of each letter) but it takes 4 seconds because our function called one time only for each parameter and repeated parameters are read from the cache. Work like a charm.

Is there a limit for that? how many different parameter can be cached and how it is work? Let’s test.

Now, our table has 256 distinct rows and then I double it by inserting all rows again. Finally, we have 512 rows on the table. Run our subquery cached select statement again.

interesting result. This table has 256 distinct value and 512 total number of rows. every row is doubled but both subquery caching and normal call takes almost same time. Did not subquery caching work? Actually, It worked but it can only cache limited number of rows. if there are more than number of rows that can be cached, new parameter values of function will overwrite an old cached values. So, our table rows are like this: 1,2,3,4…256,1,2,3,4…256 it cached some rows but new arriving rows eventually overwrite the cache and we couldn’t take advantage of subquery caching. lets order those rows:

t2 has the same table with table t but rows are ordered in the first place. When we run subquery caching example on t2 it takes 270 seconds almost half of the previous version which was on table t. now it was able to use cached values before they are overwrite.

Edit:

I wrote that my tests indicates that only 8 different values has been cached not 255 but my further tests show that this is not true. I assume Oracle uses a hash function to store data and while hashing our function and variable, some collusions occur. that is why some of the function results are overwritten frequently. this is not something we can change. of course consequent same calls are cached and take advantage of subquery caching.

thank you very much for reading. any comment will be welcomed.

wish you all healthy days.

Direct Path Insert &/vs Conventional Insert

Hi there,

I couldn’t write anything for a while. Those days were busy. So I would talk to about direct path insert and conventional insert. Many people think that they are using “direct path insert” when used sqlldr utility but they are not. it is easy to confused batch insert and direct path insert. of course this is a big subject but I want to talk about the basics and how to use sqlldr to demonstrate it.

first of all, let’s discuss the basics.

  • High Water Mark : This is a pointer which points end of the table! this is the boundary of used space. when you create a table (without a segment) and start to insert into it, Oracle will allocate new blocks and put your data into them. Oracle already knows the starting block because it is a fixed point when you created the segment but end of the table (last block) is not fixed. it can be change when new data arrived. So, Oracle uses a pointer which points end of this table called High Water Mark (HWM). as a simple example, “full table scan” means searching all blocks from starting block to HWM. if you truncate a table this means moving HWM to starting block. that is why truncate is really fast. it won’t delete rows, it will move HWM to starting point and don’t forget this is end of the table so you remove all rows automatically. of course rows are still there but they are not belong to this table anymore and data is obsolete, can be overwritten.
  • Insert & Delete operations: when you insert a new row using “insert into values” command, Oracle will search for an “empty place” in table blocks. if it is found then this row will be inserted into that block. Oracle always try to use space efficiently. motto is; if there is empty space, use it. This is valid for “insert into … values (…)” command. sqlldr for example can act differently and I will discuss about it. Delete command will mark a rows as “deleted”. that means that place is considered as “empty” anymore and if a new row arrives (insert into values) this row can be written to “empty” place. if you delete all rows from a billion rows table and then select it, your select will take huge amount of time. Why? deleting will not remove blocks from table just mark rows as deleted. so if your table has 1million blocks, after deleting all rows, you will still have 1 million blocks. if you run “select * from table” at that point, as said above, a full table scan will be done and all blocks from starting block to HWM will search, as a result your select will search 1million blocks but couldn’t find any “not deleted” row. that is why it will take so long. if you truncate the table on the other hand, since it will move HWM to starting point, first truncate will complete really fast and after then when you run “select * from table”, it will complete in milliseconds because there is no million blocks, there is only starting point. In conclusion, delete cause empty spaces and when an Insert statement issued Oracle wants to use that empty space to eliminate unnecessary growth.

At this point you can say that, insert statements could take more time because Oracle searches for an “empty space”. you are right! oracle takes some time but not much of course, this is controlled by a bit map so it is easy to find an empty space but still there is a search.

What defined here is “conventional insert”. search for an empty space and insert it. At this point, “direct path insert” does what you think. don’t check empty spaces just add new data to the end of the table. “add data to the end of the table” means, new data won’t interact with table directly! Oracle will use “new blocks” to insert this data. those new blocks are free blocks which means they are not related to any other segment (table, index, cluster etc). data will be inserted directly to those new free blocks so Oracle won’t look for empty spaces in the table and those completely new blocks will be added to “end of the table”. this also means, HWM will be moved to end of those newly created blocks. this operation is of course much more faster especially if you are loading huge amount of data. you can “tell” oracle to do that via some hints like APPEND: insert /*+ append */ into table select * from ….; this date will be inserted using direct path insert and data will be added directly to end of this table (won’t use any empty space in table”. even if whole table is empty (all rows are deleted by delete command), Oracle will add new blocks for this table and table size will increase.

I wanted to show you how sqlldr tool does that and difference of performance. this is for my setup:

this is add 582.616 rows into tmp table on my local db. After inserting those rows, I just exported this data as csv (using sql developer) and save it as object.csv file.

As you see, table has 12.288 blocks which contains 582.616 rows. Now I am deleting all rows:

of course table still has 12.288 blocks because we just marked rows as deleted!

Let’s insert our previously exported object.csv file into this table using sqlldr. here is sqlldr.ctl file:

my sqlldr command is like this (I am using windows 10 for this demonstration):

all rows are inserted. first, let’s check table size:

there is no change on table size! because this is a “conventional” insert. please check first lines of sqlldr output just before “commit point reached” lines, it says: “Path used: Conventional”. this means look for empty spaces and insert new data into those empty spaces as we discussed earlier. so since I have deleted all rows before the insert, all table was empty and has enough space to get same rows again.

when I check my log file (conventional.log) sqlldr takes 11.20 seconds to insert all rows.

Let’s do it again with direct path insert this time. delete table and check size:

all rows are deleted and still have same size. here is the direct path insert:

as you see, now “Path used” says Direct. I did this by using “direct=true” parameter of sqlldr tool and same amount of rows are inserted. Let’s check table size now:

it almost doubled the size! why? because new data is added to end of the table, no empty space is not checked! we still have same number of rows in the table but also have a huge empty space in it. so what is the benefit here? Let’s check log file (direct.log) and see how long does it take to complete:

Elapsed time was: 00:00:02.83

2.83 seconds it is almost 4 times faster than conventional insert! here is the benefit.

sqlldr is a great tool if you use it correctly. of course it has a lot more options like parallelism, array size etc (I hope I will write about them too) but if you have a huge data to load into a table and empty spaces are not an issue then you can use direct path insert. only problem is “logging” option is important. if object is in “no logging” mode then it will even run faster but no redo log will be generated. this means you can not get your data from backups! if you have a failure after no logging direct path insert operation, you can not use your backups. you must backup immediately. of course your database administrator might put your database into “Force Logging” mode, then it is not important whether your object is logging or nologging mode. database will log it anyway and your data will be safe in backups.

so use it carefully. I hope this helps.

wish you healthy, corona free days.

 

Oracle Table Compression Part1

Hi,

I have been dealing with table compression a lot during my development period and I still do. I wanted to share some information about table information but I believe this will be a long explanation so I will write them in parts and this is part1.

Table compression is a very useful feature of Enterprise Edition (unfortunately cannot be used in Standard Edition). Table compression will compress your data and it will reduce space requirement for the table. Not just on disk also in memory. When you compress a table, Oracle will store it compressed always so it will consume less memory space and I think this is very important for large databases with relatively small memory amounts.

Of course compression has a disadvantage and that’s why it is not default option for a table. as you can guess it has a “cpu overhead” which means your cpu will be doing some extra job about compressing and decompressing. Many times this overhead could be acceptable because you will have extra disk and memory space. Do not think this as just a space gain. while reading a data from disk (and loading into memory) you will read much more less data from disk. this will significantly increase your query performance because of less disk IO.

Basically there are 3 compression algorithm for tables:

  1. Basic
  2. Advanced (if I am not wrong, it was called “FOR OLTP” compression at earlier relases)
  3. Hybrid Columnar (or Column Store)

there are important things about those compression methods. Most important one is ADVANCED compression requires “Advanced Compression License” so it will cost you a lot! you can find detail of advanced compression license detail from this link: https://www.oracle.com/technetwork/database/options/compression/advanced-compression-datasheet-134474.pdf

Other two, Basic and Hybrid compression does not require a license but Hybrid Columnar Compression (also called as HCC or Column Store compression) requires special storage systems. your storage must be support HCC.

Basic Compression

you can compress a table with an Alter Table command or create table with a compression option. there are few important points. I’ve seen some clients that “mark” table as a compressed table but they don’t know that previous rows are not compressed! here is an example:

I created a dummy table and “mark” it as “compress” but size didn’t change and rows are not compressed! I know that because DBMS_Compression.get_compression_type returned 1 for a random row. 1 means no compression.

marking a table as “compress” means “consequent rows” will be compressed but not the existing ones. I used ROW STORE COMPRESS BASIC method above (or just “compress”) and only some specific operations can be compressed at that operation. This is the most basic compression but it has a HIGH compression ratio!

I just bulk inserted same rows (basically I doubled the number of rows) but table size increased only 4 MB not 12 MB.

So, if you want to compress existing rows, you must MOVE the table:

here it is, previous rows are also compressed. you can combine these operations with: ALTER TABLE TMP MOVE COMPRESS; it will compress currently existing rows. of course these operations will be invalidate indexes! you can use “update indexes” clause but it will increase compression time. I will talk about this at my later posts.

you can use different syntax:

Alter table tmp compress;
Alter table tmp row store compress;
Alter table tmp row store compress basic;

all commands above are the same. compress table with BASIC method. BASIC compression will compress new rows only if data is coming as BULK insert or an ARRAY insert. row by row inserts (insert into values (…) ) for example won’t be compressed. I mostly use this compression for previous months data which are not frequently accessed and when accessed, only queried not any DML operation because update also won’t be compressed.

Advanced Compression

Don’t let the naming mislead you. it is not more “advanced” than BASIC compression. actually it almost has the same compression level (HIGH) with basic compression but Advanced Table compression has some advantages. Most important one is it can be used for OLTP systems and by saying that can be used for active data not previous month data etc. performance overhead for currently DML operations are less than BASIC compression and it can even compress row by row inserts.

compression will not be done after a row processed (insert or update). when Block has reached to a threshold, Oracle will compress that block at the background. so your operations will run at fastest speed and even newly inserted / updated rows will be compressed. BUT it requires advanced compression license as I said before.

Hybrid Columnar Compression

this is the king of the compression’s. it will compress data much much much much more than first two and as you can guess it has a disadvantage. HCC is much more cpu intensive compression. it requires cpu usage far more than basic and advanced compression. Mostly used for less accessed rows. it has 2 methods which have 2 sub methods:

  1. Query Low (Lowest Compression Ratio, Lowest Cpu Usage)
  2. Query High
  3. Archive Low
  4. Archive High (Highest Compression Ratio, Highest Cpu Usage)

from 1 to 4, compression ratio increased but also cpu usage.

Basically, it uses a different logic in compression method. first two compression, compresses rows in the same blocks. writes same data once and uses a reference for actual places. Column store compression compresses columns not rows in a block and it is more logical because repetitive data mostly exists in same column, for example gender or customer id or product code etc.

So, at this post I wanted to share a real life example with compression ratios and speeds. of course I can not provide main data in here but I used a table with real data and copied it then compress the same table with different methods. Here is the results:

this small code, will compress table TMP with different methods and reports about timing and size. first one is NOCOMPRESS so we will see size of table without compression and then other methods results:

without compression this table’s size is 1408 MB. as you see BASIC and ADVANCED compression reduce the size to 280 MB but it takes 37-31 seconds to compress.

interestingly, Hybrid compression (column store) “query low” takes less than others, just 20 seconds and size reduced to 176 MB almost half of first two. You can also see that size is decreasing from “Query Low” to “Query High”. king is Query High, table size is 72 MB. 20x compression is very impressive but also as you can see, it takes 2 minutes to compress. Also, querying those compression methods will take significantly amount of time.

Compression is very useful if you use it accordingly. This is just a demonstration about compression level and basics of compression. I will mentioned about query performance of compression, CLOB compression, CLOB vs HCC compression etc at my later posts too.

that’s all for now. wish you all a health day.

Redo Log Switch Order

Hi,

I always have the impression that redo log switch occurs in group number order but a small test showed me, I was wrong.

As you know redo log is one of the most important part of a Oracle database. it almost protect everything! your data in memory, data file, your transaction etc.  whenever you made a change in database a redo entry (maybe two) is written to redo log files. So the slowest component of a Oracle database is redo log files because it always does an disk IO. as a result we mostly use fastest disks to locate redo log files that way our TPS (transaction per second) can be maximum.

Lately, I was tuning a client database and realized their redo log files and data files are on the same disk! this is already a availability issue; if something happens to disk, you will loose your data files and redo log files. there is one thing guaranteed in Oracle, if you loose an active redo log file, you will loose some data. we always have more than one redo log member (file) in a redo log group and put those files into separate disks. that way if something happens to one of disks, we can still use redo log file which is on the other disk. of course you can use an ASM disk group and reduce the risk. if you don’t have a ASM disk group then you must put your redo log files on different disks. Anyway, so I asked to hosting company to add 4 new disks. I had 4 redo log groups with 2 members each. I put first and third redo log group members into disk1 and disk2 and then second and fourth redo log group members into disk3 and disk4.

while Oracle is writing to redo log group 1, Disk1 and Disk2 will be written and when it is switched to redo log group 2, Disk3 and Disk4 will be written. reason I do this, these disk drives has some limits. while you are writing some data they are using their full speed but when you start to read another file at the same time, write speed is decreased. As you can guess, since our database in archivelog mode, when a redo log switch occurs, Oracle starts to archive previous redo log file but this means you need to read it! so I don’t want Oracle write and read at the same time for the same disk. So, when redo log group1 is full and a switch occurs, Disk1 will be read (to archive redo log file) and redo log group 2 will be start to be written which resides on disk3 and disk4. This way none of the disks will be read and written at the same time and we will get the maximum performance. At least I thought so!

while testing that structure, I realized that performance has decreased dramatically after a switch occurs. when I checked, after redo log group 1 is full, it did not switch to redo log group 2! it switched to redo log group 3 which is at the same disk groups (1 and 2).

As I said at the beginning, I always have the impression that redo log switch occurs in group number order but it does not. while moving redo log files to those disks I did not shutdown database and move redo log files. I added some new redo log groups and dropped old ones so at some point some of the redo log groups were UNUSED state (they never used) and when a switch is required, oracle switched to smallest UNUSED redo log groups.

as you can see, switch occurs from 1 to 3 and also Oracle remembers that order and keep switching in that order so if you clear your logfile or add new log group, keep in mind that next switch will be to newly added or cleared log group.

PS: by changing the disk structure as this, TPS (transaction per second) increased from 15.000 to 45.000! small changes can produce big results.

 

 

Timestamp Arithmetic and Well Known Mistakes

Hi,

Lately, I started to see many developer uses timestamp types (like timestamp or timestamp with time zone) wrong especially in time arithmetic.

So as a short reminder, adding and subtracting numbers to/from a DATE value will add/subtract number of days:

Basically that is all if you use add and subtract of course we have “INTERVAL” data type which I am about to explain.

This usage is not wrong if you work with DATE but if you are using TIMESTAMP types then there is a small side effect.

so adding and subtracting did their’s job but if you look carefully there is no fractional seconds! Because they are not TIMESTAMP anymore, they are DATE. adding ans subtracting on timestamp types will convert result into a DATE. Let’s check if they are really date:

dump is a function that shows you how oracle store that data internally. in this example you can see both Typ number are 13 and 13 is one of date type which Oracle uses and yes there are more than one date type in oracle but they are not visible to us. For example return type of SYSDATE function and a table date column has different types:

as you see table date column type number is 12 not 13. so they are different but it’s not relevant to us right now.

PS: just in case if you wonder how this data is a date:

Typ=13 Len=8: 228,7,7,28,14,27,25,0

(228+(256*7)) = 2020 and rest is just the month day hour minute and second.

if I returned to main subject adding to a timestamp type will convert it to a DATE and if you use this type in your sql like:

you will be comparing timestamp (created_timestamp column in orders table) with a DATE (systimestamp -1). of course these are convertible types but still Oracle needs to do a convert operation. to avoid that you can use INTERVAL data types.

I found INTERVAL types more clear to myself.

as you can see Interval type are so simple and clean. we just write what we want to add or subtract and result is base type what you pass (TIMESTAMP With Time Zone in my example).

this type of interval has a fault to me. you can not pass number of Day/Hour/… as a parameters. this means you can not use “Interval V_Number_of_Days Day” for example but of course there is a solution, INTERVAL functions NumToDSinterval and NumToYMinterval.

NumToDSinterval = Number To DaySecond Interval

NumToYMinterval = Number To Year Month Interval

so you can use:

and result will be same with INTERVAL ‘1’DAY/HOUR…. Also please remember that you can use interval types with DATEs too.

hope this helps you in coding.

wish you a great day.

Performance Gain of NOCOPY Parameters

Hi,

Almost every Oracle database uses PLSQL codes like packages/procedures/functions etc. Complex systems with complex codes pass many arguments (parameters) many times between plsql objects. Passing parameters might be the most executed operations in a database. So how well do you pass your parameters?

I assume you know about types of parameters like IN or OUT so I won’t explain it all over. Basically out let you modify the parameter inside a plsql block and return that value back to caller variable. You might think this is kind of a pointer operation but it is not. when you modify a parameter inside a block that new value won’t be affecting caller variable. it will store on a temporary variable and after plsql block completed modified value will be assign back to caller variable.

adding a NOCOPY to your in out variable actually make it a kind of pointer and whenever you modifed your parameter in plsql block it will affect the caller variable even if plsql block raise an exception! without nocopy, since data is copied to a new variable every procedure call, it will increase your memory consumption and timing. especially if you are using big variables like CLOB or PLSQL Collections (arrays).

So Let’s test different variable types, call them many times and see if there is a difference when we use NOCOPY.

I didn’t think about the sample code so it is a little bit messy. Basically, I am just using overloaded functions with different variable types and pass them to a procedure many times.

PS: I am setting plsql_optimize_level to 1 to prevent Oracle modify my sloppy code. Since I don’t actually run a business logic and Oracle can remove many of those code like assigning a dummy variable which is never used.

this is my package code:

I will test VARCHAR2, DATE, NUMBER, CLOB and a COLLECTION of Varchar2. Call every each of them with and without NOCOPY and check the timings.

My sp_main procedure gets 3 arguments, first is the tested type parameter, second is calling type, COPY or NOCOPY and third one is number of calls.

So big types like clob and collection has the biggest difference. What surprises me are NUMBER and DATE types. they have almost no difference when used NOCOPY.

Here are the results:

as you can see Varchar2 has almost twice of the difference and please consider, when you work with bigger variable data and more than one variable, difference will be much more than this.

As I said before, CLOB and the COLLECTION has the  biggest difference. I mostly use NOCOPY with my collection and clob parameters even if I won’t be modifying them. Of course that requires a little bit more attention. Modifying by mistake it can cause headaches.

Using NOCOPY for those parameters is definitely increase your performance but also can cause some inconsistent situations. Let’s say you pass a collection using NOCOPY to a procedure and started to modify it in the procedure (in a loop maybe) after doing some job your plsql raise an unexpected error. In this case your variable will be partially modified and this is not something we desire. you must be careful using NOCOPY against this kind of situations. When you did not use a NOCOPY, you won’t be dealing with this and your collection (or other type of variable) will be consistent.

I hope this help you increase your codes performance.

I wish you healthy days.

 

Edit:

Since number is one of the smallest type it is not much possible to demonstrate performance gain with just one parameter. I just added 15 more number parameter to my test procedures and result is like this: