Indexes with NOSEGMENT a.k.a Virtual Index

Hi,

Today, an old trainee has contacted with my on linkedin and we talk about some performance issues about a query. During the conversation I suggest to use “virtual indexes” to simulate the index so I wanted to write about virtual indexes and their benefits in this post.

if you don’t work on a large database (or a small database server) then, you probably don’t get bored while creating an index but if you do then, index creation can be a pain. Let’s say you examined a query and decided to create an index based on some columns could get increase the performance but will it? if your table is a large one then, creating an index could take serious amount of time. During that time, your server will have extra work load and if that index does not work as you assumed then all those time and work load will be for nothing! Also, you will have generated too many archivelogs. Those archivelogs could increase your recovery time in case of a failure.

So what can we do? You can use “virtual index”. Virtual index is an index without a segment which means this index has no data, does not index anything at all! So, what good come out of it? it’s creation time almost instantaneous and no work load on your server, no archivelog generation, no performance impact in any way. So, why do we create it? Because you can check your execution plan and see the cost and other metric values so that you can decide whether this index should be created or not!

This will save you from wasting too much time and give an initial idea.

Here is an example:

I have 72.544 rows in my dba_objects view. By using this view, I created a table:

I created a BIG table which contains 1.1 Billion rows in it. Those rows are copies of dba_objects view and it takes 35 minutes to create it in parallel (I enabled parallel dml and ddl operations to achieve that). Table size is around 185GB.

Now, Let’s create indexes on it:

Unique index took almost 18 minutes to complete and non unique took 10 minutes (no parallel). I must say that, my laptop is a monster with 10th gen i9 (8core), 64GB ram and nvme ssds. those timings could be double or triple on a regular db server. As I said, after all this time, if indexes aren’t useful then you drop them.

Instead of that, we can use NOSEGMENT clause while creating the index (virtual index)

As you see, it is fast because it just created an index definition into oracle data dictionary. Literally, there is no such index. check dba_indexes:

How Oracle can use an index if it is not actually created? Answer is STATISTICS. To use virtual index and get a consistent result from it, your statistics must be up to date. Of course, result will not be certain but close enough to give us an idea. By default Oracle will not use virtual indexes in your execution plan so you must set a hidden parameter, then your explained plans will have virtual indexes:

after setting hidden parameter _use_nosegment_indexes optimizer will create an execution plan with considering virtual indexes and I can see that the cost of the query has decreased to 2 from 6.450.000. This saves so much time. after checking your execution plans, if you are ok with them you can actually create the index.

EDIT: There is 2 things that I forgot to mention. after creating virtual index run DBMS_STATS.GENERATE_STATS procedure to generate some statistics of your virtual indexes.

and the second thing is, this is can be used only in enterprise edition not standard edition. Standard edition has no deferred segment creation so you cannot create a data object without data segment.

To create this virtual index as a real one, you must drop it and re-create it without nosegment clause.

Time is the most valuable thing for us. Don’t waste your time carelessly.

Wish you all healthy, beautiful days.

How to install Oracle Database Options like Spatial, Label Security etc After DB Creation

Hi,

yes, I am aware that title is a little bit long and too informative but I believe it is necessary because terms are complicated according to me.

When you say “option” it could be many thinks like “advanced compression” option in EE or Spatial option or Partitioning option and not all of these are the same “option”. they have different backgrounds and structures. some times they called “components” too but in the documentation they always mentioned as options. I want to write about “options” like Spatial, Jserver, Label Security, Data Vault, Olap… Do you see where I am going to? These are the “options” in dbca utility that you choose while installing the database (only if you choose “custom database” not general purposes or dw as db type) but also I will talk about partitioning and RAT (real application testing) option because their namespaces intersect in some “options” (ok, I will stop putting double quotes around the option).

I didn’t care about options much until now and I wanted to learn how to install/uninstall after db creation or what options do we have. Started to dig the documentations and I realized that it is a little bit more complicated than I thought. What I focused was the options on dbca as I said:

here they are. So instead searching docs and google first, I use a small trick and check the option “Generate Script” at the last page of dbca and I looked for the scripts. fortunately, Oracle puts every options creation into a script so it is really easy to find them but before talking about the scripts I want to talk about the option types.

Shared Objects & DLLs

Some options are made of just simple scripts but some of them are deeper! For example, if you want to install Spatial option (to use geographic coordinates in your db) you just need to run an sql script but for OLAP, you need to “link” some objects into Oracle kernel and then run some scripts. What does it mean? some options are not just tables/packages etc, they need some special C functions in oracle kernel but not everything is in the kernel. Shared Objects in Linux and (mostly known) DLL files in windows are widely used to add or remove functionalities to your application. In the end Oracle is an application and uses those dynamic libraries. I am not a software engineer so this part is a little bit above my paygrade but I will try to explain. In Linux if you want to add a functionality to your application via shared objects (dlls in windows os) you must “link” them to your application by “make” command. For example, to use OLAP it must be linked to your oracle binaries in Linux, as such necessary DLLs must be available in windows. Remember this operation is not per database but Oracle Home. When you do this, all current and future databases in this Oracle Home will be or will not be able to use these options.

Of course, it is not that easy 🙂 some options must be linked to Oracle binaries but they don’t need to run an sql script. So, if we group the options by their installation types, we have 3 different types:

  1. SQL Scripts (for ex: spatial)
  2. Linking Binaries (for ex: unified auditing)
  3. Linking Binaries & SQL Scripts (for ex: OLAP)

To install Spatial you just need to run $ORACLE_HOME/md/admin/mdinst.sql script as a sysdba user. That is it. This script will create MDSYS user and necessary objects to use spatial option.

To enable “unified audit” option in database you must just link some binaries but you don’t need to know everything of course. you can check this post of mine: http://mustafakalayci.me/2021/09/08/enable-unified-audit-on-linux-and-windows/

spoiler:
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk uniaud_on ioracle

and To install OLAP, first you must link OLAP module first and then then run $ORACLE_HOME/olap/admin/olap.sql script as sysdba.

Linking Binaries & chopt

for linking operation Oracle provides as a simple tool called chopt short for Change Option. Instead of running “make” command you can use chopt and disable and enable “some” options from Oracle Kernel. it is basically just calling “make” command itself. It does support very few option:

as you see, you can enable/disable OLAP, Advanced Analytics, Partitioning and RAT.  I want to repeat again, when you use chopt, you are linking/unlinking some binaries. So, this will add or remove some components from Oracle kernel and currently available and future databases will be or will not be able to use these options. usege of chopt is very easy:

if you run it on Windows, it will just rename a DLL (to enable option, it will rename it to correct name and to disable, rename it to something else)

chopt has different options from version to version. for example in old version there a module called DM (data mining) but in 19c it is under the Advanced Analytics so if you want to enable/disable data mining you must use chopt enable/disable oaa command. Also, you can see that “partitioning” is an option too. if you don’t have a partitioning license you can disable partitioning option so that non of developers create a partitioned table and caused (maybe) thousands of dollars cost.

Only use the options you need! Nothing more. Consider that unnecessary options will increase db upgrade timings too. Also, as in partitioning example, some options are licensed options which means you have to pay to use them. if you don’t have the license then disable it.

As far as I understand, there is a historical development in this. for example, in old versions, Spatial was an option that needed to be linked too. sdo_on and sdo_off was used to enable/disable the Spatial.

as you see, when you try to turn it off now, warns us about “sdo_off is disable”. it is always on anymore.

here is some examples to enable/disable options in Oracle kernel but some of them (like Label security) is deprecated.

Product/Component Enable Switch Disable Switch
Automated Storage Management asm_on asm_off
Oracle Data Mining dm_on dm_off
Database Vault dv_on dv_off
Oracle OLAP olap_on olap_off
Oracle Label Security lbac_on lbac_off
Oracle Partitioning part_on part_off
Real Application Cluster rac_on rac_off
Real Application Testing rat_on rat_off

Also, please consider that if you disable an options like OLAP using chopt (or make command) it will be disabled in dbca too. you won’t be able to choose it:

WARNING: before use chopt or make commands, always close all services in the oracle home first (database, listener etc). for windows servers, also stop the related Oracle Services in Windows Services (like OracleServiceOrcl…)

Dbca Options

Options that you can install into a database (might required binary linking). you saw them at the screenshot above. Every each of them has some unique features (I won’t go detail the purposes of the options) and some of them depends to others. So, basic hierarchy is like this:

  1. Oracle JVM
    1. OLAP
    2. Multimedia
  2. Oracle Text
  3. Spatial
  4. APEX
  5. Label Security
    1. Database Vault
  6. Extension for .NET

to be able to install Multimedia you must have Oracle JVM first or you can install APEX or Label security without installing anything else. What are the scripts to install? you should always check the documentation or support to achieve this but as an example I wanted to write them in 19.13 version. DO NOT run them directly! Always check the documentation first, scripts might be changed! those samples are taken from dbca generated scripts.

  1. Oracle JVM
    $ORACLE_HOME/javavm/install/initjvm.sql;
    $ORACLE_HOME/xdk/admin/initxml.sql;
    $ORACLE_HOME/xdk/admin/xmlja.sql;
    $ORACLE_HOME/rdbms/admin/catjava.sql;
    connect “SYS”/”&&sysPassword” as SYSDBA
    $ORACLE_HOME/rdbms/admin/catxdbj.sql;
  2. OLAP
    $ORACLE_HOME/olap/admin/olap.sql SYSAUX TEMP;
  3. Multimedia
    $ORACLE_HOME/ord/im/admin/iminst.sql;
  4. Oracle Text
    $ORACLE_HOME/ctx/admin/catctx.sql Xbkfsdcdf1ggh_123 SYSAUX TEMP LOCK;
    alter user CTXSYS account unlock identified by “CTXSYS”;
    connect “CTXSYS”/”CTXSYS”
    $ORACLE_HOME/ctx/admin/defaults/dr0defin.sql “AMERICAN”;
    connect “SYS”/”&&sysPassword” as SYSDBA
    alter user CTXSYS password expire account lock;
    $ORACLE_HOME/rdbms/admin/dbmsxdbt.sql;
  5. Spatial
    $ORACLE_HOME/md/admin/mdinst.sql;
  6. Apex
    $ORACLE_HOME/apex/catapx.sql Xbkfsdcdf1ggh_123 SYSAUX SYSAUX TEMP /i/ NONE;
  7. Label Security
    $ORACLE_HOME/rdbms/admin/catols.sql;
  8. Database Vault
    $ORACLE_HOME/rdbms/admin/catmac.sql SYSAUX TEMP;
  9. Extension for .NET
    $ORACLE_HOME/rdbms/admin/dbmsclr.plb (in windows only)

How to Install Dbca Options

Well, I pointed out the scripts and you can run them (for the same version, and always check the documentations first) but of course it is not that simple 🙂

This is very important! Installation method depends on whether you are using a Container database or Non Container database! if you are not using container database (old design which is not supported anymore by starting 21c), you can run the scripts by connecting a sysdba user but for a container database it is more complicated.

for a container database, most of those options must be run in cdb$root and then all the pdbs but this is a hard job and opens for mistakes. instead of this, use catcon.pl perl file to automate this. spatial sample creation script taken from dbca generated scripts:

it seems complicated but mostly paths makes it longer. simply call a perl script and as a parameter it passes the original script of the necessary option. Always use catcon.pl to install/uninstall an option in a container database.

How to Check Which Options are Installed

This is (as always) not that simple 🙂 Some options are “registered options” so you can list them via dba_registry data dictionary view:

These rows do not match to an option exactly. OLAP option is listed as 2 rows, “Oracle OLAP API” and “OLAP Analytic Workspace”. Also, some options are installed by default like “Oracle Workspace Manager”.

is that all? of course not. these are registered options (components) but also there are options in Oracle kernel like unified auditing. it is not depend on an sql script. for all options in the kernel use V$OPTION view:

I just listed some of the rows because it is a long list.

Uninstall an Option

Uninstalling and option is not that easy because some of the options create specific objects in different schemas (like sys) and to uninstall it either you must find every object that used by the option or use some scripts again that either Oracle Support provide or exist under the oracle home directory.

To Uninstall Spatial, for instance, you can simply just drop MDSYS schema but there might be some tables which uses spatial data types, indexes etc. you must find and drop them first. even if it seems easy there is always a catch. So, be careful and always follow support documents about it.

In Conclusion

So, it is a deep topic and probably there are much more. I said this before but again; remove all unused options from your database. They will be a burden to your system. So far these are what I understand and learn… I hope, these are useful and helpful to you while understanding the db options.

As always, wish you healthy days, NO WAR in anywhere, all life in this planet matters.

V$DIAG_ALERT_EXT and equality conditions

Hello,

I don’t know if this is a bug but something small took my attention. Codes below are run on 19.8 and 19.12 versions. Lately, I was working on alert log messages and to do that I use V$DIAG_ALERT_EXT view which is a row by row representation of alert log file. while I was looking for values in MESSAGE_GROUP first I got distinct values :

seems fine until now, then I wanted to list everything for message group value is “startup”

at first I though I type “startup” incorrectly but it was correct. then I though there might be some invisiable charanter at the end but there were none any! So I tried a “like” search:

so rows are there conditions are true but result is missing with an equality search.

then I tested this on others columns like host_id, organization_id etc and same behavior! I decided to create a table using CTAS and test it on it if there is something different on the data:

it worked! so there is nothing wrong with data. equality search is working on dummy table. Also if you check column data types you can see that they are VARCHAR2(67). As a last step I wanted to look at the data using DUMP function:

so data is exactly the same. either this is a small bug on V$DIAG_ALERT_EXT or it is a feature 🙂 I am not sure but just in case consider searching v$diag_Alert_ext with like conditions (using % sign at the end) not an equality.

wish you a healthy days.

 

Edit 2021-11-18 : I confirm that equality condition works properly on 19.13

ORA-24244: invalid host or port for access control list (ACL) assignment

Hello everyone,

So you wanted to migrate your ACLs to another database or you are trying to add a new acl entity but you got ORA-24244. There are many reasons for that error unfortunately. I wish Oracle provides us more detail about what is the error but I wanted to share what I have seen so far and how to generate acl records. What I am about to write is for 19c databases.

  1. First of all, of course, you can get this error by setting an invalid port like -123 or 999999999. check them first.
  2. secondly, port range you use might intersect with another definition. for example, there is an acl entry for www.example.com for 3000 and 3010 ports. you can not add a new port range for same host, same principal and same privileges with port range 3010-3015. since 3010 is already defined in first entry you can not use it again, second entry must start with 3011 at least.
  3. thirdly, you might be trying to insert an entry for RESOLVE privilege with port definition! this is not also allowed. for resolve privilege, lower and upper ports must be null.

these are my basic steps to check. even expdp utility with “include=NETWORK_ACL” option might failed while importing them because of these 3 steps. Lately I was copying acls from a 12c db to 19c and after getting too much error I decided to generate a simple select statements to generate acls append command. you can either copy, paste and run the result of statement or run it in a loop and run the command with dynamic sql.

Please don’t judge for not using match_recognize. I just wrote it fastest way I can think of. script is first dividing port ranges to row by row single ports and them regroup them (some old entries had intersected port ranges). also uses NULL for resolve privileges. I didn’t add all options of course. I hope this script helps you.

 

wish you healthy days.

Raid 5 and XOR

Hi,

Today while talking with a friend he mentioned about using a RAID 5 on disks. Since I am not experienced with RAID structures, after the conversation, I started to read about RAID 5. of course I know what RAID is and basically how it works but don’t have information about all types. so while looking at RAID 5 I saw that term “parity”.

Basically, a data block A wil be written to the disks if you have a RAID 5 configuration with 4 disk then the data will be dived into 3 pieces (a1,a2,a3) and each part will be written on a disk. this will improve your performance of course but on the last disk there is a part called “parity”. what is that? for example on ASM if we are using our disk groups with NORMAL redundancy, every piece written on a disk will be also written to another disk so data will be protected against disk corruption or lost. RAID 5 configuration is also protect your data against disk lost etc but it does not copy every data piece (a1,a2,a3) to other disks. it is just using this “parity”.

At first I thought it is a combination of all pieces (which means whole data) but this would be ridicules because you would be loose all performance gain and I started digging. What I found is so simple (and probably many of you already knew about this) and clever, also new to me. So I wanted to share.

Parity is a result of XOR Gate and produces brilliant results. Let’s assume our data pieces (a1,a2,a3) as simple binary data like:

a1: 1001
a2: 0111
a3: 1011

as a short explanation of XOR gate, it will produce 1 if inputs are different and 0 if they are same. so;

1 XOR 1 = 0
0 XOR 0 = 0
1 XOR 0 = 1
0 XOR 1 = 1

so XOR of a1 and a2 is:

1001
0111
——- XOR
1110

parity is the combination of three pieces (in my example) with an XOR gate

a1              a2              a3
————————————–
1001 XOR 0111 XOR 1011

so order of execution is (a1 XOR a2) XOR a3. Result of this operation is 0101 (you can use online xor calculators like: https://toolslick.com/math/bitwise/xor-calculator) and remember leading zeros will be suppressed so if you see 101 as result, it is same with 0101.  result of 2 xor operation is 0101 which is our “parity”. this data is written to last disk but how does this protect us from a data loss? This is the part where we show the power of XOR. Let’s say we lost disk2 (a2 is missing), all we have a1, a3 and parity which are:

a1            a3            parity
1001        1011       0101

so let’s put them into another XOR gate:

1001 XOR 1011 XOR 0101 = 0111 (a2)

we have our a2 piece back. this will work any of missing pieces and it is really efficient but of course real data blocks are much more bigger and they have many bits to pass through XOR gate of course this will decrease your performance a little bit but still will be faster than just one disk.

Sequential ID column without gap!

Hello,

A customer is asking to you generate order numbers one by one and without gap! What an unpleasant request isn’t it? if you are not familiar to databases then you might think that this request is logical but it is not.

So your customer asked for this and you decided to develop your code based on this. How this can be achieved? You can count the number of current orders and then increase it by one and use this number as new order number but after a while you will realize that customer is complaining about a “unique constraint” error (I hope they will otherwise this means you don’t have a unique key on order number and you will create duplicate orders with same number).

When you check the situation you will find out that these orders’ insert times are very close to each other. Basic problem here is when 2 people tries to insert a new order at the same time, you will get count of the rows (at the same time) and both session will get same number of rows and increase it by one and use it as order number but whichever session inserts the order row, the other one will get and unique constraint error (hopefully). so getting count of rows and increasing it does not guarantee that you will get a unique ID!

After that point some people understand what they are trying to do and make some research and stop themselves. Some other people will try to overcome this “obstacle” and come up with new ideas(!). one of them is adding a “for update” clause to select count(*)… statement so no one can run this select statement at the same time. This terrible idea will destroy your db performance and cause too much locks, wait events etc because for update will lock all rows that you select (in this case all rows in table) so it won’t be able to insert more than one order at the same time and also, a user who is trying to update an old row will also wait for completion of new order’s insert operation. so this is the worst solution. (it is not even a solution!)

a Second group of people are create a ORDER_NUMBER table and insert a rows in it to keep order number.

again they have to use a “For Update” clause to make sure no one is reading the table at the same time and every number will be unique! This solution (?!) will save you locking from all orders in orders table so the sessions which are trying to update a row will continue their job but sessions which are trying to insert orders at the same time have to wait each other. if this is a busy system (Let’s say 100.000 order per day) then wait on Order_Numbers table will be huge and your customer(s) will be complaining about slow processing.

Then what is the solution here? Solution is giving up to generate gapless sequential numbers. Gapless ID columns are nice for human eyes but certainly not needed for a software system. Since we have a limited capacity to remember, we want things in some specific order so we can track it but computer systems are capable of tracking billions of data without “forgetting” staff. Simply just use a SEQUENCE for Oracle (every db has an object for this purpose). Sequence is an object that you can query simultaneously and it guarantees that every number will be unique (unless you told it to do otherwise) whoever you are and whenever you asked for it. if so, can not we use sequences for our gapless ID column? not exactly.

Mostly we create sequences with a CACHE option. whenever you ask a new number from a sequence it will generate 20 numbers by default. this will increase the performance of the sequence but if you shutdown your database cached numbers will be gone! after opening database you won’t get those numbers back. so there will be a gap! Secondly, let’s say you get a number from sequence but before inserting your row you got an error and your transaction rollbacked. That number is also lost! again there is a gap.

As I said earlier, gapless (and fully sequential ) order numbers are a fantasy (mostly!) if my customer(s) force me to do this I would design my system with IDs which are generated from sequences and simply add a “rownum” or “row_number…” function while showing those rows to my customer so they can see their rows with and “Ordered” ID. of course that will bring some extra coding to you because when customer clicked a row to see the details you must be holding real ID and fetch data accordingly but this will save you from so much headache (locks, wait events etc).

In this case you might ask that what will happen if someone deletes a row from the table! Let’s say I have 5 rows and I used a row_number analytic function to generate fake IDs and then someone delete the second row so on the next time I will be showing 4 rows and 5th row with fake ID will be shown as 4 now. Please don’t forget for this kind of problems you wouldn’t be using DELETE anyway. even if you generate your IDs ordered and gapless, and if someone deletes a row the you will have a gap which you already don’t want that. so no one should be running delete otherwise there is no point requesting “gapless sequential ID”.

I defined this request as “fantasy” before but this is not entirely true. There might be some legal issues here for example your government might force you to keep Invoice Numbers ordered and fully sequential. So you might not have any option but accept this request. Or your customer might declared that they won’t work with you unless you do that! So to keep your customer you must do it and since you already developed your system with sequences you might not have enough time or budget to convert your software to a “rownum or row_number” fake id solutions.

Note: by saying “fully sequential”, an order’s number can not be bigger from the order that inserted after that row. you have 5 rows and you started to insert them (one row for every minute), their numbers must be: 1,2,3,4 and 5. they can not be 1,2,3,5 and 4 both gapless but not fully sequential.

One of my customer asked me a solution about that kind of problem and they asked it with worse conditions: Every customer’s order number must be gapless and fully sequential. As this condition is not enough, these numbers must be reset to 1 for every year. they want to show order numbers as:

2020-1
2020-2
2020-3
….
2021-1
2021-2 etc

Since they have more than one customer, they want these conditions per customer. every customers’ order number must be start with 1 and it must be reset every year. How nice…

still I don’t want to use “storing number into a table and selecting it with for update” solution. So I come up with a sequence solution has a little bit low performance (I won’t use sequence caching) and relatively complicated.

My package generates IDs for your customers. Sf_Get_Id function gets the customer ID as parameter and check for a specific sequence name pattern: SEQ_CUSTOMER_ID_<YEAR>_<CUSTOMER_ID> if it is not exist (this might be the first time you use the package or it might be the first call of the new year) it creates that sequence. as a con if you have many customers there will be many sequences and still there is a gap probability because you might get the ID but not use it ( you might get an error before insert your ID) but if some small gaps (a few per year) is acceptable then I would go with this solution (?!).

As I said a real solution would be using sequence (with caching and all the staff) and generate a fake ID per rows. I wouldn’t go with any of those unless I have to do it because of the law or something.

PS: whoever user will run this code have to have “create sequence” privilege directly not over a role or something.

thanks.