Insert multi line data using sqlldr

Hello Everyone,

I just answered an OTN question about inserting multi line data using sqlldr and simply wanted to share it here (https://community.oracle.com/thread/4278952)

so if have a data like this in your text file:

1,”this is

two line data”, 10

2,”this is

three line

data”, 11

3,”this is one line data”, 12

4,this, 13

you need to do some extra work to accomplish this problem. In my example I have 4 rows but 2 of them includes multi line character data. By default sqlldr uses “newline” character (actually maybe saying enter character could be more appropriate) and every new line will be accepted as new row! To solve this problem you need to change your “row delimiter/terminator” but this will bring some extra burden because your current data file (csv, txt etc) is needed to be changed and your “new” row delimiter must be added to end of every row!

this is my table:

This row delimiter could be more than one character and it should be because row delimiter must not be seen in your original data. for example if you have comma in your data you can not use comma as row delimiter because every comma will be count as end of row. In my example I will use $%&/| as row delimiter.

First change your data file and add new delimiter to your data:

1,”this is

two line data”, 10$%&/|

2,”this is

three line

data”, 11$%&/|

3,”this is one line data”, 12$%&/|

4,this, 13$%&/|

secondly, we need to specify this new delimiter in our control file:

probably you already saw “STR X’2425262F7C'”. this is specifying new row delimiter. X means this is an hex code and 2425262F7C is the hex code of my delimiter $%&/|

how to produce this:

select UTL_RAW.cast_to_raw(‘$%&/|’) from dual;
————————
2425262F7C

so you can use any combination for your delimiter and just simply use utl_raw to generate hex code and use this in controlfile.

rest is same. just run sqlldr command with this control file and it will load the rows as expected.

select * from tmp_x;

please consider that your data must be in another file than control file! you can not use “infile *” and BEGINDATA in this structure.

I hope this helps.

Using Temp instead of Undo for GTTs

Hello everyone,

with 12c we have a very nice new option; Temp Undo for GTTs (Global Temporary Tables). as you know temporary tables are used for storing temporary data. those tables data are stored in TEMP tablespace so they are not vital. Temporary tables can generates very little redo data so they are  faster than heap tables. if you experience an instance crash those data will be lost but as I said those data is not vital.

even we say gtts generates less redo, they still cause to generate redo because of UNDO! temporary table data does not need to recover after an instance crash but you might need to use a “rollback”. whenever you run a DML on a table Oracle will copy the original data to UNDO tablespace in case you need to rollback your operation and if you run a rollback, original data will be copied back to table from UNDO tablespace. By the way, that is why commit is so fast and rollback takes as much as the dml operation itself.

So, whenever you run a DML, oracle genrates 2 basic group of redo data. one for the change of table blocks itself and one for the change of undo blocks. GTTs does not generate (at least very little) redo data so they will have a better performance but your dml will cause original data to copy to UNDO and that will cause an UNDO block change and that will cause to redo log data. GTTs cause to generate redo log data indirectly. After your dml on the GTTs if you issue a rollback original data will be copied from undo to GTT.

With 12c, we have an option to choose where undo data to be written and we can choose to write undo data to temporary tablespace itself. Point is almost there won’t be any redo log data and that will make our GTTs much more faster. To achieve that you need to set a new database parameter called TEMP_UNDO_ENABLED. by default this parameter’s value is false and you can set in system or session.

Let’s demonstrate this but I would like to add a note before we go! if you use a GTT on a session, parameter’s value change won’t affect the result. before using the GTT you must set temp_undo_enabled parameter (or simply set in system level)

first to check the redo size I wrote a small package that will show us used redo size difference between the previous call of this package and current.

whenever you call pkg_get_stats.sp_get_redo_size procedure, it will write the difference yo dbms_output.

so Let’s start with a new fresh sqlplus session (always use sqlplus for tests because UIs might do some extra jobs which can affect your results). My test case will show us three things.

1- How much redo generated for dml operations (including table create command )
2- How much temp space is being used by that session
3- How much time required to complete the dmls

session1 without using temp undo:

I suppressed dml results. as you can see job completed in 24 seconds, generated ~733 KB redo and used 559KB temp space. Total generated data is around 1292KB.

Now let’s run the same code after enabling temp undo:

session 2 with temp undo:

as you can see total run time decreased to 15 from 24 which is a huge gain for performance. almost no redo generated but of course temp usage increased even the total amount (1118KB) is less than previous (1292 KB).

PROS:
1- You will probably get faster execution time! I say probably because if your undo tablespace has much more faster disk structures than temp then you might experience a performance loss! on my system they have same disk structure and disk performance.
2- You will generate less redo which is so much important. Don’t think this is just a performance issue. if you have a data guard on a distance location, every redo log data will be shipped to that location over network. decreasing the generated redo size will decrease your network traffic also the amount of job on data guard db because unnecessary temp table insert/update/delete operations won’t be run. Again a performance gain for everything.
3- You will need less UNDO space since your GTTs won’t use it anymore.

CONS:

1- You will need more space on TEMPORARY tablespace because it will use for undo data of GTTs
2- your TEMPORARY tablespaces should be on fast disks so you can get the performance gain.

In conclusion, I can’t wait to see the active usage results and I will active this on a customer’s db these days. I will also share the information about active usage.

thanks to all.

Edit: I already used this new feature on one of my customer’s report and got very good results. Performance of the report increased around %10 and redo size increased significantly but I want to be cautious before set this parameter database level and I use it on session level for specific operations.

Commit Time for an Inserted Row in Oracle

Hello everyone,

This week, one of my customer have experienced a problem. They provide some Web Service to their suppliers and one of those web services creates some rows and then returns the response as successful operation but just after that, supplier queries the newly created those rows but they get “no rows”! When analyze and development team came to me I simply look at the logs and fortunately they are storing their responses in a table with timestamp values too. After checking some tables, we find out that rows are created first and then responses are sent to caller. Also business logic is in that way. there is no way to return the response before the rows created.

So simply I asked about “commit time” because this is the only things that we can not measure in this case. rows might be created before sending the response but they might not committed yet so caller can not see them when querying with another session. Development team can not answer my question exactly because of very complicated framework logic. Somehow I need to solve this.

Basically I needed the commit time of that rows. while inserting a row you might use “sysdate” to understand when this row inserted but actually this is not enough and accurate because I might insert the row at 22:00:00 and write this date along with the row but might commit it at 22:01:00! Commit is 1 minute later. so when I check the row few hours later, I will “think” that this row is exist at 22:00:25 but actually it is not because it wasn’t committed yet but I don’t know that.

How can we check that? you might heard about “ora_rowscn” if not you can find some details here: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ORA_ROWSCN-Pseudocolumn.html#GUID-8071AAB0-F656-4C93-B926-0BCE1439F121

This is a pseudo column which means it is stored without your action in your table. ora_rowscn stores scn (system change number) information and by default table stores this data per your table blocks not per row! that means a bunch of rows has same ora_rowscn value because they are in the same block but you can change this behave and force Oracle to store ora_rowscn per row and this will give you an idea about the “commit time” of every row. let’s demonstrate:

I created a table with “ROWDEPENDENCIES” clause which allows Oracle to store ora_rowscn per row. then I inserted and committed 2 rows but after inserting the second row I am waiting for 10 seconds. when we check the rows they both have same inserted_date but they have different ora_rowscn values and ora_rowscn almost shows the “commit time”.

I said almost because Oracle does not guarantee that. ora_rowscn will show at least the “commit time” not before that but might show a little bit after the exact commit time. even tough, it gave us a clue. in my example I am inserting second row at 22:48:26 but commit has arrived at 22:48:36.

as a con, “row dependency” property can not altered after the creation of the table which means you can not change your existing table you need to create a new table. As you guess my customer’s table doesn’t store ora_rowscn per row so rhat did I do to solve it? Simply created a new table with the Id of base table and with rowdependencies, then added a trigger to base table. Whenever an insert runs against the table, I also inserted it to my new table with the ID of the row and then check the ora_rowscn from new table.

Also a reminder. storing ora_rowscn per row requires more space than per block(by default).

I hope that helps.

How to Get Your SID? USERENV vs V$MYSTAT!

Hello everyone,

This is an easy question. How to find your own session SID and when you ask Google for it, you will find many answers. While reading some documents, I realized that some Gurus are using V$MYSTAT to get current SID:

I always prefer USERENV(‘SID’) context to get SID and curious about which one is faster and consume less cpu resources. So I created this sample code:

as you see, USERENV(‘SID’) is much faster and consume less cpu resource than v$mystat method. so I will keep using it and suggest you do the same.

Goodbye Log Triggers Welcome Flashback Data Archive

Hello,

I would like to talk about Flashback Data Archive (or Flashback Archive – FBA) in 12c. FBA was introduced in 11g. It is not new but it has very important new features that allow us to use FBA very efficiently and for free.  I would like to talk about new features more than what it does and and how it works but let’s give a quick look.

What is FBA?

Basically FBA is a module that let you store historical information about data in your table. When you enable FBA for a table then Oracle will start to watch this table and store every change on the table. this is a large definition and it is not wrong because all the DML changes will be started to log but also physical changes will be recorded too. you will be able to flashback your table before truncate or any other table alter. if you dropped a column, you can take it back or vice versa.

What is new in 12c?

First of all, and I believe the most important one, it is free anymore! in 11g FBA was using compressed tables by default which requires “Advanced Compression License” and that means additional costs. in 12c this is an optional features. By default Oracle does not create those table as compressed so you don’t have to pay anything unless you don’t want to use compression option.

Secondly, FBA can store context information along the data changes anymore which I needed most and couldn’t use it in 11g just because of that. if you have a web application then probably  application will be using a common user and managing users itself. this causes you not to identify sessions because they are all same users but if you have a good developer team then you can ask them to set some context information like client_identifier. This data can be used to separate sessions and identify real users for example. With 12c FBA is able to store those information with changes and when we check historical data we can see all context information too.

Is FBA better than Log Triggers?

In my opinion, YES! of course there are many things to check but I will try to make a demonstration about performance of FBA.

This is my test case:

So we have two tables, TMP and TMP_FBA. I created a logging trigger on TMP and write every DML into TMP_LOG table with some context information like, client identifier, os_user, terminal etc. In this point you can see that my trigger is a for each row trigger and it will be writing every change one by one to log table. Some might use a compound trigger and store changed rows into a collection write it to log table at after statement section. This can optimize your logging while using Bulk DMLs but if your DMLs change too much rows then this can cause you consume too much PGA and memory problems. So I didn’t use it in my example. By the way to provide stability I created T_Base_Data table and I will use this to insert my original test tables.

My FBA is not compressed one (I didn’t use “optimize data” clause) as well as my Tmp_Log table too. I will do some DML and compare the performance. Also I want to compare size of the tables it will give useful information too. First I will insert some data with “insert select” statement then insert same data row by row using a for loop.

When we check timings, we see unbelievable  difference:

Trigger Bulk Insert: 22.13 seconds
FBA Bulk Insert       : 00.07 seconds

Trigger Row By Row Insert: 33.62 seconds
FBA Row By Row Insert       : 05.12 seconds

so for performance of our Insert statements, winner is definitely FBA. if we check log sizes, Our log table which is inserted by trigger, has reach to 112MB but FBA related objects are 35MB. One of the best things about FBA is it does not generate much INSERT log records because the original data is already in our table. This feature has already given us a lot of space. So we can say that about logging size FBA is winner again!

PS: While running my codes, I want you to know that only FBA table is Tmp_FBA so while checking size of FBA related object I used “object_name like ‘SYS_FBA%” condition. I will explain those objects at the end.

Let’s run some UPDATE:

Trigger Update : 53.38 seconds
FBA Update        : 02.25 seconds

Trigger Log Size : 96 MB (208 – 112)
FBA Log Size        : 91 MB (126 – 35)

Winner is still FBA.

Delete:

Trigger Delete : 48.24 seconds
FBA Delete        : 01.92 seconds

Trigger Log Size : 104 MB (312-208)
FBA Log Size        : 48 MB (174-126)

and winner is again FBA!

Everything is awesome but how can we see our logs in FBA? where are those logs? of course we can check the tables that FBA created automatically but there is a better way to see logs, Flashback Query:

you don’t even need to find the log table, just a flashback query will be enough to see historical data.

In my example I inserted 2 times all dba_objects into t_base_data table and I used this table to insert 2 times again into Tmp_FBA, that is why you see 4 COL$ tables.

finally, if you want to see FBA tables:

SYS_FBA_DDL_COLMAP_nnnnn is used to store column changes.
SYS_FBA_TCRV_nnnnn is used to store transaction informations.
SYS_FBA_HIST_nnnnn is used to store data changes.

also there are 2 default indexes on those tables.

Why FBA is so much faster?

trigger logging cause 2 actions. first calling a trigger which is a plsql object and then running an another insert statement. That means too much job to complete and context switch between sql and plsql.

FBA is using UNDO segments so basically it does no extra job! whenever you run a DML statement, Oracle copies all data which you are about to change to undo segments. if you commit, undo segments become obsolete (unless there is no select actively running) but if you rollback then all data in undo segments copied back to original table blocks. that is why commit is too fast but rollback is slow. Anyway, FBA reads undo segments which means your DML already generated undo blocks and FBA just read and save them. That’s all.

How about the Security?

One more time, FBA is the winner! You can not modify FBA related tables and by saying modify we mean any DML or DDL. even if SYS user can not drop or delete FBA related tables:

any user with drop any table or delete any table can delete your trigger base logging table but not with FBA! that brings a huge security advantage. of course a user who has flashback archive administer privilege can remove FBA from your table but this will be an obvious action because previous data will be also lost!

In Conclusion

Based on results of my test case I decided to convert all my log structure to FBA but there are a few more tests that I must complete first like checking PMOs (partition management operation), compression on FBA (since I have advanced compression license) etc.

thanks for reading.

How to call a Web Soap Service or API from PLSQL

Hello Everyone,

I would like to share one of my codes which allows you to call a web service (soap service) or API through PLSQL. I modified it time to time whenever something new comes up and I still do. I hope it helps you too.

 

Useful 12c new features for developers

Hello Everyone,

I just upgraded one of my customers database to 12c R2 and it has really nice features so I wanted to share some of them for developers.

  1. Top N rows:
    this is a late feature but finally arrived. Before 12c if we want to first N rows from an ordered dataset we would have to write something like this:

    with 12c we can finally can get rid of that subquery with new FETCH clause. Equivalent version of sql’s above in 12c is:

    if you check execution plan for this new query you will see that it uses Row_Number() analytic function to fetch rows:
  2. Table Column Default On Null:
    Before 12c we can assign a default value to a column but this will work only if user does not use that column in insert script. Also if user sends NULL as value, your default value won’t be use again. after 12c we are able to define a value even user sends NULL in insert script:

    if you are someone like me who almost hates NULL values, this is a great feature.
  3. Identity Column and Sequence Nextval as Default Column Value
    First, you can assign a sequence next value to a column default value.

    so this is not just a shortcut for writing insert statements. many companies (unfortunately) use triggers to assign id column value which is a complete disaster!
    Rule of Thumb: Triggers are evil! if you have option not to use trigger, do not use trigger!
    Let’s check performance of sequence default value and using trigger for that job.

    so it is around 7 times faster then trigger! Yupp!

    Also you can create identity columns anymore and don’t have to deal with sequences:

    this is not magical of course, let’s check which objects we just created:

    as you can see there is a new sequence called ISEQ$$_76120. This is created for our identity column. In the end, identity data is generated from a sequence. check default data for identity column:

    Oracle just created a sequence and assign it as default value of that column.

  4. Creating PLSql Functions in SQL:
    This is a great feature. It is probably one of the worst thing to use a plsql function in Sql! this will cause context switch and decrease your Sql Performance.

     

There are many different new features of course but those are my favorite ones. Hope you like it.

See you soon.