Guid vs Sequences Mustafa, 2023-12-29 Hi Everyone, I haven’t written any posts lately because there is so much going on in my life and biggest change is I am moving to the Netherlands. it is a big change for me and my family but I hope everything will be fine (if we can find a house to rent because it is really a problem 🙂 ). Before the end of 2023, I would like to write about something small. Sequences vs Guid on database. I discussed this topic on oracle community many years ago and saw it lately and wanted to point a few things. SYS_GUID() function generates guid on Oracle database and it might be quite fast, even faster than sequences since sequences run internal commits to provide consistency but it has some disadvantages. 1- Space usage Guids are 16 bytes data. smallest number to use 16 bytes should be 10000000000000000000000000001. you can use vsize to check the sizes: select vsize(sys_guid()) , vsize(10000000000000000000000000001)from dual ; VSIZE(SYS_GUID()) VSIZE(99999999999999999999999999999) ----------------- ------------------------------------ 16 16 1234 select vsize(sys_guid()) , vsize(10000000000000000000000000001)from dual ;VSIZE(SYS_GUID()) VSIZE(99999999999999999999999999999)----------------- ------------------------------------ 16 16 this is a huge number, probably there won’t be much system which use so many numbers as keys. that means if you start from 1 on your sequence, for billion of rows, it will require relatively small amount of space to store numbers but guids will require much more: create table tmp_guid as select sys_guid() id from dual connect by level <= 1000000; create sequence seq cache 100000; create table tmp_seq as select seq.nextval id from dual connect by level <= 1000000; select segment_name, round(bytes/1024/1024) size_in_KB from user_Segments where segment_name in ('TMP_SEQ', 'TMP_GUID'); SEGMENT_NAME SIZE_IN_KB ------------- ---------- TMP_GUID 25 TMP_SEQ 13 12345678910 create table tmp_guid as select sys_guid() id from dual connect by level <= 1000000; create sequence seq cache 100000;create table tmp_seq as select seq.nextval id from dual connect by level <= 1000000; select segment_name, round(bytes/1024/1024) size_in_KB from user_Segments where segment_name in ('TMP_SEQ', 'TMP_GUID');SEGMENT_NAME SIZE_IN_KB------------- ----------TMP_GUID 25TMP_SEQ 13 it almost double the size. 2- Performance This one is tricky. Let’s do a test: Oracle PL/SQL drop sequence seq; create sequence seq cache 100000; set timing on declare rr raw(32767); begin for i in 1..100000 loop rr := sys_guid(); end loop; end; / Elapsed: 00:00:00.125 declare rr simple_integer := 0; begin for i in 1..100000 loop rr := seq.nextval; end loop; end; / Elapsed: 00:00:00.567 1234567891011121314151617181920212223242526 drop sequence seq; create sequence seq cache 100000; set timing on declare rr raw(32767);begin for i in 1..100000 loop rr := sys_guid(); end loop;end;/ Elapsed: 00:00:00.125 declare rr simple_integer := 0;begin for i in 1..100000 loop rr := seq.nextval; end loop;end;/ Elapsed: 00:00:00.567 for 100.000 generation, guid takes 0.1 second and sequence takes 0.5 second. ratio is big but time difference is not much significant. still guid is 5 times faster or it seems so 🙂 Guid generation in oracle relies on operating system. sys_guid() function is actually calling OS guid generation function. the example above is from a database which resides on Linux host. here is the results from a db which resides on Windows 11 host: set timing on declare rr raw(32767); begin for i in 1..100000 loop rr := sys_guid(); end loop; end; / Elapsed: 00:00:07.778 declare rr simple_integer := 0; begin for i in 1..100000 loop rr := seq.nextval; end loop; end; / Elapsed: 00:00:01.000 12345678910111213141516171819202122 set timing on declare rr raw(32767);begin for i in 1..100000 loop rr := sys_guid(); end loop;end;/ Elapsed: 00:00:07.778 declare rr simple_integer := 0;begin for i in 1..100000 loop rr := seq.nextval; end loop;end;/Elapsed: 00:00:01.000 suddenly, guid took 7.7 seconds. almost 80 times slower on Windows OS. the reason of this can be seen by querying data: select sys_guid() from dual connect by level <=5; Output On Linux: 0DA41BD1A0D<strong>3</strong>621DE0634406C40A46BD 0DA41BD1A0D<strong>4</strong>621DE0634406C40A46BD 0DA41BD1A0D<strong>5</strong>621DE0634406C40A46BD 0DA41BD1A0D<strong>6</strong>621DE0634406C40A46BD 0DA41BD1A0D<strong>7</strong>621DE0634406C40A46BD Output On Windows: 22C5DAC9C6D646CF938C3669429D338A A9287E15E31F42308823656212D0E49B 154F601EFD244B3886D07CF5F95DA799 97C2D3861FA7493A8CAEEF530D17DC2F 8234A4410800413A9C9C90AC2CF201E7 123456789101112131415 select sys_guid() from dual connect by level <=5; Output On Linux:0DA41BD1A0D<strong>3</strong>621DE0634406C40A46BD0DA41BD1A0D<strong>4</strong>621DE0634406C40A46BD0DA41BD1A0D<strong>5</strong>621DE0634406C40A46BD0DA41BD1A0D<strong>6</strong>621DE0634406C40A46BD0DA41BD1A0D<strong>7</strong>621DE0634406C40A46BD Output On Windows:22C5DAC9C6D646CF938C3669429D338AA9287E15E31F42308823656212D0E49B154F601EFD244B3886D07CF5F95DA79997C2D3861FA7493A8CAEEF530D17DC2F8234A4410800413A9C9C90AC2CF201E7 Linux generates guid in an order (which is better by the way, you can use sorting in sql) but windows guid is more randomize. because of this randomization, windows guid generation takes much more time. So be careful on this one. Guid is really a good option and much faster then sequences but will require more disk space. Also, it is not good looking to human eye but of course this shouldn’t be a reason. Also, sys_guid() funciton almost has no concurrency problems. if you start to call a sequence from different sessions so many times (even the cache size is big) there will be some library cache: mutex X Concurrency. This will not happen on guids. here is an example: truncate table tmp_guid; begin for i in 1..5 loop dbms_scheduler.create_job('aj_test'||i, job_type=>'PLSQL_BLOCK', job_action=> 'begin insert into tmp_guid select sys_guid() from dual connect by level < 10000000; end;', start_date=>sysdate, enabled=> true); end loop; end; / select run_duration from user_scheduler_job_run_Details where job_name like 'AJ_TEST%' order by log_Date desc; --------------------------------------------------------- +00 00:00:07.000000 +00 00:00:07.000000 +00 00:00:07.000000 +00 00:00:06.000000 +00 00:00:06.000000 /*********************************************************/ drop sequence seq; create sequence seq cache 1000000; truncate table tmp_seq; begin for i in 1..5 loop dbms_scheduler.create_job('aj_test_seq'||i, job_type=>'PLSQL_BLOCK', job_action=> 'begin insert into tmp_seq select seq.nextval from dual connect by level < 10000000; end;', start_date=>sysdate, enabled=> true); end loop; end; / select run_duration from user_scheduler_job_run_Details where job_name like 'AJ_TEST_SEQ%' order by log_Date desc; --------------------------------------------------------- +00 00:01:21.000000 +00 00:01:21.000000 +00 00:01:20.000000 +00 00:01:20.000000 +00 00:01:18.000000 123456789101112131415161718192021222324252627282930313233343536373839404142434445 truncate table tmp_guid; begin for i in 1..5 loop dbms_scheduler.create_job('aj_test'||i, job_type=>'PLSQL_BLOCK', job_action=> 'begin insert into tmp_guid select sys_guid() from dual connect by level < 10000000; end;', start_date=>sysdate, enabled=> true); end loop;end;/ select run_duration from user_scheduler_job_run_Details where job_name like 'AJ_TEST%' order by log_Date desc;---------------------------------------------------------+00 00:00:07.000000+00 00:00:07.000000+00 00:00:07.000000+00 00:00:06.000000+00 00:00:06.000000 /*********************************************************/ drop sequence seq;create sequence seq cache 1000000; truncate table tmp_seq; begin for i in 1..5 loop dbms_scheduler.create_job('aj_test_seq'||i, job_type=>'PLSQL_BLOCK', job_action=> 'begin insert into tmp_seq select seq.nextval from dual connect by level < 10000000; end;', start_date=>sysdate, enabled=> true); end loop;end;/ select run_duration from user_scheduler_job_run_Details where job_name like 'AJ_TEST_SEQ%' order by log_Date desc;--------------------------------------------------------- +00 00:01:21.000000+00 00:01:21.000000+00 00:01:20.000000+00 00:01:20.000000+00 00:01:18.000000 as you see 5 concurrent sessions took at most 7 seconds to complete 10 million row insertion but for sequences it takes over 80 seconds. Also, as a reminder, I used 1 million cache size on sequence to increase the speed. That is all for now. I wish you all a happy new year. I hope the world found peace this year. Of course this is what everyone wishes every year and wishes are ideal but history is full of violence (this was from a movie if I am not wrong). Still, let’s hope the best. 11g 12c 18c 19c 21c 23ai Administration Development Performance guid performancesequence vs sys_guidsequnce vs guidsys_guid