I have been dealing with table compression a lot during my development period and I still do. I wanted to share some information about table information but I believe this will be a long explanation so I will write them in parts and this is part1.
Table compression is a very useful feature of Enterprise Edition (unfortunately cannot be used in Standard Edition). Table compression will compress your data and it will reduce space requirement for the table. Not just on disk also in memory. When you compress a table, Oracle will store it compressed always so it will consume less memory space and I think this is very important for large databases with relatively small memory amounts.
Of course compression has a disadvantage and that’s why it is not default option for a table. as you can guess it has a “cpu overhead” which means your cpu will be doing some extra job about compressing and decompressing. Many times this overhead could be acceptable because you will have extra disk and memory space. Do not think this as just a space gain. while reading a data from disk (and loading into memory) you will read much more less data from disk. this will significantly increase your query performance because of less disk IO.
Basically there are 3 compression algorithm for tables:
- Advanced (if I am not wrong, it was called “FOR OLTP” compression at earlier relases)
- Hybrid Columnar (or Column Store)
there are important things about those compression methods. Most important one is ADVANCED compression requires “Advanced Compression License” so it will cost you a lot! you can find detail of advanced compression license detail from this link: https://www.oracle.com/technetwork/database/options/compression/advanced-compression-datasheet-134474.pdf
Other two, Basic and Hybrid compression does not require a license but Hybrid Columnar Compression (also called as HCC or Column Store compression) requires special storage systems. your storage must be support HCC.
you can compress a table with an Alter Table command or create table with a compression option. there are few important points. I’ve seen some clients that “mark” table as a compressed table but they don’t know that previous rows are not compressed! here is an example:
I created a dummy table and “mark” it as “compress” but size didn’t change and rows are not compressed! I know that because DBMS_Compression.get_compression_type returned 1 for a random row. 1 means no compression.
marking a table as “compress” means “consequent rows” will be compressed but not the existing ones. I used ROW STORE COMPRESS BASIC method above (or just “compress”) and only some specific operations can be compressed at that operation. This is the most basic compression but it has a HIGH compression ratio!
I just bulk inserted same rows (basically I doubled the number of rows) but table size increased only 4 MB not 12 MB.
So, if you want to compress existing rows, you must MOVE the table:
here it is, previous rows are also compressed. you can combine these operations with: ALTER TABLE TMP MOVE COMPRESS; it will compress currently existing rows. of course these operations will be invalidate indexes! you can use “update indexes” clause but it will increase compression time. I will talk about this at my later posts.
you can use different syntax:
Alter table tmp compress;
Alter table tmp row store compress;
Alter table tmp row store compress basic;
all commands above are the same. compress table with BASIC method. BASIC compression will compress new rows only if data is coming as BULK insert or an ARRAY insert. row by row inserts (insert into values (…) ) for example won’t be compressed. I mostly use this compression for previous months data which are not frequently accessed and when accessed, only queried not any DML operation because update also won’t be compressed.
Don’t let the naming mislead you. it is not more “advanced” than BASIC compression. actually it almost has the same compression level (HIGH) with basic compression but Advanced Table compression has some advantages. Most important one is it can be used for OLTP systems and by saying that can be used for active data not previous month data etc. performance overhead for currently DML operations are less than BASIC compression and it can even compress row by row inserts.
compression will not be done after a row processed (insert or update). when Block has reached to a threshold, Oracle will compress that block at the background. so your operations will run at fastest speed and even newly inserted / updated rows will be compressed. BUT it requires advanced compression license as I said before.
Hybrid Columnar Compression
this is the king of the compression’s. it will compress data much much much much more than first two and as you can guess it has a disadvantage. HCC is much more cpu intensive compression. it requires cpu usage far more than basic and advanced compression. Mostly used for less accessed rows. it has 2 methods which have 2 sub methods:
- Query Low (Lowest Compression Ratio, Lowest Cpu Usage)
- Query High
- Archive Low
- Archive High (Highest Compression Ratio, Highest Cpu Usage)
from 1 to 4, compression ratio increased but also cpu usage.
Basically, it uses a different logic in compression method. first two compression, compresses rows in the same blocks. writes same data once and uses a reference for actual places. Column store compression compresses columns not rows in a block and it is more logical because repetitive data mostly exists in same column, for example gender or customer id or product code etc.
So, at this post I wanted to share a real life example with compression ratios and speeds. of course I can not provide main data in here but I used a table with real data and copied it then compress the same table with different methods. Here is the results:
this small code, will compress table TMP with different methods and reports about timing and size. first one is NOCOMPRESS so we will see size of table without compression and then other methods results:
without compression this table’s size is 1408 MB. as you see BASIC and ADVANCED compression reduce the size to 280 MB but it takes 37-31 seconds to compress.
interestingly, Hybrid compression (column store) “query low” takes less than others, just 20 seconds and size reduced to 176 MB almost half of first two. You can also see that size is decreasing from “Query Low” to “Query High”. king is Query High, table size is 72 MB. 20x compression is very impressive but also as you can see, it takes 2 minutes to compress. Also, querying those compression methods will take significantly amount of time.
Compression is very useful if you use it accordingly. This is just a demonstration about compression level and basics of compression. I will mentioned about query performance of compression, CLOB compression, CLOB vs HCC compression etc at my later posts too.
that’s all for now. wish you all a health day.