Today while talking with a friend he mentioned about using a RAID 5 on disks. Since I am not experienced with RAID structures, after the conversation, I started to read about RAID 5. of course I know what RAID is and basically how it works but don’t have information about all types. so while looking at RAID 5 I saw that term “parity”.
Basically, a data block A wil be written to the disks if you have a RAID 5 configuration with 4 disk then the data will be dived into 3 pieces (a1,a2,a3) and each part will be written on a disk. this will improve your performance of course but on the last disk there is a part called “parity”. what is that? for example on ASM if we are using our disk groups with NORMAL redundancy, every piece written on a disk will be also written to another disk so data will be protected against disk corruption or lost. RAID 5 configuration is also protect your data against disk lost etc but it does not copy every data piece (a1,a2,a3) to other disks. it is just using this “parity”.
At first I thought it is a combination of all pieces (which means whole data) but this would be ridicules because you would be loose all performance gain and I started digging. What I found is so simple (and probably many of you already knew about this) and clever, also new to me. So I wanted to share.
Parity is a result of XOR Gate and produces brilliant results. Let’s assume our data pieces (a1,a2,a3) as simple binary data like:
as a short explanation of XOR gate, it will produce 1 if inputs are different and 0 if they are same. so;
1 XOR 1 = 0
0 XOR 0 = 0
1 XOR 0 = 1
0 XOR 1 = 1
so XOR of a1 and a2 is:
parity is the combination of three pieces (in my example) with an XOR gate
a1 a2 a3
1001 XOR 0111 XOR 1011
so order of execution is (a1 XOR a2) XOR a3. Result of this operation is 0101 (you can use online xor calculators like: https://toolslick.com/math/bitwise/xor-calculator) and remember leading zeros will be suppressed so if you see 101 as result, it is same with 0101. result of 2 xor operation is 0101 which is our “parity”. this data is written to last disk but how does this protect us from a data loss? This is the part where we show the power of XOR. Let’s say we lost disk2 (a2 is missing), all we have a1, a3 and parity which are:
a1 a3 parity
1001 1011 0101
so let’s put them into another XOR gate:
1001 XOR 1011 XOR 0101 = 0111 (a2)
we have our a2 piece back. this will work any of missing pieces and it is really efficient but of course real data blocks are much more bigger and they have many bits to pass through XOR gate of course this will decrease your performance a little bit but still will be faster than just one disk.