RAID5 as an Error-correcting Code

五 13 九月 2024

As a researcher in the field of quantum error correction, I have cited it in the application section of my slides many times, that error correcting codes are everywhere, including memories like RAID and teleportation procotol like 5G and 6G. But it is only recently I learned how eaactly the error correcting codes are used in RAID5, when I planned to but a NAS storage for myself and study how it works.

As seen in this images for a NAS with 4 disks, all storages have been divided into strips. The strips names with numbers like A1,A2,B1,..., are normal data storages. The stipes named with index p are parity computed from stipes with same alphabet name. For example, \(A_p = A1+A2+A3 mod 2\). Equivalently, \(A1+A2+A3+A_p=0 mod 2\). With this extra parity, if any one of them is lost, it can be recovered from the parity of the other three.

From the view of error correcting codes, this protocal encodes three bits of (logical) information into four (physical) bits. The encoded messaged, \(m=[A1 A2 A3 A_p]\) as binary vector form, should satisfy the parity check condition \(Hm^T=0\), where \(H = [1 1 1 1]\). This error correcting code has parameters \([n,k,d]=[4,3,2]\), and it is indeed the dual code of repetition code. Normally a code with distance \(d=2\) could only detect single error but not correcting them. But in NAS storage, it is not an unknown erroneous bit, but actually an erasure channel. The issue is about a lost disk and we know which one it is. Distsance 2 is enough to recover single erasure.

For now we have seen how the error correcting code is used in RAID5, with only the A sectors. Then why there are B, C, Ds in the scheme? This results from balancing reading and writing speed. When writing new data, the processor need to finish the following: write data into one disk; read other two disks; compute the parity; write parity into the fourth disk. Ignore the delay of computing parity, it read two disks and write two disks. Normally, a disk can do read and write at the same time, and read is much faster than writing. Hence, to read peak performance, it is better to distribute reading and writing tasks evenly among all disks. That why we see strips B,C,D with parity strip in different location. In RAID5, the strip length is 64k (65536 * 8 = 524288 bits ).

RAID5 can recover a lost disk among any number of disks, just by extending the alphabet list of A,B,C,D,E,.... Then a question would be, how can one recover from two lost disks.

Say if A1 and B1 are lost, based on RAID5, one can recover A1+B1. In order to recover both of them, one should have A1 or B1 contained in an extra parity bit. To be more general, in order to recover any two disks, then any two disks should been computed from at least one different parity bit. To achieve this, one can add unique index \(i,j\) for each disk and assign index \(l\) for parity bits, then let parity \(P_l\) computing parity for all \(i=l\) and \(j=l\). This ensure each disk have parity info in two different parity bits.

P o o
o P o
o o P
# 2D array to recover from any two lost disks
# P for parity disk
# o for data disk

Let me explain the procotal again. We are actually distribute the disks in a 2D array. The disagnal terms are for parity computations, which compute all bits in its row and column.

This procotal is enough, but may not be the most efficient one. Since to recover x,y from x+y, one only need to know x or y, but not both x and y. I will leave this for future explorations.

Category: blog Tagged: raid5 QEC error-correcting-code Published by Weilei Zeng


LLM and GPT

一 06 五月 2024

When GPT3 comes out, I have no access since one have to pass hte GFW and pay for it. I tried out a few open source platforms, inlcuding

  • minGPT4
  • stable difussion web
  • nanoGPT

The model behind it is llama and vicuna

Category: blog Tagged: gpt Published by Weilei Zeng

Read More

Update my website again

日 05 五月 2024

I haven't update my website since 2021, during which I lost the domain "weileizeng.com" as well. I was looking for a Chinese service provider to host my site, but gitee pages have been down for a while, and Aliyun doesn't provide a good domain. Luckily the GitHub pages still …

Category: blog Tagged: website python Published by Weilei Zeng

Read More

CCF CQCC 2024 to be hosted at Changsha

六 04 五月 2024

CCF量子计算大会将于2024年8月3日-4日在长沙举行,主题聚焦“量子计算与计算机学科的交融共进”。大会由量子计算和计算机学科的13位院士和4位权威专家领衔组成指导委员会,邀请到量子计算机实现的判定依据 …

Category: news Tagged: CCF CQCC Changsha Published by manager

Read More

Publications of Weilei Zeng

六 04 五月 2024
[1] Weilei Zeng and Leonid P Pryadko. Higher-dimensional quantum hypergraph-product codes with finite rates. Physical Review Letters, 122(23):230501, 2019. [ DOI ]
[2] W. Zeng, A. Ashikhmin, M. Woolls, and L. P. Pryadko. Quantum convolutional data-syndrome codes. In 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications …

Category: about Published by ZWL

Read More

"In China, GitHub Is a Free Speech Zone for Covid Information"

五 01 一月 2021

WHEN THE CORONAVIRUS first spread through China in January, Chinese PhD student Weilei Zeng watched the pandemic unfold online from his apartment in Riverside, California. Thousands of miles from home, he frantically tried to keep up with news of the crisis, following the rare outpouring of discontent that flooded Chinese …

Category: alembic Published by Yi-Ling Liu

Read More
Page 1 of 4

Next »