Updated Erasure coding for warm storage (markdown)

Chris Lu 2019-06-06 00:59:13 -07:00
parent 43420f2eac
commit 3a427c8a8e

@ -19,14 +19,16 @@ The scripts has 3 steps.
The default command is `ec.encode -fullPercent=95 -quietFor=1h`. It will find volumes at least 95% of the maximum volume size, which is usually 30GB, and has no updates for 1 hour.
### Data Repair
The default command is `ec.rebuild -force`. If disk fails or server fails, some data shards are lost. With erasure coding, we can recover the lost data shards from remaining data shards.
If disk fails or server fails, some data shards are lost. With erasure coding, we can recover the lost data shards from remaining data shards.
The data repair happens for the whole volume, instead of one file at a time. It is much more efficient.
The default command is `ec.rebuild -force`.
The data repair happens for the whole volume, instead of one small file at a time. It is much more efficient and fast to reconstruct the missing data shards.
### EC data balancing
With servers added or removed, some data shards may not be laid out optimally. For example, one volume's 5 data shards could be on the same server. If the server goes down, the volume would be unrepairable or part of the data is lost permanently.
The default command is `ec.balance -force`. It will try to spread the data shards to minimize the data loss risk.
The default command is `ec.balance -force`. It will try to spread the data shards evenly to minimize the data shard loss risk.
## How the read works?
When all data shards are online, the read are randomly assigned to one volume server (A) that has at least one data shard. Server A will read its copy of index file, and locate the volume server (B), and read from server B for the file.