Updated Erasure Coding for warm storage (markdown)

Chris Lu 2019-06-10 21:46:27 -07:00
parent 12ba22aa79
commit aae5e9c854

@ -22,17 +22,17 @@ The scripts have 3 steps related to erasure coding.
### Erasure Encode Sealed Data
`ec.encode` command will find volum`es that are almost full and has been stale for a period of time.
The default command is `ec.encode -fullPercent=95 -quietFor=1h`. It will find volumes at least 95% of the maximum volume size, which is usually 30GB, and has no updates for 1 hour.
The default command is `ec.encode -fullPercent=95 -quietFor=1h`. It will find volumes at least 95% of the maximum volume size, which is usually 30GB, and have no updates for 1 hour.
### Data Repair
If disk fails or server fails, some data shards are lost. With erasure coding, we can recover the lost data shards from remaining data shards.
If disks fail or servers fail, some data shards are lost. With erasure coding, we can recover the lost data shards from the remaining data shards.
The default command is `ec.rebuild -force`.
The data repair happens for the whole volume, instead of one small file at a time. It is much more efficient and fast to reconstruct the missing data shards.
The data repair happens for the whole volume, instead of one small file at a time. It is much more efficient and fast to reconstruct the missing data shards than processing each file individually.
### EC data balancing
With servers added or removed, some data shards may not be laid out optimally. For example, one volume's 5 data shards could be on the same server. If the server goes down, the volume would be unrepairable or part of the data is lost permanently.
With servers added or removed, some data shards may not be laid out optimally. For example, one volume's 5 data shards could be on the same server. If that server goes down, the volume would be unrepairable or part of the data is lost permanently.
The default command is `ec.balance -force`. It will try to spread the data shards evenly to minimize the data shard loss risk.
@ -42,11 +42,11 @@ When all data shards are online, the read for one file key are assigned to one v
For example, one read request for 3,0144448888 will:
1. Ask master server to locate the EC shards for volume 3, which is usually a list of volume servers.
2. The read client can randomly pick one of the returned volume servers, A.
3. Server A will read its index file, find the right volume server B that has the file content. Sometimes it may have to contact additional servers if the file is split between multiple blocks. But usually the data shard has 1GB block size. So this does not happen often.
3. Server A will read its local index file, find the right volume server B that has the file content. Sometimes it may have to contact additional servers if the file is split between multiple blocks. But usually the data shard has 1GB block size. So this does not happen often.
In normal operations, there are usually one extra network hop compared to normal volume reads.
In case of missing data shards or read failures from server B, server A will try to collect as many pieces of data as possible from the remaining servers, and compute the required data.
In case of missing data shards or read failures from server B, server A will try to collect as many pieces of data as possible from the remaining servers, and reconstruct the requested data.
## Read Performance
@ -86,7 +86,7 @@ Then I force to erasure encode the volumes by `ec.encode -collection t -quietFor
Here is the normal EC read performance by `weed benchmark -master localhost:9334 -n 102040 -collection=t -write=false`.
You may need to run it twice because of some one-time read for the volume version. The EC read performance is about half of the normal volume read, because of the extra network hop.
You may need to run it twice because of some one-time read for the volume version. The EC read performance is about half of the normal volume read performance, because of the extra network hop.
```
------------ Randomly Reading Benchmark ----------