Updated Erasure Coding for warm storage (markdown)

Chris Lu 2019-06-18 12:45:28 -07:00
parent 3c26f11946
commit 93e1bc51ff

@ -14,6 +14,12 @@ However, data can become warm or cold after a period of time. They are accessed
* **Rack-Aware** data placement to minimize impact of volume server and rack failures.
* **Flexible Server Layout** There are no minimum number of servers, or racks. SeaweedFS manages erasure coding data via volumes. If the number of servers is less than 4, EC can protect against hard drive failures. If the number of servers is greater than or equal to 4, EC can protect against server failures. If the number of racks is greater than 4, EC can protect against rack failures.
The downside:
* If some EC shards are missing, fetching data on those shards would be slower.
* Re-construct missing EC shards would require transmitting whole volume data.
* current EC volumes can not have blob deletion as of now. The blob deletion is working in progress.
* Compaction would require transmitting whole volume data.
## Architecture
SeaweedFS implemented 10.4 Reed-Soloman Erasure Coding (EC). The large volumes are split into chunks of 1GB, and every 10 data chunks are also encoded into 4 parity chunks. So a 30 GB data volume will be encoded into 14 EC shards, each shard is of size 3 GB and has 3 EC blocks.