Updated Erasure Coding for warm storage (markdown)

Chris Lu 2019-06-10 23:59:57 -07:00
parent 1dd4e96b01
commit 713221908f

@ -1,6 +1,6 @@
Usually data is hot when it is fresh, and are accessed very often. SeaweedFS normal volumes tries hard to minimize the disk operations, but it comes with a cost of loading indexes in memory.
However, data can become warm or cold after a period of time. They are accessed much less often. The high cost of memory is not cost-efficient for warm storage. To store them more efficiently, you can "seal" the data and enable erasure coding.
However, data can become warm or cold after a period of time. They are accessed much less often. The high cost of memory is not cost-efficient for warm storage. To store them more efficiently, you can "seal" the data and enable erasure coding (EC).
## Benefit
* **Storage Efficiency**: SeaweedFS implemented RS(10,4), which allows loss of 4 shards of data with 1.4x data size. Compared to replicating data 5 times to achieve the same robustness, it saves 3.6x disk space.
@ -10,7 +10,16 @@ However, data can become warm or cold after a period of time. They are accessed
* **Memory Efficiency** Minimum memory usage. The volume server does not load index data into memory.
* **Fast Startup** Startup time is much shorter by skip loading index data into memory.
* **Rack-Aware** data placement to minimize impact of volume server and rack failures.
* **No Minimum Server Limit** No requirement for large amount of servers. SeaweedFS manage erasure coding data via volumes. If the number of servers is less than 4, this can protect against hard drive failures. If the number of servers is greater than 4, this can protect against server failures. If the number of racks is greater than 4, this can protect against rack failures.
* **Flexible Server Layout** There are no minimum number of servers, or racks. SeaweedFS manages erasure coding data via volumes. If the number of servers is less than 4, EC can protect against hard drive failures. If the number of servers is greater than or equal to 4, EC can protect against server failures. If the number of racks is greater than 4, EC can protect against rack failures.
## Architecture
SeaweedFS implemented 10.4 Reed-Soloman Erasure Coding (EC). The large volumes are split into chunks of 1GB, and every 10 data chunks are also encoded into 4 parity chunks. So a 30 GB data volume will be encoded into 14 EC shards, each shard is of size 3 GB and has 3 EC blocks.
Since the data is split into 1GB chunks, usually one small file is contained in shard, or possibly two shards in edge cases. So most reads still only cost O(1) disk read.
For smaller volumes less than 10GB, and for edge cases, the volume is split into smaller 1MB chunks.
The 14 EC shards should be spread into disks, volume servers and racks as evenly as possible, to protect against the hardware failure caused data loss.
## How to enable it?
Run `weed scaffold -conf=master` to generate a `master.toml` file, put it in current directory, `~/.seaweedfs/`, or `/etc/seaweedfs/`.