mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
Created Cloud Tier (markdown)
parent
aa5069fd17
commit
013d3d89ad
51
Cloud-Tier.md
Normal file
51
Cloud-Tier.md
Normal file
|
@ -0,0 +1,51 @@
|
||||||
|
## Motivation
|
||||||
|
Cloud storage is an ideal place to backup warm data. Its storage is scalable, and cost is usually low compared to on-premise storage servers. Uploading to the cloud is usually free. However, usually the cloud storage access is not free and slow.
|
||||||
|
|
||||||
|
SeaweedFS is fast. However, it is limited by available number of volume servers.
|
||||||
|
|
||||||
|
One good way is to combine SeaweedFS with the cloud storage.
|
||||||
|
|
||||||
|
Assuming hot data is 20% and warm data is 80%. We can move the warm data to the cloud storage. The access for the warm data will be slower, but this can free up 80% servers, or repurpose them for faster local access, instead of just storing warm data with little access. This integration is all transparent to SeaweedFS users.
|
||||||
|
|
||||||
|
This transparent cloud integration literally gives SeaweedFS unlimited capacity, in addition to its fast speed. Just add more local SeaweedFS volume servers to increase the throughput.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
If one volume is tiered to the cloud,
|
||||||
|
* The volume is marked as readonly.
|
||||||
|
* The index file is still local
|
||||||
|
* The `.dat` file is moved to the cloud.
|
||||||
|
* The same O(1) disk read is applied to the remote file. When requesting a file entry, a single range request retrieves the entry's content.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
1. Use `weed scaffold -conf=master` to generate `master.toml`, tweak it, and start master server with the `master.toml`.
|
||||||
|
1. Use `volume.tier` in `weed shell` to move volumes to the cloud.
|
||||||
|
|
||||||
|
## Configuring Storage Backend
|
||||||
|
(Currently only s3 is developed. More is coming soon.)
|
||||||
|
```
|
||||||
|
[storage.backend]
|
||||||
|
[storage.backend.s3.default]
|
||||||
|
enabled = true
|
||||||
|
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||||
|
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||||
|
region = "us-west-1"
|
||||||
|
bucket = "one_bucket" # an existing bucket
|
||||||
|
```
|
||||||
|
|
||||||
|
After this is configured, you can use this command.
|
||||||
|
|
||||||
|
```
|
||||||
|
// move the volume 37.dat to the s3 cloud
|
||||||
|
volume.tier -dest=s3 -collection=benchmark -volumeId=37
|
||||||
|
// or
|
||||||
|
volume.tier -dest=s3.default -collection=benchmark -volumeId=37
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Data Layout
|
||||||
|
The dat file on the cloud will be laid out following best practices. Especially, the name is a randomized UUID to ensure the dat file can be spread out evenly.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue