Created Cloud Tier (markdown)

2024-01-19 02:48:24 +00:00 · 2019-12-03 23:18:12 -08:00 · 2019-12-03 23:18:12 -08:00 · 013d3d89ad
parent aa5069fd17
commit 013d3d89ad
1 changed files with 51 additions and 0 deletions
--- a/Cloud-Tier.md
+++ b/Cloud-Tier.md
@ -0,0 +1,51 @@
 ## Motivation
 Cloud storage is an ideal place to backup warm data. Its storage is scalable, and cost is usually low compared to on-premise storage servers. Uploading to the cloud is usually free. However, usually the cloud storage access is not free and slow. 
 SeaweedFS is fast. However, it is limited by available number of volume servers. 
 One good way is to combine SeaweedFS with the cloud storage.
 Assuming hot data is 20% and warm data is 80%. We can move the warm data to the cloud storage. The access for the warm data will be slower, but this can free up 80% servers, or repurpose them for faster local access, instead of just storing warm data with little access. This integration is all transparent to SeaweedFS users.
 This transparent cloud integration literally gives SeaweedFS unlimited capacity, in addition to its fast speed. Just add more local SeaweedFS volume servers to increase the throughput.
 ## Design
 If one volume is tiered to the cloud, 
 * The volume is marked as readonly.
 * The index file is still local
 * The `.dat` file is moved to the cloud. 
 * The same O(1) disk read is applied to the remote file. When requesting a file entry, a single range request retrieves the entry's content.
 ## Usage
 1. Use `weed scaffold -conf=master` to generate `master.toml`, tweak it, and start master server with the `master.toml`.
 1. Use `volume.tier` in `weed shell` to move volumes to the cloud.
 ## Configuring Storage Backend
 (Currently only s3 is developed. More is coming soon.)
 ```
 [storage.backend]
 	[storage.backend.s3.default]
 	enabled = true
 	aws_access_key_id     = ""     # if empty, loads from the shared credentials file (~/.aws/credentials).
 	aws_secret_access_key = ""     # if empty, loads from the shared credentials file (~/.aws/credentials).
 	region = "us-west-1"
 	bucket = "one_bucket"          # an existing bucket
 ```
 After this is configured, you can use this command.
 ```
 // move the volume 37.dat to the s3 cloud
 volume.tier -dest=s3 -collection=benchmark -volumeId=37
 // or
 volume.tier -dest=s3.default -collection=benchmark -volumeId=37
 ```
 ## Data Layout
 The dat file on the cloud will be laid out following best practices. Especially, the name is a randomized UUID to ensure the dat file can be spread out evenly.