Updated Remote Storage Architecture (markdown)

Chris Lu 2021-08-10 00:46:35 -07:00
parent fdd8b0d86e
commit 366262d2ca

@ -33,6 +33,8 @@ With this feature, SeaweedFS can cache data that is on cloud. It can cache metad
[HDFS|Mount|HTTP|S3|WebDAV] ==> Filer(metadata cache) ==> Volume Servers (data cache) ==> `weed filer.remote.sync` ==> Cloud
```
There are no format changes to the files on the cloud.
## Mount Remote Storage
The remote storage, e.g., AWS S3, can be [[configured|Configure Remote Storage]] and [[mounted|Mount Remote Storage]] directly to an empty folder in SeaweedFS.
@ -60,3 +62,9 @@ The cache is write back by the `weed filer.remote.sync` process.
If not starting `weed filer.remote.sync`, the cache will be just read only. Changes are not forbidden, but any data changes will not be propagated back to the cloud.
The asynchronous write back will not slow down any local operations.
# Possible Use Cases
* Machine learning training jobs need to repeatedly visit a large set of files. Increase training speed and reduce API cost and network cost.
* Saving data files. With cloud capacity and storage tiering, saving data files there may be a good idea. The cache can save the programming effort.
* Multiple access methods, HDFS/HTTP/S3/WebDav/Mount, to access remote storage. No need to use one specific way to access remote storage.