From 366262d2ca4a32973a01d330b2bcc8a13f00cbdf Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Tue, 10 Aug 2021 00:46:35 -0700 Subject: [PATCH] Updated Remote Storage Architecture (markdown) --- Remote-Storage-Architecture.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Remote-Storage-Architecture.md b/Remote-Storage-Architecture.md index 5ed067e..172c2b4 100644 --- a/Remote-Storage-Architecture.md +++ b/Remote-Storage-Architecture.md @@ -33,6 +33,8 @@ With this feature, SeaweedFS can cache data that is on cloud. It can cache metad [HDFS|Mount|HTTP|S3|WebDAV] ==> Filer(metadata cache) ==> Volume Servers (data cache) ==> `weed filer.remote.sync` ==> Cloud ``` +There are no format changes to the files on the cloud. + ## Mount Remote Storage The remote storage, e.g., AWS S3, can be [[configured|Configure Remote Storage]] and [[mounted|Mount Remote Storage]] directly to an empty folder in SeaweedFS. @@ -60,3 +62,9 @@ The cache is write back by the `weed filer.remote.sync` process. If not starting `weed filer.remote.sync`, the cache will be just read only. Changes are not forbidden, but any data changes will not be propagated back to the cloud. The asynchronous write back will not slow down any local operations. + +# Possible Use Cases + +* Machine learning training jobs need to repeatedly visit a large set of files. Increase training speed and reduce API cost and network cost. +* Saving data files. With cloud capacity and storage tiering, saving data files there may be a good idea. The cache can save the programming effort. +* Multiple access methods, HDFS/HTTP/S3/WebDav/Mount, to access remote storage. No need to use one specific way to access remote storage.