diff --git a/Cloud-Cache-Benefits.md b/Cloud-Cache-Benefits.md index 07ebfa5..c46f006 100644 --- a/Cloud-Cache-Benefits.md +++ b/Cloud-Cache-Benefits.md @@ -68,15 +68,18 @@ However, how to make SeaweedFS work with data already on cloud? * Users can also choose to never uncache, basically treating cloud copy as a backup. * Big Data * Problem - * Run MapReduce, Spark, and Flink jobs on mounted folders for faster computation. + * Run MapReduce, Spark, and Flink jobs on cloud data is slow due to metadata operations. + * Repeated data access increases unnecessary cost. + * May need to work with the cloud ecosystem. * With SeaweedFS Cloud Cache * Avoiding slow cloud storage metadata access. - * Large amount of data access will not increase cost. + * Access data only once. * Write back data to work with cloud ecosystems. * Cloud Storage Vendor Agnostic * Problem - * Different datasets may need to be on different vendors, based on access pattern, latency, cost, etc. - * Transparently switch to from one vendor to another. + * Different datasets may need to be on different vendors, based on access pattern, latency, cost, etc. + * With SeaweedFS Cloud Cache + * Transparently switch to from one vendor to another. * Move Off Cloud * Problem * Cloud storage is costly!