From e7824b1758b9781d92985d3cc3b2c42823a2e576 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Wed, 20 Oct 2021 11:20:47 -0700 Subject: [PATCH] Updated Words from SeaweedFS Users (markdown) --- Words-from-SeaweedFS-Users.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Words-from-SeaweedFS-Users.md b/Words-from-SeaweedFS-Users.md index 0c8f8ec..a5b3d42 100644 --- a/Words-from-SeaweedFS-Users.md +++ b/Words-from-SeaweedFS-Users.md @@ -6,7 +6,7 @@ | we use seaweedfs embedded in our AI products that are deployed on client site (usually AirGapped because of the sensitivity of the data)| clusters ranging from 3-10 servers (and now starting to get bigger and bigger), usually retaining 7-14 days video and 30-60 days of thumbnails | we comared CEPH & Minio, we checked deployment procedure & maintenance and especially performance of writes and especially single server performance and easy scale out. we went and found that seaweedfs always won. we mainly write intensive and rarely read (usually reading as soon as write, so no real disk access) and 95% of the data is not missing critical, so the easiness of seaweedfs and the amazing performance (all writes are sequential as possible) | | [Holding lots of files](https://hypixel.net/threads/dev-blog-5-storing-your-skyblock-island.2190753/) | We've had to develop our own backup script and monitoring, interfacing with SeaweedFS. Backups of the whole dataset are done twice a day and stored in S3 for a few weeks. We run SeaweedFS across 3 volume servers which all use very low resources, always replicating volumes on the 3 servers for availability and peace of mind. The Seaweed FID are stored in Mongo. | It is basically Amazon S3, but self-hosted.| | [InternetArchive compares SeaweedFS with Minio](https://github.com/internetarchive/sandcrawler/blob/master/proposals/2020_seaweed_s3.md) | SeaweedFS 200M object upload via Python script sucessfully in about 6 days, memory usage was at a moderate 400M (~10% of RAM). Relatively constant performance at about 400 PutObject requests/s (over 5 threads, each thread was around 80 requests/s; then testing with 4 threads, each thread got to around 100 requests/s) | Problem: minio inserts slowed down after inserting 80M or more objects. | -| Key-Value Store | Internet Archive built scholar citation graph with SeaweedFS as a key-value store accessible with an S3 API. | we use it as component in our infrastructure for https://scholar.archive.org/ (serving e.g. thumbnails, cached compute results, etc; we've just recently published a tech report on a sub-project: https://arxiv.org/pdf/2110.06595.pdf | +| Key-Value Store | Internet Archive built scholar citation graph with SeaweedFS as a key-value store accessible with an S3 API. [Crawler Readme](https://github.com/internetarchive/sandcrawler/blob/master/blobs/README.md) | we use it as component in our infrastructure for https://scholar.archive.org/ (serving e.g. thumbnails, cached compute results, etc; we've just recently published a tech report on a sub-project: https://arxiv.org/pdf/2110.06595.pdf. minio was used initially, but did not scale well in number of files. | | Store images | Evercam has used Seaweed for a few years. We've 1344TB of mostly jpegs and use the filer for folder structure. It's worked well for us, especially with low cost Hetzner SX boxes. | Question: What about your recovery times when a server fails on 1Gbs port? Answer: Also in almost 5 years we only had one server crash which was due to file-system corruption, and we overcome that as well, it was a few leveldb files which got corrupt due to which the whole XFS file-system was went down, but we recovered it. Just one drawback was: We never used the same filer for saving files, and Get speed was also quite slow on that one, but with time, the volume compaction and vacuum, everything works fine on GET requests. | | We've been running SeaweedFS in production serving images and other small files. | We're not using Filer functionality just the underlying volume storage. We wrote our own asynchronous replication on top of the volume servers since we couldn't rely on synchronous replication across datacenters. | The maintainer is super responsive and is quick to review our PRs. | | It is archiving and serving more than 40,000 images on a webapp I built for the small team I work with. | I am not a large user whatsoever but I've been using SeaweedFS for a few years now. I run SeaweedFS on two machines and it serves all images I host. | It has been simple, reliable, and robust. I really like it and hope if one of my side projects ever take off at some point, I get to test it with a much bigger load.|