Updated Distributed Filer (markdown)

2024-01-19 02:48:24 +00:00 · 2018-06-07 21:02:24 -07:00 · 2018-06-07 21:02:24 -07:00 · 16212899e2
parent 8bfe42da4f
commit 16212899e2
1 changed files with 15 additions and 62 deletions
--- a/Distributed-Filer.md
+++ b/Distributed-Filer.md
@ -1,17 +1,10 @@
-The default weed filer is in standalone mode, storing file metadata on disk.
+The default weed filer is in standalone mode, storing file metadata on local LevelDB.
 It is quite efficient to go through deep directory path and can handle
 millions of files.

 However, no SPOF is a must-have requirement for many projects.

-Luckily, SeaweedFS is so flexible that we can use a completely different way
-to manage file metadata.
-
-This distributed filer uses Redis or Cassandra to store the metadata.
-
-## Redis Setup
-
-No setup required.
+SeaweedFS can utilize existing familiar data store, e.g., Cassandra, Mysql, Postgres, Redis, to store the filer metadata.

 ## Cassandra Setup

@ -35,21 +28,24 @@ CREATE TABLE seaweed_files (
 );
 ```

-## Sample usage
+## Create a filer.toml

-To start a weed filer in distributed mode with Redis:
+Try run ```weed filer -h``` to see an example filer.toml file. The file should be under one of current directory, $HOME/.seaweedfs/, or /etc/seaweedfs/ folers.
+
+Here is the shortest example for Cassandra

 ```bash
-# assuming you already started weed master and weed volume
-weed filer -redis.server=localhost:6379
+[cassandra]
+enabled = true
+keyspace="seaweedfs"
+hosts=[
+    "localhost:9042",
+]
 ```

-To start a weed filer in distributed mode with Cassandra:
+With the filer.toml file created, you can start ```weed filer```.

-```bash
-# assuming you already started weed master and weed volume
-weed filer -cassandra.server=localhost
-```
+## See it in action

 Now you can add/delete files

@ -62,33 +58,7 @@ curl -F "filename=@Makefile" "http://localhost:8888/path/to/sources/new_name"
 curl "http://localhost:8888/path/to/sources/new_name"
 ```

-## Limitation
-
-List sub folders and files are not supported because Redis or Cassandra
-does not support prefix search.
-
-## Flat Namespace Design
-
-Instead of using both directory and file metadata, this implementation uses
-a flat namespace.
-
-If storing each directory metadata separately, there would be multiple
-network round trips to fetch directory information for deep directories,
-impeding system performance.
-
-A flat namespace would take more space because the parent directories are
-repeatedly stored. But disk space is a lesser concern especially for
-distributed systems.
-
-So either Redis or Cassandra is a simple file_full_path ~ file_id mapping.
-(Actually Cassandra is a file_full_path ~ list_of_file_ids mapping
-with the hope to support easy file appending for streaming files.)
-
-## Complexity
-
-For one file retrieval, the full_filename=>file_id lookup will be O(logN)
-using Redis or Cassandra. But very likely the one additional network hop would
-take longer than the actual lookup.
+Or you can visit ```http://localhost:8888/``` to see the files and click around.

 ## Deployment Notes

@ -100,20 +70,3 @@ Replication is controlled by the client side. The filer's default replication is

 The same setting on master server would not take effect since filer will always use the specified or filer's default replication to write.

-## Use Cases
-
-Clients can assess one "weed filer" via HTTP, create files via HTTP POST,
-read files via HTTP POST directly.
-
-## Future
-
-SeaweedFS can support other distributed databases. It will be better
-if that database can support prefix search, in order to list files
-under a directory.
-
-## Helps Wanted
-
-Please implement your preferred metadata store!
-
-Just follow the cassandra_store/cassandra_store.go file and send me a pull
-request. I will handle the rest.