From 34941cec8017917ee52c70600c60ef0c6a65e037 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Sat, 2 Jun 2018 00:28:48 -0700 Subject: [PATCH] update for filer2 --- .gitignore | 2 + Change-List.md | 21 ++++++++- Directories-and-Files.md | 73 +++++++++----------------------- Filer-Commands-and-Operations.md | 19 ++------- Filer.md | 69 ++++++++++++++++++++++-------- 5 files changed, 99 insertions(+), 85 deletions(-) create mode 100644 .gitignore diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..432d52d --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ + +seaweedfs.wiki.iml diff --git a/Change-List.md b/Change-List.md index bf80419..05ad3e2 100644 --- a/Change-List.md +++ b/Change-List.md @@ -2,6 +2,25 @@ This file contains list of recent changes, important features, usage changes, data format changes, etc. Do read this if you upgrade. +## v0.90 +The changes are mostly for a new filer implementation. +1. much simpler to implement for different storage +2. added with more meta data on files and directories +3. added directory listing in both Cassandra and Redis +4. “weed mount” can work both read and write, with local caching, chunking +5. improved file listing web UI + +Filer breaking changes: +1. the filer storage will not be compatible with old filer store. +2. configuration is moved to a "filer.toml" file under current directory, or "$HOME/.seaweedfs/", or "/etc/seaweedfs/" folder. + +Other: +1. "weed filer" and "weed volume" need to specifiy a complete list of masters in multi-master setup. +2. fix bugs copying 0 byte files. + +This is a breaking change and content in old filer is not compatible. A separate "filer1_maintenance_branch" is created. +Migrating data is possible, but difficult since exporting old data is different for different storage options. + ## v0.76 1. Support btree mode, in addition to in-memory/leveldb/boltdb modes, for less memory with customized id. @@ -252,4 +271,4 @@ weed volume -dir="/tmp" -volumes=0-4 -mserver="localhost:9333" -port=8080 -publi And more new commands, in addition to "server","volume","fix", etc, will be added. -This provides a simple deliverable file, and the file size is much smaller since Go language statically compile the commands. Combining commands into one file would avoid lots of duplication. \ No newline at end of file +This provides a simple deliverable file, and the file size is much smaller since Go language statically compile the commands. Combining commands into one file would avoid lots of duplication. diff --git a/Directories-and-Files.md b/Directories-and-Files.md index 4852b51..953d842 100644 --- a/Directories-and-Files.md +++ b/Directories-and-Files.md @@ -28,10 +28,10 @@ curl -F "filename=@Makefile" "http://localhost:8888/path/to/sources/new_name" curl "http://localhost:8888/path/to/sources/new_name" # list sub folders and files -curl "http://localhost:8888/path/to/sources/?pretty=y" +visit "http://localhost:8888/path/to/sources/" # if lots of files under this folder, here is a way to efficiently paginate through all of them -curl "http://localhost:8888/path/to/sources/?lastFileName=abc.txt&limit=50&pretty=y" +visit "http://localhost:8888/path/to/sources/?lastFileName=abc.txt&limit=50" ``` ### Design @@ -42,59 +42,30 @@ SeaweedFS wants to make as small number of disk access as possible, yet still be We can take the following steps to map a full file path to the actual data block: -1. file_parent_directory => directory_id -2. directory_id+fileName => file_id -3. file_id => data_block - -Because default SeaweedFS only provides file_id=>data_block mapping, only the first 2 steps need to be implemented. - -There are several data features I noticed: - -1. the number of directories usually is small, or very small -2. the number of files can be small, medium, large, or very large - -This leads to a novel (as far as I know now) approach to organize the meta data for the directories and files separately. - -A "weed filer" server is to provide these two missing parent_directory=>directory_id, and directory_id+filename=>file_id mappings, completing the "common" file storage interface. - -#### Assumptions - -I believe these are reasonable assumptions: - -1. The number of directories are smaller than the number of files by one or more magnitudes. -2. Very likely for big systems, the number of files under one particular directory can be very high, ideally unlimited, far exceeding the number of directories. -3. Directory meta data is accessed very often. - -#### Data Structure - -This assumed differences between directories and files lead to the design that the metadata for directories and files should have different data structure. - -1. Store directories in memory - 1. all of directories hopefully all be in memory - 2. efficient to move/rename/list_directories -2. Store files in a sorted string table in format - 1. efficient to list_files, just simple iterator - 2. efficient to locate files, binary search +1. (file_parent_directory, fileName) => meta data of attributes and a list of file_id. Implemented by "weed filer" server. +2. file_id => data_block. Implemented by default SeaweedFS master and volume servers. #### Complexity -For one file retrieval, if the parent directory includes n folders, then it will take n steps to navigate from root to the file folder. However, this O(n) step is all in memory. So in practice, it will be very fast. +For one file retrieval, the (file_parent_directory, fileName)=>meta data lookup will be O(logN) for LSM tree or Btree implementations, where N is number of existing entries, or O(1) for Redis. -For one file retrieval, the dir_id+filename=>file_id lookup will be O(logN) using LevelDB, a log-structured-merge (LSM) tree implementation. The complexity is the same as B-Tree. +For file listing under a particular directory, the listing is just a simple scan for LSM tree or Btree, or O(1) for Redis. -For file listing under a particular directory, the listing in LevelDB is just a simple scan, since the record in LevelDB is already sorted. For B-Tree, this may involves multiple disk seeks to jump through. +For adding one file, the parent directories will be recursively created if not exists. And then the file entry will be created. -For directory renaming, it's just trivially change the name or parent of the directory. Since the directory_id stays the same, there are no change to files metadata. +### Comparing Storage Options -For file renaming, it's just trivially delete and then add a row in leveldb. +Here is a comparison of different filer store options. -### Details +1. "memory" : only for testing/example purpose. +2. "leveldb": simple, single machine, fast, scalable, but no failover. +3. "mysql"/"postgres": robust and common, fast enough for most cases, scalable. +4. "cassandra": robust and common, fast, scalable. +5. "redis": very fast, scalable with clustering, need to enable persistent storage, file listing is limited as one directory's sub file names are stored in one set of a key. -In the current first version, the path_to_file=>file_id mapping is stored with an efficient embedded leveldb. Being embedded, it runs on single machine. So it's not linearly scalable yet. However, it can handle LOTS AND LOTS of files on SeaweedFS on other master/volume servers. +### Extending Storage Options -Switching from the embedded leveldb to an external distributed database is very feasible. Your contribution is welcome! - -The in-memory directory structure can improve on memory efficiency. Current simple map in memory works when the number of directories is less than 1 million, which will use about 500MB memory. But I would expect common use case would have a few, not even more than 100 directories. +For any new storage option, please check the FilerStore interface. It should be fairly straight forward to implement. Welcome to contribute back. ### Use Cases @@ -111,14 +82,12 @@ This uses bazil.org/fuse, which enables writing FUSE file systems on Linux and O weed mount -filer=localhost:8888 -dir=/some/existing/dir ``` -Now you can browse/delete directories and files, and read file as in local file system. For efficiency, only no more than 100 sub directories and files under the same directory will be listed. To unmount, just shut it down. +Now you can browse/delete directories and files, read and write file as in local file system. For efficiency, only no more than 1000 sub directories and files under the same directory will be listed. To unmount, just shut it down. -### Future +### Filer Scalability -Later, FUSE or HCFS plugins will be created, to really integrate SeaweedFS to existing systems. +Filer has two use case. -### Helps Wanted +When filer is used directly to upload and download files, in addition to file meta data, the filer also need to process the file content during read and write. So it's a good idea to add multiple filer servers. Having an nginx server in front of the filer servers to load balance the requests would be a good idea. -This is a big step towards more interesting SeaweedFS usage and integration with existing systems. - -Help on FUSE is needed. \ No newline at end of file +When filer is used to support "weed mount", the filer only provides file meta data retrieval. The actual file content are read and write directly between "weed mount" and "weed volume" servers. So the filer is limited only by the filer storage capability. diff --git a/Filer-Commands-and-Operations.md b/Filer-Commands-and-Operations.md index 5296050..fe4c893 100644 --- a/Filer-Commands-and-Operations.md +++ b/Filer-Commands-and-Operations.md @@ -19,25 +19,14 @@ Copy ./weed/command/compact.go => http://localhost:8888/github/./weed/command/co ... ``` -The above `weed copy` command is very efficient. It will contact the master server for a fileId, and submit the file content to volume servers, then just register the (path, fileId) pair on filer. Also, the file copying will also split large files into trunks automatically. +The above `weed copy` command is very efficient. It will contact the master server for a fileId, and submit the file content to volume servers, then just create the entry on filer. Also, the file copying will also split large files into trunks automatically. This put very little loads on filer and the master server. Data is only transmitted between the local machine and the volume server. ## Register a file on Filer -As mentioned above, the (path, fileId) can be registered on filer with this http operation. +As mentioned above, the (path, fileId, fileSize) can be registered on filer with this gRpc call. -``` + filer_pb.SeaweedFilerClient.CreateEntry() -curl --data "path=/path/to/your/file&fildId=3,01637037d6" http://localhost:8888/admin/register - -``` - -## Move a Folder - -Moving a folder is a very lightweight operation for embedded filer. Not implemented for flat namespace filer implementation since it is not efficient. - -``` -curl --data "from=/path/to/your/fileOrDir&to=/path/to/new/folder/" http://localhost:8888/admin/mv -curl --data "from=/path/to/your/fileOrDir&to=/path/to/new/file" http://localhost:8888/admin/mv -``` \ No newline at end of file +The code example can be found in filer_copy.go file. diff --git a/Filer.md b/Filer.md index bf7347f..a06ed6c 100644 --- a/Filer.md +++ b/Filer.md @@ -2,7 +2,7 @@ This page aims to consolidate the pages on the [[single-node filer|Directories a ## Background -SeaweedFS comes with a lightweight "filer" server, which provides a RESTful wrapper around SeaweedFS's arbitrary blob API, mapping content to a traditional file directory of paths. +SeaweedFS comes with a lightweight "filer" server, which provides a RESTful wrapper around SeaweedFS's blob API, mapping content to a traditional file directory of paths. The files in filer can also be mounted to Linux or Mac with FUSE support. ## Backends @@ -28,33 +28,53 @@ While the table name and field structure must match what is written here, you ar if there are at least 3 Cassandra servers. ```cql -create keyspace seaweed WITH replication = { +create keyspace seaweedfs WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1 }; -use seaweed; +use seaweedfs; + + CREATE TABLE filemeta ( + directory varchar, + name varchar, + meta blob, + PRIMARY KEY (directory, name) + ) WITH CLUSTERING ORDER BY (name ASC); +``` + +## Create a filer.toml file + +Please create a filer.toml file in current directory, or ""$HOME/.seaweedfs/", or ""/etc/seaweedfs/". + +Just run "weed filer -h" to see an up-to-date example. Here is one simpler copy. Remember to set enabled=true to pick one option. + +``` +[leveldb] +enabled = false +dir = "." # directory to store level db files + +[cassandra] +enabled = false +keyspace="seaweedfs" +hosts=[ + "localhost:9042", +] + +[redis] +enabled = true +address = "localhost:6379" +password = "" +db = 0 -CREATE TABLE seaweed_files ( - path varchar, - fids list, - PRIMARY KEY (path) -); ``` ## Starting the Filer -To start the filer, after you have started the master and volume servers (with `weed server`, or `weed master` and `weed volume` respectively), you can start a filer server with `weed filer`, providing backing store location options to use the Redis or Cassandra backends: +To start the filer, after you have started the master and volume servers (with `weed server`, or `weed master` and `weed volume` respectively), you can start a filer server with `weed filer`: ```bash -# to use the default LevelDB backend: weed filer - -# to use the Redis backend: -weed filer -redis.server=localhost:6379 - -# to use the Cassandra backend: -weed filer -cassandra.server=localhost ``` Alternatively, to start all servers in one shot, you can start a filer server alongside a master server and volume server with the `-filer` option to `weed server`: @@ -88,4 +108,19 @@ curl "http://localhost:8888/path/to/sources/?pretty=y" curl "http://localhost:8888/path/to/sources/?lastFileName=abc.txt&limit=50&pretty=y" ``` -The Redis and Cassandra backends are currently implemented as ["flat namespace" stores](https://github.com/chrislusf/seaweedfs/blob/master/weed/filer/flat_namespace/flat_namespace_filer.go), so filers using them may not perform directory listings at this time. \ No newline at end of file +## Mount the filer + +On Mac, you would need to install [FUSE for macOS](https://osxfuse.github.io/) + +``` +weed mount -dir some_directory +``` + + +## Upgrading from previous Filer storage +Upgrading is complicated since the storage format is very different. + +Here are the basic steps: +1. Export all files from existing storage, including the full path, and fileId. +2. For each fileId, find out the size, mime type. +3. Register the file in the new filer, via SeaweedFiler CreateEntry() gRpc API. See [Filer Commands and Operations]