mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
adjust filer doc
parent
04723d71f4
commit
7d404749bb
|
@ -6,6 +6,14 @@ When talking about file systems, many people would assume directories, list file
|
|||
|
||||
First, run ```weed filer -h``` to see an example ```filer.toml``` file. Copy it out and read it, create the data store if needed.
|
||||
|
||||
The simplest filer.toml can be:
|
||||
```
|
||||
```
|
||||
[leveldb]
|
||||
enabled = true
|
||||
dir = "." # directory to store level db files
|
||||
```
|
||||
|
||||
Two ways to start a weed filer
|
||||
|
||||
```bash
|
||||
|
@ -53,11 +61,15 @@ For reads:
|
|||
1. Client Read File Metadata => Weed Filer => Weed Filer database (LevelDB, Cassandra, Redis, Mysql, Postgres, etc)
|
||||
2. Client Read File Chunks => Weed Volume Servers
|
||||
|
||||
![](FilerRead.png)
|
||||
|
||||
For writes:
|
||||
1. Client stream files to Filer
|
||||
2. Filer uploads data to Weed Volume Servers, and break the large files into chunks.
|
||||
3. Filer writes the metadata and chunk information into Filer database.
|
||||
|
||||
## Filer Store
|
||||
|
||||
#### Complexity
|
||||
|
||||
For one file retrieval, the (file_parent_directory, fileName)=>meta data lookup will be O(logN) for LSM tree or Btree implementations, where N is number of existing entries, or O(1) for Redis.
|
||||
|
@ -72,13 +84,16 @@ For directory renaming, it will be O(N) operations, with N as the number of file
|
|||
|
||||
### Comparing Storage Options
|
||||
|
||||
Here is a comparison of different filer store options.
|
||||
The Filer Store persists all file metadata and directory information.
|
||||
|
||||
1. "memory" : only for testing/example purpose.
|
||||
2. "leveldb": simple, single machine, fast, scalable, but no failover.
|
||||
3. "mysql"/"postgres": robust and well-understood, fast enough for most cases, scalable.
|
||||
4. "cassandra": robust and well-understood, fast, scalable.
|
||||
5. "redis": very fast, scalable with clustering, need to enable persistent storage, file listing is limited because one directory's sub file names are stored in one key~value entry.
|
||||
| Filer Store Name | Lookup | number of entries in a folder | Scalability | Note |
|
||||
| ---------------- | -- | -- | -- | -- |
|
||||
| memory | O(1) | limited by memory | Local, Fast | for testing only, no persistent storage |
|
||||
| LevelDB | O(logN)| unlimited | Local, Very Fast | Default, fairly scalable |
|
||||
| Redis | O(1) | limited | Local or Distributed, Fastest | one directory's sub file names are stored in one key~value entry |
|
||||
| Cassandra | O(logN)| unlimited | Local or Distributed, Very Fast| |
|
||||
| MySql | O(logN)| unlimited | Local or Distributed, Fast | Easy to manage, export |
|
||||
| PostGres | O(logN)| unlimited | Local or Distributed, Fast | Easy to manage, export |
|
||||
|
||||
### Extending Storage Options
|
||||
|
||||
|
@ -97,3 +112,13 @@ Filer has two use cases.
|
|||
When filer is used directly to upload and download files, in addition to file meta data, the filer also need to process the file content during read and write. So it's a good idea to add multiple filer servers. Having an nginx server in front of the filer servers to load balance the requests would be a good idea.
|
||||
|
||||
When filer is used to support "weed mount", the filer only provides file meta data retrieval. The actual file content are read and write directly between "weed mount" and "weed volume" servers. So the filer is limited only by the filer storage capability.
|
||||
|
||||
|
||||
|
||||
## Upgrading from previous Filer storage
|
||||
Upgrading is complicated since the storage format is very different.
|
||||
|
||||
Here are the basic steps:
|
||||
1. Export all files from existing storage, including the full path, and fileId.
|
||||
2. For each fileId, find out the size, mime type.
|
||||
3. Register the file in the new filer, via SeaweedFiler CreateEntry() gRpc API. See [[Filer Commands and Operations]]
|
||||
|
|
|
@ -1,76 +0,0 @@
|
|||
The default weed filer is in standalone mode, storing file metadata on local LevelDB.
|
||||
It is quite efficient to go through deep directory path and can handle
|
||||
millions of files.
|
||||
|
||||
However, no SPOF is a must-have requirement for many projects.
|
||||
|
||||
SeaweedFS can utilize existing familiar data store, e.g., Cassandra, Mysql, Postgres, Redis, to store the filer metadata.
|
||||
|
||||
The following takes Cassandra as an example.
|
||||
|
||||
## Cassandra Setup
|
||||
|
||||
Here is the CQL to create the table.CassandraStore.
|
||||
Optionally you can adjust the keyspace name and replication settings.
|
||||
For production, you would want to set replication_factor to 3
|
||||
if there are at least 3 Cassandra servers.
|
||||
|
||||
```cql
|
||||
create keyspace seaweedfs WITH replication = {
|
||||
'class':'SimpleStrategy',
|
||||
'replication_factor':1
|
||||
};
|
||||
|
||||
use seaweedfs;
|
||||
|
||||
CREATE TABLE filemeta (
|
||||
directory varchar,
|
||||
name varchar,
|
||||
meta blob,
|
||||
PRIMARY KEY (directory, name)
|
||||
) WITH CLUSTERING ORDER BY (name ASC);
|
||||
|
||||
```
|
||||
|
||||
## Create a filer.toml
|
||||
|
||||
Try run ```weed filer -h``` to see an example filer.toml file. The file should be under one of current directory, $HOME/.seaweedfs/, or /etc/seaweedfs/ folers.
|
||||
|
||||
Here is the shortest example for Cassandra
|
||||
|
||||
```bash
|
||||
[cassandra]
|
||||
enabled = true
|
||||
keyspace="seaweedfs"
|
||||
hosts=[
|
||||
"localhost:9042",
|
||||
]
|
||||
```
|
||||
|
||||
With the filer.toml file created, you can start ```weed filer```.
|
||||
|
||||
## See it in action
|
||||
|
||||
Now you can add/delete files
|
||||
|
||||
```bash
|
||||
# POST a file and read it back
|
||||
curl -F "filename=@README.md" "http://localhost:8888/path/to/sources/"
|
||||
curl "http://localhost:8888/path/to/sources/README.md"
|
||||
# POST a file with a new name and read it back
|
||||
curl -F "filename=@Makefile" "http://localhost:8888/path/to/sources/new_name"
|
||||
curl "http://localhost:8888/path/to/sources/new_name"
|
||||
```
|
||||
|
||||
Or you can visit ```http://localhost:8888/``` to see the files and click around.
|
||||
|
||||
## Deployment Notes
|
||||
|
||||
Replication is controlled by the client side. The filer's default replication is "000". To enable it, start filer with similar option like this:
|
||||
|
||||
```bash
|
||||
-defaultReplicaPlacement=001
|
||||
```
|
||||
|
||||
The same setting on master server would not take effect since filer will always use the specified or filer's default replication to write.
|
||||
|
49
Filer-Cassandra-Setup.md
Normal file
49
Filer-Cassandra-Setup.md
Normal file
|
@ -0,0 +1,49 @@
|
|||
SeaweedFS can utilize existing familiar data store, e.g., Cassandra, Mysql, Postgres, Redis, to store the filer metadata.
|
||||
|
||||
The following takes Cassandra as an example.
|
||||
|
||||
## Cassandra Setup
|
||||
|
||||
Here is the CQL to create the table.CassandraStore.
|
||||
Optionally you can adjust the keyspace name and replication settings.
|
||||
For production, you would want to set replication_factor to 3
|
||||
if there are at least 3 Cassandra servers.
|
||||
|
||||
```cql
|
||||
create keyspace seaweedfs WITH replication = {
|
||||
'class':'SimpleStrategy',
|
||||
'replication_factor':1
|
||||
};
|
||||
|
||||
use seaweedfs;
|
||||
|
||||
CREATE TABLE filemeta (
|
||||
directory varchar,
|
||||
name varchar,
|
||||
meta blob,
|
||||
PRIMARY KEY (directory, name)
|
||||
) WITH CLUSTERING ORDER BY (name ASC);
|
||||
|
||||
```
|
||||
|
||||
## Create a filer.toml
|
||||
|
||||
Try run ```weed filer -h``` to see an example filer.toml file. The file should be under one of current directory, $HOME/.seaweedfs/, or /etc/seaweedfs/ folers.
|
||||
|
||||
Here is the shortest example for Cassandra
|
||||
|
||||
```bash
|
||||
[cassandra]
|
||||
enabled = true
|
||||
keyspace="seaweedfs"
|
||||
hosts=[
|
||||
"localhost:9042",
|
||||
]
|
||||
```
|
||||
|
||||
|
||||
## Starting the Filer
|
||||
|
||||
```bash
|
||||
weed filer
|
||||
```
|
117
Filer.md
117
Filer.md
|
@ -1,117 +0,0 @@
|
|||
This page aims to consolidate the pages on the [[single-node filer|Directories and Files]] and [[distributed filer]] into one.
|
||||
|
||||
## Background
|
||||
|
||||
SeaweedFS comes with a lightweight "filer" server, which provides a RESTful wrapper around SeaweedFS's blob API, mapping content to a traditional file directory of paths. The files in filer can also be mounted to Linux or Mac with FUSE support.
|
||||
|
||||
## Backends
|
||||
|
||||
SeaweedFS's built-in filer supports three different backends (although pull requests to add more are always welcome).
|
||||
|
||||
The default backend, LevelDB, is for simple, non-distributed single nodes.
|
||||
|
||||
The other backends, Redis and Cassandra, are for clustering backing stores that can be distributed across several nodes at high scale.
|
||||
|
||||
The LevelDB backend is very capable and efficient; the main disadvantage it has, relative to the distributed backends, is that it presents a single point of failure. In "[pets vs. cattle][pvc]" terms, the LevelDB backend is only suitable for "pet" servers, while the Redis and Cassandra backends are suitable for "cattle" servers.
|
||||
|
||||
[pvc]: https://blog.engineyard.com/2014/pets-vs-cattle
|
||||
|
||||
## Initialization
|
||||
|
||||
The LevelDB and Redis backends need no initialization.
|
||||
|
||||
### Initializing the Cassandra backend
|
||||
|
||||
Here is the CQL to create the table used by SeaweedFS's Cassandra store, as well as a keyspace for specifying the replication strategy to use.
|
||||
|
||||
While the table name and field structure must match what is written here, you are free to rename the keyspace and use whatever replication settings you wish. For production, you would want to set replication_factor to 3
|
||||
if there are at least 3 Cassandra servers.
|
||||
|
||||
```cql
|
||||
create keyspace seaweedfs WITH replication = {
|
||||
'class':'SimpleStrategy',
|
||||
'replication_factor':1
|
||||
};
|
||||
|
||||
use seaweedfs;
|
||||
|
||||
CREATE TABLE filemeta (
|
||||
directory varchar,
|
||||
name varchar,
|
||||
meta blob,
|
||||
PRIMARY KEY (directory, name)
|
||||
) WITH CLUSTERING ORDER BY (name ASC);
|
||||
```
|
||||
|
||||
## Create a filer.toml file
|
||||
|
||||
Please create a filer.toml file in current directory, or ""$HOME/.seaweedfs/", or ""/etc/seaweedfs/".
|
||||
|
||||
Just run "weed filer -h" to see an up-to-date example. Here is one simpler copy. Remember to set enabled=true to pick one option.
|
||||
|
||||
```
|
||||
[leveldb]
|
||||
enabled = false
|
||||
dir = "." # directory to store level db files
|
||||
|
||||
[cassandra]
|
||||
enabled = false
|
||||
keyspace="seaweedfs"
|
||||
hosts=[
|
||||
"localhost:9042",
|
||||
]
|
||||
|
||||
[redis]
|
||||
enabled = true
|
||||
address = "localhost:6379"
|
||||
password = ""
|
||||
db = 0
|
||||
|
||||
```
|
||||
|
||||
## Starting the Filer
|
||||
|
||||
To start the filer, after you have started the master and volume servers (with `weed server`, or `weed master` and `weed volume` respectively), you can start a filer server with `weed filer`:
|
||||
|
||||
```bash
|
||||
weed filer
|
||||
```
|
||||
|
||||
Alternatively, to start all servers in one shot, you can start a filer server alongside a master server and volume server with the `-filer` option to `weed server`:
|
||||
|
||||
```
|
||||
# this is equivalent to `weed master`, `weed volume`, and `weed filer` together
|
||||
weed server -filer
|
||||
```
|
||||
|
||||
## Using the Filer
|
||||
|
||||
The filer provides a simple RESTful interface, where POST requests to a path upload the file content for that path, and GET requests retrieve the content for that path.
|
||||
|
||||
```
|
||||
# POST a file and read it back
|
||||
curl -F "filename=@README.md" "http://localhost:8888/path/to/sources/"
|
||||
curl "http://localhost:8888/path/to/sources/README.md"
|
||||
|
||||
# POST a file with a new name and read it back
|
||||
curl -F "filename=@Makefile" "http://localhost:8888/path/to/sources/new_name"
|
||||
curl "http://localhost:8888/path/to/sources/new_name"
|
||||
```
|
||||
|
||||
You may also request a "listing" for a directory:
|
||||
|
||||
```
|
||||
# list sub folders and files
|
||||
curl "http://localhost:8888/path/to/sources/?pretty=y"
|
||||
|
||||
# if lots of files under this folder, here is a way to efficiently paginate through all of them
|
||||
curl "http://localhost:8888/path/to/sources/?lastFileName=abc.txt&limit=50&pretty=y"
|
||||
```
|
||||
|
||||
## Upgrading from previous Filer storage
|
||||
Upgrading is complicated since the storage format is very different.
|
||||
|
||||
Here are the basic steps:
|
||||
1. Export all files from existing storage, including the full path, and fileId.
|
||||
2. For each fileId, find out the size, mime type.
|
||||
3. Register the file in the new filer, via SeaweedFiler CreateEntry() gRpc API. See [[Filer Commands and Operations]]
|
|
@ -13,7 +13,7 @@
|
|||
* [[Failover Master Server]]
|
||||
* Filer
|
||||
* [[Directories and Files]]
|
||||
* [[Distributed Filer]]
|
||||
* [[Filer Cassandra Setup]]
|
||||
* [[Filer Commands and Operations]]
|
||||
* [[Mount]]
|
||||
* [[Customize Filer Store]]
|
||||
|
|
Loading…
Reference in a new issue