mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
Updated Filer Store Replication (markdown)
parent
b4966f6a01
commit
f30e537390
|
@ -1,12 +1,27 @@
|
||||||
Here we talk about using `weed filer -peers=...`, which achieves two purposes:
|
# Parallel filers with embedded filer store
|
||||||
|
|
||||||
|
If one filer is not enough, you can add more filers. This seems easy with shared filer stores, such as Redis, MySql, Postgres, Cassandra, HBase, etc.
|
||||||
|
|
||||||
|
But did you notice this also works for embedded filer stores, such as LevelDB, RocksDB, SQLite, etc?
|
||||||
|
|
||||||
|
How is it possible?
|
||||||
|
|
||||||
|
# Automatic Peer Discovery
|
||||||
|
|
||||||
|
When a filer starts up, it will report itself to the master. So the master knows all the filers. It will keep each filer updated about its peers (Since version 2.77).
|
||||||
|
|
||||||
|
# Metadata synchronization
|
||||||
|
|
||||||
|
Knowing all the peers, one filer will keep its own metadata updated:
|
||||||
|
|
||||||
1. Aggregate filer meta data changes from peers
|
1. Aggregate filer meta data changes from peers
|
||||||
2. Replay filer meta data changes to local filer store
|
2. Replay filer meta data changes to local filer store, if it is an embedded store.
|
||||||
|
|
||||||
# FUSE mount with multiple filers
|
## Aggregate metadata updates
|
||||||
|
|
||||||
The first point is tightly related to FUSE Mount, which streams filer meta data changes from one filer.
|
This is tightly related to FUSE Mount, which streams filer meta data changes from one filer. When using multiple filers but without peer file metadata updates, a FUSE mount can only see the changes applied to the connected filer.
|
||||||
|
|
||||||
So when using multiple filers, this `-peers=xxx` option is needed. If not, a FUSE mount can only see the changes applied to the connected filer. This is required when the filers are using either shared or dedicated filer stores.
|
So aggregating metadata updates form its peers is required when the filers are using either shared or dedicated filer stores.
|
||||||
|
|
||||||
```
|
```
|
||||||
FUSE mount <----> filer1 -- filer2
|
FUSE mount <----> filer1 -- filer2
|
||||||
|
@ -15,39 +30,11 @@ So when using multiple filers, this `-peers=xxx` option is needed. If not, a FUS
|
||||||
filer3
|
filer3
|
||||||
```
|
```
|
||||||
|
|
||||||
# File Store Replication
|
# Persist metadata changes to local embedded store
|
||||||
|
|
||||||
The second point is about metadata replication.
|
If the filer is running on embedded store, the metadata updates from its peers would be saved locally.
|
||||||
|
|
||||||
This `-peers=...` can synchronize the meta data in the filer stores. If filers are using shared filer stores, this is optional.
|
|
||||||
|
|
||||||
It can also enables Active-Active or one-directional replication.
|
|
||||||
|
|
||||||
## Use Cases
|
|
||||||
|
|
||||||
For filer stores using shared filer stores, such as shared Mysql/Postgres/Cassandra/Redis/Sqlite/ElasticSearch/etc in [[Filer-Stores]], this is not really needed, since all filers are stateless, and there are no need to replicate the meta data back to the same filer store.
|
|
||||||
|
|
||||||
But if each filer has its own filer store, usually with the default local Leveldb, or even with a dedicated Mysql/Postgres/Cassandra/Redis/Sqlite/etc store, this would be very useful.
|
|
||||||
|
|
||||||
Sometimes you may want to replicate the existing store to a new filer store, or move to a new filer store, this would also be useful.
|
|
||||||
|
|
||||||
### One-Directional Replication
|
|
||||||
|
|
||||||
When starting a filer, set the `-peers` option, to receive updates from the peers.
|
|
||||||
|
|
||||||
Assuming there is a separate filer.toml for each filer, and a filer is already running at `localhost:8888`, this command will replicate metadata in `localhost:8888` to `localhost:8889`.
|
|
||||||
|
|
||||||
```
|
|
||||||
weed filer -port=8889 -peers=localhost:8888
|
|
||||||
```
|
|
||||||
|
|
||||||
### Active-Active Replication
|
|
||||||
|
|
||||||
```
|
|
||||||
weed filer -port=8888 -peers=localhost:8888,localhost:8889
|
|
||||||
weed filer -port=8889 -peers=localhost:8888,localhost:8889
|
|
||||||
```
|
|
||||||
|
|
||||||
|
This basically synchronize the metadata across all the filer stores. If filers are using shared filer stores, this is optional.
|
||||||
|
|
||||||
# Example Topologies
|
# Example Topologies
|
||||||
|
|
||||||
|
@ -56,41 +43,35 @@ weed filer -port=8889 -peers=localhost:8888,localhost:8889
|
||||||
```
|
```
|
||||||
filer1(leveldb) <-> filer2(leveldb) <-> filer3(leveldb)
|
filer1(leveldb) <-> filer2(leveldb) <-> filer3(leveldb)
|
||||||
|
|
||||||
weed filer -peers=<filer1:port1>,<filer2:port2>,<filer3:port3>
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
* Two filers are fine. There is no requirements for number of filers.
|
* Two filers are fine. There is no requirements for number of filers.
|
||||||
|
|
||||||
```
|
```
|
||||||
filer1(leveldb) <-> filer2(leveldb)
|
filer1(leveldb) <-> filer2(leveldb)
|
||||||
|
|
||||||
weed filer -peers=<filer1:port1>,<filer2:port2>
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
* Two filers with different stores are also fine. Of course, you will need a different `filer.toml`.
|
* Two filers with different embedded stores are also fine. Of course, you will need a different `filer.toml`.
|
||||||
|
|
||||||
```
|
```
|
||||||
filer1(leveldb) <-> filer2(elastic search)
|
filer1(leveldb) <-> filer2(rocksdb)
|
||||||
|
|
||||||
weed filer -peers=<filer1:port1>,<filer2:port2>
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
* Master-Slave mode for filers with different stores.
|
* Two filers with one shared stores are fine.
|
||||||
|
|
||||||
```
|
```
|
||||||
filer1(leveldb) --> filer2(elastic search)
|
filer1(mysql) <-> filer2(mysql)
|
||||||
|
```
|
||||||
|
|
||||||
# start filer2 as this.
|
* Two filers with a shared store and an embedded store are NOT fine.
|
||||||
weed filer -peers=<filer1:port1>
|
|
||||||
|
|
||||||
|
```
|
||||||
|
filer1(leveldb) <--XX NOT WORKING XX---> filer2(mysql)
|
||||||
```
|
```
|
||||||
|
|
||||||
# How is it implemented?
|
# How is it implemented?
|
||||||
|
|
||||||
Each filer has a local meta data change log. When starting with `-peers` setting, each filer will subscribe to meta data changes from its peers and apply to local filer store.
|
Each filer has a local meta data change log. When starting, each filer will subscribe to meta data changes from its peers and apply to local filer store.
|
||||||
|
|
||||||
Each filer store will auto generate a unique `filer.store.id`. So for shared filer stores, such as mysql/postgres/redis, there is no need to setup peers because the `filer.store.id` will be the same.
|
Each filer store will auto generate a unique `filer.store.id`. So for shared filer stores, such as mysql/postgres/redis, there is no need to setup peers because the `filer.store.id` will be the same.
|
||||||
|
|
||||||
|
@ -100,4 +81,4 @@ It is actually OK if you need to change filer IP or port. The replication can st
|
||||||
|
|
||||||
# Limitation
|
# Limitation
|
||||||
|
|
||||||
Multiple filers with local leveldb filer stores can work well with the `-peers` configured. However, this layout does not work well with `weed filer.sync` cross data center replication as of now. This is because currently `weed filer.sync` use `filer.store.id` to identify data that needs to be replicated. Having multiple `filer.store.id` will confuse the `weed filer.sync`.
|
Multiple filers with local leveldb filer stores can work well. However, this layout does not work well with `weed filer.sync` cross data center replication as of now. This is because currently `weed filer.sync` use `filer.store.id` to identify data that needs to be replicated. Having multiple `filer.store.id` will confuse the `weed filer.sync`.
|
||||||
|
|
Loading…
Reference in a new issue