mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
Created Async Backup (markdown)
parent
0a7edf8688
commit
335230f4f3
82
Async-Backup.md
Normal file
82
Async-Backup.md
Normal file
|
@ -0,0 +1,82 @@
|
||||||
|
Cloud storage options, such as Amazon S3, Google Cloud Storage, Azure, Backblaze B2, etc, are ideal for backup purpose.
|
||||||
|
|
||||||
|
For example, for Amazon S3, the upload is free. You only pay for the storage.
|
||||||
|
So you have the benefit of:
|
||||||
|
* Extremely fast access to local SeaweedFS Filer
|
||||||
|
* Near-Real-Time Backup to Amazon S3 with zero-cost upload network traffic.
|
||||||
|
|
||||||
|
Of course, you can also backup to local disks on another machine.
|
||||||
|
|
||||||
|
# Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Filer --> Metadata Change Logs --> `weed filer.backup` --> AWS S3
|
||||||
|
|
|
||||||
|
+-----> GCP
|
||||||
|
|
|
||||||
|
+-----> Azure
|
||||||
|
|
|
||||||
|
+-----> Backblaze B2
|
||||||
|
|
|
||||||
|
+-----> Local Disk
|
||||||
|
```
|
||||||
|
|
||||||
|
All file meta data changes in Filer are saved in the logs and can be subscribed. See [[Filer Change Data Capture]].
|
||||||
|
A "weed filer.backup" process will subscribe to this topic, and then read the actual file content, and send the update to the cloud sink or local disk sinks.
|
||||||
|
|
||||||
|
* Sinks can be: AWS S3, Google Cloud Storage, Microsoft Azure, Backblaze B2, or Local Disk.
|
||||||
|
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
|
||||||
|
This command replaced the previous `weed filer.replicate`, which requires an external message queue.
|
||||||
|
But for configuration, use the same `weed scaffold -config=replication` to generate a `replication.toml` file. Just need to keep the linkes of the sinks that you want to use.
|
||||||
|
|
||||||
|
```
|
||||||
|
[sink.s3]
|
||||||
|
# read credentials doc at https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/sessions.html
|
||||||
|
# default loads credentials from the shared credentials file (~/.aws/credentials).
|
||||||
|
enabled = false
|
||||||
|
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||||
|
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||||
|
region = "us-east-2"
|
||||||
|
bucket = "backupbucket" # an existing bucket
|
||||||
|
directory = "/" # destination directory
|
||||||
|
endpoint = "http://localhost:8334"
|
||||||
|
is_incremental = false
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
# Running Backup
|
||||||
|
1. Make sure the `replication.toml` is in place.
|
||||||
|
1. Start the backup by running `weed filer.backup`.
|
||||||
|
|
||||||
|
# Incremental Mode
|
||||||
|
If `is_incremental = true`, all the files are backed up under the `YYYY-MM-DD` directories, which the timestamps are based on modified time.
|
||||||
|
So
|
||||||
|
* Each date directory contains all new and updated files.
|
||||||
|
* The deleted files in the source filer will not be deleted on the backup.
|
||||||
|
|
||||||
|
So if in this folder, on `2021-03-01`, these files are created in the source:
|
||||||
|
```
|
||||||
|
/dir1/file1
|
||||||
|
/dir1/file2
|
||||||
|
/dir1/file3
|
||||||
|
```
|
||||||
|
and on `2021-03-02`, these files are created, modified, deleted in the source:
|
||||||
|
```
|
||||||
|
/dir1/file1 // modified
|
||||||
|
/dir1/file2 // not changed
|
||||||
|
/dir1/file3 // deleted
|
||||||
|
/dir1/file4 // created
|
||||||
|
```
|
||||||
|
|
||||||
|
The backup destination will have the following directory structure.
|
||||||
|
```
|
||||||
|
/2021-03-01/dir1/file1
|
||||||
|
/2021-03-01/dir1/file2
|
||||||
|
/2021-03-01/dir1/file3
|
||||||
|
/2021-03-02/dir1/file1
|
||||||
|
/2021-03-02/dir1/file4
|
||||||
|
```
|
||||||
|
|
Loading…
Reference in a new issue