mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
Created Async Backup (markdown)
parent
0a7edf8688
commit
335230f4f3
82
Async-Backup.md
Normal file
82
Async-Backup.md
Normal file
|
@ -0,0 +1,82 @@
|
|||
Cloud storage options, such as Amazon S3, Google Cloud Storage, Azure, Backblaze B2, etc, are ideal for backup purpose.
|
||||
|
||||
For example, for Amazon S3, the upload is free. You only pay for the storage.
|
||||
So you have the benefit of:
|
||||
* Extremely fast access to local SeaweedFS Filer
|
||||
* Near-Real-Time Backup to Amazon S3 with zero-cost upload network traffic.
|
||||
|
||||
Of course, you can also backup to local disks on another machine.
|
||||
|
||||
# Architecture
|
||||
|
||||
```
|
||||
Filer --> Metadata Change Logs --> `weed filer.backup` --> AWS S3
|
||||
|
|
||||
+-----> GCP
|
||||
|
|
||||
+-----> Azure
|
||||
|
|
||||
+-----> Backblaze B2
|
||||
|
|
||||
+-----> Local Disk
|
||||
```
|
||||
|
||||
All file meta data changes in Filer are saved in the logs and can be subscribed. See [[Filer Change Data Capture]].
|
||||
A "weed filer.backup" process will subscribe to this topic, and then read the actual file content, and send the update to the cloud sink or local disk sinks.
|
||||
|
||||
* Sinks can be: AWS S3, Google Cloud Storage, Microsoft Azure, Backblaze B2, or Local Disk.
|
||||
|
||||
|
||||
# Configuration
|
||||
|
||||
This command replaced the previous `weed filer.replicate`, which requires an external message queue.
|
||||
But for configuration, use the same `weed scaffold -config=replication` to generate a `replication.toml` file. Just need to keep the linkes of the sinks that you want to use.
|
||||
|
||||
```
|
||||
[sink.s3]
|
||||
# read credentials doc at https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/sessions.html
|
||||
# default loads credentials from the shared credentials file (~/.aws/credentials).
|
||||
enabled = false
|
||||
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
|
||||
region = "us-east-2"
|
||||
bucket = "backupbucket" # an existing bucket
|
||||
directory = "/" # destination directory
|
||||
endpoint = "http://localhost:8334"
|
||||
is_incremental = false
|
||||
|
||||
```
|
||||
|
||||
# Running Backup
|
||||
1. Make sure the `replication.toml` is in place.
|
||||
1. Start the backup by running `weed filer.backup`.
|
||||
|
||||
# Incremental Mode
|
||||
If `is_incremental = true`, all the files are backed up under the `YYYY-MM-DD` directories, which the timestamps are based on modified time.
|
||||
So
|
||||
* Each date directory contains all new and updated files.
|
||||
* The deleted files in the source filer will not be deleted on the backup.
|
||||
|
||||
So if in this folder, on `2021-03-01`, these files are created in the source:
|
||||
```
|
||||
/dir1/file1
|
||||
/dir1/file2
|
||||
/dir1/file3
|
||||
```
|
||||
and on `2021-03-02`, these files are created, modified, deleted in the source:
|
||||
```
|
||||
/dir1/file1 // modified
|
||||
/dir1/file2 // not changed
|
||||
/dir1/file3 // deleted
|
||||
/dir1/file4 // created
|
||||
```
|
||||
|
||||
The backup destination will have the following directory structure.
|
||||
```
|
||||
/2021-03-01/dir1/file1
|
||||
/2021-03-01/dir1/file2
|
||||
/2021-03-01/dir1/file3
|
||||
/2021-03-02/dir1/file1
|
||||
/2021-03-02/dir1/file4
|
||||
```
|
||||
|
Loading…
Reference in a new issue