chrislu 2023-08-17 08:50:17 -07:00
commit 960bcacb21
9 changed files with 125 additions and 6 deletions

@ -33,6 +33,7 @@ Name | Author | Language
[Erlang SeaweedFS Client](https://github.com/Neurotec/seaweedfs.erl) | Neurotec | Erlang
[Julia SeaweedFS Client](https://github.com/lawless-m/SeaweedFSClient.jl) | Lawless-m | Julia
[Rust SeaweedFS Client](https://github.com/kerzeld/rusty_weed) | kerzeld | Rust
[Openresty Client](https://github.com/cooperlyt/lua-resty-seaweedfs) | Cooper | Lua
## GRPC APIs
SeaweedFS uses GRPC internally. You can use them too. Just check https://github.com/seaweedfs/seaweedfs/tree/master/weed/pb for the proto files.

@ -50,7 +50,8 @@ In `weed shell`:
"s3SecretKey": "***",
"s3Region": "us-east-2"
}
# For aliyun OSS
> remote.configure -name=s5 -type=aliyun -aliyun.access_key=xxx -aliyun.secret_key=yyy -aliyun.endpoint=http://oss-cn-wulanchabu-internal.aliyuncs.com -aliyun.region=oss-cn-wulanchabu -s3.storage_class=STANDARD -s3.support_tagging=false
```
# Mount Remote Storage

@ -50,9 +50,14 @@ Compared to direct S3 storage, this is both faster and cheaper!
1. Use `volume.tier.download` in `weed shell` to move volumes to the local cluster.
## Configuring Storage Backend
(Currently only s3 is developed. More is coming soon.)
Multiple s3 buckets are supported. Usually you just need to configure one backend.
The following storage backends are currently supported:
- S3
- Rclone
### S3
Multiple S3 buckets are supported. Usually you just need to configure one backend.
```
# The storage backends are configured on the master, inside master.toml
@ -75,6 +80,37 @@ Multiple s3 buckets are supported. Usually you just need to configure one backen
```
### Rclone
Here's an example of a configuration for an Rclone storage backend:
```
[storage.backend]
[storage.backend.rclone.default]
enabled = true
remote_name = "my-onedrive"
key_template = "SeaweedFS/{{ slice . 0 2 }}/{{ slice . 2 4 }}/{{ slice . 4 }}" # optional
```
Where `my-remote` must correspond to what's configured in `~/.config/rclone/rclone.conf`:
```
[my-onedrive]
type = onedrive
...
```
The optional `key_template` property makes it possible to choose how the individual volume files are eventually named in the storage backend.
The template has to follow the [text/template](https://pkg.go.dev/text/template) syntax, where `.` is a UUID string representing the volume ID.
In this example, files would be named as such in OneDrive:
```
SeaweedFS/
5f/
e5/
03d8-0aba-4e42-b1c8-6116bf862f71
```
### Configuration
After changing the master config, you need to **restart the volume servers** so that they pull the updated configuration
from the master. If you forget to do this, you'll get an error message about no storage backends being configured.

@ -104,6 +104,9 @@ This is based on Filer gRPC API. You should be able to easily implement it in yo
https://github.com/seaweedfs/seaweedfs/blob/master/weed/pb/filer.proto#L52
A Golang example:
https://github.com/tuxmart/seawolf
# Possible Use Cases
This is basically stream processing or event processing for files. The possible use cases are all up to your imagination.

@ -81,12 +81,12 @@ When a write is made to the filer, there is an additional step before step 1. an
## Fix replication
If one replica is missing, there are no automatic repair right away. This is to prevent over replication due to transient volume sever failures or disconnections. In stead, the volume will just become readonly. For any new writes, just assign a different file id to a different volume.
If one replica is missing, there are no automatic repair right away. This is to prevent over replication due to transient volume sever failures or disconnections. In stead, the volume will just become read-only. For any new writes, just assign a different file id to a different volume.
To repair the missing replicas, you can use `volume.fix.replication` in `weed shell`.
### Replicate without deleting
In certain circumstances—like adding/removing/altering replication settings of volumes or servers—the best strategy is to only repair under-replicated volumes and not delete any while working on volume and server modifications, in this citation use the flag `noDelete`:
In certain circumstances—like adding/removing/altering replication settings of volumes or servers—the best strategy is to only repair under-replicated volumes and not delete any while working on volume and server modifications, in this situation use the flag `noDelete`:
`volume.fix.replication -noDelete`

@ -5,6 +5,9 @@ It's a common concept to put a proxy in front of S3 that handles requests. Nginx
```
upstream seaweedfs { server localhost:8333 fail_timeout=0; keepalive 20;}
## Also you can use unix domain socket instead for better performance:
# upstream seaweedfs { server unix:/tmp/seaweedfs-s3-8333.sock; keepalive 20;}
server {
listen 443 ssl;
server_name ~^(?<subdomain>[^.]+).yourdomain.com;

@ -13,7 +13,7 @@ Since data that are hot, warm, and cold, it would be cost-efficient to place dat
To write any data, the volume index needs to append one entry. To read any data, the volume index lookup is required. The volume index can be in memory mode or an LevelDB instance. The amount of index content is small while it is accessed frequently.
You can run `weed volume -dir.idx=/fast/disk/dir` or `weed server -volume.dir.idx=/vast/disk/dir` to ensure the volume index is located on a fast disk, e.g., a fast SSD mount.
You can run `weed volume -dir.idx=/fast/disk/dir` or `weed server -volume.dir.idx=/fast/disk/dir` to ensure the volume index is located on a fast disk, e.g., a fast SSD mount.
If the volume server already has some existing data, you can just stop the volume server, move the `.idx` files to the index folder, and restart the volume server.

@ -53,6 +53,7 @@
* [[Amazon S3 API]]
* [[AWS CLI with SeaweedFS]]
* [[s3cmd with SeaweedFS]]
* [[rclone with SeaweedFS]]
* [[restic with SeaweedFS]]
* [[nodejs with Seaweed S3]]
* [[S3 API Benchmark]]

74
rclone-with-SeaweedFS.md Normal file

@ -0,0 +1,74 @@
### Installation
See https://rclone.org/install/
On mac: `brew install rclone`
### Configuration
See https://rclone.org/s3/
Set config ~/.config/rclone/rclone.conf:
```
[swfs]
type = s3
provider = Other
access_key_id = any-key-id
secret_access_key = any-access-key
endpoint = http://localhost:8333
upload_cutoff = 50Mi
chunk_size = 50Mi
force_path_style = true
```
### Execute commands
copy files
```
rclone --log-level INFO copy --checksum --fast-list /Users/kmlebedev/files swfs:/bucket-name/files
```
## Client-side encryption
### Installation local KMS API
```
git clone https://github.com/kmlebedev/local-kms.git
cd local-kms
go install
```
run local-kms
```
local-kms
INFO[2023-07-24 15:34:23.876] Local KMS Version Unknown (Commit Hash Unknown)
INFO[2023-07-24 15:34:23.992] No file found at path /init/seed.yaml; skipping seeding.
INFO[2023-07-24 15:34:23.992] Data will be stored in /tmp/local-kms
INFO[2023-07-24 15:34:23.992] Local KMS started on 0.0.0.0:8080
```
create master key
```
aws kms create-key --endpoint=http://localhost:8080
{
"KeyMetadata": {
"AWSAccountId": "111122223333",
"KeyId": "5beb0309-d1ec-45ea-895a-52bbecbc8bde",
"Arn": "arn:aws:kms:eu-west-2:111122223333:key/5beb0309-d1ec-45ea-895a-52bbecbc8bde",
"CreationDate": "2023-07-03T14:24:36+05:00",
"Enabled": true,
"KeyUsage": "ENCRYPT_DECRYPT",
"KeyState": "Enabled",
"Origin": "AWS_KMS",
"KeyManager": "CUSTOMER",
"CustomerMasterKeySpec": "SYMMETRIC_DEFAULT",
"KeySpec": "SYMMETRIC_DEFAULT",
"EncryptionAlgorithms": [
"SYMMETRIC_DEFAULT"
]
}
}
```
### Copy files with encryption
```
rclone --log-level INFO copy --cse-kms-master-key-id 5beb0309-d1ec-45ea-895a-52bbecbc8bde --kms-endpoint http://localhost:8080 --ignore-size --checksum --fast-list /Users/kmlebedev/files swfs:/bucket-name/files
```