move to github.com/seaweedfs/seaweedfs

chrislu 2022-07-29 01:40:03 -07:00
parent 140455fa85
commit 0d8bd95030
32 changed files with 77 additions and 80 deletions

@ -54,7 +54,7 @@ remove_bucket: newbucket3
# Presigned URL # Presigned URL
If [authentication](https://github.com/chrislusf/seaweedfs/wiki/Amazon-S3-API#authentication) is enabled, the url is not accessible without proper credentials. But you can presign a url and access it. If [authentication](https://github.com/seaweedfs/seaweedfs/wiki/Amazon-S3-API#authentication) is enabled, the url is not accessible without proper credentials. But you can presign a url and access it.
``` ```

@ -1,6 +1,6 @@
# Installation # Installation
See [AWS-CLI-with-SeaweedFS](https://github.com/chrislusf/seaweedfs/wiki/AWS-CLI-with-SeaweedFS#installation) See [AWS-CLI-with-SeaweedFS](https://github.com/seaweedfs/seaweedfs/wiki/AWS-CLI-with-SeaweedFS#installation)
# Execute commands # Execute commands

@ -26,7 +26,7 @@ This will add 1 physical volume when existing volumes are full. If using replica
fs.configure -locationPrefix=/buckets/ -replication=001 -volumeGrowthCount=2 -apply fs.configure -locationPrefix=/buckets/ -replication=001 -volumeGrowthCount=2 -apply
``` ```
See https://github.com/chrislusf/seaweedfs/wiki/Path-Specific-Configuration See https://github.com/seaweedfs/seaweedfs/wiki/Path-Specific-Configuration
# Supported APIs # Supported APIs

@ -1,6 +1,6 @@
# Deprecated # Deprecated
This feature is replace by https://github.com/chrislusf/seaweedfs/wiki/Filer-Active-Active-cross-cluster-continuous-synchronization This feature is replace by https://github.com/seaweedfs/seaweedfs/wiki/Filer-Active-Active-cross-cluster-continuous-synchronization
# Architecture # Architecture

@ -34,11 +34,11 @@ Name | Author | Language
[Julia SeaweedFS Client](https://github.com/lawless-m/SeaweedFSClient.jl) | Lawless-m | Julia [Julia SeaweedFS Client](https://github.com/lawless-m/SeaweedFSClient.jl) | Lawless-m | Julia
## GRPC APIs ## GRPC APIs
SeaweedFS uses GRPC internally. You can use them too. Just check https://github.com/chrislusf/seaweedfs/tree/master/weed/pb for the proto files. SeaweedFS uses GRPC internally. You can use them too. Just check https://github.com/seaweedfs/seaweedfs/tree/master/weed/pb for the proto files.
When developing HDFS compatible file system, which allows replacing HDFS with SeaweedFS, a Java implementation of the GRPC client API is developed. When developing HDFS compatible file system, which allows replacing HDFS with SeaweedFS, a Java implementation of the GRPC client API is developed.
* Java GRPC client Source code: https://github.com/chrislusf/seaweedfs/tree/master/other/java/client * Java GRPC client Source code: https://github.com/seaweedfs/seaweedfs/tree/master/other/java/client
* Java GRPC client Maven Repo: https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-client * Java GRPC client Maven Repo: https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-client
## Projects using SeaweedFS ## Projects using SeaweedFS

@ -8,9 +8,9 @@
> file in SeaweedFS will also become a file in the cloud storage provider. > file in SeaweedFS will also become a file in the cloud storage provider.
> - This is useful in case you want to use the files inside the cloud provider's infrastructure. > - This is useful in case you want to use the files inside the cloud provider's infrastructure.
> - However, this does **not support file encryption** in any way (obviously), as the files are put to Cloud Storage as is. > - However, this does **not support file encryption** in any way (obviously), as the files are put to Cloud Storage as is.
> 2) [Tiered Storage with Cloud Tier](https://github.com/chrislusf/seaweedfs/wiki/Cloud-Tier) > 2) [Tiered Storage with Cloud Tier](https://github.com/seaweedfs/seaweedfs/wiki/Cloud-Tier)
> - In this mode, seaweedFS **moves full volume files to the cloud storage provider**, so files which are 1 GB (in our case) big. > - In this mode, seaweedFS **moves full volume files to the cloud storage provider**, so files which are 1 GB (in our case) big.
> - This mode supports [Filer Data Encryption](https://github.com/chrislusf/seaweedfs/wiki/Filer-Data-Encryption) transparently. > - This mode supports [Filer Data Encryption](https://github.com/seaweedfs/seaweedfs/wiki/Filer-Data-Encryption) transparently.
> - The chunk files uploaded to the cloud provider are not usable outside of SeaweedFS. > - The chunk files uploaded to the cloud provider are not usable outside of SeaweedFS.
@ -141,4 +141,3 @@ SeaweedFS Cloud Drive has these unique characteristics:
* You may need to access cloud data by HDFS, or HTTP, or S3 API, or WebDav, or FUSE Mount. * You may need to access cloud data by HDFS, or HTTP, or S3 API, or WebDav, or FUSE Mount.
* With SeaweedFS Cloud Drive * With SeaweedFS Cloud Drive
* Multiple ways to access remote storage. * Multiple ways to access remote storage.

@ -1,7 +1,7 @@
## Motivation ## Motivation
> NOTE: SeaweedFS provides **two mechanisms** to use cloud storage: > NOTE: SeaweedFS provides **two mechanisms** to use cloud storage:
> 1) [SeaweedFS Cloud Drive](https://github.com/chrislusf/seaweedfs/wiki/Cloud-Drive-Benefits) > 1) [SeaweedFS Cloud Drive](https://github.com/seaweedfs/seaweedfs/wiki/Cloud-Drive-Benefits)
> - in this case, you can **mount** an S3 bucket to the Seaweedfs file system (in the filer), and access the remote files > - in this case, you can **mount** an S3 bucket to the Seaweedfs file system (in the filer), and access the remote files
> through SeaweedFS. Effectively, SeaweedFS caches the files from the cloud storage. > through SeaweedFS. Effectively, SeaweedFS caches the files from the cloud storage.
> - In this mode, the file structure in cloud store is exactly matching the SeaweedFS structure - so every > - In this mode, the file structure in cloud store is exactly matching the SeaweedFS structure - so every
@ -10,7 +10,7 @@
> - However, this does **not support file encryption** in any way (obviously), as the files are put to Cloud Storage as is. > - However, this does **not support file encryption** in any way (obviously), as the files are put to Cloud Storage as is.
> 2) **Tiered Storage with Cloud Tier** (**<== You are here**) > 2) **Tiered Storage with Cloud Tier** (**<== You are here**)
> - In this mode, seaweedFS **moves full volume files to the cloud storage provider**, so files which are 1 GB (in our case) big. > - In this mode, seaweedFS **moves full volume files to the cloud storage provider**, so files which are 1 GB (in our case) big.
> - This mode supports [Filer Data Encryption](https://github.com/chrislusf/seaweedfs/wiki/Filer-Data-Encryption) transparently. > - This mode supports [Filer Data Encryption](https://github.com/seaweedfs/seaweedfs/wiki/Filer-Data-Encryption) transparently.
> - The chunk files uploaded to the cloud provider are not usable outside of SeaweedFS. > - The chunk files uploaded to the cloud provider are not usable outside of SeaweedFS.

@ -3,7 +3,7 @@ It is fairly easy if you need to store filer metadata with other data store.
Let's use "yourstore" as the chosen name. Let's use "yourstore" as the chosen name.
Here are the steps: Here are the steps:
1. Add a package under github.com/chrislusf/seaweedfs/weed/filer/yourstore 1. Add a package under github.com/seaweedfs/seaweedfs/weed/filer/yourstore
2. Implement the filer.FilerStore interface 2. Implement the filer.FilerStore interface
``` ```
package filer package filer
@ -45,21 +45,21 @@ func init() {
filer2.Stores = append(filer2.Stores, &YourStore{}) filer2.Stores = append(filer2.Stores, &YourStore{})
} }
``` ```
4. Load yourstore. Just import it in github.com/chrislusf/seaweedfs/weed/server/filer_server.go 4. Load yourstore. Just import it in github.com/seaweedfs/seaweedfs/weed/server/filer_server.go
``` ```
import ( import (
"net/http" "net/http"
"strconv" "strconv"
"github.com/chrislusf/seaweedfs/weed/filer" "github.com/seaweedfs/seaweedfs/weed/filer"
_ "github.com/chrislusf/seaweedfs/weed/filer/cassandra" _ "github.com/seaweedfs/seaweedfs/weed/filer/cassandra"
_ "github.com/chrislusf/seaweedfs/weed/filer/leveldb" _ "github.com/seaweedfs/seaweedfs/weed/filer/leveldb"
_ "github.com/chrislusf/seaweedfs/weed/filer/mysql" _ "github.com/seaweedfs/seaweedfs/weed/filer/mysql"
_ "github.com/chrislusf/seaweedfs/weed/filer/postgres" _ "github.com/seaweedfs/seaweedfs/weed/filer/postgres"
_ "github.com/chrislusf/seaweedfs/weed/filer/redis" _ "github.com/seaweedfs/seaweedfs/weed/filer/redis"
_ "github.com/chrislusf/seaweedfs/weed/filer/yourstore" _ "github.com/seaweedfs/seaweedfs/weed/filer/yourstore"
// ^^ add here // ^^ add here
"github.com/chrislusf/seaweedfs/weed/security" "github.com/seaweedfs/seaweedfs/weed/security"
"github.com/chrislusf/seaweedfs/weed/glog" "github.com/seaweedfs/seaweedfs/weed/glog"
) )
``` ```
5. Send a pull request! 5. Send a pull request!

@ -10,7 +10,7 @@ This is a really good guideline on how to setup minikube: https://thenewstack.io
# Steps # Steps
The default docker image is based on alpine Linux which causes issues with the DNS addon in Minikubes (see https://github.com/chrislusf/seaweedfs/issues/474). To get around this problem, I rebuilt the docker image from "scratch". The default docker image is based on alpine Linux which causes issues with the DNS addon in Minikubes (see https://github.com/seaweedfs/seaweedfs/issues/474). To get around this problem, I rebuilt the docker image from "scratch".
To do this I added (or modified) the following files to the local repository: To do this I added (or modified) the following files to the local repository:
@ -40,8 +40,8 @@ The script uses the docker file to build my image. Note the followings:
``` ```
#!/bin/sh #!/bin/sh
go get github.com/chrislusf/seaweedfs/weed/... go get github.com/seaweedfs/seaweedfs/weed/...
CGO_ENABLED=0 GOOS=linux go build github.com/chrislusf/seaweedfs/weed CGO_ENABLED=0 GOOS=linux go build github.com/seaweedfs/seaweedfs/weed
docker build -t weed:latest -f ./Dockerfile . docker build -t weed:latest -f ./Dockerfile .
docker tag weed:latest 192.168.42.23:80/weed:latest docker tag weed:latest 192.168.42.23:80/weed:latest
docker push 192.168.42.23:80/weed:latest docker push 192.168.42.23:80/weed:latest

@ -22,7 +22,7 @@ The downside:
Side Note: Side Note:
* The 10+4 can be easily adjusted via `DataShardsCount` and `ParityShardsCount` in * The 10+4 can be easily adjusted via `DataShardsCount` and `ParityShardsCount` in
https://github.com/chrislusf/seaweedfs/blob/master/weed/storage/erasure_coding/ec_encoder.go#L17 https://github.com/seaweedfs/seaweedfs/blob/master/weed/storage/erasure_coding/ec_encoder.go#L17
* If you are considering these enterprise-level customizations, please consider supporting SeaweedFS first. * If you are considering these enterprise-level customizations, please consider supporting SeaweedFS first.
## Architecture ## Architecture

4
FAQ.md

@ -6,7 +6,7 @@ SeaweedFS has web dashboards for its different services:
* Volume server dashboards can be accessed on `http://hostname:port/ui/index.html`. * Volume server dashboards can be accessed on `http://hostname:port/ui/index.html`.
For example: `http://localhost:8080/ui/index.html` For example: `http://localhost:8080/ui/index.html`
Also see [#275](https://github.com/chrislusf/seaweedfs/issues/275). Also see [#275](https://github.com/seaweedfs/seaweedfs/issues/275).
### Does it support xxx language? ### Does it support xxx language?
If using `weed filer`, just send one HTTP POST to write, or one HTTP GET to read. If using `weed filer`, just send one HTTP POST to write, or one HTTP GET to read.
@ -54,7 +54,7 @@ It is also important to leave some disk space for a couple of volume size, so th
If one volume has large number of small files, the memory usage would be high in order to keep each entry in memory or in leveldb. If one volume has large number of small files, the memory usage would be high in order to keep each entry in memory or in leveldb.
To reduce memory usage, one way is to convert the older volumes into [Erasure-Coded volumes](https://github.com/chrislusf/seaweedfs/wiki/Erasure-Coding-for-warm-storage), which are read only. The volume server can will sort the index and store it as a sorted index file (with extension `.sdx`). So looking up one entry costs a binary search within the sorted index file, instead of O(1) memory lookup. To reduce memory usage, one way is to convert the older volumes into [Erasure-Coded volumes](https://github.com/seaweedfs/seaweedfs/wiki/Erasure-Coding-for-warm-storage), which are read only. The volume server can will sort the index and store it as a sorted index file (with extension `.sdx`). So looking up one entry costs a binary search within the sorted index file, instead of O(1) memory lookup.
### How to configure volumes larger than 30GB? ### How to configure volumes larger than 30GB?

@ -87,7 +87,7 @@ filer ---- mount1
#### Mount directory on host from docker-compose #### Mount directory on host from docker-compose
If docker compose is being used to manage the server (eg. https://github.com/chrislusf/seaweedfs/wiki/Getting-Started#with-compose) If docker compose is being used to manage the server (eg. https://github.com/seaweedfs/seaweedfs/wiki/Getting-Started#with-compose)
it's possible to mount a directory on the host with docker privileged mode like so: it's possible to mount a directory on the host with docker privileged mode like so:
``` ```
mount_1: mount_1:
@ -280,7 +280,7 @@ From https://github.com/osxfuse/osxfuse/issues/358
> FUSE needs to register a virtual device for exchanging messages between the kernel and the actual file system implementation running in user space. The number of available device slots is limited by macOS. So if you are using other software like VMware, VirtualBox, TunTap, Intel HAXM, ..., that eat up all free device slots, FUSE will not be able to register its virtual device. > FUSE needs to register a virtual device for exchanging messages between the kernel and the actual file system implementation running in user space. The number of available device slots is limited by macOS. So if you are using other software like VMware, VirtualBox, TunTap, Intel HAXM, ..., that eat up all free device slots, FUSE will not be able to register its virtual device.
### Samba share mounted folder ### ### Samba share mounted folder ###
From https://github.com/chrislusf/seaweedfs/issues/936 From https://github.com/seaweedfs/seaweedfs/issues/936
The issue is with samba.conf. If you see NT_STATUS_ACCESS_DENIED error, try to add `force user` and `force group` to your samba.conf file. The issue is with samba.conf. If you see NT_STATUS_ACCESS_DENIED error, try to add `force user` and `force group` to your samba.conf file.
``` ```
[profiles] [profiles]

@ -81,7 +81,7 @@ The gRPC API is also open to public and can support many other languages.
Here is an example, in Java: Here is an example, in Java:
https://github.com/chrislusf/seaweedfs/blob/master/other/java/examples/src/main/java/com/seaweedfs/examples/WatchFiles.java https://github.com/seaweedfs/seaweedfs/blob/master/other/java/examples/src/main/java/com/seaweedfs/examples/WatchFiles.java
To subscribe the meta data changes: To subscribe the meta data changes:
| Parameter | Meaning | | Parameter | Meaning |
@ -102,7 +102,7 @@ Basically there are four types of events to handle:
This is based on Filer gRPC API. You should be able to easily implement it in your own language. This is based on Filer gRPC API. You should be able to easily implement it in your own language.
https://github.com/chrislusf/seaweedfs/blob/master/weed/pb/filer.proto#L52 https://github.com/seaweedfs/seaweedfs/blob/master/weed/pb/filer.proto#L52
# Possible Use Cases # Possible Use Cases

@ -21,5 +21,4 @@ Another side is, with GDPR, companies are required to "forget" customer data aft
### Encryption Algorithm ### Encryption Algorithm
The encryption is through AES256-GCM https://en.wikipedia.org/wiki/Galois/Counter_Mode The encryption is through AES256-GCM https://en.wikipedia.org/wiki/Galois/Counter_Mode
There is one randomly generated cipher key of 256 bits for each file chunk. The cipher code is here https://github.com/chrislusf/seaweedfs/blob/master/weed/util/cipher.go There is one randomly generated cipher key of 256 bits for each file chunk. The cipher code is here https://github.com/seaweedfs/seaweedfs/blob/master/weed/util/cipher.go

@ -3,7 +3,7 @@ On filer, there is a `/topics/.system/log` folder, it stores all filer metadata
## Metadata Event Format ## Metadata Event Format
The events are stored in files organized by timestamp, `yyyy-MM-dd/hh-mm.segment`. The events are stored in files organized by timestamp, `yyyy-MM-dd/hh-mm.segment`.
The events are encoded by protobuf, defined in https://github.com/chrislusf/seaweedfs/blob/master/weed/pb/filer.proto . The related sections are: The events are encoded by protobuf, defined in https://github.com/seaweedfs/seaweedfs/blob/master/weed/pb/filer.proto . The related sections are:
``` ```
service SeaweedFiler { service SeaweedFiler {
rpc SubscribeMetadata (SubscribeMetadataRequest) returns (stream SubscribeMetadataResponse) { rpc SubscribeMetadata (SubscribeMetadataRequest) returns (stream SubscribeMetadataResponse) {
@ -41,7 +41,7 @@ The ondisk file is a repeated bytes of the following format:
The `LogEntry.data` stores serialized `SubscribeMetadataResponse` The `LogEntry.data` stores serialized `SubscribeMetadataResponse`
## Read Metadata Events ## Read Metadata Events
The events can be read by any program as files. One example is here: https://github.com/chrislusf/seaweedfs/blob/master/unmaintained/see_log_entry/see_log_entry.go The events can be read by any program as files. One example is here: https://github.com/seaweedfs/seaweedfs/blob/master/unmaintained/see_log_entry/see_log_entry.go
## Subscribe to Metadata ## Subscribe to Metadata

@ -270,7 +270,7 @@ The patterns are case-sensitive and support wildcard characters '*' and '?'.
// recursively delete everything, ignoring any recursive error // recursively delete everything, ignoring any recursive error
> curl -X DELETE http://localhost:8888/path/to/dir?recursive=true&ignoreRecursiveError=true > curl -X DELETE http://localhost:8888/path/to/dir?recursive=true&ignoreRecursiveError=true
// For Experts Only: remove filer directories only, without removing data chunks. // For Experts Only: remove filer directories only, without removing data chunks.
// see https://github.com/chrislusf/seaweedfs/pull/1153 // see https://github.com/seaweedfs/seaweedfs/pull/1153
> curl -X DELETE http://localhost:8888/path/to?recursive=true&skipChunkDeletion=true > curl -X DELETE http://localhost:8888/path/to?recursive=true&skipChunkDeletion=true
``` ```
| Parameter | Description | Default | | Parameter | Description | Default |

@ -2,7 +2,7 @@
The Filer Store persists all file metadata and directory information. The Filer Store persists all file metadata and directory information.
| Filer Store Name | Lookup | number of entries in a folder | Scalability | Directory Renaming | TTL | [Fast Bucket Deletion](https://github.com/chrislusf/seaweedfs/wiki/S3-API-FAQ#how-to-speed-up-bucket-deletion) |Note | | Filer Store Name | Lookup | number of entries in a folder | Scalability | Directory Renaming | TTL | [Fast Bucket Deletion](https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#how-to-speed-up-bucket-deletion) |Note |
| ---------------- | -- | -- | -- | -- | -- | -- | -- | | ---------------- | -- | -- | -- | -- | -- | -- | -- |
| memory | O(1) | limited by memory | Local, Fast | | Yes| | for testing only, no persistent storage | | memory | O(1) | limited by memory | Local, Fast | | Yes| | for testing only, no persistent storage |
| leveldb | O(logN)| unlimited | Local, Very Fast | | Yes| | Default, fairly scalable | | leveldb | O(logN)| unlimited | Local, Very Fast | | Yes| | Default, fairly scalable |
@ -13,8 +13,8 @@ The Filer Store persists all file metadata and directory information.
| Mongodb | O(logN)| unlimited | Local or Distributed, Fast | | Yes| | Easy to manage | | Mongodb | O(logN)| unlimited | Local or Distributed, Fast | | Yes| | Easy to manage |
| Arangodb | O(logN)| unlimited | Local or Distributed, Fast | | Native| Yes | Easy to manage; Scalable | | Arangodb | O(logN)| unlimited | Local or Distributed, Fast | | Native| Yes | Easy to manage; Scalable |
| YDB | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Native| Yes | Easy to manage; True elastic Scalability; High Availability. Need to manually build. | | YDB | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Native| Yes | Easy to manage; True elastic Scalability; High Availability. Need to manually build. |
| [Redis2](https://github.com/chrislusf/seaweedfs/wiki/Filer-Redis-Setup)| O(1) | limited | Local or Distributed, Fastest ||Native| | one directory's children are stored in one key~value entry | | [Redis2](https://github.com/seaweedfs/seaweedfs/wiki/Filer-Redis-Setup)| O(1) | limited | Local or Distributed, Fastest ||Native| | one directory's children are stored in one key~value entry |
| [Redis3](https://github.com/chrislusf/seaweedfs/wiki/Filer-Redis-Setup)| O(1) | unlimited | Local or Distributed, Fastest ||Native| | one directory's children are spread into multiple key~value entries | | [Redis3](https://github.com/seaweedfs/seaweedfs/wiki/Filer-Redis-Setup)| O(1) | unlimited | Local or Distributed, Fastest ||Native| | one directory's children are spread into multiple key~value entries |
| Cassandra | O(logN)| unlimited | Local or Distributed, Very Fast||Native| | | | Cassandra | O(logN)| unlimited | Local or Distributed, Very Fast||Native| | |
| MySql | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Yes| | Easy to manage | | MySql | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Yes| | Easy to manage |
| MySql2 | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Yes| Yes| Easy to manage | | MySql2 | O(logN)| unlimited | Local or Distributed, Fast |Atomic| Yes| Yes| Easy to manage |

@ -2,7 +2,7 @@
## Installing SeaweedFS ## Installing SeaweedFS
Download the latest official release from https://github.com/chrislusf/seaweedfs/releases. Download the latest official release from https://github.com/seaweedfs/seaweedfs/releases.
Decompress the downloaded file. You will only find one executable file, either "weed" on most systems or "weed.exe" on windows. Decompress the downloaded file. You will only find one executable file, either "weed" on most systems or "weed.exe" on windows.
@ -92,7 +92,7 @@ docker-compose -f docker/seaweedfs-compose.yml -p seaweedfs up
You can use image "chrislusf/seaweedfs" or build your own with [dockerfile][] in the root of repo. You can use image "chrislusf/seaweedfs" or build your own with [dockerfile][] in the root of repo.
[dockerfile]: https://github.com/chrislusf/seaweedfs/blob/master/docker/Dockerfile [dockerfile]: https://github.com/seaweedfs/seaweedfs/blob/master/docker/Dockerfile
### Using pre-built Docker image ### Using pre-built Docker image
@ -118,7 +118,7 @@ curl "http://$IP:9333/cluster/status?pretty=y"
Make a local copy of seaweedfs from github Make a local copy of seaweedfs from github
```bash ```bash
git clone https://github.com/chrislusf/seaweedfs.git git clone https://github.com/seaweedfs/seaweedfs.git
``` ```
Minimal Image (~19.6 MB) Minimal Image (~19.6 MB)

@ -4,16 +4,16 @@ SeaweedFS excels on small files and has no issue to store large files. Now it is
# Build SeaweedFS Hadoop Client Jar # Build SeaweedFS Hadoop Client Jar
``` ```
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/client $cd $GOPATH/src/github.com/seaweedfs/seaweedfs/other/java/client
$ mvn install $ mvn install
# build for hadoop2 # build for hadoop2
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs2 $cd $GOPATH/src/github.com/seaweedfs/seaweedfs/other/java/hdfs2
$ mvn package $ mvn package
$ ls -al target/seaweedfs-hadoop2-client-3.13.jar $ ls -al target/seaweedfs-hadoop2-client-3.13.jar
# build for hadoop3 # build for hadoop3
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs3 $cd $GOPATH/src/github.com/seaweedfs/seaweedfs/other/java/hdfs3
$ mvn package $ mvn package
$ ls -al target/seaweedfs-hadoop3-client-3.13.jar $ ls -al target/seaweedfs-hadoop3-client-3.13.jar

@ -8,7 +8,7 @@ To support large files, SeaweedFS supports these two kinds of files:
This piece of code shows the json file structure: This piece of code shows the json file structure:
https://github.com/chrislusf/seaweedfs/blob/master/weed/operation/chunked_file.go#L24 https://github.com/seaweedfs/seaweedfs/blob/master/weed/operation/chunked_file.go#L24
``` ```
type ChunkInfo struct { type ChunkInfo struct {

@ -134,7 +134,7 @@ will also generate a "pictures" collection and a "documents" collection if they
Actually, the actual data files have the collection name as the prefix, e.g., "pictures_1.dat", "documents_3.dat". Actually, the actual data files have the collection name as the prefix, e.g., "pictures_1.dat", "documents_3.dat".
If you need to delete them later see https://github.com/chrislusf/seaweedfs/wiki/Master-Server-API#delete-collection If you need to delete them later see https://github.com/seaweedfs/seaweedfs/wiki/Master-Server-API#delete-collection
## Logging ## Logging

@ -14,7 +14,7 @@ We will use 2 servers. Server 1 will host master, 2x volumes (2 disks, one volum
# todo: use 2 step build process, copy over weed binary to fresh container (do not need curl and tar at runtime) # todo: use 2 step build process, copy over weed binary to fresh container (do not need curl and tar at runtime)
FROM alpine FROM alpine
RUN apk update && apk add wget tar RUN apk update && apk add wget tar
RUN wget https://github.com/chrislusf/seaweedfs/releases/download/3.13/linux_amd64_large_disk.tar.gz RUN wget https://github.com/seaweedfs/seaweedfs/releases/download/3.13/linux_amd64_large_disk.tar.gz
RUN tar -xf linux_amd64_large_disk.tar.gz RUN tar -xf linux_amd64_large_disk.tar.gz
RUN chmod +x weed RUN chmod +x weed
RUN mv weed /usr/bin/ RUN mv weed /usr/bin/
@ -52,7 +52,7 @@ services:
``` ```
5. `docker-compose up` on both servers and check that the master sees the volume 5. `docker-compose up` on both servers and check that the master sees the volume
6. Follow [security guide](https://github.com/chrislusf/seaweedfs/wiki/Security-Configuration) to add secrets and certs. Scaffold `security.toml` file and generate certs, in this example, all certs are in `certs/` folder. Update `docker-compose.yml` of master server: 6. Follow [security guide](https://github.com/seaweedfs/seaweedfs/wiki/Security-Configuration) to add secrets and certs. Scaffold `security.toml` file and generate certs, in this example, all certs are in `certs/` folder. Update `docker-compose.yml` of master server:
```yml ```yml
version: '3.7' version: '3.7'
services: services:

@ -5,7 +5,7 @@ Here are some tests on one single computer. However, if you need more performanc
# Test with Warp # Test with Warp
https://github.com/minio/warp https://github.com/minio/warp
Warp is a more complete test suite. It needs identity access management. So you would need to configure authentication info with `weed s3 -conf=...` or `weed filer -s3 -s3.conf=...`. See https://github.com/chrislusf/seaweedfs/wiki/Amazon-S3-API#authentication Warp is a more complete test suite. It needs identity access management. So you would need to configure authentication info with `weed s3 -conf=...` or `weed filer -s3 -s3.conf=...`. See https://github.com/seaweedfs/seaweedfs/wiki/Amazon-S3-API#authentication
Here is the results from my local laptop written to an external SSD via USB 3.1: Here is the results from my local laptop written to an external SSD via USB 3.1:

@ -2,7 +2,7 @@
## Can not upload due to "no free volumes left" ## Can not upload due to "no free volumes left"
The symptom is similar to https://github.com/chrislusf/seaweedfs/issues/1631 where the logs show The symptom is similar to https://github.com/seaweedfs/seaweedfs/issues/1631 where the logs show
``` ```
Nov 20 18:49:37 s2375.j weed[31818]: E1120 18:49:37 31818 filer_server_handlers_write.go:42] failing to assign a file id: rpc error: code = Unknown desc = No free volumes left! Nov 20 18:49:37 s2375.j weed[31818]: E1120 18:49:37 31818 filer_server_handlers_write.go:42] failing to assign a file id: rpc error: code = Unknown desc = No free volumes left!
Nov 20 18:49:37 s2375.j weed[31818]: I1120 18:49:37 31818 common.go:53] response method:PUT URL:/buckets/dev-passport-video-recordings/02342a46-7435-b698-2437-c778db34ef59.mp4 with httpStatus:500 and JSON:{"error":"rpc error: code = Unknown desc = No free volumes left!"} Nov 20 18:49:37 s2375.j weed[31818]: I1120 18:49:37 31818 common.go:53] response method:PUT URL:/buckets/dev-passport-video-recordings/02342a46-7435-b698-2437-c778db34ef59.mp4 with httpStatus:500 and JSON:{"error":"rpc error: code = Unknown desc = No free volumes left!"}
@ -24,14 +24,14 @@ This will add 1 physical volume when existing volumes are full. If using replica
fs.configure -locationPrefix=/buckets/ -replication=001 -volumeGrowthCount=2 -apply fs.configure -locationPrefix=/buckets/ -replication=001 -volumeGrowthCount=2 -apply
``` ```
See https://github.com/chrislusf/seaweedfs/wiki/Path-Specific-Configuration See https://github.com/seaweedfs/seaweedfs/wiki/Path-Specific-Configuration
## How to speed up bucket deletion? ## How to speed up bucket deletion?
One common unexpected problem is the deletion can be slow. To delete a file, we need to delete the file content on the volume servers and delete the file entry from the filer store. It is almost the same amount of work as adding a file. If there are millions of files, it can take a long time to delete. One common unexpected problem is the deletion can be slow. To delete a file, we need to delete the file content on the volume servers and delete the file entry from the filer store. It is almost the same amount of work as adding a file. If there are millions of files, it can take a long time to delete.
When you need to create large buckets and delete them often, you may choose `leveldb3` as the filer store, or any other stores that supports **Fast Bucket Deletion** in https://github.com/chrislusf/seaweedfs/wiki/Filer-Stores When you need to create large buckets and delete them often, you may choose `leveldb3` as the filer store, or any other stores that supports **Fast Bucket Deletion** in https://github.com/seaweedfs/seaweedfs/wiki/Filer-Stores
`leveldb3` can automatically create a separate LevelDB instance for each bucket. `leveldb3` can automatically create a separate LevelDB instance for each bucket.
So bucket deletion is as simple as deleting the LevelDB instance files and the collection of volume files. So bucket deletion is as simple as deleting the LevelDB instance files and the collection of volume files.

@ -2,12 +2,12 @@ There are a few external Java libraries available. But actually SeaweedFS alread
Here is an SeaweedFS Java API implementation refactored out of the existing code. Here is an SeaweedFS Java API implementation refactored out of the existing code.
https://github.com/chrislusf/seaweedfs/tree/master/other/java/examples/src/main/java/com/seaweedfs/examples https://github.com/seaweedfs/seaweedfs/tree/master/other/java/examples/src/main/java/com/seaweedfs/examples
# Build Java Client Jar # Build Java Client Jar
``` ```
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/client $cd $GOPATH/src/github.com/seaweedfs/seaweedfs/other/java/client
$ mvn install $ mvn install
``` ```

@ -82,8 +82,8 @@ To enable JWT-based access control for the Filer,
If `jwt.filer_signing.key` is configured: When sending upload/update/delete HTTP operations to a filer server, the request header `Authorization` should be the JWT string (`Authorization: Bearer [JwtToken]`). The operation is authorized after the filer validates the JWT with `jwt.filer_signing.key`. If `jwt.filer_signing.key` is configured: When sending upload/update/delete HTTP operations to a filer server, the request header `Authorization` should be the JWT string (`Authorization: Bearer [JwtToken]`). The operation is authorized after the filer validates the JWT with `jwt.filer_signing.key`.
The JwtToken can be generated by calling `security.GenJwtForFilerServer(signingKey SigningKey, expiresAfterSec int)` in `github.com/chrislusf/seaweedfs/weed/security` package. The JwtToken can be generated by calling `security.GenJwtForFilerServer(signingKey SigningKey, expiresAfterSec int)` in `github.com/seaweedfs/seaweedfs/weed/security` package.
https://github.com/chrislusf/seaweedfs/blob/9b941773805400c520558d83aed633adc821988c/weed/security/jwt.go#L53 https://github.com/seaweedfs/seaweedfs/blob/9b941773805400c520558d83aed633adc821988c/weed/security/jwt.go#L53
If `jwt.filer_signing.read.key` is configured: When sending GET or HEAD requests to a filer server, the request header `Authorization` should be the JWT string (`Authorization: Bearer [JwtToken]`). The operation is authorized after the filer validates the JWT with `jwt.filer_signing.read.key`. If `jwt.filer_signing.read.key` is configured: When sending GET or HEAD requests to a filer server, the request header `Authorization` should be the JWT string (`Authorization: Bearer [JwtToken]`). The operation is authorized after the filer validates the JWT with `jwt.filer_signing.read.key`.

@ -41,7 +41,7 @@ And then you can configure your Prometheus to crawl them periodically.
# Dashboard # Dashboard
The dashboard is shared at https://github.com/chrislusf/seaweedfs/blob/master/other/metrics/grafana_seaweedfs.json The dashboard is shared at https://github.com/seaweedfs/seaweedfs/blob/master/other/metrics/grafana_seaweedfs.json
If you modify the dashboard, please share your revisions. If you modify the dashboard, please share your revisions.

@ -35,7 +35,7 @@ Example (within the weed shell):
fs.configure -locationPrefix=/buckets/ssd_ -disk=ssd -apply fs.configure -locationPrefix=/buckets/ssd_ -disk=ssd -apply
``` ```
https://github.com/chrislusf/seaweedfs/wiki/Path-Specific-Configuration https://github.com/seaweedfs/seaweedfs/wiki/Path-Specific-Configuration
# Custom Tags # Custom Tags

@ -39,7 +39,7 @@ curl -H "Content-Type:image/png" -F file=@myImage.png http://127.0.0.1:8080/5,27
The simple way is to front all master and volume servers with firewall. The simple way is to front all master and volume servers with firewall.
The following white list option is deprecated. Please follow https://github.com/chrislusf/seaweedfs/wiki/Security-Overview The following white list option is deprecated. Please follow https://github.com/seaweedfs/seaweedfs/wiki/Security-Overview
A white list option can be used. Only traffic from the white list IP addresses have write permission. A white list option can be used. Only traffic from the white list IP addresses have write permission.

@ -29,7 +29,7 @@ curl -F file=@/home/chris/myphoto.jpg http://127.0.0.1:8080/3,01637037d6
{"size": 43234} {"size": 43234}
``` ```
The size returned is the size stored on SeaweedFS, sometimes the file is automatically gzipped based on the file extension or mime type [(see when compression will be applied automatically)](https://github.com/chrislusf/seaweedfs/blob/c42b95c596f762dcca2bc9c7e7a918ab8ca8b206/weed/util/compression.go#L111). The size returned is the size stored on SeaweedFS, sometimes the file is automatically gzipped based on the file extension or mime type [(see when compression will be applied automatically)](https://github.com/seaweedfs/seaweedfs/blob/c42b95c596f762dcca2bc9c7e7a918ab8ca8b206/weed/util/compression.go#L111).
| URL Parameter | Description | Default | | URL Parameter | Description | Default |
| ---- | -- | -- | | ---- | -- | -- |
@ -150,4 +150,3 @@ curl "http://localhost:8080/status?pretty=y"
] ]
} }
``` ```

@ -37,8 +37,8 @@ And modify the configuration at runtime:
1. change the spark-defaults.conf 1. change the spark-defaults.conf
``` ```
spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar spark.driver.extraClassPath=/Users/chris/go/src/github.com/seaweedfs/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar
spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar spark.executor.extraClassPath=/Users/chris/go/src/github.com/seaweedfs/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar
spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem
``` ```
@ -81,8 +81,8 @@ spark.history.fs.cleaner.enabled=true
spark.history.fs.logDirectory=seaweedfs://localhost:8888/spark2-history/ spark.history.fs.logDirectory=seaweedfs://localhost:8888/spark2-history/
spark.eventLog.dir=seaweedfs://localhost:8888/spark2-history/ spark.eventLog.dir=seaweedfs://localhost:8888/spark2-history/
spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar spark.driver.extraClassPath=/Users/chris/go/src/github.com/seaweedfs/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar
spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar spark.executor.extraClassPath=/Users/chris/go/src/github.com/seaweedfs/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-3.13.jar
spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem
spark.hadoop.fs.defaultFS=seaweedfs://localhost:8888 spark.hadoop.fs.defaultFS=seaweedfs://localhost:8888
``` ```

@ -127,6 +127,6 @@ docker run \
"fs.configure -locationPrefix=/buckets/foo -volumeGrowthCount=3 -replication=002 -apply" "fs.configure -locationPrefix=/buckets/foo -volumeGrowthCount=3 -replication=002 -apply"
``` ```
Here `shell` selects the [Docker image entrypoint](https://github.com/chrislusf/seaweedfs/blob/master/docker/entrypoint.sh#L60-L64). Here `shell` selects the [Docker image entrypoint](https://github.com/seaweedfs/seaweedfs/blob/master/docker/entrypoint.sh#L60-L64).
The arguments are `fs.configure -locationPrefix=/buckets/foo -volumeGrowthCount=3 -replication=002 -apply` The arguments are `fs.configure -locationPrefix=/buckets/foo -volumeGrowthCount=3 -replication=002 -apply`