cmd | ||
local | ||
pkg | ||
proto | ||
.gitignore | ||
go.mod | ||
go.sum | ||
main.go | ||
Makefile | ||
README.md |
FiLeStore
fls is a tool for easily, efficiently, and reliably storing your files across a pool of multiple disks, servers, racks, zones, regions, or even datacenters.
What is the state of the project?
This project is very early in its development. It is no more than an experiment at this point. It lacks many features to make it useful, and many more features that would make it "good".
Do not use it to store any data you care about.
TODO
- Chunk file validation
- Chunk file repair/rebuilding
- Input file reconstruction (with data validation)
- Input file reconstruction (with missing chunk files/shards, without rebuilding)
- Networking features
- Chunk storage
- Tracking health of stored chunks
- Rebuilding lost chunks
- Balancing of chunks
- Filesystem features
- FUSE mount of network filesystem
- Chunk storage
- Properly organize code, unify logic
IN-PROGRESS
- Networking features
- Chunk storage
- Basic functionality (mostly working)
- Chunk storage
DONE
- Chunk file generation (data + parity)
- Input file reconstruction (requires all data chunks, does not validate reconstructed data)
How does it work?
Files are striped (with a configurable stripe width, 10MiB by default) across a configurable number of data chunks (10 by default) and parity chunks (4 by default) are generated with Reed-Solomon erasure encoding. Chunks can be stored anywhere you can put a file. If the shards are distributed on enough disks/servers/whatever it is possible to recover from the loss of up to the number of parity chunks (by default you can lose any of up to 4 data or parity chunk files while maintaining data availability).
Why?
For fun. To solve a specific problem I have with existing options for distributed replicated file systems. The primary goal of this project is reliable file storage. Some are overly complex. Some are difficult to administer. Some scale poorly. Some don't have adequate data integrity features. Some require full file replication. Hopefully all of these shortcomings and more will be addressed for this specific problem space.
Notes
Deps
- protoc
go exe deps
GOBIN=`pwd`/local/bin go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
GOBIN=`pwd`/local/bin go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest