Go to file

Kegan Myers 327ee10d39 Add initial implementation of store server		2022-08-28 01:04:36 -05:00
cmd	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
local	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
pkg	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
proto	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
.gitignore	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
go.mod	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
go.sum	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
main.go	initial commit	2022-08-23 21:56:32 -05:00
Makefile	Add initial implementation of store server	2022-08-28 01:04:36 -05:00
README.md	Add initial implementation of store server	2022-08-28 01:04:36 -05:00

README.md

FiLeStore

fls is a tool for easily, efficiently, and reliably storing your files across a pool of multiple disks, servers, racks, zones, regions, or even datacenters.

What is the state of the project?

This project is very early in its development. It is no more than an experiment at this point. It lacks many features to make it useful, and many more features that would make it "good".

Do not use it to store any data you care about.

TODO

Chunk file validation
Chunk file repair/rebuilding
Input file reconstruction (with data validation)
Input file reconstruction (with missing chunk files/shards, without rebuilding)
Networking features
- Chunk storage
  - Tracking health of stored chunks
  - Rebuilding lost chunks
  - Balancing of chunks
- Filesystem features
- FUSE mount of network filesystem
Properly organize code, unify logic

IN-PROGRESS

Networking features
- Chunk storage
  - Basic functionality (mostly working)

DONE

Chunk file generation (data + parity)
Input file reconstruction (requires all data chunks, does not validate reconstructed data)

How does it work?

Files are striped (with a configurable stripe width, 10MiB by default) across a configurable number of data chunks (10 by default) and parity chunks (4 by default) are generated with Reed-Solomon erasure encoding. Chunks can be stored anywhere you can put a file. If the shards are distributed on enough disks/servers/whatever it is possible to recover from the loss of up to the number of parity chunks (by default you can lose any of up to 4 data or parity chunk files while maintaining data availability).

Why?

For fun. To solve a specific problem I have with existing options for distributed replicated file systems. The primary goal of this project is reliable file storage. Some are overly complex. Some are difficult to administer. Some scale poorly. Some don't have adequate data integrity features. Some require full file replication. Hopefully all of these shortcomings and more will be addressed for this specific problem space.

Notes

Deps

protoc

go exe deps

GOBIN=`pwd`/local/bin go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
GOBIN=`pwd`/local/bin go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest