Go to file
2022-08-28 01:04:36 -05:00
cmd Add initial implementation of store server 2022-08-28 01:04:36 -05:00
local Add initial implementation of store server 2022-08-28 01:04:36 -05:00
pkg Add initial implementation of store server 2022-08-28 01:04:36 -05:00
proto Add initial implementation of store server 2022-08-28 01:04:36 -05:00
.gitignore Add initial implementation of store server 2022-08-28 01:04:36 -05:00
go.mod Add initial implementation of store server 2022-08-28 01:04:36 -05:00
go.sum Add initial implementation of store server 2022-08-28 01:04:36 -05:00
main.go initial commit 2022-08-23 21:56:32 -05:00
Makefile Add initial implementation of store server 2022-08-28 01:04:36 -05:00
README.md Add initial implementation of store server 2022-08-28 01:04:36 -05:00

FiLeStore

fls is a tool for easily, efficiently, and reliably storing your files across a pool of multiple disks, servers, racks, zones, regions, or even datacenters.

What is the state of the project?

This project is very early in its development. It is no more than an experiment at this point. It lacks many features to make it useful, and many more features that would make it "good".

Do not use it to store any data you care about.

TODO

  • Chunk file validation
  • Chunk file repair/rebuilding
  • Input file reconstruction (with data validation)
  • Input file reconstruction (with missing chunk files/shards, without rebuilding)
  • Networking features
    • Chunk storage
      • Tracking health of stored chunks
      • Rebuilding lost chunks
      • Balancing of chunks
    • Filesystem features
    • FUSE mount of network filesystem
  • Properly organize code, unify logic

IN-PROGRESS

  • Networking features
    • Chunk storage
      • Basic functionality (mostly working)

DONE

  • Chunk file generation (data + parity)
  • Input file reconstruction (requires all data chunks, does not validate reconstructed data)

How does it work?

Files are striped (with a configurable stripe width, 10MiB by default) across a configurable number of data chunks (10 by default) and parity chunks (4 by default) are generated with Reed-Solomon erasure encoding. Chunks can be stored anywhere you can put a file. If the shards are distributed on enough disks/servers/whatever it is possible to recover from the loss of up to the number of parity chunks (by default you can lose any of up to 4 data or parity chunk files while maintaining data availability).

Why?

For fun. To solve a specific problem I have with existing options for distributed replicated file systems. The primary goal of this project is reliable file storage. Some are overly complex. Some are difficult to administer. Some scale poorly. Some don't have adequate data integrity features. Some require full file replication. Hopefully all of these shortcomings and more will be addressed for this specific problem space.

Notes

Deps

  • protoc

go exe deps

GOBIN=`pwd`/local/bin go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
GOBIN=`pwd`/local/bin go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest