Updated Super Large Directories (markdown)

Chris Lu 2020-12-22 01:54:58 -08:00
parent 91251b7d3e
commit 79ee7e9c41

@ -4,7 +4,9 @@ If one super large directory has way too many files or sub folders, the file lis
For example, for Cassandra filer store, all the children in a directory are physically stored in one Cassandra data node. This is fine for most cases. However, if there are billions of child entries for one directory, the data node would not be able to query or even store the child list. For example, for Cassandra filer store, all the children in a directory are physically stored in one Cassandra data node. This is fine for most cases. However, if there are billions of child entries for one directory, the data node would not be able to query or even store the child list.
This is actually a common case when user name, id, or UUID are used as sub folder or file names. We need a way to spread the data to all data nodes. This is actually a common case when user name, id, or UUID are used as sub folder or file names. Usually a separate index is built to translate names to file id, and use file id to access data directory, giving up all the convenience from a file system.
We need a way to spread the data to all data nodes, without sacrificing too much.
# How it works? # How it works?