Updated Super Large Directories (markdown)

Chris Lu 2020-12-22 02:54:12 -08:00
parent ce08c9ef61
commit 2844d09e6f

@ -11,7 +11,7 @@ For example, for Cassandra filer store, each entry has this schema:
PRIMARY KEY (directory, name)
) WITH CLUSTERING ORDER BY (name ASC);
```
The directory is the partitioning key. So the entries with the same directory is partitioned to the same data node. This is fine for most cases. However, if there are billions of child entries for one directory, the data node would not perform well.
The directory is the partitioning key. So the entries with the same directory is partitioned to the same data node. This is fine for most cases. However, if there are billions of direct child entries under one directory, the data node would not perform well.
This is actually a common case when user name, id, or UUID are used as child entries. Usually a separate index is built to translate names to file id, and use file id to access data directory, giving up all the convenience from a file system.