Updated Super Large Directories (markdown)

Chris Lu 2020-12-29 15:00:26 -08:00
parent 4deb66fd0e
commit ea0dcd5f2e

@ -1,8 +1,10 @@
# Why need a super large directory?
This is actually a common case. For example, entity ids, such as user name, id, IP address, URLs, or UUID can be used as sub directory names. And under the sub directory, more unstructured data can be colocated together, such as user avatar, uploaded files, access logs, URL text, images, audio, video, etc.
This is actually a common case. For example, entity ids, such as user name, id, IP address, URLs, or UUID can be used as sub directory names. The number of entity ids could be very large. And under the sub directory, more unstructured data can be colocated together, such as user avatar, uploaded files, access logs, URL text, images, audio, video, etc.
If using a separate lookup to translate the entity id to file id, and use file id to access data, this would give up all the convenience from a file system.
You can manually translate the entity id to file id with a separate lookup, and use file id to access data. This is exactly what SeaweedFS does internally. This manual approach not only re-invents the wheel, but also would give up all the convenience from a file system, such as deeper directories.
Assuming you are bootstrapping a startup with potentially millions of users, but currently only a few test accounts. You need to actually spend your time to really meet user requirements. You would not spend your time to design data structures and schemas for different cases to store customer data. Instead of optimizing early on, you can start with a folder for each account, and continue. SeaweedFS can make this simple approach future-proof.
# Why super large directory is challenging?