diff --git a/Filer-Data-Encryption.md b/Filer-Data-Encryption.md index 6d6800d..c858c25 100644 --- a/Filer-Data-Encryption.md +++ b/Filer-Data-Encryption.md @@ -6,9 +6,14 @@ For filer, However, there could be many volume servers. And the volumes may be tiered to the cloud. What if there are some security breach? ### Encrypt data on volume servers -`weed filer -encryptVolumeData` is an option to encrypt the data on volume servers. The encryption key is randomly generated during write time, and is different for different files. The encryption key is stored as metadata in filer store. +`weed filer -encryptVolumeData` is an option to encrypt the data on volume servers. -So the volume data on the volume servers are encrypted and should be safe. As long as the filer store is not exposed, it is nearly impossible to guess the encryption keys for each file. +The encryption keys are randomly generated during write time, and are different for different files. The encryption keys are stored as metadata in filer store. + +So the volume data on the volume servers are encrypted. As long as the filer store is not exposed, it is nearly impossible to guess the encryption keys for all the files. + +### Safe Data Storage +Actually the volume servers do not have any concept of encryption. With the file content encrypted, it is safe to put volume servers anywhere you want. The volume servers are not visible to any unencrypted data, for either storage or transmission. ### Safely Forget Data Another side is, with GDPR, companies are required to "forget" customer data after some time. If the volume data is stored on a glacial storage system, it is cumbersome to dig them out and destroy them. It is much easier to just delete the metadata, and the volume data is automatically "destroyed". @@ -16,7 +21,7 @@ Another side is, with GDPR, companies are required to "forget" customer data aft ### Encryption Algorithm The encryption is through GCM https://en.wikipedia.org/wiki/Galois/Counter_Mode -There is one randomly generated cipher key of 256 bits for each file chunk. One file has one or many file chunks. By default the chunk size is 32MB. The cipher code is here https://github.com/chrislusf/seaweedfs/blob/master/weed/util/cipher.go +There is one randomly generated cipher key of 256 bits for each file chunk. The cipher code is here https://github.com/chrislusf/seaweedfs/blob/master/weed/util/cipher.go ### Note The volume servers are agnostic to encryption. There are no encryption if you only use master and volume servers as an object store.