Commit graph

186 commits

Author SHA1 Message Date
Andrew Garrett 082f67bfe9
Fix filer.backup local sink to propagate file mode changes (#4896) 2023-10-06 05:40:20 -07:00
Konstantin Lebedev 44906f1f3b
fix: avoid error file name too long when writing a file (#4876) 2023-09-27 05:40:51 -07:00
Lars Lehtonen 28a3a31b27
weed/replication/sub: fix dropped error (#4865) 2023-09-25 07:33:18 -07:00
chrislu 81fdf3651b grpc connection to filer add sw-client-id header 2023-01-20 01:48:12 -08:00
Chris Lu d4566d4aaa
more solid weed mount (#4089)
* compare chunks by timestamp

* fix slab clearing error

* fix test compilation

* move oldest chunk to sealed, instead of by fullness

* lock on fh.entryViewCache

* remove verbose logs

* revert slat clearing

* less logs

* less logs

* track write and read by timestamp

* remove useless logic

* add entry lock on file handle release

* use mem chunk only, swap file chunk has problems

* comment out code that maybe used later

* add debug mode to compare data read and write

* more efficient readResolvedChunks with linked list

* small optimization

* fix test compilation

* minor fix on writer

* add SeparateGarbageChunks

* group chunks into sections

* turn off debug mode

* fix tests

* fix tests

* tmp enable swap file chunk

* Revert "tmp enable swap file chunk"

This reverts commit 985137ec47.

* simple refactoring

* simple refactoring

* do not re-use swap file chunk. Sealed chunks should not be re-used.

* comment out debugging facilities

* either mem chunk or swap file chunk is fine now

* remove orderedMutex  as *semaphore.Weighted

not found impactful

* optimize size calculation for changing large files

* optimize performance to avoid going through the long list of chunks

* still problems with swap file chunk

* rename

* tiny optimization

* swap file chunk save only successfully read data

* fix

* enable both mem and swap file chunk

* resolve chunks with range

* rename

* fix chunk interval list

* also change file handle chunk group when adding chunks

* pick in-active chunk with time-decayed counter

* fix compilation

* avoid nil with empty fh.entry

* refactoring

* rename

* rename

* refactor visible intervals to *list.List

* refactor chunkViews to *list.List

* add IntervalList for generic interval list

* change visible interval to use IntervalList in generics

* cahnge chunkViews to *IntervalList[*ChunkView]

* use NewFileChunkSection to create

* rename variables

* refactor

* fix renaming leftover

* renaming

* renaming

* add insert interval

* interval list adds lock

* incrementally add chunks to readers

Fixes:
1. set start and stop offset for the value object
2. clone the value object
3. use pointer instead of copy-by-value when passing to interval.Value
4. use insert interval since adding chunk could be out of order

* fix tests compilation

* fix tests compilation
2023-01-02 23:20:45 -08:00
chrislu 6ede19e825 add a simple file replication progress bar 2022-12-20 19:47:21 -08:00
chrislu 6c7fe40305 filer sink retries reading file chunks, skipping missing chunks
if the file chunk is not available during replication time, the file is skipped
2022-12-19 11:31:58 -08:00
chrislu 70a4c98b00 refactor filer_pb.Entry and filer.Entry to use GetChunks()
for later locking on reading chunks
2022-11-15 06:33:36 -08:00
chrislu ea2637734a refactor filer proto chunk variable from mtime to modified_ts_ns 2022-10-28 12:53:19 -07:00
chrislu 0d817bc347 fix invalid memory address or nil pointer dereference on filer.sync
fix https://github.com/seaweedfs/seaweedfs/issues/3826
2022-10-11 21:58:17 -07:00
chrislu ea271600ec fix parameters 2022-10-04 12:36:05 -07:00
chrislu 0452ae6a6c filer.sync: limit concurrency when fetching file chunks
fix https://github.com/seaweedfs/seaweedfs/issues/3787
2022-10-04 11:35:07 -07:00
chrislu b463ca1a2f filer replication: compare content changes directly
Fix https://github.com/seaweedfs/seaweedfs/issues/3714

The destination chunks may be empty. For example, the file is updated and the volume is vacuumed. In this case, the sync would miss the old chunks. This is fine. However, the entry would have correct metadata but missing chunks.

For this case, the simple metadata comparison would be wrongly skipping data changes, and the file will stay empty unless file content md5 is changed.
2022-09-20 08:35:10 -07:00
Ryan Russell d54eb9966f
refactor: Directory readability (#3665) 2022-09-14 10:11:31 -07:00
Ryan Russell d734fff322
docs: replicte -> replicate (#3664) 2022-09-14 10:01:18 -07:00
Ryan Russell dfaa602661
refactor(notification_kafka): parition -> partition (#3663) 2022-09-14 09:15:21 -07:00
chrislu cb6cf331ca filer.backup and filer.sync: include headers during backup and sync
fix https://github.com/seaweedfs/seaweedfs/issues/3532
2022-09-04 18:26:36 -07:00
dependabot[bot] 97d69d5336
Bump gocloud.dev/pubsub/rabbitpubsub from 0.25.0 to 0.26.0 (#3541)
* Bump gocloud.dev/pubsub/rabbitpubsub from 0.25.0 to 0.26.0

Bumps [gocloud.dev/pubsub/rabbitpubsub](https://github.com/google/go-cloud) from 0.25.0 to 0.26.0.
- [Release notes](https://github.com/google/go-cloud/releases)
- [Commits](https://github.com/google/go-cloud/compare/v0.25.0...v0.26.0)

---
updated-dependencies:
- dependency-name: gocloud.dev/pubsub/rabbitpubsub
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* update code

* more code fix

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2022-08-31 10:16:49 -07:00
chrislu cc0c8c5f81 simplify 2022-08-27 00:21:57 -07:00
chrislu 87b70a6809 clean up 2022-08-27 00:09:04 -07:00
chrislu c839ce1b19 s3 sink use s3 upload manager
fix https://github.com/seaweedfs/seaweedfs/issues/3531
2022-08-26 23:47:12 -07:00
askeipx 2e78a522ab
remove old raft servers if they don't answer to pings for too long (#3398)
* remove old raft servers if they don't answer to pings for too long

add ping durations as options

rename ping fields

fix some todos

get masters through masterclient

raft remove server from leader

use raft servers to ping them

CheckMastersAlive for hashicorp raft only

* prepare blocking ping

* pass waitForReady as param

* pass waitForReady through all functions

* waitForReady works

* refactor

* remove unneeded params

* rollback unneeded changes

* fix
2022-08-23 23:18:21 -07:00
chrislu 4081d50607 filer sink: retryable data chunk uploading 2022-08-20 19:09:15 -07:00
chrislu e3f40d538d cleaner code 2022-08-20 17:51:30 -07:00
chrislu 11f99836c3 filer.backup: backup small files if the file is saved in filer (saveToFilerLimit > 0)
fix https://github.com/seaweedfs/seaweedfs/issues/3468
2022-08-19 23:00:56 -07:00
chrislu 2b580a7566 also migrate jsonpb 2022-08-17 12:42:03 -07:00
chrislu eaeb141b09 move proto package 2022-08-17 12:05:07 -07:00
Konstantin Lebedev 4d08393b7c
filer prefer volume server in same data center (#3405)
* initial prefer same data center
https://github.com/seaweedfs/seaweedfs/issues/3404

* GetDataCenter

* prefer same data center for ReplicationSource

* GetDataCenterId

* remove glog
2022-08-04 17:35:00 -07:00
chrislu 26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
Konstantin Lebedev 7e09a548a6 exclude directories to sync on filer 2022-07-27 19:22:57 +05:00
Konstantin Lebedev 785223e587 rabbitpubsub enable durable 2022-07-06 10:05:29 +05:00
Konstantin Lebedev bcbdc4cb37 use const multipart uploads folder
avoid error bucket NotEmpty if multipart uploads folder exist
2022-06-29 16:21:16 +05:00
chrislu bff1ccc1de fix compilation 2022-05-11 00:52:15 -07:00
chrislu 139e039c44 filer.sync: pass attributes for mount
fix https://github.com/chrislusf/seaweedfs/issues/3012
2022-05-06 03:54:12 -07:00
chrislu 3885374edf conditionally build elastic, gocdk to reduce binary size 2022-04-21 01:10:46 -07:00
chrislu 80c017907b filer.backup: fix backing up encrypted chunks
I have done filer.backup test:
replication.toml:
[sink.local]
enabled = true
directory = "/srv/test"
___
system@dat1:/srv/test$ weed filer.backup -filer=app1:8888 -filerProxy
I0228 12:39:28 19571 filer_replication.go:129] Configure sink to local
I0228 12:39:28 19571 filer_backup.go:98] resuming from 2022-02-28 12:04:20.210984693 +0100 CET
I0228 12:39:29 19571 filer_backup.go:113] backup app1:8888 progressed to 2022-02-28 12:04:20.211726749 +0100 CET 0.33/sec

system@dat1:/srv/test$ ls -l
total 16
drwxr-xr-x 2 system system 4096 Feb 28 12:39 a
-rw-r--r-- 1 system system   48 Feb 28 12:39 fu.txt
-rw-r--r-- 1 system system   32 Feb 28 12:39 _index.html
-rw-r--r-- 1 system system   68 Feb 28 12:39 index.php
system@dat1:/srv/test$ cat fu.txt
?	?=?^??`?f^};?{4?Z%?X0=??rV????|"?1??踪~??
system@dat1:/srv/test$
On the active mount on the target server it's:
system@app1:/srv/app$ ls -l
total 2
drwxrwxr-x 1 system system  0 Feb 28 12:04 a
-rw-r--r-- 1 system system 20 Feb 28 12:04 fu.txt
-rw-r--r-- 1 system system  4 Feb 28 12:04 _index.html
-rw-r--r-- 1 system system 40 Feb 28 12:04 index.php
system@app1:/srv/app$ cat fu.txt
This is static boy!
Filer was started with: weed filer master="app1:9333,app2:9333,app3:9333" -encryptVolumeData
It seems like it's still encrypted?
2022-02-28 10:07:06 -08:00
elee 881a0fe806 ensure compatibility 2022-02-27 04:50:59 -06:00
elee 954ad98e0d set canned acl on replication create 2022-02-27 04:49:31 -06:00
chrislu 9405eaefdb filer.sync: fix replicating partially updated file
Run two servers with volumes and fillers:
server -dir=Server1alpha -master.port=11000 -filer -filer.port=11001 -volume.port=11002
server -dir=Server1sigma -master.port=11006 -filer -filer.port=11007 -volume.port=11008

Run Active-Passive filler.sync:
filer.sync -a localhost:11007 -b localhost:11001 -isActivePassive

Upload file to 11007 port:
curl -F file=@/Desktop/9.xml "http://localhost:11007/testFacebook/"

If we request a file on two servers now, everything will be correct, even if we add data to the file and upload it again:
curl "http://localhost:11007/testFacebook/9.xml"
EQUALS
curl "http://localhost:11001/testFacebook/9.xml"

However, if we change the already existing data in the file (for example, we change the first line in the file, reducing its length), then this file on the second server will not be valid and will not be equivalent to the first file

Снимок экрана 2022-02-07 в 14 21 11

This problem occurs on line 202 in the filer_sink.go file. In particular, this is due to incorrect mapping of chunk names in the DoMinusChunks function. The names of deletedChunks do not match the chunks of existingEntry.Chunks, since the first chunks come from another server and have a different addressing (name) compared to the addressing on the server where the file is being overwritten.

Deleted chunks are not actually deleted on the server to which the file is replicated.
2022-02-07 03:46:28 -08:00
chrislu affe3c2c12 change to util.WriteFile 2022-02-04 21:32:27 -08:00
chrislu 9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
Chris Lu ce2af0082e revert 2021-11-28 23:35:22 -08:00
Chris Lu 1c9f3c7ac0 read deleted chunks when replcating data 2021-11-28 23:34:34 -08:00
Eng Zer Jun a23bcbb7ec
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-14 12:27:58 +08:00
Chris Lu e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu 6923af7280 refactoring 2021-09-06 16:20:49 -07:00
Chris Lu 7ce97b59d8 go fmt 2021-09-01 02:45:42 -07:00
Chris Lu c08ac536ed cloud drive: add support for Wasabi
* disable md5, sha256 checking to avoid reading one chunk twice
* single threaded upload to avoid chunk swapping (to be enhanced later)
2021-08-25 17:34:29 -07:00
Chris Lu 7c39a18ba5 update azure library 2021-08-24 00:32:35 -07:00
Chris Lu 00c4e06caa cloud drive: s3 configurable force path style 2021-08-23 03:30:41 -07:00