Commit graph

285 commits

Author SHA1 Message Date
Chris Lu a87f57e47c
Merge pull request #2868 from kmlebedev/hashicorp_raft
hashicorp raft
2022-04-10 23:00:05 -07:00
Konstantin Lebedev f5246b748d Merge branch 'new_master' into hashicorp_raft
# Conflicts:
#	weed/pb/master_pb/master.pb.go
2022-04-07 18:50:27 +05:00
shibinbin c20e1edd99 fix: master lose some volumes 2022-04-07 15:18:28 +08:00
chrislu bc888226fc erasure coding: tracking encoded/decoded volumes
If an EC shard is created but not spread to other servers, the masterclient would think this shard is not located here.
2022-04-05 19:03:02 -07:00
Konstantin Lebedev 14dd971890 hashicorp raft with state machine 2022-04-04 17:51:51 +05:00
Konstantin Lebedev c514710b7b initial add hashicorp raft 2022-04-04 13:50:56 +05:00
chrislu ae558fa073 log reasons volumes became unwritable 2022-03-21 00:41:44 -07:00
chrislu 57c6eddd22 avoid possible deadlock if volume layout is used in some logs 2022-03-21 00:04:01 -07:00
Konstantin Lebedev 9ea09cc41c healthz check to avoid drain pod with last replicas 2022-02-16 14:18:36 +05:00
Konstantin Lebedev 0ed76a0556 clearly 2022-02-14 14:10:06 +05:00
Konstantin Lebedev 36013f63ed https://github.com/chrislusf/seaweedfs/issues/2648 2022-02-14 13:59:12 +05:00
chrislu 433fde4b18 move error to a separate file
This file contains metric names for all errors
The naming convention is ErrorSomeThing = "error.some.thing"
2022-02-04 22:57:51 -08:00
Konstantin Lebedev 3f4e17aa24 error metrics for filer and store 2022-02-04 14:07:14 +05:00
chrislu 9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
Chris Lu b0665a15f4
Merge pull request #2527 from banjiaojuhao/master-assign-by-datanode 2021-12-21 08:56:51 -08:00
banjiaojuhao dda6b90d25 assign fileId according to DataNode with empty DataCenter and Rack 2021-12-21 17:28:33 +08:00
chrislu 5eacff9d4f log message adds server name
address https://github.com/chrislusf/seaweedfs/issues/2514#issuecomment-995925733
2021-12-16 10:46:26 -08:00
Chris Lu 3be3c17f59 volume vacuum: avoid timeout with streaming progress report
fix https://github.com/chrislusf/seaweedfs/issues/2396
2021-10-24 01:55:34 -07:00
Chris Lu e4830bd93d go fmt 2021-10-07 21:13:31 -07:00
Chris Lu 332d49432d reduce concurrent volume grow requests 2021-10-05 01:58:30 -07:00
Chris Lu 96119eab00 refactor 2021-10-05 00:40:04 -07:00
Chris Lu 8a66306064 calculate disk usage in case of race condition
related to https://github.com/chrislusf/seaweedfs/issues/2357
2021-10-04 23:32:07 -07:00
Chris Lu a067deaabc avoid possible modified location list
fix issue 1 of https://github.com/chrislusf/seaweedfs/issues/2345
2021-09-28 16:54:18 -07:00
Chris Lu 2789d10342 go fmt 2021-09-14 10:37:06 -07:00
Chris Lu e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu 574485ec69 better IP v6 support 2021-09-07 19:29:42 -07:00
Chris Lu 6923af7280 refactoring 2021-09-06 16:20:49 -07:00
Chris Lu e93d4935e3 add other replica locations when assigning volumes 2021-09-05 23:32:25 -07:00
Chris Lu 7a13816e94 refactor 2021-09-05 23:17:15 -07:00
Chris Lu 65af3cf4df master: disconnect only the phantom volume server
fix https://github.com/chrislusf/seaweedfs/issues/2311
2021-09-05 15:20:03 -07:00
Chris Lu 78e8ddf910 Only when tailing volume, the zero-ed cookie should skip checking.
This only happens when checkCookie == false and fsync == false.
2021-08-13 02:09:35 -07:00
Chris Lu d1d1fc772c move some volume lookup operations to grpc
jwt related lookup will come in next commit
2021-08-12 20:33:00 -07:00
Chris Lu 01336d71eb minor 2021-08-10 13:04:33 -07:00
Chris Lu eed26af266 Merge branch 'master' into add_remote_storage 2021-08-08 15:48:04 -07:00
Chris Lu 4370a4db63 use int64 for volume count in case of negative overflow 2021-08-08 15:19:39 -07:00
Chris Lu cb1dbd3135 refactor 2021-08-01 11:53:46 -07:00
Chris Lu b624090398 go fmt 2021-07-01 01:21:14 -07:00
Chris Lu d474ce6fe3 master: avoid repeated leader redirection
fix https://github.com/chrislusf/seaweedfs/issues/2146
2021-06-21 22:56:07 -07:00
Chris Lu 87a32bfef4 avoid possible nil when node is disconnected from its parent
fix https://github.com/chrislusf/seaweedfs/issues/2073
2021-05-19 10:02:01 -07:00
Chris Lu d2d36a3f9d master: avoid creating too many volumes
fix https://github.com/chrislusf/seaweedfs/issues/2062
2021-05-11 10:05:31 -07:00
Chris Lu 9a6aa00e9d avoid nil locations
fix https://github.com/chrislusf/seaweedfs/issues/2059
2021-05-10 02:39:52 -07:00
qieqieplus c4d32f6937 ahead of time volume assignment 2021-05-06 18:55:44 +08:00
Patrick Schmidt 7413d59750 Fix EC shard count logic
This fixes the calculation of the amount of EC shards a node holds.
Previously a global counter was increased, but also used inside the
loop to apply disk usage deltas. This led to wrong absolute numbers.
The fix is to apply only deltas of single EC shards per iteration.
2021-03-05 12:50:58 +01:00
Patrick Schmidt 5f7b024891 Show the real disk usage in stats calls
Currently the file size of only one volume location is taken into
account in the stats. This commit multiplies the disk usages by the
amount of nodes holding a replica of the volume.
This will yield the expected amount of disk usage and matches the
total size calculations from before.
2021-02-26 13:58:40 +01:00
Chris Lu 2270737344 volume: avoid fixed vacuum timeout for large volumes
1GB for 3 minutes, about 5.7MB/s
2021-02-22 12:52:37 -08:00
Chris Lu 565f7a6e72 Update data_node.go 2021-02-19 14:22:36 -08:00
Chris Lu a37473ae60 add back volume ids
address https://github.com/chrislusf/seaweedfs/issues/1792#issuecomment-782339576
2021-02-19 14:22:12 -08:00
Chris Lu c576ad04ac fix volume server display for volumes 2021-02-19 01:38:56 -08:00
Chris Lu 73958e357d add descriptive error if no free volumes 2021-02-18 19:10:20 -08:00
Chris Lu 3575d41009 go fmt 2021-02-17 20:57:08 -08:00
Chris Lu 6daa932f5c refactoring to get master function, instead of passing master values directly
this will enable retrying later
2021-02-17 20:55:55 -08:00
Chris Lu 68775d29e3 fix tests 2021-02-16 10:51:03 -08:00
Chris Lu b314d78e97 fix print 2021-02-16 10:48:28 -08:00
Chris Lu 53ca7e66ef avoid dead lock 2021-02-16 10:48:16 -08:00
Chris Lu 3097b9a9b7 fix existence checking 2021-02-16 05:59:43 -08:00
Chris Lu cb9cc29518 volume.list display; fix updating maxVolumeCount for disk 2021-02-16 03:55:24 -08:00
Chris Lu 3fe628f04e use hdd instead of empty string 2021-02-16 03:03:00 -08:00
Chris Lu f8446b42ab this can compile now!!! 2021-02-16 02:47:02 -08:00
Chris Lu 4bd8a692d8 disk type can be generic tags 2021-02-13 13:50:14 -08:00
Chris Lu 821c46edf1 Merge branch 'master' into support_ssd_volume 2021-02-09 11:37:07 -08:00
Chris Lu 1102ae32c4 fix concurrent map reads 2021-01-31 18:26:26 -08:00
Chris Lu 9c9ba3c209 nil related
related to https://github.com/chrislusf/seaweedfs/issues/1676
2021-01-03 12:25:58 -08:00
Chris Lu d9e8479c06 adjust UI max count 2020-12-17 13:47:51 -08:00
Chris Lu 3cdf5945a2 adjust UI 2020-12-17 13:37:00 -08:00
Chris Lu f696a2b2a7 assign volumes based on disk type 2020-12-17 13:25:05 -08:00
Chris Lu 1bf22c0b5b go fmt 2020-12-16 09:14:05 -08:00
Chris Lu 94525aa0fd allocate volume by disk type 2020-12-13 23:08:21 -08:00
Chris Lu a9db24cd05 master allocate volumes if ssd type runs out 2020-12-13 19:44:57 -08:00
Chris Lu 0d2ec832e2 rename from volumeType to diskType 2020-12-13 11:59:32 -08:00
Chris Lu 715b199eeb fix tests 2020-12-13 04:14:50 -08:00
Chris Lu d156c74ec0 volume server set volume type and heartbeat to the master 2020-12-13 03:11:24 -08:00
Chris Lu e9cd798bd3 adding volume type 2020-12-13 00:58:58 -08:00
Chris Lu 003b6245e7 fix nil 2020-12-02 00:09:19 -08:00
Chris Lu 965413c21b shell: add volume.vacuum command 2020-11-28 23:18:02 -08:00
Chris Lu c7ebadc25d avoid possible concurrent access inside ensureCorrectWritables() 2020-11-22 17:15:59 -08:00
Chris Lu 8cb8cd4cc5 add locks 2020-11-17 16:59:48 -08:00
Chris Lu c6bd244ebd add TODO 2020-11-11 12:51:27 -08:00
Chris Lu e0002f8dd7 check existing volumes for writable status 2020-10-24 01:34:31 -07:00
Chris Lu 720b1d9b88 adding locking to avoid nil VolumeLocationList
fix panic: runtime error: invalid memory address or nil pointer dereference
Oct 22 00:53:44 bedb-master1 weed[8055]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x17658da]
Oct 22 00:53:44 bedb-master1 weed[8055]: goroutine 310 [running]:
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/topology.(*VolumeLocationList).Length(...)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/topology/volume_location_list.go:35
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/topology.(*VolumeLayout).enoughCopies(...)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/topology/volume_layout.go:376
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/topology.(*VolumeLayout).ensureCorrectWritables(0xc000111d50, 0xc000b55438)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/topology/volume_layout.go:202 +0x5a
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/topology.(*Topology).SyncDataNodeRegistration(0xc00042ac60, 0xc001454d30, 0x1, 0x1, 0xc0005fc000, 0xc00135de40, 0x4, 0xc00135de50, 0x10, 0x10d, ...)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/topology/topology.go:224 +0x616
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/server.(*MasterServer).SendHeartbeat(0xc000162700, 0x23b97c0, 0xc000ae2c90, 0x0, 0x0)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/server/master_grpc_server.go:106 +0x325
Oct 22 00:53:44 bedb-master1 weed[8055]: github.com/chrislusf/seaweedfs/weed/pb/master_pb._Seaweed_SendHeartbeat_Handler(0x1f8e7c0, 0xc000162700, 0x23b0a60, 0xc00024b440, 0x3172c38, 0xc000ab7100)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/seaweedfs/weed/pb/master_pb/master.pb.go:4250 +0xad
Oct 22 00:53:44 bedb-master1 weed[8055]: google.golang.org/grpc.(*Server).processStreamingRPC(0xc0001f31e0, 0x23bb800, 0xc000ac5500, 0xc000ab7100, 0xc0001fea80, 0x311fec0, 0x0, 0x0, 0x0)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:1329 +0xcd8
Oct 22 00:53:44 bedb-master1 weed[8055]: google.golang.org/grpc.(*Server).handleStream(0xc0001f31e0, 0x23bb800, 0xc000ac5500, 0xc000ab7100, 0x0)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:1409 +0xc5c
Oct 22 00:53:44 bedb-master1 weed[8055]: google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0001ce8b0, 0xc0001f31e0, 0x23bb800, 0xc000ac5500, 0xc000ab7100)
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:746 +0xa5
Oct 22 00:53:44 bedb-master1 weed[8055]: created by google.golang.org/grpc.(*Server).serveStreams.func1
Oct 22 00:53:44 bedb-master1 weed[8055]: #011/root/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:744 +0xa5
Oct 22 00:53:44 bedb-master1 systemd[1]: weedmaster.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 22 00:53:44 bedb-master1 systemd[1]: weedmaster.service: Failed with result 'exit-code'.
2020-10-21 23:15:48 -07:00
Chris Lu 410b818aa7 master: avoid timer leakage 2020-10-19 14:24:57 -07:00
Chris Lu c7d7b1a0f6
Merge pull request #1485 from LIBA-S/fix_oversized
Correct the oversized state of volume after compaction
2020-09-23 19:24:30 -07:00
LIBA-S eecd6b5d35 Fix a race condition when handle VolumeLocationList 2020-09-23 20:56:51 +08:00
LIBA-S 0157798ebf Correct the oversized state of volume after compaction 2020-09-23 20:27:42 +08:00
Chris Lu 289e62a305 master: better locking of in memory volume data
related to https://github.com/chrislusf/seaweedfs/issues/1436#issuecomment-695880135
2020-09-20 23:07:55 -07:00
Chris Lu 6a92f0bc7a refactoring to typed Size
Go is amazing with refactoring!
2020-08-18 17:04:28 -07:00
Chris Lu 152a6cbc2b minor adjustments 2020-08-10 20:42:27 -07:00
cheng.li01 25fbff5d52 fix bug: two same volumeId in different collections
1, there will be two leader when master server startup in a few seconds
2, raft server will get a leader even there is only one master, so there is no need to do hard code to set the server to be leader
2020-08-10 16:37:47 +08:00
cheng.li01 dad1161c70 fix dn.volumes Iterate when write issue 2020-07-08 19:57:19 +08:00
Chris Lu e912fd15e3 renaming 2020-06-19 22:45:27 -07:00
Evgenii Kozlov 0e0db70f55 Set volumes ReadOnly if low free disk space 2020-06-05 18:18:15 +03:00
bingoohuang 1f8782a1ed try showing the first 100 volume ids and an extra ... 2020-05-29 16:15:33 +08:00
bingoohuang 1a642b9876 add Volume Ids column only for max 100 volumes for convenience in the master ui. 2020-05-29 15:37:58 +08:00
Chris Lu e4af63a721 volume server: accept fsync=true in write requests 2020-04-11 21:39:16 -07:00
James Hartig eae3f27c80 Added treat_replication_as_minimums master toml option 2020-04-01 19:08:48 -04:00
Chris Lu e39e78ea8d remove println 2020-03-22 18:37:12 -07:00
Chris Lu 35208711e5 logging 2020-03-22 18:32:49 -07:00
Chris Lu c3cb6fa1d7 volume: compaction can cause readonly volumes
address https://github.com/chrislusf/seaweedfs/issues/1233
2020-03-17 09:43:57 -07:00
Chris Lu 560df51def refactoring 2020-03-15 03:11:26 -07:00
Chris Lu 7edbee6f57 volume: proxy writes to remote volume server, with replication or not
the panic is triggered by uploading a file to a volume server not holding the designated replica.
2020-03-15 10:20:14.365488 I | http: panic serving 127.0.0.1:57124: runtime error: invalid memory address or nil pointer dereference
goroutine 119 [running]:
net/http.(*conn).serve.func1(0xc0001a8000)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:1772 +0x139
panic(0x2316fe0, 0x3662900)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/runtime/panic.go:973 +0x396
github.com/chrislusf/seaweedfs/weed/topology.getWritableRemoteReplications(0xc00009c000, 0x2, 0x7ffeefbffbd2, 0xe, 0x0, 0xa, 0x0, 0x0, 0xbb4bf1f7)
	/home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/topology/store_replicate.go:157 +0x53
github.com/chrislusf/seaweedfs/weed/topology.ReplicatedWrite(0x7ffeefbffbd2, 0xe, 0xc00009c000, 0xc000000002, 0xc000472750, 0xc0001b2200, 0x0, 0x1, 0x0)
	/home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/topology/store_replicate.go:29 +0xc7
github.com/chrislusf/seaweedfs/weed/server.(*VolumeServer).PostHandler(0xc0001513f0, 0x292bde0, 0xc0001fe2a0, 0xc0001b2200)
	/home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/server/volume_server_handlers_write.go:52 +0x56f
github.com/chrislusf/seaweedfs/weed/server.(*VolumeServer).privateStoreHandler(0xc0001513f0, 0x292bde0, 0xc0001fe2a0, 0xc0001b2200)
	/home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/server/volume_server_handlers.go:37 +0x21f
net/http.HandlerFunc.ServeHTTP(0xc0004420e0, 0x292bde0, 0xc0001fe2a0, 0xc0001b2200)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:2012 +0x44
net/http.(*ServeMux).ServeHTTP(0xc0001fc800, 0x292bde0, 0xc0001fe2a0, 0xc0001b2200)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:2387 +0x1a5
net/http.serverHandler.ServeHTTP(0xc0001781c0, 0x292bde0, 0xc0001fe2a0, 0xc0001b2200)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc0001a8000, 0x2934420, 0xc000212400)
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
	/home/travis/.gimme/versions/go1.14.linux.amd64/src/net/http/server.go:2933 +0x35c
Eg:
server A (datacenter 1) and server B (datacenter 2) hold replica (100) for volume 1.
If you upload a file with a key 1,xxxxx to server C (datacenter 3) will trigger the panic on server C.
The server C should either proxy upload file to the correct volume server or should return an HTTP error code and not panic.
2020-03-15 02:50:42 -07:00
Chris Lu d022b6bc0e fix compilation 2020-03-14 16:32:16 -07:00