Привет! У меня 4-узловой Ceph-кластер. После установки последнего патча я перезагрузил все узлы по одному, когда Ceph был в процессе ребалансировки (статус зеленый). Моя версия Ceph – 19.2.2 stable. Но после перезагрузки последнего узла все диски стали недоступны.
```
ceph -v
INI: ceph version 19.2.2 (72a09a98429da13daae8e462abda408dc163ff75) squid (stable)
ceph -s
INI: cluster:
id: 71bc13b0-e73f-4db8-8d09-d68ffdd1c306
health: HEALTH_WARN
4 osd down
1 host (4 osds) down
Degraded data redundancy: 434207/1792488 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
services:
mon: 4 daemons, quorum pkx1,pkx2,pkx3,pkx4 (age 12m)
mgr: pkx1(active, since 90m), standbys: pkx4, pkx2, pkx3
osd: 16 osds: 12 up (since 7m), 16 in (since 44m); 2 remapped pgs
data:
pools: 2 pools, 257 pgs
objects: 597.50k objects, 2.3 TiB
usage: 6.7 TiB used, 21 TiB / 28 TiB avail
pgs: 434207/1792488 objects degraded (24.224%)
2285/1792488 objects misplaced (0.127%)
186 active+undersized+degraded
69 active+clean
2 active+clean+remapped
io:
client: 582 MiB/s rd, 11 MiB/s wr, 217 op/s rd, 1.02k op/s wr
ceph health detail
INI: HEALTH_WARN 4 osds down; 1 host (4 osds) down; Degraded data redundancy: 434208/1792491 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
[WRN] OSD_DOWN: 4 osds down
osd.15 (root=default,host=pkx3) is down
osd.16 (root=default,host=pkx3) is down
osd.17 (root=default,host=pkx3) is down
osd.18 (root=default,host=pkx3) is down
[WRN] OSD_HOST_DOWN: 1 host (4 osds) down
host pkx3 (root=default) (4 osds) is down
[WRN] PG_DEGRADED: Degraded data redundancy: 434208/1792491 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
pg 3.b7 is active+undersized+degraded, acting [20,3]
pg 3.b8 is stuck undersized for 8m, current state active+undersized+degraded, last acting [19,5]
pg 3.b9 is stuck undersized for 8m, current state active+undersized+degraded, last acting [7,20]
pg 3.ba is stuck undersized for 8m, current state active+undersized+degraded, last acting [1,20]
pg 3.bb is stuck undersized for 8m, current state active+undersized+degraded, last acting [20,4]
pg 3.bc is stuck undersized for 8m, current state active+undersized+degraded, last acting [20,0]
pg 3.bd is stuck undersized for 8m, current state active+undersized+degraded, last acting [5,21]
pg 3.be is stuck undersized for 8m, current state active+undersized+degraded, last acting [2,5]
pg 3.bf is stuck undersized for 8m, current state active+undersized+degraded, last acting [2,4]
pg 3.c0 is stuck undersized for 8m, current state active+undersized+degraded, last acting [21,1]
pg 3.c3 is stuck undersized for 8m, current state active+undersized+degraded, last acting [21,1]
pg 3.c4 is stuck undersized for 8m, current state active+undersized+degraded, last acting [22,4]
pg 3.c5 is stuck undersized for 8m, current state active+undersized+degraded, last acting [5,22]
pg 3.e6 is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,0]
pg 3.f8 is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,20]
pg 3.fd is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,19]
```
Что происходит?
```
ceph -v
INI: ceph version 19.2.2 (72a09a98429da13daae8e462abda408dc163ff75) squid (stable)
ceph -s
INI: cluster:
id: 71bc13b0-e73f-4db8-8d09-d68ffdd1c306
health: HEALTH_WARN
4 osd down
1 host (4 osds) down
Degraded data redundancy: 434207/1792488 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
services:
mon: 4 daemons, quorum pkx1,pkx2,pkx3,pkx4 (age 12m)
mgr: pkx1(active, since 90m), standbys: pkx4, pkx2, pkx3
osd: 16 osds: 12 up (since 7m), 16 in (since 44m); 2 remapped pgs
data:
pools: 2 pools, 257 pgs
objects: 597.50k objects, 2.3 TiB
usage: 6.7 TiB used, 21 TiB / 28 TiB avail
pgs: 434207/1792488 objects degraded (24.224%)
2285/1792488 objects misplaced (0.127%)
186 active+undersized+degraded
69 active+clean
2 active+clean+remapped
io:
client: 582 MiB/s rd, 11 MiB/s wr, 217 op/s rd, 1.02k op/s wr
ceph health detail
INI: HEALTH_WARN 4 osds down; 1 host (4 osds) down; Degraded data redundancy: 434208/1792491 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
[WRN] OSD_DOWN: 4 osds down
osd.15 (root=default,host=pkx3) is down
osd.16 (root=default,host=pkx3) is down
osd.17 (root=default,host=pkx3) is down
osd.18 (root=default,host=pkx3) is down
[WRN] OSD_HOST_DOWN: 1 host (4 osds) down
host pkx3 (root=default) (4 osds) is down
[WRN] PG_DEGRADED: Degraded data redundancy: 434208/1792491 objects degraded (24.224%), 186 pgs degraded, 186 pgs undersized
pg 3.b7 is active+undersized+degraded, acting [20,3]
pg 3.b8 is stuck undersized for 8m, current state active+undersized+degraded, last acting [19,5]
pg 3.b9 is stuck undersized for 8m, current state active+undersized+degraded, last acting [7,20]
pg 3.ba is stuck undersized for 8m, current state active+undersized+degraded, last acting [1,20]
pg 3.bb is stuck undersized for 8m, current state active+undersized+degraded, last acting [20,4]
pg 3.bc is stuck undersized for 8m, current state active+undersized+degraded, last acting [20,0]
pg 3.bd is stuck undersized for 8m, current state active+undersized+degraded, last acting [5,21]
pg 3.be is stuck undersized for 8m, current state active+undersized+degraded, last acting [2,5]
pg 3.bf is stuck undersized for 8m, current state active+undersized+degraded, last acting [2,4]
pg 3.c0 is stuck undersized for 8m, current state active+undersized+degraded, last acting [21,1]
pg 3.c3 is stuck undersized for 8m, current state active+undersized+degraded, last acting [21,1]
pg 3.c4 is stuck undersized for 8m, current state active+undersized+degraded, last acting [22,4]
pg 3.c5 is stuck undersized for 8m, current state active+undersized+degraded, last acting [5,22]
pg 3.e6 is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,0]
pg 3.f8 is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,20]
pg 3.fd is stuck undersized for 8m, current state active+undersized+degraded, last acting [6,19]
```
Что происходит?
