Ceph storage interruption cpk
Just before 23:09 quite a lot of ceph storage nodes became unreachable. This seems to be due to one of the redundant links between two datacenter locations failing for about 4 seconds. This triggered a whole slew of ceph osd processes being killed off and not starting again. A generic configuration change made for all our servers generated an extra interface, which confused some of the osd processes (depending on interface ordering) when starting up....