Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons every 6 seconds by default, which is configurable of course. User can change the “heartbeat interval” by adding an “osd_heartbeatinterval" setting under the [osd] section in the ceph configuration file, or by setting the value at runtime.
If a neighboring Ceph OSD Daemon doesn’t show a heartbeat within a grace period (by default 20 seconds), the Ceph OSD Daemon may consider the neighboring Ceph OSD Daemon as “down" and report it back to a Ceph Monitor, which will update the Ceph Cluster Map. We can change the “grace period” by adding an “osd_heartbeat_grace" setting under the [osd] section in the ceph configuration file, or by setting the value at runtime.
If the heartbeat check from one OSD doesn’t hear from the other within the set value for `osd_heartbeat_grace`, which is set to 20 seconds by default, the OSD that sends the heartbeat check reports the other OSD (the one that didn’t respond within 20 seconds) as down, to the MONs. Once an OSD reports three times that the non-responding OSD is indeed `down`, the ceph mon acknowledges it and marks it as that OSD is down.
The Ceph monitor will update the cluster map and send it to all participating nodes in the cluster.
When an OSD can’t reach another OSD for a heartbeat, it reports the following in the OSD logs:
osd.15 1497 heartbeat_check: no reply from osd.14 since back 2016-02-28 17:29:44.013402
Note: From ceph Jewel release, the ceph mons require a minimum of 2 OSDs report a specific OSD as down from two nodes, which are in different CRUSH subtrees, in order to actually mark the OSD as “down”. These are controlled by the following configuration flags:
Number of OSDs from different subtrees who need to report a down OSD for it to count
In which level of parent bucket the reporters are counted.