Monitoring DO network stability
Every once in a while either due to unforeseen networking issues or due to DO maintenance I forgot to schedule in Nagios, I get an alert blast due to so many connectivity issues timing out.
Is there a good endpoint to monitor within DO’s ownership that can serve as a way to keep Nagios noise down?
My current plan is to make all of my monitored hosts dependent on the upstream DO “host” so that if the top level one fails, I get one notification that my droplet endpoints may be impacted, but none of the failures of the individual endpoints’ reachability.
This problems happens both monitoring from the inside-out, and the outside-in. Cross-region to droplets can be listed as “down” due to subsequent failures, but also the monitoring droplet trying to reach non-DO hosted items fails due to network turbulence.
Is there any DO-level endpoint I can monitor to work as an overall “regional status health” item, that can inform individual DO connectivity statuses?