Every once in a while either due to unforeseen networking issues or due to DO maintenance I forgot to schedule in Nagios, I get an alert blast due to so many connectivity issues timing out.
Is there a good endpoint to monitor within DO’s ownership that can serve as a way to keep Nagios noise down?
My current plan is to make all of my monitored hosts dependent on the upstream DO “host” so that if the top level one fails, I get one notification that my droplet endpoints may be impacted, but none of the failures of the individual endpoints’ reachability.
This problems happens both monitoring from the inside-out, and the outside-in. Cross-region to droplets can be listed as “down” due to subsequent failures, but also the monitoring droplet trying to reach non-DO hosted items fails due to network turbulence.
Is there any DO-level endpoint I can monitor to work as an overall “regional status health” item, that can inform individual DO connectivity statuses?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Click below to sign up and get $200 of credit to try our products over 60 days!
I think you should be able to accomplish that by running a ping check against the network gateway that is set on the droplet.
While you could monitor a higher level “network” abstraction, it wouldn’t necessarily solve your problem, because the issue could be isolated to a hypervisor, network switch, or some other network device between you and the final router.
So if you were monitoring say connectivity at AMS but the issue was with the hypervisor that your droplet was on, then it would come back as Network Ok, but the droplet would still be potentially affected.