FIDO: process watchdog

FIDO: process watchdog

On the campus FIDO instance, multiple times it has been observed that the fido_snmp non blocking call to Net::SNMP::snmp_dispatcher() doesn't return.  In 2013, I introduced the non-standard module AnyEvent::SNMP which replaces Net::SNMP's event loop.  Many months went by but the problem recurred.

On the campus FIDO instance, in the root users' crontab, you will find

*/5 * * * * /usr/local/fido/bin/ >> /home/net/fido/logs/fido_watchdog.log 2>&1

The fido_watchdog opens the latest FIDO status file and looks for tests that have gone STALE that and have restart instructions.  If conditions have been met, the watchdog attempts to restart the stalled processes.

As of 2017/08, restart instructions are in place for several tests

Keywords:FIDO: process watchdog   Doc ID:38261
Owner:Michael H.Group:Network Services
Created:2014-03-08 17:53 CDTUpdated:2017-08-01 09:47 CDT
Sites:Network Services, Systems & Network Control Center
Feedback:  0   0