There are two kinds of watchdog pings:
check web server
check cgi server
In the watchdog failure log, the error will indicate if it is from web or CGI.
This is how it works when watchdog pings the CGI server:
- Watchdog sends a special URL (login.cgi) to web server.
- Web server forwards the request to CGI server.
- CGI server picks a child CGI process to service the request.
- Child CGI server recognizes the special URL and immediately sends HTTP 200 response to the web server.
- Web server gets the response and replies back to the watchdog.
In this chain of events, if the CGI servers are busy servicing auth requests or they are blocked by back-end auth server, then the watchdog ping request will be queued up in the parent CGI server until a child process is available to service. Now, there is a timeout between web server and CGI server for the response to come back to web server.
If web server does not receive the response within that timeout, the request will be dropped and the watchdog will log an event. After the first failure, the watchdog will speed up the ping. If the next ping is successful, then the watchdog goes back to normal ping. Otherwise, if three pings failed consecutively, then the watchdog generates a system snapshot and restarts the services.