DX Play Computer Vision

Server health monitoring in AI solutions

Any business solution, including AI-based ones, rely heavily on backend processing. The servers need to operate 24/7 with the lowest latency possible. Having a tool that can do server health check by demand and get the statistics of latency and uptime can be crucial in measuring the efficiency of the solution. DeepX AI solution for business includes such a tool. Let’s see what kind of features and benefits server health monitoring brings to measuring the efficiency of business relying on AI processing.

Agents for constant server health monitoring

Architecture of DeepX AI Solutions for server health monitoring

In order to accomplish the main function of the server health monitoring system, we’ve introduced the agents that are tracking the status of the servers by schedule, as well as on demand. The results of regular checks are written into a database, and are later aggregated to be accessed for displaying server statistics on the frontend in a corresponding section.

Real time server health check

Real time server health check

Accessing the server maintenance section of the user interface immediately sends a request for the health status of the servers and gets a response from the agents. As we can see from the interface screenshot, we get the status for each server and an error counter. We can also press a refresh button to check the status of the servers once more.

Multiple server latency measurement

User interface also includes a section for getting the server latency statistics. We can choose a period of time and the servers, on which the statistics has to be gathered. The request is sent to a corresponding handler and aggregator, the information for the given period is gathered, sent to the frontend and displayed in the latency chart. Hovering over a point gives us the median aggregated latency for a corresponding time division.

Server uptime monitoring

The statistics of server uptime is one of the most important points in determining the overall reliability of the solution. As well as with latency statistics, we can choose a period of time for getting the stats, and the servers. Hovering over a specific bar gives us the uptime value for the time division and the server under consideration.

Notifications when server is down

Server health check system also includes a functionality to send slack notifications whenever any of the agents discovers that some of the servers is down. This allows for taking immediate actions to get the problematic server back up and running.

Server health monitoring for business

Having a convenient server health monitoring tool is crucial for the following:

Server maintenance

– Gathering and viewing the statistics of server uptime and latency

– Determining the reliability and efficiency of the solution provided

– Immediate actions by getting slack notifications as soon as something goes wrong

Close Bitnami banner