Troubleshooting Common Zabbix Agent Issues

A Sysadmin's Guide to Diagnosing and Fixing Agent Problems

The Zabbix Agent is a remarkably stable and reliable piece of software, but in the complex world of IT, issues can still arise. When your monitoring data stops flowing, it's crucial to have a systematic approach to troubleshooting. This guide covers some of the most common problems encountered with the Zabbix Agent for Windows and other operating systems, providing clear, actionable steps to diagnose and resolve them. By following these procedures, you can quickly get your monitoring back on track.

Issue 1: Agent is Unreachable (Availability Icon is Red)

One of the most frequent issues is when the Zabbix server cannot communicate with the agent. In the Zabbix UI, the host's availability icon will be red, and you'll see an error message like "Get value from agent failed: ZBX_TCP_READ() failed: [104] Connection reset by peer".

Troubleshooting Steps:

  1. Check if the Agent Service is Running: On the monitored host, verify that the Zabbix Agent service is actually running. On Windows, open services.msc and look for "Zabbix Agent". On Linux, use a command like systemctl status zabbix-agent. If it's stopped, try starting it. Check the agent's log file (specified by the LogFile parameter in the config) for any startup errors.
  2. Firewall Issues: This is the most common culprit. The Zabbix server must be able to reach the agent on its listening port (default TCP 10050) for passive checks. Ensure that the firewall on the agent host has a rule allowing incoming connections on this port from the Zabbix server's IP address. Also, check any network firewalls that may exist between the server and the agent.
  3. Incorrect Server Parameter: Open the agent's configuration file (zabbix_agentd.conf). Double-check the Server parameter. It should contain the IP address of your Zabbix server or proxy. The agent will only accept connections from the IPs listed here. If it's incorrect, the agent will reject the connection.
  4. Network Connectivity: From the Zabbix server, try to establish a manual connection to the agent using telnet or nc. For example: telnet 10050. If the connection is refused or times out, you have a network or firewall problem that is preventing communication.
  5. Check Host Interface in Zabbix UI: In the Zabbix web interface, go to the host's configuration and verify that the IP address or DNS name in the agent interface is correct and reachable from the server.

Issue 2: Items Become "Not supported"

Sometimes, specific items will turn red in the "Latest data" view with a status of "Not supported". This means the agent was able to communicate but could not retrieve the value for that specific item key.

Troubleshooting Steps:

  1. Check the Error Message: Hover over the red "i" icon next to the item in the Zabbix UI. This will often give you a detailed error message explaining why the item is not supported.
  2. Invalid Item Key: The most common reason is a typo or an incorrect format in the item key itself. For example, using proc.num[httpd] instead of proc.num[httpd.exe] on Windows. Carefully check the Zabbix documentation for the correct syntax of the item key you are using.
  3. Permissions Issues (Especially for UserParameters): If the unsupported item is a UserParameter that runs a script, the problem is often related to permissions. The Zabbix Agent service runs as a specific user (e.g., 'zabbix' on Linux, 'SYSTEM' on Windows). This user might not have the necessary permissions to execute your script or access the files/resources the script needs. Try running the script manually as the same user to replicate the issue.
  4. Timeout Exceeded: If a check takes longer to execute than the Timeout value specified in the agent's configuration (default 3 seconds), the agent will terminate it, and the item will become unsupported. You can either optimize your script to run faster or cautiously increase the Timeout value.
  5. Use zabbix_agentd -t: A powerful diagnostic tool is the agent itself. On the monitored host, you can run the agent in test mode to check a specific key. For example: zabbix_agentd.exe -t "system.cpu.load[all,avg1]". This will immediately tell you if the agent can collect the metric and what value it returns.

Issue 3: Active Checks Are Not Working

If you have configured active checks, but no data is appearing in Zabbix, the issue lies with the agent's ability to communicate with the server.

Troubleshooting Steps:

  1. Check ServerActive Parameter: Ensure the ServerActive parameter in zabbix_agentd.conf is correctly set to the IP address or DNS name of your Zabbix server. This is the address the agent will try to connect to.
  2. Check Hostname Parameter: The Hostname in the agent's configuration file must be exactly the same (case-sensitive) as the "Host name" configured in the Zabbix web interface. If they do not match, the server will reject the incoming data from the agent, as it doesn't know which host it belongs to. Check the Zabbix server log for messages like "cannot find host".
  3. Firewall on Server Side: For active checks, the agent initiates the connection to the Zabbix server on TCP port 10051. Ensure the firewall on the Zabbix server itself allows incoming connections on this port.
  4. Network Connectivity (Agent to Server): From the agent machine, try to connect to the Zabbix server on port 10051 using telnet or nc. For example: telnet 10051. If this fails, there is a network or firewall issue preventing the agent from reaching the server.

Troubleshooting the Zabbix Agent is a logical process of elimination. By systematically checking the service status, configuration files, firewalls, and network connectivity, you can solve the vast majority of common issues. And remember, the agent and server log files are your best friends—they often contain the exact error message you need to pinpoint the problem. If you need to start from scratch, you can always download zabbix agent and follow a fresh installation guide.

Troubleshooting Zabbix Agent