使用 Nagios 來偵測 Server 狀態

利用 Nagios 可以偵測目前 Server 的使用狀態,如:系統目前負載 / 線上使用者 / Ping 通連狀況 / root 根目錄使用狀態 / SWAP 記憶使用狀況 /  所有的 Process 等等……,當然也可以用來偵測各種不同的服務,如:SSH / FTP / HTTP / LDAP / DNS……。

以本機 localhost 來做說明:[@more@]# vim /etc/nagios/objects/localhost.cfg
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################

# ping 的通連狀態
# Define a service to “ping” the local machine

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }

# root 磁碟分割區使用狀態
# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Root Partition
        check_command                   check_local_disk!20%!10%!/
        }

# 線上使用者
# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Users
        check_command                   check_local_users!20!50
        }

# Process 使用數量
# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Total Processes
        check_command                   check_local_procs!250!400!RSZDT
        }

# 系統負載
# Define a service to check the load on the local machine.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Load
        check_command                   check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
        }

# SWAP 置換記憶體使用狀態
# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Swap Usage
        check_command                   check_local_swap!20!10
        }

修改完畢,重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定值是沒有問題。