以 Nagios 偵測 MySQL Server

MySQL Server 為了安全,所以大部分都不建立連線埠,或者是僅限本機連線,不允許遠端連線,所以偵測 MySQL Server 就比較沒有那麼實際,而且在檢查時,密碼是以明碼傳送,會有安全性上的問題。

1. 檢查是否有 check_mysql 檢查指令
# locate check_mysql
/usr/lib64/nagios/plugins/check_mysql
/usr/lib64/nagios/plugins/check_mysql_query

2. 檢查一下,check_mysql 須要加入的參數
# /usr/lib64/nagios/plugins/check_mysql -h
Options:
 -h, –help
    Print detailed help screen
 -V, –version
    Print version information
 –extra-opts=[section][@file]
    Read options from an ini file. See http://nagiosplugins.org/extra-opts
    for usage and examples.
 -H, –hostname=ADDRESS
    Host name, IP Address, or unix socket (must be an absolute path)
 -P, –port=INTEGER
    Port number (default: 3306)
 -s, –socket=STRING
    Use the specified socket (has no effect if -H is used)
 -d, –database=STRING
    Check database with indicated name
 -u, –username=STRING
    Connect using the indicated username
 -p, –password=STRING
    Use the indicated password to authenticate the connection
    ==> IMPORTANT: THIS FORM OF AUTHENTICATION IS NOT SECURE!!! <==
    Your clear-text password could be visible as a process table entry
 -S, –check-slave
    Check if the slave thread is running properly.
 -w, –warning
    Exit with WARNING status if slave server is more than INTEGER seconds
    behind master
 -c, –critical
    Exit with CRITICAL status if slave server is more then INTEGER seconds
    behind master
[@more@]3. 建立資料庫和資料庫使用者及密碼
# /usr/local/bin/mysqladmin -u root -p create nagiostest
# /usr/bin/mysql -u root -p -e “GRANT ALL PRIVILEGES ON nagiostest.* TO ‘ntest’@’localhost’ IDENTIFIED BY ‘123test’;”

4. 測試一下
# /usr/lib64/nagios/plugins/check_mysql -H 127.0.0.1 -P 3306 -u ntest -d nagiostest -p 123test
Uptime: 178009  Threads: 1  Questions: 348  Slow queries: 0  Opens: 16  Flush tables: 1  Open tables: 9  Queries per second avg: 0.1

5. 在 /etc/nagios/objects/commands.cfg 檔案加入

# ‘check_mysql’ command definition
define command{
        command_name    check_mysql
        command_line    $USER1$/check_mysql -H $ARG1$ -P $ARG2$ -u $ARG3$ -d $ARG4$ -p $ARG5$
        }

6. 在要偵測的 MySQL Server 主機加入
# vim /etc/nagios/objects/localhost.cfg
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             MySQL
        check_command                   check_mysql!127.0.0.1!3306!ntest!nagiostest!123test
        notifications_enabled           1
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定成功

這時候應該可以看到下面的圖

以 Nagios 偵測 SNMP Service

1. 檢查是否有 check_snmp 檢查指令
一個是使用採用簡單認證,另外一個是支援加密傳輸
# locate check_snmp
/usr/lib64/nagios/plugins/check_snmp

2. 檢查一下,check_snmp 須要加入的參數
# /usr/lib64/nagios/plugins/check_snmp -h
 -H, –hostname=ADDRESS
    Host name, IP Address, or unix socket (must be an absolute path)
 -p, –port=INTEGER
    Port number (default: 161)
 -n, –next
    Use SNMP GETNEXT instead of SNMP GET
 -P, –protocol=[1|2c|3]
    SNMP protocol version
 -L, –seclevel=[noAuthNoPriv|authNoPriv|authPriv]
    SNMPv3 securityLevel
 -a, –authproto=[MD5|SHA]
    SNMPv3 auth proto
 -x, –privproto=[DES|AES]
    SNMPv3 priv proto (default DES)
 -C, –community=STRING
    Optional community string for SNMP communication (default is “public”)
 -U, –secname=USERNAME
    SNMPv3 username
 -A, –authpassword=PASSWORD
    SNMPv3 authentication password
 -X, –privpasswd=PASSWORD
    SNMPv3 privacy password
 -o, –oid=OID(s)
    Object identifier(s) or SNMP variables whose value you wish to query

[@more@]3. 測試一下
# /usr/lib64/nagios/plugins/check_snmp -H SNMP_Server’IP -o .1.3.6.1.2.1.1.3.0
SNMP OK – Timeticks: (41154894) 4 days, 18:19:08.94 |

4. 在 /etc/nagios/objects/commands.cfg 檔案加入

# ‘check_snmp’ command definition
define command{
        command_name    check_snmp
        command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -o $ARG1$
        }

5. 在要偵測的 SNMP 主機加入
# vim /etc/nagios/servers/snmp.cfg
define service{
        use                             generic-service         ; Name of service template to use
        host_name                       snmp.test.ilc.edu.tw
        service_description             SNMP
        check_command                   check_snmp!.1.3.6.1.2.1.1.3.0
        notifications_enabled           1
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定成功

這時候應該可以看到下面的圖

以 Nagios 偵測 LDAP Server

1. 檢查是否有 check_ldap 檢查指令
一個是使用採用簡單認證,另外一個是支援加密傳輸
# locate check_ldap
/usr/lib64/nagios/plugins/check_ldap
/usr/lib64/nagios/plugins/check_ldaps

2. 檢查一下,check_ldap 須要加入的參數
# /usr/lib64/nagios/plugins/check_ldap -h
Options:
 -h, –help
    Print detailed help screen
 -V, –version
    Print version information
 –extra-opts=[section][@file]
    Read options from an ini file. See http://nagiosplugins.org/extra-opts
    for usage and examples.
 -H, –hostname=ADDRESS
    Host name, IP Address, or unix socket (must be an absolute path)
 -p, –port=INTEGER
    Port number (default: 389)
 -4, –use-ipv4
    Use IPv4 connection
 -6, –use-ipv6
    Use IPv6 connection
 -a [–attr]
    ldap attribute to search (default: “(objectclass=*)”
 -b [–base]
    ldap base (eg. ou=my unit, o=my org, c=at
 -D [–bind]
    ldap bind DN (if required)
 -P [–pass]
    ldap password (if required)
 -T [–starttls]
    use starttls mechanism introduced in protocol version 3
 -S [–ssl]
    use ldaps (ldap v2 ssl method). this also sets the default port to 636
 -2 [–ver2]
    use ldap protocol version 2
 -3 [–ver3]
    use ldap protocol version 3
    (default protocol version: 2)
 -w, –warning=DOUBLE
    Response time to result in warning status (seconds)
 -c, –critical=DOUBLE
    Response time to result in critical status (seconds)
 -t, –timeout=INTEGER
    Seconds before connection times out (default: 10)
 -v, –verbose
    Show details for command-line debugging (Nagios may truncate output)
[@more@]3. 測試一下
# /usr/lib64/nagios/plugins/check_ldap -H ldap.test.ilc.edu.tw -b dc=ldap,dc=test.ilc.edu.tw -p 389
LDAP OK – 0.026 seconds response time|time=0.026050s;;;0.000000

4. 在 /etc/nagios/objects/commands.cfg 檔案加入

# ‘check_ldap’ command definition
define command{
        command_name    check_ldap
        command_line    $USER1$/check_ldap -H $HOSTADDRESS$ -b $ARG1$ -p $ARG2$
        }

5. 在要偵測的 DNS 主機加入
# vim /etc/nagios/servers/ldap.cfg
define service{
        use                             generic-service         ; Name of service template to use
        host_name                       ldap.test.ilc.edu.tw
        service_description              LDAP
        check_command                   check_ldap!dc=ldap,dc=test.ilc.edu.tw!389
        notifications_enabled           1
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定成功

這時候應該可以看到下面的圖

以 Nagios 偵測 DNS Server

1. 檢查是否有 check_dns 檢查指令
# locate check_dns
/usr/lib64/nagios/plugins/check_dns

2. 檢查一下,check_dns 須要加入的參數
# /usr/lib64/nagios/plugins/check_dns -h
Options:
 -h, –help
    Print detailed help screen
 -V, –version
    Print version information
 –extra-opts=[section][@file]
    Read options from an ini file. See http://nagiosplugins.org/extra-opts
    for usage and examples.
 -H, –hostname=HOST
    The name or address you want to query
 -s, –server=HOST
    Optional DNS server you want to use for the lookup
 -a, –expected-address=IP-ADDRESS|HOST
    Optional IP-ADDRESS you expect the DNS server to return. HOST must end with
    a dot (.). This option can be repeated multiple times (Returns OK if any
    value match). If multiple addresses are returned at once, you have to match
    the whole string of addresses separated with commas (sorted alphabetically).
 -A, –expect-authority
    Optionally expect the DNS server to be authoritative for the lookup
 -w, –warning=seconds
    Return warning if elapsed time exceeds value. Default off
 -c, –critical=seconds
    Return critical if elapsed time exceeds value. Default off
 -t, –timeout=INTEGER
    Seconds before connection times out (default: 10)[@more@]3. 測試一下
# /usr/lib64/nagios/plugins/check_dns -H www.ilc.edu.tw -s DNS’IP
DNS OK: 0.012 seconds response time. www.ilc.edu.tw returns 140.111.66.96|time=0.012217s;;;0.000000

4. 在 /etc/nagios/objects/commands.cfg 檔案加入

# ‘check_dns’ command definition
define command{
        command_name    check_dns
        command_line    $USER1$/check_dns -H $ARG1$ -s $HOSTADDRESS$
        }

5. 在要偵測的 DNS 主機加入 dns.test.ilc.edu.tw 就是要偵測的 DNS 主機
# vim /etc/nagios/servers/dns.cfg
define service{
        use                             generic-service         ; Name of service template to use
        host_name                       dns.test.ilc.edu.tw
        service_description             DNS
        check_command                   check_dns!www.ilc.edu.tw
        notifications_enabled           1
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定成功

這時候應該可以看到下面的圖

解決 Nagios 偵測主機服務狀態出現的提示圖

使用 Nagios 偵測主機服務狀態時,在 OK 的前面會出現一個提示圖

以 HTTP 服務,點選圖之後會出現的畫面

看起來似乎是沒有設定當服務出現問題時,是否要發出提示或警告的訊息,但服務還是有正常啟動,如果很在意出現這個圖的話,底下是解決的方式:[@more@]修改原有的偵測主機設定檔

notifications_enabled           0
都改成
notifications_enabled           1

# vim /etc/nagios/servers/www.cfg
define host {
        use                     linux-server
        host_name               www.test.ilc.edu.tw
        alias                   www.test.ilc.edu.tw
        address                 192.168.1.1
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             SSH
        check_command                   check_ssh
        notifications_enabled           1
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           1
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             FTP
        check_command                   check_ftp
        notifications_enabled           1
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

這時就不會再出現了!

使用 Nagios 來偵測 FTP / SSH / HTTP 服務狀態

首先先在  /etc/nagios/objects/commands.cfg 檔案中找尋相對應的命令
# vim /etc/nagios/objects/commands.cfg
# ‘check_ftp’ command definition
define command{
        command_name    check_ftp
        command_line    $USER1$/check_ftp -H $HOSTADDRESS$
        }

# ‘check_http’ command definition
define command{
        command_name    check_http
        command_line    $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
        }

# ‘check_ssh’ command definition
define command{
        command_name    check_ssh
        command_line    $USER1$/check_ssh $ARG1$ $HOSTADDRESS$
        }
[@more@]
建立要偵測的主機檔案
# vim /etc/nagios/servers/www.cfg
define host {
        use                     linux-server
        host_name               www.test.ilc.edu.tw
        alias                   www.test.ilc.edu.tw
        address                 192.168.1.1
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             SSH
        check_command                   check_ssh
        notifications_enabled           0
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           0
        }

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       www.test.ilc.edu.tw
        service_description             FTP
        check_command                   check_ftp
        notifications_enabled           0
        }

重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: .done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定成功

這時候應該可以看到下面的圖

Nagios Web 管理介面 vshell 介面中文化

在 /usr/local/vshell 的目錄下,發現有 locale 目錄,vshell 似乎有支援多國語言,只是裡面沒有正體中文而已。看了一下裡面的英文,覺得還好,而且數量也不多,所以就順手翻了一下。

[@more@]建立目錄
# mkdir -p /usr/local/vshell/locale/zh_TW/LC_MESSAGES
複製英文語系來修改
# cp /usr/local/vshell/locale/en_EN/en_EN/LC_MESSAGES/en_EN.po /usr/local/vshell/locale/zh_TWLC_MESSAGES/zh_TW.po
# cp /usr/local/vshell/locale/en_EN/en_EN/LC_MESSAGES/en_EN.mo /usr/local/vshell/locale/zh_TW/LC_MESSAGES/zh_TW.mo

修改 zh_TW.po 將裡面的英文條項翻成中文,然後將修改好的 zh_TW.mo 複製到 /usr/local/vshell/locale/zh_TW/LC_MESSAGES 目錄之下

修改 /etc/vshell.conf 設定檔,將 LANG = “en_GB” 改成 LANG = “zh_TW”
# sed -i ‘s/LANG = “en_GB”/LANG = “zh_TW”/ /etc/vshell.conf

翻的不是很好,有須要的人,可以由 此處 下載

Nagios 的 Web 管理介面 – vshell

Nagios 本身就有一個 Web 管理介面,只是畫面有些稍微複雜,所以有一個使用 PHP 開發的 Web 介面,比較精簡直覺。
Nagios vshell 下載網站:Here 
目前最新版本是 1.9.1 版,適用於 Nagios 3.x 和 Nagios XI
[@more@]
底下是安裝步驟:
1.下載 vshell
# wget http://assets.nagios.com/downloads/exchange/nagiosvshell/vshell.tar.gz

2.解壓縮
# tar xvzf vshell.tar.gz

3.搬移目錄
# mv vshell /usr/local

4.複製檔案到 /etc/httpd/conf.d 目錄之下
# cp /usr/local/vshell/config/vshell_apache.conf /etc/httpd/conf.d

5.修改 /etc/httpd/conf.d/vshell_apache.conf
# vim /etc/httpd/conf.d/vshell_apache.conf

#modify this file to fit your apache configuration

Alias /vshell “/usr/local/vshell”

<Directory “/usr/local/vshell”>
#  SSLRequireSSL
   Options None
   AllowOverride None
#   Order allow,deny
#   Allow from all
   Order deny,allow
   Deny from all
   Allow from 127.0.0.1 192.168.1.0/24
   Allow from ::1

#  Allow from 127.0.0.1

#use the below lines for Nagios XI
 # AuthName “Nagios Monitor XI”
 #  AuthType Basic
 # AuthUserFile /usr/local/nagiosxi/etc/htpasswd.users

#Use the below lines for a typical Nagios Core installation
   AuthName “Nagios Access”
   AuthType Basic
   AuthUserFile /etc/nagios/passwd

   Require valid-user
</Directory>

6.修改 /usr/local/vshell/config/vshell.conf 設定檔
# vim /usr/local/vshell/config/vshell.conf

; Full filesystem path to the Nagios status file
STATUSFILE = “/usr/local/nagios/var/status.dat”

; Full filesystem path to the Nagios object cache file
OBJECTSFILE = “/usr/local/nagios/var/objects.cache”

; Full filesystem path to the Nagios CGI permissions configuration file
CGICFG = “/usr/local/nagios/etc/cgi.cfg”

; Full filesystem path to the Nagios command pipe
NAGCMD = “/usr/local/nagios/var/rw/nagios.cmd”

修改成
; Full filesystem path to the Nagios status file
STATUSFILE = “/var/log/nagios/status.dat

; Full filesystem path to the Nagios object cache file
OBJECTSFILE = “/var/log/nagios/objects.cache

; Full filesystem path to the Nagios CGI permissions configuration file
CGICFG = “/etc/nagios/cgi.cfg

; Full filesystem path to the Nagios command pipe
NAGCMD = “/var/spool/nagios/cmd/nagios.cmd

7.重新啟動 Apache Web Server
# service httpd restart

8. 如果有問題,可以將設定檔複製到 /etc 目錄之下
# cp /usr/local/vshell/config/vshell.conf /etc

9.一切 OK 了!

 後記:其實後來發現,vshell 有提供直接從網頁上安裝的功能,可以直接執行 http://Server’IP/vshell/install.php
但我試了之後好像沒有成功,所以後來還是直接用手動安裝的方式來完成。

最後記得要把安裝目錄中的 install.php 檔案刪除
# rm -rf /usr/local/vshell/install.php

Nagios Exchange 網站

Nagios Exchange 下載網站:http://exchange.nagios.org/

以 CentOS 6.5 x86_64 為例,Nagios Plugin 提供的檢查命令都是放在 /usr/lib64/nagios/plugins/ 目錄之下
# cd /usr/lib64/nagios/plugins
# ls
check_breeze*    check_game*       check_mrtgtraf*     check_overcr*   check_swap*
check_by_ssh*    check_hpjd*       check_mysql*        check_pgsql*    check_tcp*
check_clamd@     check_http*       check_mysql_query*  check_ping*     check_time*
check_cluster*   check_icmp*       check_nagios*       check_pop@      check_udp@
check_dhcp*      check_ide_smart*  check_nntp@         check_procs*    check_ups*
check_dig*       check_imap@       check_nntps@        check_real*     check_users*
check_disk*      check_ircd*       check_nrpe*         check_rpc*      check_wave*
check_disk_smb*  check_jabber@     check_nt*           check_sensors*  eventhandlers/
check_dns*       check_ldap*       check_ntp*          check_simap@    negate*
check_dummy*     check_ldaps@      check_ntp_peer*     check_smtp*     urlize*
check_file_age*  check_load*       check_ntp.pl*       check_snmp*     utils.pm
check_flexlm*    check_log*        check_ntp_time*     check_spop@     utils.sh*
check_fping*     check_mailq*      check_nwstat*       check_ssh*
check_ftp@       check_mrtg*       check_oracle*       check_ssmtp@

如果覺得不夠,可以到上面提到的 Nagios Exchange 網站下載來使用,或者自己寫,可以使用 Bash Shell / Perl / PHP 等工具來撰寫。

使用 Nagios 來偵測 Server 狀態

利用 Nagios 可以偵測目前 Server 的使用狀態,如:系統目前負載 / 線上使用者 / Ping 通連狀況 / root 根目錄使用狀態 / SWAP 記憶使用狀況 /  所有的 Process 等等……,當然也可以用來偵測各種不同的服務,如:SSH / FTP / HTTP / LDAP / DNS……。

以本機 localhost 來做說明:[@more@]# vim /etc/nagios/objects/localhost.cfg
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################

# ping 的通連狀態
# Define a service to “ping” the local machine

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }

# root 磁碟分割區使用狀態
# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Root Partition
        check_command                   check_local_disk!20%!10%!/
        }

# 線上使用者
# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Users
        check_command                   check_local_users!20!50
        }

# Process 使用數量
# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Total Processes
        check_command                   check_local_procs!250!400!RSZDT
        }

# 系統負載
# Define a service to check the load on the local machine.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Load
        check_command                   check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
        }

# SWAP 置換記憶體使用狀態
# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Swap Usage
        check_command                   check_local_swap!20!10
        }

修改完畢,重新啟動 Nagios
# service nagios restart
Running configuration check…done.
Stopping nagios: done.
Starting nagios: done.

如果沒有出現錯誤訊息,代表設定值是沒有問題。