该脚本用于监控VPS服务器负载,Web程序内存及CPU使用。当服务器系统负载或内存使用达到预设值,则重启该程序,或者某个php-cgi进程占用CPU过大,则直接kill掉该进程。目的在于缓解服务器资源耗尽导致意外宕机等情况。
嗯,没错。该脚本是此前 v1 的更新版本,考虑今后可能还会更新,故移到 github gist 进行简单的版本控制。
一、使用方法:
1 2 3 4 | git clone git://gist.github.com/1216837.git gist-1216837 vim gist-1216837/sys-mon.sh //修改内存、CPU等预设阀值 mkdir /var/script mv gist-1216837/sys-mon.sh /var/script |
设置每分钟执行一次
1 2 | crontab -e * * * * * /bin/bash /var/shell/sys-mon.sh |
二、Shell脚本内容
建议打开下面网址查看最新版本。
https://gist.github.com/1216837
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | #! /bin/bash #==================================================================== # sys-mon.sh # # Copyright (c) 2011, WangYan <webmaster@wangyan.org> # All rights reserved. # Distributed under the GNU General Public License, version 3.0. # # Monitor system mem and load, if too high, restart some service. # # See: http://wangyan.org/blog/sys-mon-shell-script.html # # V 0.5, Date: 2011-12-08 #==================================================================== # Need to monitor the service name # Must be in /etc/init.d folder exists NAME_LIST="httpd nginx mysql" # Single process to allow the maximum CPU (%) PID_CPU_MAX="25" # The maximum allowed memory (%) PID_MEM_SUM_MAX="95" # The maximum allowed system load SYS_LOAD_MAX="6" # Log path settings LOG_PATH="/var/log/sys-mon.log" # Date time format setting DATA_TIME=$(date +"%y-%m-%d %H:%M:%S") # Your email address EMAIL="webmaster@example.com" # Your website url MY_URL="http://106.187.38.210/p.php" #==================================================================== for NAME in $NAME_LIST do PID_CPU_SUM="0";PID_MEM_SUM="0" PID_LIST=`ps aux | grep $NAME | grep -v root` IFS_TMP="$IFS";IFS=$'\n' for PID in $PID_LIST do PID_NUM=`echo $PID | awk '{print $2}'` PID_CPU=`echo $PID | awk '{print $3}'` PID_MEM=`echo $PID | awk '{print $4}'` # echo "$NAME: PID_NUM($PID_NUM) PID_CPU($PID_CPU) PID_MEM($PID_MEM)" PID_CPU_SUM=`echo "$PID_CPU_SUM + $PID_CPU" | bc` PID_MEM_SUM=`echo "$PID_MEM_SUM + $PID_MEM" | bc` if [ `echo "$PID_CPU >= $PID_CPU_MAX" | bc` -eq 1 ];then if [[ "$NAME" = "php-fpm" || "$NAME" = "httpd" ]];then sleep 5 if [ `echo "$PID_CPU >= $PID_CPU_MAX" | bc` -eq 1 ];then echo "${DATA_TIME}: kill ${NAME}($PID_NUM) successful (CPU:$PID_CPU)" | tee -a $LOG_PATH kill $PID_NUM fi else echo "${DATA_TIME}: [WARNING!] ${NAME}($PID_NUM) cpu usage is too high! (CPU:$PID_CPU)" | tee -a $LOG_PATH fi fi done IFS="$IFS_TMP" SYS_LOAD=`uptime | awk '{print $(NF-2)}' | sed 's/,//'` SYS_MON="CPU:$PID_CPU_SUM MEM:$PID_MEM_SUM LOAD:$SYS_LOAD" # echo -e "$NAME: $SYS_MON\n" SYS_LOAD_TOO_HIGH=`awk 'BEGIN{print('$SYS_LOAD'>'$SYS_LOAD_MAX')}'` PID_MEM_SUM_TOO_HIGH=`awk 'BEGIN{print('$PID_MEM_SUM'>'$PID_MEM_SUM_MAX')}'` if [[ "$SYS_LOAD_TOO_HIGH" = "1" || "$PID_MEM_SUM_TOO_HIGH" = "1" ]];then /etc/init.d/$NAME stop sleep 5 for ((i=1;i<4;i++)) do if [ `pgrep $NAME | wc -l` = "0" ];then echo "$DATA_TIME: Stop $NAME successful! ($SYS_MON)" | tee -a $LOG_PATH break else echo "${DATA_TIME}: [WARNING!] Stop $NAME failed[$i]! ($SYS_MON)" | tee -a $LOG_PATH pkill $NAME && killall $NAME fi done /etc/init.d/$NAME start sleep 5 for ((ii=1;ii<4;ii++)) do if [ `pgrep $NAME | wc -l` != "0" ];then echo "$DATA_TIME: Start $NAME successful!" | tee -a $LOG_PATH break else echo "${DATA_TIME}: [WARNING!] Start $NAME failed[$ii]! ($SYS_MON)" | tee -a $LOG_PATH /etc/init.d/$NAME start sleep 5 fi done if [ `pgrep $NAME | wc -l` != "0" ];then echo "${DATA_TIME}: [ERROR!] Start $NAME failed! ($SYS_MON)" | mail -s "Start $NAME failed" $EMAIL fi fi done STATUS_CODE=`curl -o /dev/null -s -w %{http_code} $MY_URL` #echo -e "STATUS CODE: $STATUS_CODE\n" if [ "$STATUS_CODE" != "200" ];then sleep 3 STATUS_CODE=`curl -o /dev/null -s -w %{http_code} $MY_URL` if [ "$STATUS_CODE" != "200" ];then echo "${DATA_TIME}: [WARNING!] Website Downtime! ($SYS_MON)" | tee -a $LOG_PATH echo "${DATA_TIME}: [WARNING!] Website Downtime! ($SYS_MON)" | mail -s "Start $NAME failed" $EMAIL fi fi |
脚本内容不难理解,原理解释可参考《Linux 进程自动监控shell脚本》
三、注意事项
1、NAME_LIST 指定的监控程序必须在/etc/init.d 文件夹中存在,并且支持stop和start操作
2、PID_CPU_MAX 指的是单个进程的CPU占用,只针对php-fpm或httpd。
3、PID_MEM_SUM_MAX 指的是该程序所有进程实际内存占用,而并非系统总内存。
4、EMAIL 只有在程序启动失败后,你才能收到邮件提醒。
2、PID_CPU_MAX 指的是单个进程的CPU占用,只针对php-fpm或httpd。
3、PID_MEM_SUM_MAX 指的是该程序所有进程实际内存占用,而并非系统总内存。
4、EMAIL 只有在程序启动失败后,你才能收到邮件提醒。
四、更新历史:
2011.11.28: 去掉nginx502状态监控,完善进程cpu监控,修正数据不准确等问题。
2011.12.07: 继续修正cpu监控不正确问题,增加宕机后邮件通知功能。
2011.12.07: 继续修正cpu监控不正确问题,增加宕机后邮件通知功能。
本站遵循 : 知识共享署名-非商业性使用-相同方式共享 3.0 版权协议
版权声明 : 原创文章转载时,请务必以超链接形式标明 文章原始出处
没有评论:
发表评论