Frustration with my home network’s WAN connectivity randomly flapping and being somewhat cheap led me to see if I could get paged without having worry about email relays or paying for your traditional notification service à la PagerDuty (pricey) or VictorOps (unreliable).
The first order of business was to write up a simple script to perform some checks against my home network. It’s a simple shell script run through cron that maintains a running log of check attempts and results. Each time it runs, reads the last result, performs new checks, and if the state changed from the previous attempt, then send out an alert.
The code for the script is below or check it out at GitHub.
#!/bin/bash
## Very basic check to see if home network is up.
## First attempt SSH connection to server.
## If this fails, then attempt to ping router.
## Record result of check to logfile.
## If current check status is different than last check status
## sent message to Slack via webhook
## Host and SSH port to check
hName="[HostName]"
port="[Port]"
## Slack Hook and Channel
slackWebHook="https://hooks.slack.com/services/[WebHookLink]"
slackChannel="@[SlackUserName]"
## LogFile to maintain check state.
logFile="/[someLogDirectory/check-home.log"
## Send a color coded message to Slack
function notifySlack () {
status=$1
hostname=$2
result=$3
case "${status}" in
OK)
color="good"
;;
WARNING)
color="warning"
;;
CRITICAL)
color="danger"
;;
*)
color="#909090"
;;
esac
payload="\"attachments\": [{ \"title\": \"${hostname} status is ${status}\", \"text\": \"${result}\", \"color\": \"${color}\" }]"
if [ ! -z "${slackChannel}" ]; then
curl -s -XPOST --data-urlencode "payload={ \"channel\": \"${slackChannel}\", ${payload} }" ${slackWebHook} > /dev/null 2>&1
else
curl -s -XPOST --data-urlencode "payload={ ${payload} }" ${slackWebHook} > /dev/null 2>&1
fi
}
## Get previous status
lastEvent=`tail -1 ${logFile}`
lastStatus=`echo ${lastEvent} | awk '{ print $4 }'`
### First Check Host if SSH is up.
results=`echo QUIT | nc -v -w 5 ${hName} ${port} 2>&1 | grep -v mismatch`
res=`echo ${results} | awk '{print $5}'`
## Check connection status, set to WARNING if this fails
if [ "${res}" != "open" ]; then
echo `date +"%h %d %T"` WARNING $results >> ${logFile}
status="WARNING"
## Try pinging host
results2=`ping -c 5 ${hName} | grep packets`
percent=`echo ${results2} | awk '{ print $6 }' | sed -e 's/\%//'`
if [ "${percent}" -gt 0 ]; then
echo `date +"%h %d %T"` CRITICAL $results2 >> ${logFile}
status="CRITICAL"
fi
else
## Otherwise we are okay
echo `date +"%h %d %T"` OK $results >> ${logFile}
status="OK"
fi
## Do we notify Slack?
if [ "${status}" != "${lastStatus}" ]; then
results=`tail -1 ${logFile}`
notifySlack "${status}" "${hName}" "${results}"
fi
The script is run simply through cron and I have it scheduled to run every 2 minutes from a machine outside my home network. For myself, I don’t have the script post to a channel, but rather it sends a direct message. Then in my Slack notification settings, I make sure it sends a push notification for direct messages and for certain words/phrases.
In the above, the system alerted when its SSH check against my home machine failed. I only consider this a warning as the overall network could still be okay. The logic flow dictates, that if the SSH fails, then it attempts to ping the router. The below shows an alert and recovery related to the WAN interface dropping off the Internet.
Overall, it works pretty well. Using a script and cron is pretty rudimentary monitoring, but it accomplished the basic need. These alerts have allowed me to narrow down my search through the logs from my home router to track down its problems.