Nagios Little Shell Script

I used to work for a very strong company in São Paulo — focused on Digital Marketing [Email Marketing] — and I was responsible among other things for monitoring servers and services.
One day we identified the need for some specific DNS monitoring.
So I assembled this little shell script for Nagios. What it does is try and resolve some predefined DNS within a given amount of time. When it fails Nagios gets the alert. Otherwise Nagios gets an ‘OK, we’re alright’.
It was very useful for us to know when our email transmission was being screwed by some network issue on email servers side — it allowed us to slow down or stop transmitting emails to those servers until they got back up online — in the end it used to save us money, time, memory and server processing on our end.
All that just because a simple shellscript.

Being as simple as it is I put it here just because I always like assembling useful, small and simple shellscripts. Linux is fun anytime.

Shebang and variable declaration:

#!/bin/bash

HOSTS[0]="hotmail.com";		DOMAIN[0]="Hotmail"
HOSTS[1]="hotmail.com.br"	DOMAIN[1]="Hotmail BR"
HOSTS[2]="gmail.com"		DOMAIN[2]="Gmail"
HOSTS[3]="yahoo.com"		DOMAIN[3]="Yahoo!"
HOSTS[4]="yahoo.com.br"		DOMAIN[4]="Yahoo! Br"
HOSTS[5]="uol.com.br"		DOMAIN[5]="UOL"
HOSTS[6]="bol.com.br"		DOMAIN[6]="BOL"
HOSTS[7]="terra.com.br"		DOMAIN[7]="Terra"
HOSTS[8]="ig.com.br"		DOMAIN[8]="iG"
HOSTS[9]="globo.com"		DOMAIN[9]="Globo"

Here I just check if the script can run the necessary utility:

err="";
host=$(which host)	|| err="host";

if [ ! -z "$err" ] ; then
	echo "Error. Command not found: $err"
	exit $STATE_CRITICAL
fi

Here is just a function that gets called if no parameters are passed to the script. The -t parameter is for the query type [here you will find which they are].

help(){
	echo "Usage: $0 -t {query type} -W {time out seconds}"
}

Here it checks if the parameters are empty [parameters are query type and timeout value]:

while getopts t:W: OPTION ; do
	case "$OPTION" in
			t)
				t=$OPTARG;
			;;
			W)
				W=$OPTARG;
			;;
			*)
				help;
			;;
	esac
done

if [ -z "$t" ] || [ -z "$W" ] ; then
	help;
	exit $STATE_WARNING
fi

And here is where the action happens. Query is made, timeout is set and DNS is resolved. Or not.

x=0
while [ $x != ${#HOSTS[@]} ] ; do

	host -t $t -W $W ${HOSTS[$x]} 1>>/dev/null

	if [ `echo "$?"` -eq 0 ] ; then
		RETURN[${#RETURN[@]}]="$t for ${DOMAIN[$x]} sucessfully resolved."
	else
		RETURN[${#RETURN[@]}]="ERROR: Timeout when resolving $t for ${DOMAIN[$x]}."
	fi
	
	let "x = x + 1"
	
done

Once it finishes collecting the DNS checking results [storing them on an array] it sends it to Nagios.

OUTPUT=`echo "${RETURN[@]}" | sed -r "s/\. /\n/g" | grep "ERROR"`

if [ ! -z "$OUTPUT" ] ; then
	echo "$OUTPUT"
	exit 1
else
	echo "OK"
	exit 0
fi

That’s it. A simple script on a simple language for a simple purpose that solved complicated problems.

Deixe uma resposta

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair / Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair / Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair / Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair / Alterar )

Conectando a %s