Wednesday, 25 March 2015

WHITE-BOX APPLICATION MONITORING WITH DOCKER AND BOSUN - AT HOME !

In my previous post I have shown you how to perform black box monitoring of your web application. 
Today I am going to show you how to perform white box monitoring using BosunBosun is an advanced, open-source monitoring and alerting created by Stack Exchange. We are going to use it to perform some white box monitoring on our car service application I used in my the previous post.First thing to do is to start that docker image again and starting Jboss if not already running :


/opt/wildfly/bin/standalone.sh -c standalone-full.xml -b 0.0.0.0

Now we can pull the docker image from docker hub with the following command:

docker pull stackexchange/bosun

Once this image has been downloaded we can run it with:

docker run -d -p 4242:4242 -p 8070:8070 stackexchange/bosun

Bosun should now be accessible in your browser at 192.168.59.103:8070 where 192.168.59.103 is the Boot2Docker host Ip address.


Let’s go one second back to our web application Docker container. Here we have to download the Bosun collector and start it:


./scollector-linux-amd64 -h 192.168.59.103:8070 (where 192.168.59.103 is the Ip Address of the Bosun machine)

You may have to first make the collector executable with:

chmod +x scollector-linux-amd64 

The collector was giving some problem for me so I had to install postfix first with:

apt-get install postfix

Now your www docker container is sending data to the Bosun docker container.

Let’s check the Bosun graphical interface in the browser.

Go at http://192.168.59.103:8070/ again and chose your www docker container from the “hosts” tab:  



This data is being collected and aggregate by Bosun.




You can see CPU, memory, network and disk space usage.
There are many other functionality. Let’s explore the expression tab. 

In this tab you can write an expression for keeping under control a specific metric of your interest.

I wrote an expression for checking that the average CPU load in the last 5 minutes did not go above a threshold of 80%.
In the tab result you can see that this expression has not been satisfied which means that in the last 5 minutes, the average CPU rate, never went over 80%.

A specific metric can also be selected in the Available metric tab:



Here I selected the metric “linux.loadavg_1_min"
  



Also for this metric I have set a threshold of 80% which has not been reached in the last minute so under the tab normals you can read 1 as shown in the following image:


If I set the threshold  to be 0, we can see that one Critical is triggered. When this situation happen an email will be sent. The template of the email can be totally personalised as can be seen in the image above.  

Finally I executed some load test against the web application with:




It is clear from the image below that the average-load started very low because there was no processing on the web application in the www docker image. The very first peak is a first execution of the load test. Then I waited 30 second so the average load went down again, to finally go up again when I executed the test again. 








Google+ Followers