Data collection (for everything but the speedtests) is still done by the wonderful MRTG.
Graphing is RRDtool.
Latency/loss is handled by the mrtg-ping-probe add-on for MRTG.
DNS lookups is of my own creation - I use 'dig' to check lookups at various DNS's and a VBScript to turn the data into something MRTG can handle.
My existing HTTP charts (which test page load times) are handled by
http://www.webinject.org - which can natively output data in MRTG format.
I've got a script to do HTTP response time, but, it's not passing the data over into MRTG for some reason - it's a VBScript that uses 'wget' (GNU Wget 1.10.2) and 'ntimer' (from the Windows 2003 server resource kit).
eg.
ntimer -1 wget --quiet --spider
http://www.bbc.co.uk
'spider' means that nothing is done with the page. So, it's as close to a 'HTTP ping' as I can get.
This returns:
0:00:04.031 0:00:00.031 0:00:00.046 0:00:05.546
The first being elapsed time, second is user time.
I would use the user time in my reporting.
Therefore, the value from above would be 4031ms - which is very poor!!
During testing the other night, I was getting 100-150ms.
But, the speed test is something else....
I downloaded the '
Speed Tester' application from BroadbandChoices.co.uk - it is configured to run a speed test every hour.
I don't think it's particularly accurate, or the best tester around. But, it solved a problem. It writes it's data to a Microsoft Jet/Access DB, which unfortunately has a password on it.
I emailed them, told them what I was trying to do, and asked if I could have the password, so I may add it into my graphing/monitoring. They refused.
So, for legal reasons, I can't comment further on that!
Suffice to say, there is a scheduled task that runs every 20mins that updates the RRD used by the speedtest.
I'd rather be using speedtest.net for my speed testing. I've emailed them, and asked for a command-line test - or, something I can 'wget' to emulate a test. They can't help at present, but, may have something to cover this in the future.
It was the speedtest.net site that gave me the idea for the 'HTTP latency' monitoring - as they don't use ping/ICMP in their latency (as shown on a speed test), they monitor the HTTP response time.
When VM (badly) implemented the traffic shaping/prioritisation/management they were clearly prioritising ping/ICMP traffic higher than other traffic - presumably, in an attempt to fool people (that were running ping tests) into thinking the network was faster (and therefore, less problematic) than it really was.
That's why I was eager to get a 'real world latency' test done. Something to measure response, that wasn't based upon ICMP/ping.
If I wasn't such a Windows-person, and had it all running under Linux, I'd probably switch the ping-probe over to using something like TCP Ping.