Posts

Showing posts from September, 2012

What I want in a monitoring tool

I started a new job a few weeks ago, and I'm now at a point where I'm investigating monitoring options. At past jobs I used Nagios, which I know will work, but I would like to look into other more modern tools. I am aware that #monitoringsucks, and I am pretty sure people have hashed these topics before, but here are some of the things I want from a modern monitoring tool:
Ideally open source, of if not affordable per host per month pricing (we already signed up as a paying customer of Boundary for example)Installation and configuration should be easily scriptableserver installation, as well as addition/modification of clients should be easily automated so it can be done with Puppet/Chef API would be idealRobust notifications/alerting rulesescalationsservice dependenciesevent handler scriptsalerts based on subsets of hosts/servicesfor example alert me only when 2+ servers of the same type are downOut-of-the-box pluginsdatabase-specific checks for example Scalabilitythe monitor…

3 things to know when starting out with cloud computing

In the same vein as my previous post, I want to mention some of the basic but important things that someone starting out with cloud computing needs to know. Many times people see 'the cloud' as something magical, as the silver bullet that will solve all their scalability and performance problems. These people are in for a rude awakening if they don't pay attention to the following points.

Expect failure at any time

There are no guarantees in the cloud. Failures can and will happen, suddenly and mercilessly. Their frequency will increase as you increase the number of instances that you run in the cloud. It's a sickening feeling to realize that one of your database instances is gone, and there's not much you can do to bring it back. At that point, you need to rely on your disaster recovery plan (and you have a DR plan, don't you?) and either launch a new instance from scratch, or, in the case of a MySQL master server for example, promote a slave to a master. The s…