Wednesday, October 15, 2014

Testing CDN and geolocation with

Assume you want to migrate to a new CDN provider. Eventually you'll have to point as a CNAME to a domain name handled by the CDN provider, let's call it To test this setup before you put it in production, the usual way is to get an IP address corresponding to, then associate with that IP address in your local /etc/hosts file.

This works well for testing most of the functionality of your web site, but it doesn't work when you want to test geolocation-specific features such as displaying the currency based on the users's country of origin. For this, you can use a nifty feature from the amazing free service WebPageTest.

On the main page of WebPageTest, you can specify the test location from the dropdown. It contains a generous list of locations across the globe. To fake your DNS setting and point, you can specify something like this in the Script tab:


This will effectively associate the page you want to test with the CDN provider-specified URL, so you will hit the CDN first from the location you chose.

Monday, October 13, 2014

Watch the open files limit when running Riak

I was close to expressing my unbridled joy at how little hand-holding our Riak cluster needs, when we started to see strange increased latencies when hitting the cluster, on calls that should have been very fast. Also, the health of the Riak nodes seems fine in terms of CPU, memory and disk. As usual, our good old friend the error log file pointed us towards the solution. We saw entries like this in /var/log/riak/error.log:

2014-10-11 03:22:40.565 UTC [error] <0.12830.4607> CRASH REPORT Process <0.12830.4607> with 0 neighbours exited with reason: {error,accept_failed} in mochiweb_acceptor:init/3 line 34
2014-10-11 03:22:40.619 UTC [error] <0.168.0> {mochiweb_socket_server,310,{acceptor_error,{error,accept_failed}}}
2014-10-11 03:22:40.619 UTC [error] <0.12831.4607> application: mochiweb, "Accept failed error", "{error,emfile}"

A google search revealed that a possible cause of these errors is the dreaded open file descriptor limit, which is 1024 by default in Ubuntu.

To be perfectly honest, we hadn't done almost any tuning on our Riak cluster, because it had been running so smoothly. But recently we started to throw more traffic at it, hence issues with open file descriptors made sense. To fix it, we followed the advice in this Riak doc and created /etc/default/riak with the contents:

ulimit -n 65536

We also took the opportunity to apply the networking-related kernel tuning recommendations from this other Riak tuning doc and added these lines to /etc/sysctl.conf:

net.ipv4.tcp_max_syn_backlog = 40000
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_tw_reuse = 1

Then we ran sysctl -p to update the above values in the kernel. Finally we restarted our Riak nodes one at a time.

I am happy to report that ever since, we've had absolutely no issues with our Riak cluster.  I should also say we are running Riak 1.3, and I understand that Riak 2.0 has better tests in place for avoiding this issue.

I do want to give kudos to Basho for an amazingly robust piece of technology, whose only fault is that it gets you into the habit of ignoring it because it just works!

Thursday, October 02, 2014

A quick note on haproxy acl rules

I blogged in the past about haproxy acl rules we used for geolocation detection purposes. In that post, I referenced acl conditions that were met when traffic was coming from a non-US IP address. In that case, we were using a different haproxy backend. We had an issue recently when trying to introduce yet another backend for a given country. We added these acl conditions:

       acl acl_geoloc_akamai_true_client_ip_some_country req.hdr(X-Country-Akamai) -m str -i SOME_COUNTRY_CODE
       acl acl_geoloc_src_some_country req.hdr(X-Country-Src) -m str -i SOME_COUNTRY_CODE

We also added this use_backend rule:

      use_backend www_some_country-backend if acl_akamai_true_client_ip_header_exists acl_geoloc_akamai_true_client_ip_some_country or acl_geoloc_src_some_country

However, the backend www_some_country-backend was never chosen by haproxy, even though we could see traffic coming from IP address from SOME_COUNTRY_CODE.

The cause of this issue was that another use_backend rule (for non-US traffic) was firing before the new rule we added. I believe this is because this rule is more generic:

       use_backend www_row-backend if acl_akamai_true_client_ip_header_exists !acl_geoloc_akamai_true_client_ip_us or !acl_geoloc_src_us

The solution was to modify the use_backend rule for non-US traffic to fire only when the SOME_COUNTRY acl condition isn't met:

       use_backend www_row-backend if acl_akamai_true_client_ip_header_exists !acl_geoloc_akamai_true_client_ip_us !acl_geoloc_akamai_true_client_ip_some_country or !acl_geoloc_src_us !acl_geoloc_src_some_country

Maybe another solution would be to change the order of acls and use_backend rules. I couldn't find any good documentation on how this order affects what gets triggered when.

Wednesday, September 10, 2014

Booting a Raspberry Pi B+ with the Raspbian Debian Wheezy image

It took me a while to boot my brand new Raspberry Pi B+ with a usable Linux image. I chose the Raspbian Debian Wheezy image available on the downloads page of the official site. Here are the steps I needed:

1) Bought micro SD card. Note DO NOT get a regular SD card for the B+ because it will not fit in the SD card slot. You need a micro SD card.

2) Inserted the SD card via an SD USB adaptor in my MacBook Pro.

3) Went to the command line and ran df to see which volume the SD card was mounted as. In my case, it was /dev/disk1s1.

4) Unmounted the SD card. I initially tried 'sudo umount /dev/disk1s1' but the system told me to use 'diskutil unmount', so the command that worked for me was:

diskutil unmount /dev/disk1s1

5) Used dd to copy the Raspbian Debian Wheezy image (which I previously downloaded) per these instructions. Important note: the target of the dd command is /dev/disk1 and NOT /dev/disk1s1. I tried initially with the latter, and the Raspberry Pi wouldn't boot (one of the symptoms that something was wrong other than the fact that nothing appeared on the monitor, was that the green light was solid and not flashing; a google search revealed that one possible cause for that was a problem with the SD card). The dd command I used was:

dd if=2014-06-20-wheezy-raspbian.img of=/dev/disk1 bs=1m

6) At this point, I inserted the micro SD card into the SD slot on the Raspberry Pi, then connected the Pi to a USB power cable, a monitor via an HDMI cable, a USB keyboard and a USB mouse. I was able to boot and change the password for the pi user. The sky is the limit next ;-)

Wednesday, August 20, 2014

Two lessons on haproxy checks and swap space

Let's assume you want to host a Wordpress site which is not going to get a lot of traffic. You want to use EC2 for this. You still want as much fault tolerance as you can get at a decent price, so you create an Elastic Load Balancer endpoint which points to 2 (smallish) EC2 instances running haproxy, with each haproxy instance pointing in turn to 2 (not-so-smallish) EC2 instances running Wordpress (Apache + MySQL). 

You choose to run haproxy behind the ELB because it gives you more flexibitity in terms of load balancing algorithms, health checks, redirections etc. Within haproxy, one of the Wordpress servers is marked as a backup for the other, so it only gets hit by haproxy when the primary one goes down. On this secondary Wordpress instance you set up MySQL to be a slave of the primary instance's MySQL. 

Here are two things (at least) that you need to make sure you have in this scenario:

1) Make sure you specify the httpchk option in haproxy.cfg, otherwise the primary server will not be marked as down even if Apache goes down. So you should have something like:

backend servers-http
  server s1 weight 1 maxconn 5000 check port 80
  server s2 backup weight 1 maxconn 5000 check port 80
  option httpchk GET /

2) Make sure you have swap space in case the memory on the Wordpress instances gets exhausted, in which case random processes will be killed by the oom process (and one of those processes can be mysqld). By default, there is no swap space when you spin up an Ubuntu EC2 instance. Here's how to set up a 2 GB swapfile:

dd if=/dev/zero of=/swapfile1 bs=1024 count=2097152
mkswap /swapfile1
chmod 0600 /swapfile1
swapon /swapfile1
echo "/swapfile1 swap swap defaults 0 0" >> /etc/fstab

I hope these two things will help you if you're not already doing them ;-)

Friday, August 15, 2014

Managing OpenStack security groups from the command line

I had an issue today where I couldn't connect to a particular OpenStack instance on port 443. I decided to inspect the security group it belongs (let's call it myapp) to from the command line:

# nova secgroup-list-rules myapp
| IP Protocol | From Port | To Port | IP Range   | Source Group |
| tcp         | 80        | 80      |  |              |
| tcp         | 443       | 443     | |              |

Note that the IP range for port 443 is wrong. It should be all IPs and not a /24 network.

I proceeded to delete the wrong rule:

# nova secgroup-delete-rule myapp tcp 443 443                                                               
| IP Protocol | From Port | To Port | IP Range   | Source Group |
| tcp         | 443       | 443     | |              |

Then I added back the correct rule:

 # nova secgroup-add-rule myapp tcp 443 443                                                                   
| IP Protocol | From Port | To Port | IP Range  | Source Group |
| tcp         | 443       | 443     | |              |

Finally, I verified that the rules are now correct:

# nova secgroup-list-rules myapp                                                                                       
| IP Protocol | From Port | To Port | IP Range  | Source Group |
| tcp         | 443       | 443     | |              |
| tcp         | 80        | 80      | |              |

Of course, the real test was to see if I could now hit port 443 on my instance, and indeed I was able to.

Tuesday, July 22, 2014

Troubleshooting haproxy 502 errors related to malformed/large HTTP headers

We had a situation recently where our web application started to behave strangely. First nginx (which sits in front of the application) started to error out with messages of this type:

upstream sent too big header while reading response header from upstream

A quick Google search revealed that a fix for this is to bump up proxy_buffer_size in nginx.conf, for both http and https traffic, along these lines:

proxy_buffer_size   256k;
proxy_buffers   4 256k;
proxy_busy_buffers_size   256k;

Now nginx was happy when hit directly. However, haproxy was still erroring out with a 502 'bad gateway' return code, followed by PH. Here is a snippet from the haproxy log file:

Jul 22 21:27:13 haproxy[14317]: [22/Jul/2014:21:27:12.776] www-frontend www-backend/www2:80 1/0/1/-1/898 502 8396 - - PH-- 0/0/0/0/0 0/0 "GET /someurl HTTP/1.1"

Another Google search revealed that PH means that haproxy rejected the header from the backend because it was malformed.

At this point, an investigation into the web app did discover a loop in the code that kept adding elements to a cookie included in the response header.

Anyway, I leave this here in the hope that somebody will stumble on it and benefit from it.

Thursday, July 17, 2014

First experiences with OpenStack

We hit a big milestone this week, as we started to use OpenStack as a private cloud, intially just for QA/integration environments. Up to now we've been creating KVM machines semi-manually, which used to take minutes. Now we cut down that process to seconds, calling the Nova API from the command line, e.g.:

$ nova boot --image precise-image --flavor www --key_name mykey --nic net-id=3eafbd4f-0389-4c5b-93ba-7764742ee8cd www1.qa1

Once an instance is provisioned, we bootstrap it with Chef:

$ knife bootstrap -x ubuntu --sudo -E qa1 -N www1.qa1 -r "role[base], role[www]"

Our internal network architecture is fairly complex, so my colleague Jeff Roberts spent quite some time bending OpenStack Neutron to his will (in conjunction with Open vSwitch) in order to support our internal VLANs. The OpenStack infrastructure has been stable so far, and it's just such a pleasure to do everything via an API and not to spin VMs up manually. Being back to working with a (private) cloud feels good.

This is just version 1.0 of our OpenStack rollout. Soon we'll start spinning up one environment at a time using chef-metal and fog  and we'll also integrate instance + environment spin-up with Jenkins. Exciting times ahead!

Friday, June 13, 2014

Setting up the hostname in Ubuntu

Most people recommend setting up the hostname on a Linux box so that:

1) running 'hostname' returns the short name (i.e. myhost)
2) running 'hostname -f' returns the FQDN (i.e.
3) running 'hostname -d' returns the domain name (i.e

After experimenting a bit and also finding this helpful Server Fault post, here's what we did to achieve this (we did it via Chef recipes, but it amounts to the same thing):

  • make sure we have the short name in /etc/hostname:

(also run 'hostname myhost' at the command line)
  • make sure we have the FQDN as the first entry associated with the IP of the server in /etc/hosts: myhost
  • make sure we have the domain name set up as the search domain in /etc/resolv.conf:

Reboot the box when you're done to make sure all of this survives reboots.

Tuesday, May 20, 2014

Technologies to look into as a sysadmin

These are some of the technologies that I think are either established, or new and promising, but all useful for sysadmins, no matter what their level of expertise is. Some of them I am already familiar with, some are on my TODO list, some I am exploring currently. They all reflect my own taste, so YMMV.

Operating systems
  • Ubuntu

Programming/scripting languages
  • Go
  • Python/Ruby

Configuration management
  • Chef
  • Ansible

  • Sensu
  • Graphite
  • Logstash
  • ElasticSearch

Load balancer/Web server
  • HAProxy
  • Nginx

Relational databases
  • MySQL
  • PostgreSQL

Non-relational distributed databases
  • Riak
  • Cassandra

Service discovery
  • etcd
  • consul

  • KVM
  • Vagrant
  • Docker

Software defined networking (SDN)
  • Open vSwitch

  • OpenStack

  • CloudFoundry

This should keep most people in the industry busy for a while ;-)

Friday, April 25, 2014

Dashboards are important!

In this case, they were a factor in my having a discussion with Eric Garcetti, the mayor of Los Angeles, who was visiting our office. He was intrigued by the Graphite dashboards we have on 8 monitors around the Ops area and I explained to him a little bit of what's going on in terms of what we're graphing. I'll let you guess who is the mayor in this photo:

Slides from my remote presentation on "Modern Web development and operations practices" at MSU

Titus Brown was kind enough to invite me to present to his students in the CSE 491 "Web development" class at MSU. I presented remotely, via Google Hangouts, on "Modern Web development and operations practices" and it was a lot of fun. Perhaps unsurprisingly, most of the questions at the end were on how to get a job in this field and be able to play with some of these cool technologies. My answer was to become active in Open Source, beef up your portfolio on GitHub, go to conferences and network (this was actually Titus's idea, but I wholeheartedly agree), and in general  be curious and passionate about your field, and success will follow. I posted my slides on Slideshare if you are curious to take a look. Thanks to Dr. Brown for inviting me! :-)

Monday, April 07, 2014

Why does it work in staging but not in production?

This is a question that I am sure was faced by every developer and operation engineer out there. There can be multiple answers to this question, and I'll try to offer some of the ones we arrived at, having to do mainly with our Chef workflow, but that can be applied I think to any other configuration management tool.

1) A Chef cookbook version in staging is different from the version in production

This is a common scenario, and it's supposed to work this way. You do want to test out new versions of your cookbooks in staging first, then update the version of the cookbook in production.

2) A feature flag is turned on in staging but turned off in production

We have Chef attributes defined in attributes/default.rb that serve as feature flags. If a certain attribute is true, some recipe code or template section gets included which wouldn't be included if the attribute were false. The situation can occur where a certain attribute is set to true in the staging environment but is set to false in the production environment, at which point things can get out of sync. Again, this is expected, as you do want to test new features out in staging first, but don't forget to turn them on in production at some point.

3) A block of code or template is included in staging but not in production

We had this situation very recently. Instead of using attributes as feature flags, we were directly comparing the environment against 'stg' or 'prod' inside an if block in a template, and only including that template section if the environment was 'stg'. So things were working perfectly in staging, but mysteriously the template section wasn't even there in production. An added difficulty was that the template in question was peppered with non-indented if blocks, so it took us a while to figure out what was going on.

Two lessons here:

a) Make your templates readable by indenting code blocks.

b) Use attributes as feature flags, and don't compare directly against the current environment. This way, it's easier to always look at the default attribute file and see if a given feature flag is true or false.

4) A modification is made to the cookbook version in production directly on the Chef server

I blogged about this issue in the past. Suppose you have an environments file that pins a given cookbook (let's designate it as cookbook C) to 1.0.1 in staging and to 1.0.0 in production. You want to upgrade production to 1.0.1, because it was tested in staging and it worked fine. However, instead of i) modifying the environments/prod.rb file and pinning the cookbook C to 1.0.1, ii) updating the Chef server via "knife environment from file environments/prod.rb" and iii) committing your changes in git, you modify the version of the cookbook C directly on the Chef server with "knife environment edit prod".

Then, the next time you or somebody else modifies environments/prod.rb to bump up another cookbook to the next version, the version of cookbook C in that file is still 1.0.0, so when you upload environments/prod.rb to the Chef server, it will downgrade cookbook C from 1.0.1 to 1.0.0. Chaos will ensue the next time chef-client runs on the nodes that have recipes from cookbook C. Production will be broken, while staging will still happily work.

Here are 2 other scenarios not related directly to staging vs production, but instead having the potential to break production altogether.

You forget to upload the new version of the cookbook to the Chef server

You make all of your modifications to the cookbook, you commit your code to git, but for some reason you forget to upload the cookbook to the Chef server. Particularly if you keep the same version of the cookbook that is in staging (and possibly in production), then your modifications won't take effect and you may spend some quality time pulling your hair.

You upload a cookbook to the Chef server without bumping its version

There is another, even worse, scenario though: you do upload your cookbook to the Chef server, but you realize that you didn't bump up the version number compared to what is currently pinned to production. As a consequence, all the nodes in production that have recipes from that cookbook will be updated the next time they run chef-client. That's a nasty one. It does happen. So make sure you pay attention to your cookbook versioning process and stick to it!

Friday, March 07, 2014

More on haproxy geolocation detection and CDN services

In a previous blog post I described a method to do geolocation detection with haproxy. The country detection was based on the user's client IP. However, if you have a CDN service in front of your load balancer, then the source IPs will all belong to the CDN server farm, and the closest such server to an end user may not be in the same country as the user. Fortunately, CDN services generally pass that end user IP address in some specific HTTP header, so you can still perform the geolocation detection by inspecting that header. For example, Akamai passes the client IP in a header called True-Client-IP.

In our haproxy.cfg rules detailed below we wanted to handle both the case where our load balancer is hit directly by end users (in case we bypass any CDN service), and the case where the load balancer is hit via a CDN.

1) We set our own HTTP headers containing the country code as detected by geolocation based on a) the source IP (this is so we can still look at the source IP in case we bypass the CDN and hit our load balancer directly) and b) the specific CDN header containing the actual client IP (True-Client-IP in the case of Akamai):

http-request set-header X-Country-Src %[src,map_ip(/etc/haproxy/geolocation.txt)]

http-request set-header X-Country-Akamai %[req.hdr_ip(True-Client-IP,-1),map_ip(/etc/haproxy/geolocation.txt)]

2) We set an ACL that is true if we detect the presence of the True-Client-IP header, which tells us that we are hit via Akamai:

acl acl_akamai_true_client_ip_header_exists req.hdr(True-Client-IP) -m found

3) We set an ACL that is true if we detect that the country of origin (obtained via Akamai's True-Client-IP) is US:

acl acl_geoloc_akamai_true_client_ip_us req.hdr(X-Country-Akamai) -m str -i US

4) We set an ACL that is true if we detect that the country of origin (obtained via the source IP of the client) is US:

acl acl_geoloc_src_us req.hdr(X-Country-Src) -m str -i US

5) Based on the ACLs defined above, we send non-US traffic to a specific backend, IF we are being hit via Akamai (ACL #2) AND we detected that the country of origin is non-US (negation of ACL #3) OR if we detected that the country of origin if non-US via the source IP (negation of ACL #4):

use_backend www-backend-non-us if acl_akamai_true_client_ip_header_exists !acl_geoloc_akamai_true_client_ip_us or !acl_geoloc_src_us

(note that the AND is implicit in the way haproxy looks at combinations of ACLs)

6) We also we an HTTP header called X-Country which our application inspects in order to perform country-specific logic. We first set this header to the X-Country-Src header set in rule #1, but we override it if we are getting hit via Akamai:

http-request set-header X-Country %[req.hdr(X-Country-Src)]
http-request set-header X-Country %[req.hdr(X-Country-Akamai)] if acl_akamai_true_client_ip_header_exists

This looks pretty complicated, but it works well :-)

Friday, February 28, 2014

Example of Chef workflow

Here is a quick example of a Chef workflow that has been working for us. It can be easily improved on, especially around testing, but it's a good foundation.

1) Put your chef-repo on Github.
2) When you want to modify a cookbook, do a git pull to get the latest version of the cookbook.
3) Modify the cookbook.
4) Check your environments (I'll assume staging and production for now, to keep it simple) to see what version of the cookbook is used in production vs staging. Let's assume both staging and production environments use the latest version of the cookbook, say 0.1.
5) Modify metadata.rb and bump up the version of the cookbook to 0.2.
6) Modify the staging environment file (for example environments/stg.rb) and pin the cookbook you modified to version 0.2. Make sure the production environment is still pinned to 0.1.
7) Update the staging environment on the Chef server via: 'knife environment from file environments/stg.rb'
8) Upload the new version of the cookbook (0.2) to the Chef server via: 'knife cookbook upload mycookbook' (it should report version 0.2 after the upload)
9) Run chef-client on a staging box that uses the cookbook you modified. Check that everything looks good.
10) Assuming everything looks good in staging, modify the production environment file (for example environments/prod.rb) and pin the cookbook you modified to the new version 0.2.
11) Update the production environment on the Chef server via: 'knife environment from file environments/prod.rb'.
12) Run chef-client on a prod box and check that everything is OK. If it looks good, either let chef-client run by itself on all prod boxes, or run chef-client manually to force the change.
13) Commit your coobook and environment changes into git and push to Github.

Note that there is the possibility of screw-ups if somebody forgets step #13. For this reason, I usually am double careful and check especially my local version of the environment files (stg.rb and prod.rb) against what is actually running on the Chef server. I run 'knife environment show stg' and compare the result to stg.rb. I also run 'knife environment show prod' and compare the result to prod.rb. Only if they both look good do I modify my local copies of stg.rb and prod.rb and then upload them to the Chef server. We've had issues in the past with changes that were made to the Chef server directly (via 'knife environment edit') that got overwritten when somebody uploaded their version of the environment file that contained an older version of the given cookbook. For this reason I don't recommed making changes directly on the Chef server by editing roles, environments, etc, but instead making all changes on your local files, then uploading those files to Chef and also committing those changes to Github.

As I said in the beginning, there is the opportunity to run various testing tools (at a minimum rubocop and Foodcritic) on your cookbook before uploading it to the Chef server. But that is for another post.

Monday, January 13, 2014

Geolocation detection with haproxy

A useful feature for a web application is the ability to detect the user's country of origin based on their source IP address. This used not to be possible in haproxy unless you applied Cyril Bonté's geolocation patches (see the end of this blog post for how exactly to do that if you don't want to live on the bleeding edge of haproxy). However, the latest development version of haproxy (which is 1.5-dev21 at this time) contains geolocation detection functionality.

Here's how to use the geolocation detection feature of haproxy:

1) Generate text file which maps IP address ranges to ISO country codes

This is done using Cyril's haproxy-geoip utility, which is available in his geolocation patches. Here's how to locate and run this utility:
  • clone patch git repo: git clone
  • the haproxy-geoip script is now available in haproxy-patches/geolocation/tools
    • for the script to run, you need to have the funzip utility available on your system (it's part of the unzip package in Ubuntu)
    • you also need the iprange binary, which you can 'make' from its source file available in the haproxy-1.5-dev21/contrib/iprange directory; once you generate the binary, copy it somewhere in your PATH so that haproxy-geoip can locate it
  • run haproxy-geoip, which prints its output (IP ranges associated to ISO country codes) to stdout, and capture stdout to a file: haproxy-geoip > geolocation.txt
  • copy geolocation.txt to /etc/haproxy
2) Set custom HTTP header based on geolocation

For this, haproxy provides the map_ip function, which locates the source IP (the predefined 'src' variable in the line below) in the IP range in geolocation.txt and returns the ISO country code. We assign this country code to the custom X-Country HTTP header:

http-request set-header X-Country %[src, map_ip(/etc/haproxy/geolocation.txt)]

If you didn't want to map the source IP to a country code, but instead wanted to inspect the value of an HTTP header such as X-Forwarded-For, you could do this:

http-request set-header X-Country %[req.hdr_ip(X-Forwarded-For,-1), map_ip(/etc/haproxy/geolocation.txt)]

3) Use geolocation in ACLs

Let's assume that if the country detected via geolocation is not US, then you want to send the user to a different backend. You can do that with an ACL. Note that we compare the HTTP header X-Country which we already set above to the string 'US' using the '-m str' string matching functionality of haproxy, and we also specify that we want a case insensitive comparison with '-i US':

acl acl_geoloc_us req.hdr(X-Country) -m str -i US
use_backend www-backend-non-us if !acl_geoloc_us

If you didn't want to set the custom HTTP header, you could use the map_ip function directly in the definition of the ACL, like this:

acl acl_geoloc_us %[src, map_ip(/etc/haproxy/geolocation.txt)] -m str -i US
use_backend www-backend-non-us if !acl_geoloc_us

Speaking of ACLs, here's an example of defining ACLs based on the existence of a cookie and based on the value of the cookie then choosing a backend based on those ACLs:

acl acl_cookie_country req.cook_cnt(country_code) eq 1
acl acl_cookie_country_us req.cook(country_code) -m str -i US
use_backend www-backend-non-us if acl_cookie_country !acl_cookie_country_us

And now for something completely different...which is what I mentioned in the beginning of this post: 

How to use the haproxy geolocation patches with the current stable (1.4) version of haproxy

a) Patch haproxy source code with gelocation patches, compile and install haproxy:
  • clone patch git repo: git clone
  • change to haproxy-1.4.24 directory
  • copy haproxy-1.4-geolocation.patc to the root of haproxy-1.4.24 
  • apply the patch: patch -p1 < haproxy-1.4-geolocation.patch
  • make clean
  • make TARGET=linux26
  • make install
b) Generate text file which maps IP address ranges to ISO country codes
  • install funzip: apt-get install unzip
  • create iprange binary
    • cd haproxy-1.4.24/contrib/iprange
    • make
    • the iprange binary will be created in the same folder. copy that to /usr/local/sbin
  • haproxy-geoip is located here: haproxy-patches/geolocation/tools
  • haproxy-geoip > geolocation.txt
  • copy geolocation.txt to /etc/haproxy 
c) Obtain country code based on source IP and use it in ACL

This is done via the special 'geolocate' statement and the 'geoloc' variable added to the haproxy configuration syntax by the geolocation patch:

geolocate src /etc/haproxy/geolocation.txt
acl acl-au geoloc eq AU
use_backend www-backend-au if acl-au

If instead of the source IP you want to map the value of the X-Forwarded-For header to a country, use:

geolocate hdr_ip(X-Forwarded-For,-1) /etc/haproxy/geolocation.txt

If you wanted to redirect to another location instead of using an ACL, use:

redirect location if { geoloc AU }

That's it for now. I want to thank Cyril Bonté, the author of the geolocation patches, and Willy Tarreau, the author of haproxy, for their invaluable help and their amazingly fast responses to my emails. It's a pleasure to deal with such open source developers passionate about the software they produce.  Also thanks to my colleagues Zmer Andranigian for working on getting version 1.4 of haproxy to work with geolocation, and Jeff Roberts for working on getting 1.5-dev21 to work.

One last thing: haproxy-1.5-dev21 has been very stable in production for us, but of course test it thoroughly before deploying it in your environment.

Monday, December 23, 2013

Ops Design Pattern: local haproxy talking to service layer

Modern Web application architectures are often composed of a front-end application layer (app servers running Java/Python/Ruby aided by a generous helping of JavaScript) talking to one or more layers of services (RESTful or not) which in turn may talk to a distributed database such as Riak.

Typically we see a pair of load balancers in HA mode deployed up front and handling user traffic, or more commonly serving as the origin for CDN traffic. In order to avoid deploying many pairs of load balancers in between the front-end app server layer and various services layers, or in between one service layer and another, one design pattern I've successfully used is an haproxy instance running locally (on on each node that needs to talk to N other nodes running some type of service. This approach has several advantages:

  • No need to use up nodes, be they bare-metal servers, cloud instances or VMs, for the sole purpose of running yet another haproxy instance (and you actually need 2 nodes for an HA configuration, plus you need to run keepalive or something similar on each node)
  • Potentially fewer bottlenecks, as each node fans out to all other services it needs to talk to, with no need to go first through a centralized load balancer
  • Easy deployment via Chef or Puppet, by simply adding the installation of the haproxy instance to the app node's cookbook/manifest
The main disadvantage of this solution is an increased number of health checks against the service nodes behind each haproxy (1 health check from each app node). Also, as @lusis pointed out, in some scenarios, for example when the local haproxy instances talk to a Riak cluster, there is the possibility of each app node seeing a different image of the cluster in terms of the particular Riak node(s) it gets the data from (but I think with Riak this is the case even with a centralized load balancer approach).

In any case, I recommend this approach which has worked really well for us here at NastyGal. I used a similar approach in the past as well at Evite. 

Thanks to @bdha for spurring an interesting Twitter thread about this approach, and to @lusis and @obfuscurity for jumping into the discussion with their experiences. As @garethr said, somebody needs to start documenting these patterns!

Thursday, December 12, 2013

Setting HTTP request headers in haproxy and interpolating variables

I had the need to set a custom HTTP request header in haproxy. For versions up to 1.4.x, the way to do this is :

reqadd X-Custom-Header:\ some_string

However, some_string is just a static string, and I could see no way of interpolating a variable in the string. Googling around, this is possible in haproxy 1.5.x with this method:

http-request set-header X-Custom-Header %[dst_port]

where dst_port is the variable we want to interpolate and %[variable] is the syntax for interpolation.

Other examples of variables available for you in haproxy.cfg are in Section 7.3 "Fetching samples" in the haproxy 1.5 configuration manual.

Tuesday, December 10, 2013

Creating sensu alerts based on graphite data

We had the need to create Sensu alerts based on some of the metrics we send to Graphite. Googling around, I found this nice post by @ulfmansson talking about the way they did it at Recorded Future. Ulf recommends using Sean Porter's check-data.rb Sensu plugin for alerting based on Graphite data. It wasn't clear how to call the plugin, so we experimented a bit and came up with something along these lines (note that check-data.rb requires the sensu-plugin gem):

$ ruby check-data.rb -s -t "movingAverage(,10)" -w 100 -c 200

This run the check-data.rb script against the server (-s option) requesting the value or the target metric movingAverage(,10) (-t option) and setting a warning threshold of 100 for this value (-w option), and a critical threshold of 200 (-c option).  The target can be any function supported by Graphite. In this example, it is a 10-minute moving average for the number of sessions for the "assets" haproxy backend. By default check-data.rb looks at the last 10 minutes of Graphite data (this can be changed by specifying something like -f "-5mins").

To call the check in the context of sensu, you need to deploy it to the client which will run it, and configure the check on the Sensu server in a json file in /etc/sensu/conf.d/checks:

"command": "/etc/sensu/plugins/check-data.rb -s -t \"movingAverage(,10)\" -w 100 -c 200"

Friday, October 18, 2013

Avoiding keepalive storms in sensu

Sensu is a great new monitoring tool, but also a bit rough around the edges. We've been willing to live with that, because of its benefits, in particular ease of automation and increased scalability due to its use of a queuing system. Speaking of queueing systems, Sensu uses RabbitMQ for that purpose. We haven't had performance or stability issues with the rabbit, but we have been encountering a pretty severe issue with the way Sensu and RabbitMQ interact with each other.

We have systems deployed across several cloud providers and data centers, with site-to-site VPN links between locations. What started to happen fairly often for us was what we call a "keepalive storm", where all of a sudden all Sensu clients were seen by the Sensu server as unavailable, since no keepalive had been sent by the clients to RabbitMQ.  The thresholds for the keepalive timers in Sensu are hardcoded (at least in the Sensu version we are using, which is 0.10.2) and are defined in /opt/sensu/embedded/lib/ruby/gems/2.0.0/gems/sensu-0.10.2/lib/sensu/server.rb as 120 seconds for warnings and 180 seconds for critical alerts:

             thresholds = {
                :warning => 120,
                :critical => 180

What we think was happening is that the connections between the Sensu clients and RabbitMQ (which in our case is running on the same box as the Sensu server) were reset, either because of a temporary glitch in the site-to-site VPN connection, or because of some other undetermined but probably network-related cause. In any case, this issue was becoming severe and was causing the engineer on pager duty to not get a lot of sleep at night.

After lots of hair-pulling, we found a workaround by specifying a non-default value for the heartbeat parameter in the RabbitMQ configuration file rabbitmq.config. Here's what the documentation says about the heartbeat parameter:

Value representing the heartbeat delay, in seconds, that the server sends in the connection.tune frame. If set to 0, heartbeats are disabled. Clients might not follow the server suggestion, see the AMQP reference for more details. Disabling heartbeats might improve performance in situations with a great number of connections, but might lead to connections dropping in the presence of network devices that close inactive connections.
Default: 600

Note that the default value is 600 seconds, much larger than the 120 and 180 second keepalive thresholds defined in Sensu. So what we did was set a heartbeat value of less than 120. We chose 60 seconds for this value and it seemed to work fine. We still have keepalive storms, but they are definitely due to real but temporary issues in site-to-site VPN connectivity and they usually resolve themselves immediately.

One more thing: we install Sensu via its Chef community cookbook. The Sensu cookbook uses the RabbitMQ community cookbook, which doesn't define the heartbeat parameter as an attribute. We had to add that attribute, as well as use it in the rabbitmq.config.erb template file.

Just for reference, we modified cookbooks/rabbitmq/attributes/default.rb and added:

#avoid sensu keepalive storms!
default['rabbitmq']['heartbeat'] = 60

We also modified cookbooks/rabbitmq/templates/default/rabbitmq.config.erb and added:

{heartbeat, <%= node['rabbitmq']['heartbeat'] %>}

Disabling public key authentication in sftp

I just had an issue trying to sftp into a 3rd party vendor server using a user name and password. It worked fine with Filezilla, but from the command line I got:

Received disconnect from A.B.C.D: 11:
Couldn't read packet: Connection reset by peer

(A.B.C.D denotes the IP address of the sftp server)

I then ran sftp in verbose mode (-v) and got:

debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/mylocaluser/.ssh/id_rsa
Received disconnect from A.B.C.D: 11:
Couldn't read packet: Connection reset by peer

This made me realize that the sftp server is configured to accept password authentication only. I inspected the man page for sftp and googled around a bit to figure out how to disable public key authentication and I found a way that works:

sftp -oPubkeyAuthentication=no remoteuser@sftpserver

Wednesday, August 28, 2013

Keepalived, iproute2 and HAProxy (part 2)

In part 1 of this 2-part series, I explained how we initially set up keepalived and iproute2 on 2 HAProxy load balancers with the goal of achieving high availability at the load balancer layer. Each of the load balancers had 3 interfaces, and we wanted to be able to ssh into any IP address on those interfaces -- hence the need to iproute2 rules. However, adding keepalived into the mix complicated things.

To test failover at the HAProxy layer, we simulated a system failure by rebooting the primary load balancer. As expected, keepalived transferred the floating IP address to the secondary load balancer, and everything worked as expected. However, things started going south when the primary load balancer came back online. We had a chicken and egg problem: the iproute2 rules related to the floating IP address didn't kick in when rc.local was run, because the floating IP wasn't there yet. Then keepalived correctly identified the primary system as being up and transferred the floating IP there, but there was no route to it via iproute2. We decided that the iproute2 rules/policies unnecessarily complicated things, so we got rid of them. This meant we were back to one default gateway, on the same subnet as our front-end interface. The downside was that we were only able to ssh into one of the 3 IPs associated with the 3 interfaces on each load balancer, but the upside was that things were a lot simpler.

However, our failover tests with keepalived were still not working as expected. We mainly had issues when the primary load balancer came back after a reboot. Although keepalived correctly reassigned the floating IP to the primary LB, we weren't able to actually hit that IP over ssh or HTTP. It turned out that it was an ARP cache issue on the switch stack where the load balancers were connected. We had to clear the ARP cache in order for the floating IP to be associated again with the correct MAC. On further investigation, it turned out that the switches weren't accepting gratuitous ARP requests, so we enabled them by running this command on our switches:

ip arp gratuitous local

With this setup in place, we were able to fail over back and forth from the primary to the secondary load balancer. Note that whenever there are modifications to be made to the keepalived configuration, there is no good way we found to apply them (via chef) to the load balancers unless we take a very short outage while restarting keepalived on both load balancers.

Wednesday, July 10, 2013

The mystery of stale haproxy processes

We had the situation with our haproxy-based load balancers where our monitoring alerts were triggered by the fact that several haproxy processes were running, when in fact only one was supposed to be running. Looking more into it, we determined that each time Chef client ran (which by default is every 30 minutes), a new haproxy process was launched. The logic in the haproxy cookbook applied to that node was to do a 'service haproxy reload' every time the haproxy configuration file changed. Since our haproxy configuration file is based on a Chef template populated via a Chef search, that meant that the haproxy reload was happening on each Chef client run.

If you look in /etc/init.d/haproxy, you'll see that the reload launches a new haproxy process, while the existing process is supposed to finish serving existing connections, then exit. However, the symptom we were seeing was that the existing haproxy process never closed all the outstanding connections, so it never exited. Inspection via lsof also revealed that the haproxy process kept many network connections in the CLOSE_WAIT state. I need to mention that this particular haproxy box was load balancing requests from Ruby clients across a Riak cluster. After some research, it turned out that the symptom of haproxy connections in CLOSE_WAIT that never go away is due to the fact that the client connection goes away, while haproxy still waits for a confirmation of the termination of that connection. See this haproxy mailing list thread for a great in-depth explanation of the issue by the haproxy author Willy Tarreau.

In short, the solution in our case (per the mailing list thread) was to add

option forceclose

to the defaults section of haproxy.cfg.

Friday, May 31, 2013

Some gotchas around keepalived and iproute2 (part 1)

I should have written this blog post a while ago, while these things were still fresh on my mind. Still, better late than never.

Scenario: 2 bare-metal servers with 6 network ports each, to serve as our HAProxy load balancers in an active/failover configuration based on keepalived (I described how we integrated this with Chef in my previous post).

The architecture we have for the load balancers is as follows

  • 1 network interface (virtual and bonded, see below) is on a 'front-end' VLAN which gets the incoming traffic hitting HAProxy
  • 1 network interface is on a 'back-end' VLAN where the actual servers behind HAProxy live
  • 1 network interface is on an 'ops' VLAN which we want to use for accessing the HAProxy server for monitoring purposes

We (and by the way, when I say we, I mean mostly my colleagues Jeff Roberts and Zmer Andranigian) used Open vSwitch to create a virtual bridge interface and bond 2 physical interfaces on this bridge for each of the 'front-end' and 'back-end' interfaces.

To install Open vSwitch on Ubuntu, use:

# apt-get install openvswitch-switch openvswitch-controller

To create a bridge:

# ovs-vsctl add-br frontend_if

To create a bonded interface with 2 physical NICs (eth0 and eth1) on the frontend_if bridge created above:

# ovs-vsctl add-bond frontend_if frontend_bond eth0 eth1 lacp=active other_config:lacp-time=slow bond_mode=balance-tcp

We did the same for the 'back-end' interface by creating a bridge and bonding eth2 and eth3. We also configured the 'ops' interface as a regular network interface on eth4. To assign IP addresses to frontend_if, backend_if and eth4, we edited /etc/network/interfaces and added stanzas similar to:

auto eth0
iface eth0 inet static

auto eth1
iface eth1 inet static

auto frontend_if
iface frontend_if  inet static
        # dns-* options are implemented by the resolvconf package, if installed

auto eth2
iface eth2 inet static

auto eth3
iface eth3 inet static

auto backend_if
iface backend_if  inet static
        # dns-* options are implemented by the resolvconf package, if installed

At this point, we wanted to be able to ssh from a remote location into the HAProxy box using any of the 3 IP addresses associated with frontend_if, backend_if, and eth4. The problem was that with the regular routing rules in Linux, there's one default gateway, which in our case was on the same VLAN with frontend_if (

The solution was to install and configure the iproute2 package. This allows you to have multiple default gateways, one per interface that you want to configure this way (this blog post on iproute2 commands proved to be very useful).

To configure a default gateway for each of the 2 interfaces we defined above (frontend_if and backend_if), we added the following commands to /etc/rc.local  so that they can be run each time the box gets rebooted:

echo "1 admin" > /etc/iproute2/rt_tables
ip route add dev backend_if src table admin
ip route add default via dev backend_if table admin
ip rule add from table admin
ip rule add to table admin

echo "2 admin2" >> /etc/iproute2/rt_tables
ip route add dev frontend_if src table admin2
ip route add default via dev frontend_if table admin2
ip rule add from table admin2
ip rule add to table admin2

This was working great, but there was another aspect to this setup: we needed to get keepalived working between the 2 HAProxy boxes. Running keepalived means there is a new floating IP, which is a virtual IP address maintained by the keepalived process. In our case, this floating IP ( was attached to the frontend_if interface, which means we had to add another iproute2 stanza:

echo "3 admin3" >> /etc/iproute2/rt_tables
ip route add dev frontend_if src table admin3
ip route add default via dev frontend_if table admin3
ip rule add from table admin3
ip rule add to table admin3

I'll stop here for now. Stay tuned for part 2, where you can read about our adventures trying to get keepalived to work as we wanted it to. Hint: it involved getting rid of iproute2 policies.

Friday, April 19, 2013

Setting up keepalived with Chef on Ubuntu 12.04

We have 2 servers running HAProxy on Ubuntu 12.04. We want to set them up in an HA configuration, and for that we chose keepalived.

The first thing we did was look for an existing Chef cookbook for keepalived -- luckily, @jtimberman already wrote it. It's a pretty involved cookbook, probably one of the most complex I've seen. The usage instructions are pretty good though. In any case, we ended up writing our own wrapper cookbook on top of keepalived -- let's call it frontend-keepalived.

The usage documentation for the Opscode keepalived cookbook contains a role-based example and a recipe-based example. We took inspiration from both. In our frontend-keepalived/recipes/default.rb file we have:

include_recipe 'keepalived'

node[:keepalived][:check_scripts][:chk_haproxy] = {
  :script => 'killall -0 haproxy',
  :interval => 2,
  :weight => 2
node[:keepalived][:instances][:vi_1] = {
  :ip_addresses => '',
  :interface => 'frontend_if',
  :track_script => 'chk_haproxy',
  :nopreempt => false,
  :advert_int => 1,
  :auth_type => :pass, # :pass or :ah
  :auth_pass => 'mypass'

This code overrides the default values for many of the attributes defined in the Opscode keepalived cookbook. It specifies the floating IP address that will be common between the 2 servers that will each run HAProxy (:ip_addresses). It also specifies the network interface where the multicast-based keepalived protocol (:interface) and the 'check script' which tests whether HAProxy is still running on each server.

However, we still needed a way to specify which of the 2 servers is the master and which is the backup (in keepalived parlance), as well as indicating priorities for each server. The usage document in the keep alived cookbook shows this as an example of using a single role to define the master and the backup:

  :keepalived => {
    :global => {
      :router_ids => {
        'node1' => 'MASTER_NODE',
        'node2' => 'BACKUP_NODE'

We couldn't get this to work (if somebody who did reads this, please leave a comment and tell me how you did it!). Instead, we defined 2 roles, one for the master and one for the backup. Here's the master role:

$ cat frontend-keepalived-master.rb
name "frontend-keepalived-master"
description "install keepalived and set state to MASTER"

    "keepalived" => {
      "instance_defaults" => {
        "state" => "MASTER",
        "priority" => "101"


Here's the backup role:

$ cat frontend-keepalived-backup.rb
name "frontend-keepalived-backup"
description "install keepalived and set state to BACKUP"

    "keepalived" => {
      "instance_defaults" => {
        "state" => "BACKUP",
        "priority" => "100"


Notice that we override 2 attributes, the state and the priority. The defaults for these are in the Opscode keepalived cookbook, under attributes/default.rb

default['keepalived']['instance_defaults']['state'] = 'MASTER'
default['keepalived']['instance_defaults']['priority'] = 100

This was useful in determining how to specify the stanza overriding them in our roles -- it made us see that we needed to specify the instance_defaults key under keepalived in the role files.

At this point, we added the master role to the Chef run_list of server #1 and the backup role to the Chef run_list of server #2. We had to do one more thing on each server (which we'll add to the default recipe of our frontend-keepalived cookbook): per this very helpful blog post on setting up HAProxy and keepalived, we edited /etc/systctl.conf and added:

then applied it via 'sysctl -p'. This was needed so that HAProxy can listen on the keepalived-created 'floating IP' common to the 2 servers, which is not a real IP tied to an existing local network interface.

Once we ran chef-client on each of the 2 servers, we were able to verify that keepalived does its job by pinging the common floating IP from a 3rd server, then shutting down the network interface 'frontend_if' on each server, with no interruption in the ICMP responses sent from the floating IP. Our next step is to do some heavy-duty testing involving HTTP requests handled by HAProxy, and see that there is no interruption in service when we fail over from one HAProxy server to the other.


My colleague Zmer Andranigian discovered an attribute in the Opscode keepalived cookbook that deals with the sysctl setup. The default value for this attribute is:

default['keepalived']['shared_address'] = false

If this attribute is set to 'true' (for example in one of the 2 roles we defined above), then the keepalived cookbook will create a file called /etc/sysctl.d/60-ip-nonlocal-bind.conf containing:


and will also set it in the running configuration of sysctl.

For reference, the role frontend-keepalived-master would contain the following attributes:

    "keepalived" => {
      "instance_defaults" => {
        "state" => "MASTER",
        "priority" => "101"
      "shared_address" => "true"

Tuesday, April 02, 2013

Using wrapper cookbooks in Chef

Not sure if this is considered a Chef best practice or not -- I would like to get some feedback, hopefully via constructive comments on this blog post. But I've started to see this pattern when creating application-specific Chef cookbooks: take a community cookbook, include it in your own, and customize it for your specific application.

A case in point is the haproxy community cookbook. We have an application that needs to talk to a Riak cluster. After doing some research (read 'googling around') and asking people on Twitter (because Tweeps are always right), it looks like the preferred way of putting a load balancer in front of Riak is to run haproxy on each application server that needs to talk to Riak, and have haproxy listen on on some port number, then load balance those requests to the Riak backend. Here is an example of such an haproxy.cfg file.

So what I did was to create a small cookbook called haproxy-riak, add a default.rb recipe file that just calls

include_recipe haproxy

and then customize the haproxy.cfg file via a template file in my new cookbook (I actually didn't do it via the template yet, only via a hardcoded cookbook file, but my colleague and Chef guru Jeff Roberts is working on templatizing it).

I also added

depends haproxy

to metadata.rb.

I think this is pretty much all that's needed in order to create this 'wrapper' cookbook. This cookbook can then be used by any application that needs to talk to Riak. As a matter of fact, we (i.e. Jeff) are thinking that we should have such a cookbook per application (so we would call it app1-haproxy-riak for example) just so we can do things like search for different Riak clusters that we may have for different types of applications, and populate haproxy.cfg with the search results.

In any case, I would be curious to find out if other people are using this 'pattern', or if they found other ways to apply the DRY principle in their Chef cookbooks. Please leave a comment!

Monday, March 18, 2013

Installing ruby 2.0 on Ubuntu 12.04

I wanted to find a way to install ruby 2.0 on Ubuntu 12.04 via Chef, without going the rvm route, which is harder to automate. After several tries, I finally found a list of .deb packages which are necessary and sufficient as pre-requisites. Writing it down here for my future reference, and maybe it will prove useful to others out there.

I downloaded most of these packages from (if you google their names you'll find them), then I installed them via 'dpkg -i'. Here they are:


The actual ruby .deb package was built with fpm by my colleague Jeff Roberts courtesy of this blog post.

Wednesday, March 06, 2013

No snowflakes allowed

"Snowflake" is a term I learned from my colleague Jeff Roberts. It is used in the Chef community (maybe in the configuration management community at large as well) to designate a server/node that is 'unique', i.e. not in configuration management control. In a Chef environment, it means that the node in question was never added to Chef and never had chef-client run on it.

We've all been in situations where it seems overkill to go through the effort of automating the setup of a server. Maybe the server has a unique purpose within our infrastructure. Maybe we didn't feel like spending the time to create Chef recipes for that server. Whatever the reasoning, it seemed low-risk at the time.

Well, I am here to tell you there is danger in this way of thinking. Example: we deployed a server in EC2 manually. We installed the Sensu client on it manually and pointed it at our Sensu server. Everything seemed fine. Then one day we updated our Sensu configuration (via Chef) both on the Sensu server and on all the Sensu clients. Of course, the Sensu configuration on our snowflake server never got updated, since chef-client wasn't running on that server. As a result, the Sensu client wasn't checking in properly with the Sensu server, and the snowflake behaved as if it was falling off the map as far as our monitoring system was concerned. We had to manually update Sensu on the snowflake to bring it in sync with our configuration changes.

Basically, the result of having snowflake servers is that they do fall off the map as far as the overall automation of your infrastructure is concerned. They suffer bitrot, and you end up spending lots of time on their care and feeding, thus defeating the purpose of saving the time to automate them in the first place.

This being said, it's hard to be disciplined enough to run chef-client periodically on every single server in your infrastructure. I've never been able to do that before, but we are doing it now, mostly because of the insistence of Jeff. I do see the advantages of this discipline, and I do recommend it to everybody.

Tuesday, March 05, 2013

Video and slides for 'Five years of EC2 distilled' talk

On February 19th I gave a talk at the Silicon Valley Cloud Computing Meetup about my experiences and lessons learned while using EC2 for the past 5 years or so. I posted the slides on Slideshare and there's also a video recording of my presentation. I think the talk went pretty well, judging by the many questions I got at the end. Hopefully it will be useful to some people out there who are wondering if EC2 or The Cloud in general is a good fit for their infrastructure (short answer: it very much depends).

I want to thank Sebastian Stadil from Scalr for inviting me to give the talk (he first contacted me about giving a talk like this in -- believe it or not -- early 2008!). I also want to thank Adobe for hosting the meeting, and CDNetworks for sponsoring the meeting.

Thursday, February 07, 2013

A workflow of managing Chef with knife

My colleague Jeff Roberts has been trying hard to teach me how to properly manage Chef cookbooks, roles, and nodes using the knife utility. Here is a workflow for doing this in a way that aims to be as close as possible to 'best practices'. This assumes that you already have a Chef server set up.

1) Install chef on your personal machine

In my case, my personal laptop is running OS X (gasp! I used to be a big-time Ubuntu-everywhere user, but I changed my mind after being handed a MacBook Air).

I won't go into the gory details on installing chef locally, but here are a few notes:

  •  I installed the XCode command line tools, then I installed Homebrew. Note that plain XCode wasn't sufficient, I had to download the dmg package for the command line tools and install that.
  •  I installed git via brew.
  •  I installed chef via
  •  I installed the EC2 plugin for knife via
    • cd /opt/chef/embedded/bin/; sudo ./gem install knife-ec2
2) Clone chef-repo locally

Best practices dictate that you keep your chef-repo directory structure in version control. If you are using git, like we do, then you need to clone that locally via a git clone command.

3) Deploy chef client and validation keys

The keys are kept in chef-repo/.chef in our case. You need 2 keys: your_username.pem and validation.pem. You need to coordinate with your Chef server administrator to get them. A good way of passing keys around is to encrypt them on and send the link in an email, and communicate the decryption  by some out of band mechanism (such as voice).

4) Configure knife via knife.rb

You need a knife.rb file which sits in chef-repo/.chef as well. Here's a sample (replace username with your actual username, and set the proper EC2 access keys):
log_level :info
log_location STDOUT
node_name 'username'
client_key '/Users/username/chef-repo/.chef/username.pem'
validation_client_name 'chef-validator'
validation_key '/Users/username/chef-repo/.chef/validation.pem'
chef_server_url ''
cache_type 'BasicFile'
cache_options( :path => '/tmp/checksums' )
cookbook_path [ './cookbooks' ]
# EC2:
knife[:aws_access_key_id] = "xxxxxxxxxx"
knife[:aws_secret_access_key] = "XXXXXXXXXXXXXXXXXXX"
5) Test your knife setup

An easy way to see if knife can communicate properly with the Chef server at this point is to list the nodes in your infrastructure via

knife node list

If this doesn't work, you need to troubleshoot it until you make it work ;-)

BTW, in my case, I need to run knife while in the chef-repo directory, for it to properly read the files in the .chef subdirectory.

6) Create a new cookbook

For this example, I'll create a cookbook called myblog. The idea is to install nginx and Octopress.

The proper command to use is:

# knife cookbook create myblog

This will create a directory called myblog under chef-repo/cookbooks, and it will populate it with files and subdirectories pertaining to that cookbook (such as attributes, definitions, recipes, etc).

7) Download any other required cookbooks

For this example, I will download the nginx cookbook from the Opscode community cookbooks. I first search for the nginx cookbook, then I install it:

# knife cookbook site search nginx
# knife cookbook site install nginx

Once the nginx cookbook is installed locally, you still need to upload it to the Chef server:

# knife cookbook upload nginx

8) Create recipe for installing nginx and Octopress in new cookbook

Now that the pre-requisite cookbook is installed and uploaded to Chef, you can use it in your custom cookbook. You need to add references to the pre-requisite cookbook (nginx) in the following 2 files under cookbooks/myblog:

Add this to metadata.rb:

depends "nginx"

Add this to

* nginx

The actual custom recipe for myblog lives in cookbooks/myblog/recipes/default.rb. In my case, here's what I do to install Octopress:

include_recipe 'nginx'
include_recipe 'ruby::1.9.1'

# set default ruby to point to 1.9.1 (which is actually 1.9.3!)
system("update-alternatives --install /usr/bin/ruby ruby /usr/bin/ruby1.9.1 400 --slave /usr/share/man/man1/ruby.1.gz ruby.1.gz /usr/share/man/man1/ruby1.9.1.1.gz --slave /usr/bin/ri ri /usr/bin/ri1.9.1 --slave /usr/bin/irb irb /usr/bin/irb1.9.1 --slave /usr/bin/rdoc rdoc /usr/bin/rdoc1.9.1")

# install bundler via gems
system("gem install bundler")

# get octopress source code and install it via bundle and rake
system("cd /opt/; git clone git:// octopress")
system("cd /opt/octopress; bundle install; rake install")

It's a pretty convoluted way of installing Octopress, and it requires installing version 1.9.1 of ruby via the Opscode ruby cookbook first. It took me a few tries to get it right, but it seems to do the job, although I know running system commands on the remote node is not the preferred way of configuring nodes with Chef.

9) Upload new cookbook to Chef server

In order for the new cookbook you created to be available, you need to upload it to the Chef server:

# knife cookbook upload myblog

10) Test new cookbook by deploying new node

At this point, you are ready to test your shiny new cookbook. I did this by launching a new EC2 instance associated with the recipe in the myblog cookbook.

What follows is a command line using the knife EC2 plugin which took me a few tries to get right. It works for me, so I hope it will work for you too if you ever decide to do something similar. I had to dig into the knife-ec2 source code to get to some of these options, since they aren't documented in the README.

# knife ec2 server create -r "role[base], recipe[myblog]" -I ami-0d153248 --flavor c1.medium --region us-west-1 -g sg-7babb117 -i ~/.ssh/mykey.pem -x ubuntu -N myblog -S mykey -s subnet-ca6d20a3 -T T=techblog --ebs-size 50

This tells knife to launch an Ubuntu 12.04 instance (the -I AMI_ID option) associated with a 'base' role and the 'myblog' recipe (the -r option), size c1.medium (the --flavor option), in the us-west-1 region (the --region option), in a given security group (the -g option) and a given VPC subnet (the -s option), using the mykey.pem to ssh into the instance (the -i option -- where mykey.pem is the private key corresponding to the keypair you specify with the -S option) as user ubuntu (the -x option), using the mykey keypair name (the -S option -- this is a keypair that you must already have created), with a Chef node name of myblog (the -N option), an EC2 tag of techblog (the -T option), and finally an EBS root volume size of 50 GB (the --ebs-size option). Whew.

If everything goes well, you'll see something similar to this:

Instance ID: i-3c98b265
Flavor: c1.medium
Image: ami-0d153248
Region: us-west-1
Availability Zone: us-west-1b
Security Group Ids: sg-7babb117
Tags: TtechblogNametechblog
SSH Key: mykey

Waiting for server................
Subnet ID: subnet-ca7e10a3
Private IP Address:

followed by the output of an ssh session in which chef-client will run on the newly created instance. You'll be able to see if the chef-client run was successful or not. In either case, you should able to ssh into the new instance with the mykey.pem private key.

11) Commit any new or modified cookbooks

Now that you tested you cookbooks (both the pre-requisite ones such as nginx, and new ones such as myblog), you need to commit them to the chef-repo git repository so other members of your team can take advantage of them. You do this with git add, git commit and git push.

12) Other useful knife commands

You should be able to get information from the Chef server about the new node you just launched by running:

# knife node show techblog
Node Name:   techblog
Environment: _default
Run List:    role[base],  recipe[myblog]
Roles:       base, sysadmin_sudoers
Recipes:     apt, ntp, timezone, chef-client::service, chef-client::delete_validation, base-apps, users::sysadmins, sudo, nagios-plugins, ruby, rubygems, sensu::client, myblog
Platform:    ubuntu 12.04

You can also edit the run list of a given node by running:

# knife node edit techblog

(you need to set your EDITOR variable to your favorite editor first).

To inspect a given role, use:

# knife role show monitoring
chef_type:            role
description:          Installs the sensu monitoring client and related software
json_class:           Chef::Role
name:                 monitoring

There are many other knife commands you can use -- in fact, using knife to its full potential is an art in itself. Here is a sample of knife commands, courtesy of our Chef guru Jeff Roberts:

This command searches sensu.client.subscriptions and finds node that are running the mysql check.

knife search node "sensu_client_subscriptions:mysql"  

Show the sensu subscriptions for the jira.corp node.

knife node show jira.corp -a sensu.client.subscriptions

Show the EC2 attrs for the test_box node.

knife node show test_box -a "ec2"

Search all nodes and find ones in the "us-west-*" availability zone.

knife search node "ec2_placement_availability_zone:us-west-*" -a "ec2"

Search for all nodes in the role, "webserver" and show the "apache.sites" attribute.

knife search node "role:webserver" -a apache.sites

List all of the versions of the cookbook "nginx".

knife cookbook show nginx

Find all of the nodes in the "prod" environment.

knife search node "chef_environment:prod"

Find the last next available UID.

knife search users "*:*" -a uid | grep uid | sort