Tuesday, June 15, 2010

Common nginx configuration options

Google reveals a wealth of tutorials and sample nginx config files, but in any case here are some configuration tips that have been helpful to me.

Include files

Don't be shy in splitting up your main nginx.conf file into several smaller files. Your co-workers will be grateful. A structure that has been working for me is to have one file where I define my upstream pools, one file where I define locations that point to upstream pools, and one file where I define servers that handle those locations.

Examples:

upstreams.conf

upstream cluster1 {
fair;
server app01:7060;
server app01:7061;
server app02:7060;
server app02:7061;
}

upstream cluster2 {
fair;
server app01:7071;
server app01:7072;
server app02:7071;
server app02:7072;
}

locations.conf


location / {
root /var/www;
include cache-control.conf;
index index.html index.htm;
}

location /services/service1 {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

add_header Pragma "no-cache";


proxy_pass http://cluster1/;
}

location /services/service2 {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

add_header Pragma "no-cache";

proxy_pass http://cluster2/service2;
}

servers.conf

server {
listen 80;
include locations.conf;
}

At this point, your nginx.conf looks very clean and simple (you can still split it into more include files, by separating for example the gzip configuration options into their own file etc.)

nginx.conf

worker_processes 4;
worker_rlimit_nofile 10240;

events {
worker_connections 10240;
use epoll;
}

http {
include upstreams.conf;

include mime.types;
default_type application/octet-stream;

log_format custom '$remote_addr - $remote_user [$time_local] '
'"$request" $status $bytes_sent '
'"$http_referer" "$http_user_agent" "$http_x_forwarded_for" $request_time';

access_log /usr/local/nginx/logs/access.log custom;

proxy_buffering off;
sendfile on;
tcp_nopush on;
tcp_nodelay on;

gzip on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml application/xml+rss image/svg+xml application/x-font-ttf application/vnd.ms-fontobject;
gzip_disable "MSIE [1-6]\.";

# proxy cache config
proxy_cache_path /mnt/nginx_cache levels=1:2
keys_zone=one:10m
inactive=7d max_size=10g;
proxy_temp_path /var/tmp/nginx_temp;

proxy_next_upstream error;

include servers.conf;
}

This nginx.conf file is fairly vanilla in terms of the configuration options I used, but it's worth pointing some of them out.

Multiple worker processes

This is useful when you're running nginx on a multi-core box. Example:

worker_processes 4;

Increased number of file descriptors

This is useful for nginx instances that get hit by very high traffic. You want to increase the maximum number of file descriptors that nginx can use (the default on most Unix systems is 1024; run 'ulimit -n' to see the value on your system). Example:

worker_rlimit_nofile 10240;


Custom logging

See the log_format and access_log directives above. In particular, the "$http_x_forwarded_for" value is useful if nginx is behind another load balancer, and "$request_time" is useful to see the time taken by nginx when processing a request.

Compression

This is useful when you want to compress certain types of content sent back to the client. Examples:


gzip on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml application/xml+rss image/svg+xml application/x-font-ttf application/vnd.ms-fontobject;
gzip_disable "MSIE [1-6]\.";

Proxy options

These are options you can set per location. Examples:


proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
add_header Pragma "no-cache";


Most of these options have to do with setting custome HTTP headers (in particular 'no-cache' in case you don't want to cache anything related to that particular location)

Proxy cache

Nginx can be used as a caching server. You need to define a proxy_cache_path and a proxy_temp_path under your http directive, then use them in the locations you want to cache.


proxy_cache_path /mnt/nginx_cache levels=1:2
keys_zone=one:10m
inactive=7d max_size=10g;
proxy_temp_path /var/tmp/nginx_temp;

In the location you want to cache, you would add something like this:

proxy_cache one;
proxy_cache_key mylocation.$request_uri;
proxy_cache_valid 200 302 304 10m;
proxy_cache_valid 301 1h;
proxy_cache_valid any 1m;
proxy_cache_use_stale error timeout invalid_header http_500 http_502 http_503 http_504 http_404;

HTTP caching options

Many times you want to cache certain types of content and not others. You can specify your caching rules in a file that you include in your root location:

location / {
root /var/www;
include cache-control.conf;

index index.html index.htm;
}

You can specify different expire headers and cache options based on the request URI. Examples (inside cache-control.conf in my case)

# default cache 1 day
expires +1d;

if ($request_uri ~* "^/services/.*$") {
expires +0d;
add_header Pragma "no-cache";
}

if ($request_uri ~* "^/(index.html)?$") {
expires +1h;
}

SSL

All you need to do here is to define another server in servers.conf, and have it include the locations you need (which can be the same ones handled by the server on port 80 for example):

server {
server_name www.example.com;
listen 443;
ssl on;
ssl_certificate /usr/local/nginx/ssl/cert.pem;
ssl_certificate_key /usr/local/nginx/ssl/cert.key;

include locations.conf;
}



syslog-ng tips and tricks

Although I've been contemplating using scribe for our logging needs, for now I'm using syslog-ng. It's been doing the job well so far. Here are a couple of configuration tips:

1) Sending log messages for a given log facility to a given log file

Let's say you want to send all haproxy log messages to a file called /var/log/haproxy.log. In haproxy.cfg you can say:

global
 log 127.0.0.1 local7 info

...which means -- log all messages to localhost, to log facility local7 and with a log level of info.

To direct these messages to a file called /var/log/haproxy.log, you need to define the following in /etc/syslog-ng/syslog-ng.conf:

i) a destination:

destination df_haproxy { file("/var/log/haproxy.log"); };

ii) a filter:

filter f_haproxy { facility(local7); };

iii) a log (which ties the destination to the filter):

log {
source(s_all);
filter(f_haproxy);
destination(df_haproxy);
};

You also need to configure syslog-ng to allow log messages sent via UPD from localhost. Add this line to the source s_all element:

udp(ip(127.0.0.1) port(514));

Important note: since you're sending haproxy log messages to the local7 facility, this means that they'll also be captured by /var/log/syslog and /var/log/messages, since they are configured in syslog-ng.conf as destinations for the filters f_syslog and f_messages, which by default catch the local7 facility. As a result, you'll have triple logging of your haproxy messages. The solution? Add local7 to the list of facilities excluded from the f_syslog and f_messages filters.

2) Sending log messages to a remote log host

Assume you want to centralize log messages for a given service by sending them to a remote log host. Let's assume that the service logs via the local0 facility. The same procedure applies, with the creation of the following elements in syslog-ng.conf:

i) a destination


destination df_remote_log {
  udp("remote_loghost" port (5000));
};


ii) a filter:


filter f_myservice { facility(local0); };

iii) a log:

log {
        source(s_all);
        filter(f_myservice);
        destination(df_remote_log);
};

Note that you can also send messages for this particular filter (corresponding to local0) to a local file, by creating a destination poining to that file and a log element tying the filter with that destination, like this:

destination df_local_log { file("/var/log/myservice.log"); };
log {
        source(s_all);
        filter(f_myservice);
        destination(df_local_log);
};

Finally, to finish the remote logging bit, you need to configure syslog-ng on the remote host to allow messages on UDP port 5000, and to log them to a local file. Here's my configuration on host "remote_loghost":

i) a new source allowing messages on port 5000:

source s_remote_logging {
    udp(ip(0.0.0.0) port(5000));
};

ii) a destination pointing to a local file:

destination df_common_log { file ("/var/log/myservice_common.log"); };

iii) a log combining the source and the destination above; I am using the predefined f_syslog filter here, because I don't need to select messages based on a given facility anymore:

log {
        source(s_remote_logging);
        filter(f_syslog);
        destination(df_common_log);
};




Thursday, June 03, 2010

Setting up PHP5/FastCGI with nginx

PHP is traditionally used with Apache, but can also be fronted by nginx. Here are some notes that I took while setting up PHP5/FastCGI behind nginx. My main source of inspiration was this howtoforge article. My OS flavor is Ubuntu 9.04 32-bit, but the same procedure applies to other flavors.

1) Install PHP5 and other required packages

# apt-get install php5 php5-cli php5-cgi php5-xcache php5-curl php5-sqlite libpcre3 libpcre3-dev libssl-dev

2) Configure xcache

I added the lines in the following gist to the end of /etc/php5/cgi/php.ini: http://gist.github.com/424172

3) Install spawn-fcgi, which used to be included in the lighttpd package, but now can be downloaded on its own from here.

# tar xvfz spawn-fcgi-1.6.3.tar.gz; cd spawn-fcgi-1.6.3; ./configure; make; make install

At this point you should have /usr/local/bin/spawn-fcgi installed. This wrapper needs to be launched at some point via a command line similar to this:

/usr/local/bin/spawn-fcgi -a 127.0.0.1 -p 9000 -u www-data -f /usr/bin/php5-cgi

What this does is it launches the php5-cgi process which will listen on port 9000 for php requests.

The original howtoforge article recommended writing an init.d wrapper for this command. I used to do this, but I noticed that the php5-cgi process dies quite frequently, so I needed something more robust. Hence...

4) Configure spawn-fcgi to be monitored by supervisor

I installed supervisor via:

# apt-get install python-setuptools
# easy_install supervisor
# echo_supervisord_conf > /etc/supervisord.conf

Then I added this section to /etc/supervisord.conf:

[program:php5-cgi]
command=/usr/local/bin/spawn-fcgi -n -a 127.0.0.1 -p 9000 -u www-data -f /usr/bin/php5-cgi
redirect_stderr=true          ; redirect proc stderr to stdout (default false)
stdout_logfile=/var/log/php5-cgi/php5-cgi.log
stdout_logfile_maxbytes=10MB   ; max # logfile bytes b4 rotation (default 50MB)

Note that I added the -n flag to the spawn-fcgi command line. This keeps the process it spawns (/usr/bin/php5-cgi) in the foreground, so that it can be daemonized and monitored properly by supervisor.

I use this init.d script to start/stop supervisord. Don't forget to 'chkconfig supervisord on' so that it starts at boot time.

Per the configuration above, I also created /var/log/php5-cgi and chown-ed it www-data:www-data. That directory will contain a file called php5-cgi.log which will capture both the stdout and stderr of the php5-cgi process.

When I started supervisord via 'service supervisord start', I was able to see /usr/bin/php5-cgi running. I killed that process, and yes, supervisor restarted it. May I just say that SUPERVISOR ROCKS!!!

5) Install and configure nginx

I prefer to install nginx from source.

# tar xvfz nginx-0.8.20.tar.gz; cd nginx-0.8.20; ./configure --with-http_stub_status_module; make; make install

To configure nginx to serve php files, add these 'location' sections to the server listening on port 80 in /usr/local/nginx/conf/nginx.conf:

      location / {
            root   /var/www/mydocroot;
            index  index.php index.html;
        }

    location ~ \.php$ {
        include /usr/local/nginx/conf/fastcgi_params;
        fastcgi_pass  127.0.0.1:9000;
        fastcgi_index index.php;
        fastcgi_param  SCRIPT_FILENAME  /var/www/mydocroot$fastcgi_script_name;
        }

(assuming of course your root document directory is /var/www/mydocroot)

Now nginx will foward all request to *.php to a process listening on port 9000 on localhost. That process is php5-cgi, as we are aware by now.

That's about it. Start up nginx and you should be in business.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...