Archive for the ‘apache httpd’ Category

Logging website subsections with Apache

Saturday, April 26th, 2008

This is a typical VirtualHost for example.com in Apache HTTP Server:

<VirtualHost xx.xxx.xx.xx:80>
    DocumentRoot /home/example.com/site
    ServerName example.com
    ErrorLog /home/example.com/logs/error.log
    CustomLog /home/example.com/logs/access.log "combined"
</VirtualHost>

This CustomLog directive will produce something like this on the logfile:

64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /foo/index.html HTTP/1.1
64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/index.html HTTP/1.1
64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/about.html HTTP/1.1

Suppose that you wanted to log requests for /foo and /bar subsections in 2 separate files. That way you (or your client) may find it easier later on to analyze web traffic for the subsites individually.

What you can do is log conditionally using environment variables that you’ve set using SetEnvIf based on the request.

SetEnvIf Request_URI "/foo.*" subsite_foo
SetEnvIf Request_URI "/bar.*" subsite_bar
CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo
CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar

This works nice, but not when the urls for your individual subsites look like this:

http://example.com/?site=foo&page=5
http://example.com/?site=bar&page=8

That’s because the Request_URI does not contain the query string (whatever comes after the ? character).

The solution comes with a little bit of RewriteCond magic, which helps us set an environment variable using ${QUERY_STRING}:

RewriteEngine on
RewriteCond %{QUERY_STRING} .*site=foo.*
RewriteRule (.*) $1 [E=subsite_foo:1]
RewriteCond %{QUERY_STRING} .*site=bar.*
RewriteRule (.*) $1 [E=subsite_bar:1]
CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo
CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar

Happy logging…