Logging website subsections with Apache
Saturday, April 26th, 2008This is a typical VirtualHost for example.com in Apache HTTP Server:
<VirtualHost xx.xxx.xx.xx:80>
DocumentRoot /home/example.com/site
ServerName example.com
ErrorLog /home/example.com/logs/error.log
CustomLog /home/example.com/logs/access.log "combined"
</VirtualHost>
This CustomLog directive will produce something like this on the logfile:
64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /foo/index.html HTTP/1.1 64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/index.html HTTP/1.1 64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/about.html HTTP/1.1
Suppose that you wanted to log requests for /foo and /bar subsections in 2 separate files. That way you (or your client) may find it easier later on to analyze web traffic for the subsites individually.
What you can do is log conditionally using environment variables that you’ve set using SetEnvIf based on the request.
SetEnvIf Request_URI "/foo.*" subsite_foo SetEnvIf Request_URI "/bar.*" subsite_bar CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar
This works nice, but not when the urls for your individual subsites look like this:
http://example.com/?site=foo&page=5 http://example.com/?site=bar&page=8
That’s because the Request_URI does not contain the query string (whatever comes after the ? character).
The solution comes with a little bit of RewriteCond magic, which helps us set an environment variable using ${QUERY_STRING}:
RewriteEngine on
RewriteCond %{QUERY_STRING} .*site=foo.*
RewriteRule (.*) $1 [E=subsite_foo:1]
RewriteCond %{QUERY_STRING} .*site=bar.*
RewriteRule (.*) $1 [E=subsite_bar:1]
CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo
CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar
Happy logging…