Archive for the ‘deployment’ Category

Logging website subsections with Apache

Saturday, April 26th, 2008

This is a typical VirtualHost for example.com in Apache HTTP Server:

<VirtualHost xx.xxx.xx.xx:80>
    DocumentRoot /home/example.com/site
    ServerName example.com
    ErrorLog /home/example.com/logs/error.log
    CustomLog /home/example.com/logs/access.log "combined"
</VirtualHost>

This CustomLog directive will produce something like this on the logfile:

64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /foo/index.html HTTP/1.1
64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/index.html HTTP/1.1
64.229.111.27 - - [08/Aug/2007:16:12:54 +0300] GET /bar/about.html HTTP/1.1

Suppose that you wanted to log requests for /foo and /bar subsections in 2 separate files. That way you (or your client) may find it easier later on to analyze web traffic for the subsites individually.

What you can do is log conditionally using environment variables that you’ve set using SetEnvIf based on the request.

SetEnvIf Request_URI "/foo.*" subsite_foo
SetEnvIf Request_URI "/bar.*" subsite_bar
CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo
CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar

This works nice, but not when the urls for your individual subsites look like this:

http://example.com/?site=foo&page=5
http://example.com/?site=bar&page=8

That’s because the Request_URI does not contain the query string (whatever comes after the ? character).

The solution comes with a little bit of RewriteCond magic, which helps us set an environment variable using ${QUERY_STRING}:

RewriteEngine on
RewriteCond %{QUERY_STRING} .*site=foo.*
RewriteRule (.*) $1 [E=subsite_foo:1]
RewriteCond %{QUERY_STRING} .*site=bar.*
RewriteRule (.*) $1 [E=subsite_bar:1]
CustomLog /home/example.com/logs/access.foo.log combined env=subsite_foo
CustomLog /home/example.com/logs/access.bar.log combined env=subsite_bar

Happy logging…

Default Servlet and Resin

Thursday, March 8th, 2007

Suppose you use a servlet as a front controller to catch and process all urls in a web app. If you want clean URLs you may have mapped it using:

<servlet-mapping>
  <servlet-name>FrontController</servlet-name>
  <url-pattern>/</url-pattern>
</servlet-mapping>

Your front controller will now attempt to serve all URLs, and this is something you don’t want. Static content (png, html, ico, css…) are being served by a default servlet. In tomcat that is org.apache.catalina.servlets.DefaultServlet, and has been configured for you in conf/web.xml with the name “default”.

So, in order to exclude all static content from the catch-all of your front controller, you have to map static content to the default servlet, before the mapping of the front controller:

<servlet-mapping>
  <servlet-name>default</servlet-name><url-pattern>*.css</url-pattern>
</servlet-mapping>
<servlet-mapping>
  <servlet-name>default</servlet-name><url-pattern>*.js</url-pattern>
</servlet-mapping>
<servlet-mapping>
  <servlet-name>default</servlet-name><url-pattern>*.png</url-pattern>
</servlet-mapping>
<servlet-mapping>
  <servlet-name>default</servlet-name><url-pattern>*.jpg</url-pattern>
</servlet-mapping>
...

That works nicely, when deploying in Tomcat, Jetty and JBoss Application Server.
On Resin, deployment fails with the following message:
WEB-INF/web.xml:89: `default’ is an unknown servlet-name. servlet-mapping requires that the named servlet be defined in a <servlet> configuration before the <servlet-mapping>.</servlet-mapping></servlet>
Resin’s static content servlet is com.caucho.servlets.FileServlet and until 3.0 was mapped using the name “file”. Then, on 3.1, and after some people complained that they couldn’t have a servlet called “file”, the name was changed to “resin-file”.
So, there are 2 solutions to make your application function properly. You can either change all references from “default” to “resin-file” in your web.xml, or change the FileServlet’s name from “resin-file” to “default” in Resin’s conf\app-default.xml.

Happy deployments.