Archive for the ‘webapps’ Category

Migrating from tomcat to weblogic

Thursday, March 11th, 2010

Moving from tomcat to weblogic may sound crazy. In case you need to do it though (e.g for business reasons) here are a couple of things which may go wrong.

First of all the classloader hierarchy in weblogic do not do what you usually expect from other servers such as tomcat, resin, jetty and jboss. If your application uses hibernate (and implicitly ANTLR) you may get the following exception:

Caused by: java.lang.Throwable: Substituted for missing class org.hibernate.QueryException - ClassNotFoundException: org.hibernate.hql.ast.HqlToken [from com.example.model.Person order by id]
        at org.hibernate.hql.ast.HqlLexer.panic(HqlLexer.java:80)
        at antlr.CharScanner.setTokenObjectClass(CharScanner.java:340)
        at org.hibernate.hql.ast.HqlLexer.setTokenObjectClass(HqlLexer.java:54)
        at antlr.CharScanner.<init>(CharScanner.java:51)
        at antlr.CharScanner.<init>(CharScanner.java:60)
        at org.hibernate.hql.antlr.HqlBaseLexer.<init>(HqlBaseLexer.java:56)
...

As explained in the Hibernate3 Migration Guide Weblogic doesn’t seem to support proper class loader isolation, will not see the Hibernate classes in the application’s context and will try to use it’s own version of ANTLR.

In the same fashion you may get the following exception for commons lang:

java.lang.NoSuchMethodError: org.apache.commons.lang.exception.ExceptionUtils.getMessage(Ljava/lang/Throwable;)Ljava/lang/String;

because weblogic internally uses commons lang 2.1 and the one you use may have more API methods.

For both these problems the solution is to instruct weblogic to prefer the jars from the WEB-INF of your application. You need to create a weblogic specific file called weblogic.xml and place it under WEB-INF:

<?xml version="1.0" encoding="UTF-8"?>
<weblogic-web-app>
    <container-descriptor>
        <prefer-web-inf-classes>true</prefer-web-inf-classes>
    </container-descriptor>
</weblogic-web-app>

Another problem is that, like in resin, the default servlet is not named “default” so if you depend on it in web.xml, your application may throw the following at the deployment phase:

Caused by: weblogic.management.DeploymentException: [HTTP:101170]The servlet default is referenced in servlet-mapping *.avi, but not defined in web.xml.

This is because the default servlet is called FileServlet in the web.xml, so you’ll need to change all references in your web.xml from “default” to “FileServlet”.

Last, but not least, tomcat will automatically issue a 302 redirect from http://localhost:8080/context to http://localhost:8080/context/ before allowing your application to do any processing. So all instances of request.getServletPath() will never return an empty string, but will always start with “/”. Weblogic doesn’t do this so http://localhost:8080/context resolves and if your code contains something like:

request.getServletPath().substring(1)

you’ll get:

java.lang.StringIndexOutOfBoundsException: String index out of range: -1

so a safer way to trim this leading slash is by doing:

request.getServletPath().replaceFirst("^/", "")

Good luck, and remember. Every time you use a full blown application server for something that a simple web container would be enough, god kills a kitten.

Retaining browser scrollTop between page refreshes

Tuesday, August 18th, 2009

Sometimes when you develop web applications with CRUD pages and other backend functionalities you need retain the vertical scrollbar state between page loads. This gives the user a smoother experience, especially when your application doesn’t do any AJAX but it’s based of good old full HTML page request-responses.

We’ll be using jQuery and the cookie plugin:

$(function(){
  // if current viewport height is at least 70% of the previous
  if ($(document.body).height()>$.cookie('h2')*0.7)
    // retain previous scroll
    $('html,body').scrollTop($.cookie('h1'));
});
$(window).unload(function(){
  // store current scroll
  $.cookie('h1', $(window).scrollTop());
  // store current viewport height
  $.cookie('h2', $(document.body).height());
});

We can completely skip the h2 cookie and the if statement but this is an automatic way to prevent scrolling to the very bottom of a short page when coming from a long page. Such use case is common when jumping from the “long list of items” to the “edit an item” page.

Good luck

mod_expires and Cache Killers

Sunday, May 3rd, 2009

Rule 3 of Steve Souders’ YSlow suggests that websites should Add a far future Expires header to the components. Components with a cache header could be static files such as those with extensions .css, .js, .jpg, .png, .gif etc. This gives a huge boost in client side performance of users with a primed cache. In apache this is done via mod_expires and an example configuration would be:

ExpiresActive On
ExpiresByType image/x-icon "access plus 1 month"
ExpiresByType text/css "access plus 1 month"
ExpiresByType application/javascript "access plus 1 month"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType image/png "access plus 1 month"

All this works well until you need to update a cached static file. The users with the primed cache will either have to wait 1 month to get the new file, or explicitly invalidate their cache. Some people will even ask their users to do a hard refresh but this obviously does not scale and it’s not very robust.

Since you cannot send an automatic signal to the browsers to reload those files all you can do is change the URL of those files (explicit invalidation). You could simply rename all those files, but an easier way to achieve the same effect is by adding a fake (unused – dummy) parameter at the end of the resource URL:

<img src="logo.jpg?2" />

The next logical step would be to automate this into the build system and have every production release feature new cache killer tokens. It seems that many well known sites do that already:

http://slashdot.org/

href="//s.fsdn.com/sd/idlecore-tidied.css?T_2_5_0_254a"
src="//s.fsdn.com/sd/all-minified.js?T_2_5_0_254a"

http://stackoverflow.com/

href="/content/all.css?v=3184"
src="/content/js/master.js?v=3141"

http://digg.com/

@import "/css/189/global.css";
src="http://media.digg.com/js/loader/187/dialog|digg|shouts"

http://www.bbc.co.uk/

@import 'http://wwwimg.bbc.co.uk/home/release-29-7/style/homepage.min.css';
src="http://wwwimg.bbc.co.uk/home/release-29-7/script/glow.homepage.compressed.js"

http://www.guardian.co.uk/

href="http://static.guim.co.uk/static/73484/common/styles/wide/ie.css"
src="http://static.guim.co.uk/static/73484/common/scripts/gu.js"

What happens with images referenced from within css files? You could rewrite the css files automatically as part of your production build process with Ant.

<tstamp>
    <format property="cacheKill" pattern="yyyyMMddhhmm" locale="en,UK"/>
</tstamp>

<target name="rewrite-css">
    <replace dir="${build.web.dir}" value="css?${cacheKill}&quot;)"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>css&quot;)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="png?${cacheKill}&quot;)"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>png&quot;)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="gif?${cacheKill}&quot;)"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>gif&quot;)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="jpg?${cacheKill}&quot;)"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>jpg&quot;)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="css?${cacheKill}')"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>css')</replacetoken></replace>
    <replace dir="${build.web.dir}" value="png?${cacheKill}')"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>png')</replacetoken></replace>
    <replace dir="${build.web.dir}" value="gif?${cacheKill}')"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>gif')</replacetoken></replace>
    <replace dir="${build.web.dir}" value="jpg?${cacheKill}')"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>jpg')</replacetoken></replace>
    <replace dir="${build.web.dir}" value="css?${cacheKill})"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>css)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="png?${cacheKill})"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>png)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="gif?${cacheKill})"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>gif)</replacetoken></replace>
    <replace dir="${build.web.dir}" value="jpg?${cacheKill})"><include name="css/**/*.css"/><include name="scripts/**/*.css"/><replacetoken>jpg)</replacetoken></replace>
</target>

This will take care of the following background image reference styles for css, png, gif and jpg files:

... background-image: url("images/ed-bg.gif");
... background-image: url('images/ed-bg.gif');
... background-image: url(images/ed-bg.gif);

and convert them to:

... background-image: url("images/ed-bg.gif?200905031126");
... background-image: url('images/ed-bg.gif?200905031126');
... background-image: url(images/ed-bg.gif?200905031126);

Good luck!

A better SMTPAppender

Saturday, May 2nd, 2009

SMTPAppender for log4j is a type of appender which sends emails via an SMTP server. It’s very useful for applications released in production where you’d definitely need to know of all application errors logged. Of course every caring developer should look at the server logs every now and then, but if you’ve got hundreds of them (applications) then it becomes a full time job in itself.

Sometimes a fresh release of a high traffic website may produce hundreds or thousands of ERROR level log events. Many times this may be something minor which is being logged deep inside your code. Until the bug is fixed and a new release is deployed, your inbox and the mail server may suffer heavily.

What follows is an extension of SMTPAppender which limits the amount of emails sent in a specified period of time. It features sensible defaults which of course can be configured externally via the log4j configuration file.

package com.cherouvim;

import org.apache.log4j.Logger;
import org.apache.log4j.net.SMTPAppender;

public class LimitedSMTPAppender extends SMTPAppender {

    private int limit = 10;           // max at 10 mails ...
    private int cycleSeconds = 3600;  // ... per hour

    public void setLimit(int limit) {
        this.limit = limit;
    }

    public void setCycleSeconds(int cycleSeconds) {
        this.cycleSeconds = cycleSeconds;
    }

    private int lastVisited;
    private long lastCycle;

    protected boolean checkEntryConditions() {
        final long now = System.currentTimeMillis();
        final long thisCycle =  now - (now % (1000L*cycleSeconds));
        if (lastCycle!=thisCycle) {
            lastCycle = thisCycle;
            lastVisited = 0;
        }
        lastVisited++;
        return super.checkEntryConditions() && lastVisited<=limit;
    }

}

The configuration would look something like this:

log4j.appender.???=com.cherouvim.LimitedSMTPAppender
log4j.appender.???.limit=3
log4j.appender.???.cycleSeconds=60
log4j.appender.???.BufferSize=25
log4j.appender.???.SMTPHost=${mail.smtp.host}
log4j.appender.???.From=${mail-sender}
log4j.appender.???.To=${sysadmin.email}
log4j.appender.???.Subject=An error occured
log4j.appender.???.layout=org.apache.log4j.PatternLayout
log4j.appender.???.layout.ConversionPattern=%d{ISO8601} %-5p (%F:%L) - %m%n
log4j.appender.???.threshold=ERROR

The above configuration will limit the mail dispatch to only 3 emails per minute. Any further errors in that minute will not be emailed. The limit and cycleSeconds setting lines can be omitted and the defaults will be applied.

Happy logging!

My favourite string for testing web applications

Monday, February 16th, 2009

Weird title, huh?

When creating templates, pages and action responses for your web application you really need to take HTML escaping into consideration. Sometimes the use cases of the system are so many that you may omit HTML escaping for some piece of dynamic or user entered text.

HTML escaping means converting <foo>bar & " to &lt;foo&gt;bar &amp; &quot; in the HTML source.

One of the reasons for HTML escaping is to avoid XSS attacks or simply to make your site valid.

Reasons for making your HTML output valid include:

  1. It’s the “right thing to do”
  2. Does not tire the browser
  3. Allows you to manually detect (via CTRL+SHIFT+A on web developer toolbar) for real HTML output errors
  4. Saves you from css rendering issues due to HTML anomalies
  5. Ensures your content is easily parsable from third party agents (crawlers, scrappers, XSLT transformators etc)

So my favourite string is <script’”<i>
You can alter your development database content using statements like these:

update articles set title=concat("<script'\"<i>", title);
update users set firstname=concat("<script'\"<i>", lastname), lastname=concat("<script'\"<i>", lastname);
update categories set title=concat("<script'\"<i>", title);
...

If after this database content change your site is not functional then there is a problem. You can also check for HTML validity with CTRL+SHIFT+A on web developer toolbar and quickly spot areas where you missed HTML escaping.

You could even automate this whole process by having a tool (JTidy?) scan that all your pages and use cases produce valid HTML. So indirectly you would be testing for insecure (in XSS terms) parts of the application.

HTML escaping in JSTL
HTML escaping in freemarker
HTML escaping in velocity

Duplicate content test and URL canonicalization

Sunday, February 15th, 2009

Days ago I uploaded the following script on my server:

<?php

  if ($_SERVER["QUERY_STRING"]=='foo&bar') {
    echo "index test one";
  }

  if ($_SERVER["QUERY_STRING"]=='bar&foo') {
    echo "bar and foo";
  }

  if ($_SERVER["QUERY_STRING"]=='bar&foo&test') {
    echo "bar and foo";
  }

?>

I then published 3 links to my site’s index so Google could follow them:

http://cherouvim.com/foo.php?foo&bar

http://cherouvim.com/foo.php?bar&foo

http://cherouvim.com/foo.php?bar&foo&test

Days later I got this result for the Google query site:cherouvim.com/foo:

The first and third result are the same (duplicate content). Google has indexed them both though. This is a common SEO problem in dynamic web sites where there can be many different URLs linking to the same page (paginators, out of date URLs, archive pages etc) or where you want to do URL Referrer Tracking.

Google has recently published a way of overcoming this problem. You can now specify which is the real (or primary) URL for the page. E.g:

<link rel="canonical" href="/foo.php?foo&bar" />

So, as SEOmoz said, this definitely is The Most Important Advancement in SEO Practices Since Sitemaps.

Re: The 140 character webapp challenge!

Saturday, February 14th, 2009

This is my response to The 140 character webapp challenge!

Epilepsy

javascript:{r=0;setInterval(function(){document.body.style.background=(r++%2==0?'#'+r%7+r%9+r%8:'0')},50);void(0)}

114 bytes of inline javascript that you can paste to your browser’s URL.

It’s a very useful webapp* in case you want to cause epilepsy to yourself.

Via: http://f055.net/article/the-140-character-webapp-challenge/
Via: http://synodinos.net/2009/02/14/re-the-140-character-webapp-challenge/

* just kidding :)

The * stupidest things I’ve done in my programming job

Saturday, February 7th, 2009

I’m not ashamed of those sins any more, so here you go :)

1. ORM

Stupidity
Building my own Object Relational Mapping framework.
Consequence
Project is a mess after 2 years of maintenance with hardcore hacks to bypass my own ORM and call custom SQL queries.
What should I have done
Use hibernate, iBATIS, Cayenne or something similar.

2. EAV

Stupidity
Using an Entity-Attribute-Value model database schema design.
Consequence
Non scalable solution and total impossibility to run any useful queries on the database level.
What should I have done
Use an ordinary normalized database schema design.

3. Database Access

Stupidity
Synchronize (serialize) database access using one shared connection.
Consequence
Zero scalability. Very slow response times when more than 10 users where using the application.
What should I have done
Don’t do that and use a connection pool such as c3p0 and use a “new” (reused) connection returned from the pool for every request/response cycle.

4. IDE

Stupidity
Avoided learning and using an Integrated development environment.
Consequence
Inability to build test and deploy the application quickly and generally do anything useful.
What should I have done
Get familiar with an IDE. NetBeans, eclipse etc.

5. Transactions

Stupidity
Not using them.
Consequence
Corrupt data in an application involving e-shop like functionality.
What should I have done
Use database transactions. When in MySQL use InnoDB.

6. Prepared Statements

Stupidity
Using Statements, string concatenation and naive character escaping to assemble my own “safe” queries.
Consequence
SQL Injections possible in my application. I managed to login using ‘or 1=1;delete from users;– and alter the database state in a very nasty way.
What should I have done
Use Prepared Statements which correctly assemble and escape the query properly depending on the JDBC driver used.

7. Business Logic

Stupidity
Doing it in the template (JSP).
Consequence
Messy non maintainable application.
What should I have done
Do it in an MVC style with servlets or with a Front Controller. Even better by using an existing open source MVC framework such as Struts, Spring MVC etc.

Of course, all the bad choices above have probably made me a better programmer.

FreeMarker exception handling

Wednesday, June 20th, 2007

FreeMarker is a very flexible templating engine for Java. Exception handling (while rendering the template) is a very important issue for a templating engine. As with JSP the default behaviour of FreeMarker is to completely cancel rendering and display an error page. When developing a webapp this might not be very helpful. Sometimes errors might need to be tolerated; at least for the development phase of the application development.

FreeMarker provides the TemplateExceptionHandler interface with some implementations but we’ll define our own in order to provide a more failsafe and usable behaviour.

public class MyTemplateExceptionHandler
          implements TemplateExceptionHandler {

  public void handleTemplateException(TemplateException te,
          Environment env, Writer out) {
    freemarkerlog.error("template error", te);
    try {
      out.write("<span style=\"cursor:help; color: red\" " +
                "title=\"" + ExceptionUtils.getMessage(te) + "\">" +
                "[e]" +
                "</span>\n");
    } catch (IOException ignored) { }
  }

}

Then, in the code where you configure FreeMarker you need:

config.setTemplateExceptionHandler(new MyTemplateExceptionHandler());

This is what you’ll see whenever there is an exception thrown while rendering the template:
FreeMarker exception handling
A nice little [e] with a tooltip containing the exception message.

Runtime dispatching freemarker macros for pojo views

Sunday, June 10th, 2007

One of the (many) reasons I switched from JSP to FreeMarker is that I couldn’t achieve what I describe in this post. Tutorials or blog posts regarding this situation were never to be found, and in addition it was really hard to find anyone considering this issue a real problem.

The problem

Suppose we are building an issue tracking system. We have a rich Domain Model which includes entities such as User, Project, Account, Role etc. We’ve also got an abstract Issue object which is the root of the issue’s hierarchy. Concrete classes extending Issue include Bug, Feature, Request and Change. These 4 POJOs inherit common fields from Issue but add fields, methods and logic of their own.

Each of the issue’s subclass will need to have a slightly different HTML view. I tend to use the composite design pattern for my views, so I can break the HTML down to small reusable components. So it is obvious that we’re going to need 4 different views, one for each of them. Here are those issue rendering methods (presented in an imaginary pseudolanguage which combines EL, HTML and functions):

renderBug(bug) {
  <fieldset>
    <legend>Bug #${bug.id}</legend>
    <p>Author: ${bug.author}</p>
    <p>Date: ${bug.date}</p>
    <p>Description: ${bug.description}</p>
    <p>Steps to recreate bug: ${bug.stepsToRecreate}</p>
  </fieldset>
}

renderFeature(feature) {
  <fieldset>
    <legend>feature #${feature.id}</legend>
    <p>Author: ${feature.author}</p>
    <p>Date: ${feature.date}</p>
    <p>Description: ${feature.description}</p>
    <p>Related URL: ${feature.url}</p>
    <p>Screenshot upload: <img src="${feature.screenshot}" /></p>
  </fieldset>
}

...

Our DAO (probably called IssueDao) is going to fetch a Collection<Issue> (a bunch of issues) from the database for a particular use case. The runtime type of each of those entities cannot be Issue but it will be Bug, Feature, Request or Change. The problem is that we are presenting them altogether in the same screen, so in order to render them we have to write code such as this:

foreach(issues as issue) {
  if (issue instanceof Bug) renderBug(issue); continue;
  if (issue instanceof Feature) renderFeature(issue); continue;
  if (issue instanceof Request) renderRequest(issue); continue;
  if (issue instanceof Change) renderChange(issue);
}

If this doesn’t seem very bad to you, here is an actual view implementation of a slightly bigger hierarchy using JSP 2.0 Tag Files:

if (t instanceof ActivityInternal) {%><p:activityInternalView pojo="${t}" /><%;}
if (t instanceof ActivityExternal) {%><p:activityExternalView pojo="${t}" /><%;}
if (t instanceof ActivityMilestone) {%><p:activityMilestoneView pojo="${t}" /><%;}
if (t instanceof Review) {%><p:reviewView pojo="${t}" /><%;}
if (t instanceof PublicationReport) {%><p:publicationReportView pojo="${t}" /><%;}
if (t instanceof PublicationWebsite) {%><p:publicationWebsiteView pojo="${t}" /><%;}
if (t instanceof InfoConference) {%><p:infoConferenceView pojo="${t}" /><%;}
if (t instanceof InfoBase) {%><p:infoBaseView pojo="${t}" /><%;}
if (t instanceof InfoChannel) {%><p:infoChannelView pojo="${t}" /><%;}
if (t instanceof Meeting) {%><p:meetingView pojo="${t}" /><%;}
if (t instanceof Interpretation) {%><p:interpretationView pojo="${t}" /><%;}
if (t instanceof BudgetItem) {%><p:budgetItemView pojo="${t}" /><%;}
if (t instanceof FocusGeneral) {%><p:focusGeneralView pojo="${t}" /><%;}
if (t instanceof FocusResearch) {%><p:focusResearchView pojo="${t}" /><%;}
if (t instanceof Risk) {%><p:riskView pojo="${t}" /><%;}
if (t instanceof QAChecklist) {%><p:QAChecklistView pojo="${t}" /><%;}
if (t instanceof TargetAudience) {%><p:targetAudienceView pojo="${t}" /><%;}
if (t instanceof LessonsLearned) {%><p:lessonsLearnedView pojo="${t}" /><%;}

If you still don’t think this is bad, you can stop reading ;)

What not to do

In a project I did in my early JSP days, what I did was to put all the view logic in the Java class! So it looked like this (this is actual Java):

public class Bug extends Issue {

  ...

  public String renderMe() {
    return "<fieldset><legend>" + this.getName() + "</legend>" +
           "<p>Author: " + this.getAuthor() + "</p>" +
           "<p>Date: " + this.getDate() + "</p>" +
           "<p>Description: " + this.getDescription() + "</p>" +
           "</fieldset>";
  }
}

Although this type of code is a perfect candidate for The Daily WTF, the (only) advantage was that I could now render my pojos using (pseudocode):

foreach(issues as issue) {
  issue.renderMe();
}

The solution

It seems that all we want is the ability to construct and dynamically (reflectively in Java terms) call the appropriate render tag each time. In freemarker we define macros which look like this:

<#macro renderBug bug>
  <fieldset>
    <legend>Bug #${bug.id}</legend>
    <p>Author: ${bug.author}</p>
    <p>Date: ${bug.date}</p>
    <p>Description: ${bug.description}</p>
    <p>Steps to recreate bug: ${bug.stepsToRecreate}</p>
  </fieldset>
</#macro>

We need a way to call renderXXX where XXX is the short class name of the issue in question. And here is how you can do this in freemarker:

<#local macroname='render' + issue.class.name?split(".")?last />
<@.vars[macroname] issue />

For an issue of runtime type com.example.Foo, it concatenates the word “render” with “Foo” and calls the macro with that name. The magic happens with the help of the .vars special variable. It allows us to access variables by name. The full code now becomes:

<#macro renderIssue issue>
  <#local macroname='render' + issue.class.name?split(".")?last />
  <@.vars[macroname] issue />
</#macro>

<#list issues as issue>
  <@renderIssue issue />
</#list>

By the way, this capability is usually present in dynamic scripting languages. So for example there are many ways to do that in PHP.

using dynamic evaluation
$functionName = "renderBug";
$functionName($issue);
using eval
eval("renderBug($issue);");
using call_user_func (probably safest of all)
call_user_func("renderBug", $issue);