Thursday, August 5, 2010

mod_proxy or mod_jk

There are several ways to run Tomcat applications. You can either run tomcat direcly on port 80, or you can put a webserver in front of tomcat and proxy connections to it. I would highly recommend using Apache as a front end. The main reason for this suggestion is that Apache is more flexible than tomcat. Apache has many modules that would require you to code support yourself in Tomcat. For example, while Tomcat can do gzip compression, it's a single switch; enabled or disabled. Sadly you can not compress CSS or javascript for Internet Explorer 6. This is easy to support in Apache, but impossible to do in Tomcat. Things like caching are also easier to do in Apache.

Having decided to use Apache to front Tomcat, you need to decide how to connect them. There are several choices: mod_proxy ( more accurately, mod_proxy_http in Apache 2.2, but I'll refer to this as mod_proxy), mod_jk and mod_jk2. Mod_jk2 is not under active development and should not be used. This leaves us with mod_proxy or mod_jk.

Both methods forward requests from apache to tomcat. mod_proxy uses the HTTP that we all know an love. mod_jk uses a binary protocol AJP. The main advantages of mod_jk are:

  • AJP is a binary protocol, so is slightly quicker for both ends to deal with and uses slightly less overhead compared to HTTP, but this is minimal.
  • AJP includes information like original host name, the remote host and the SSL connection. This means that ServletRequest.isSecure() works as expected, and that you know who is connecting to you and allows you to do some sort of virtualhosting in your code.

A slight disadvantage is that AJP is based on fixed sized chunks, and can break with long headers, particularly request URLs with long list of parameters, but you should rarely be in a position of having 8K of URL parameters. (It would suggest you were doing it wrong. :) )

It used to be the case that mod_jk provided basic load balancing between two tomcats, which mod_proxy couldn't do, but with the new mod_proxy_balancer in Apache 2.2, this is no longer a reason to choose between them.

The position is slightly complicated by the existence of mod_proxy_ajp. Between them, mod_jk is the more mature of the two, but mod_proxy_ajp works in the same framework as the other mod_proxy modules. I have not yet used mod_proxy_ajp, but would consider doing so in the future, as mod_proxy_ajp is part of Apche and mod_jk involves additional configuration outside of Apache.

Given a choice, I would prefer a AJP based connector, mostly due to my second stated advantage, more than the performance aspect. Of course, if your application vendor doesn't support anything other than mod_proxy_http, that does tie your hands somewhat.

You could use an alternative webserver like lighttpd, which does have an AJP module. Sadly, my prefered lightweight HTTP server, nginx, does not support AJP and is unlike ever to do so, due to the design of its proxying system.

########################################

mod_jk Configuration

########################################

Here's an example of the extra configuartion needed in the Apache httpd configuration file - (/usr/local/apache2/conf/httpd.conf)

LoadModule jk_module modules/mod_jk.so
JkWorkersFile /usr/local/apache2/conf/jkworkers.properties
JkMount /latmjdemo* catkin


And here's the jkworkers.properties file:

worker.list=catkin
worker.oak.port=8009
worker.oak.host=192.168.200.1
worker.oak.lbfactor=5
worker.elm.port=8009
worker.elm.host=192.168.200.158
worker.elm.lbfactor=15
worker.catkin.type=lb
worker.catkin.balanced_workers=oak,elm
worker.catkin.sticky_session=1


Traffic is forwarded to a Tomcat server called "Oak" on 192.168.200.1, or a Tomcat server called "Elm" on 192.168.200.158, with that latter getting 3 forwards for every one passed to Oak.

The "sticky_session" is worth comment. Rather than randomly forwarding tarffic to either server, httpd will forward users who already have sessions established to the same system right through their session. That way, a multiple page process (such as an on line ordering system) can easily be implemented without the need for a lot of extra code to share work-in-progress data between the various Tomcat server.

In order for sticky sessions to work, you need to configure your jvmRoute in Tomcat to reflect the server name

########################################

########################################

mod_proxy Configuration

########################################

Proxy forwarding to a Java Server

Here's an example of a proxied request from Apache httpd on to a server (probably Apache Tomcat) that's running the ajp protocol on port 5090:

ProxyPass /harry ajp://192.168.200.215:5090/latmjdemo
ProxyPassReverse /harry ajp://192.168.200.215:5090/latmjdemo


That's code to be added to the end of your httpd.conf file!

Proxy forwarding to a group of Java Servers

It gets even better ... mod_proxy_balancer lets you define a group of Java servers which you can forward your traffic on to - ideal on a busy site where the background task that's running in Java is a resource hog and needs to be shared between systems. Here's an example of what you would add to httpd.conf:


BalancerMember ajp://192.168.200.215:5090/latmjdemo
BalancerMember ajp://192.168.200.214:5009/latmjdemo

ProxyPass /prince balancer://catbox/


In this example, any references to the web resources on the server under the /prince directory will be forwarded to one of two other machines, on port 5090, and will be directed to the "latmjdemo" web application on there.

More flexibility in forwarding to a group of Servers

The example above uses the default "round robin" scheduler - but there are other facilities available too to help you tune your forwarding. Here's a further example:


BalancerMember ajp://192.168.200.219:5009/latmjdemo loadfactor=1
BalancerMember ajp://192.168.200.218:5009/latmjdemo loadfactor=3
BalancerMember ajp://192.168.200.215:5009/latmjdemo status=+h
ProxySet lbmethod=bytraffic
ProxySet stickysession=JSESSIONID

ProxyPass /corgi balancer://kennel/


In this example, we are forwarding to 2 systems, in a ratio of 1 : 3 and we're allocating traffic based on the traffic quantities coming back from each server rather than the number of requests (so queries that generate a lot of traffic count for more). An extra machine has been designated as "hot swap" if neither of the others is available. Once a visitor is allocated to a particular machine for his forward, he'll continue to be forwarded to that same system while his JSESSIONID cookie remains live.

Some other notes about mod_proxy and family in Apache 2.2:

• ProxyPassMatch is available, which lets you specify a pattern (Regular Expression) for your forwarding - for example, if you wanted to forward all you image requests to an image server:
ProxyPassMatch ^/(.*\.jpg)$ http://images.wellho.net/$1

• mod_rewrite IS aware of mod_proxy_balancer, so that you can rewrite your requests as we do in many parts of our site, and then forward them on to other systems through an appropriate balancer.

• As from Apache 2.2.9, ProxyPassReverse is also mod_proxy_balancer aware.

########################################

No comments:

Post a Comment