A few months ago we set out to build a high speed reverse proxy server. From the beginning we had a strict set of requirements to meet. It had to be fast, handle a large number of simultaneous connections and scale to a high throughput. Essentially a reverse proxy server dispatches in-bound network traffic to a set of servers and presents a single interface to the caller. For example, a reverse proxy could be used for load balancing a cluster of web servers or routing certain traffic based on configurable rules to back end servers. Our proxy server was going to sit in front of all our ad servers and route many tens of millions of HTTP requests per day!
At a high level the reverse proxy server needs to have the following components:
- I/O Services, Filters and Handlers. These modules would manage the inbound and outbound connections and filter traffic.
- Protocol decoding and encoding. It needs to support both HTTP 1.0 and 1.1.
- Statistics Manager for collecting and reporting real-time and historical metrics.
- Server Monitoring and Management.
- HTTP Server for serving static and servlet based content.
Building such a server requires careful design consideration. In order to support a large number of client connections we opted to use asynchronous I/O for handling both the server side socket and back end connections. Java’s implementations is called NIO and lives under the java.nio.* package. Traditionally synchronous I/O requires a single thread per connection which blocks until inbound data arrives to be read or outbound data is ready to be written. On the other hand with asynchronous I/O the server is notified when I/O events are ready to be consumed. The I/O threads do not block while waiting for reads and writes. Instead they continue processing events for other sessions. The server needs to keep track of where each client is within the I/O transaction. This model allows a small group of I/O threads to handle a large number of simultaneous connections and to pass execution of complex tasks to a larger pool of worker threads. Asynchronous I/O is limited by the typical constraints such as CPU, bandwidth, File Descriptors, etc.. while synchronous I/O, in addition to these, is limited by how many threads can run inside a JVM.
There are number of frameworks out there which make it easier to write NIO based servers in Java. The most popular of these are Apache Mina and Grizzly. Both are good frameworks, however Mina has a cleaner API (in our opinion) and excellent documentation. Stay tuned as we talk more about building the various modules including performance results and tuning tips for scaling our server!
Moe Alabed, Server Development Team
