Fighting spam with Mollom on Glassfish
One of my main projects for the past six months has been the conversion of the Mollom spam fighting service from an isolated Java project into a Java Enterprise project. Recently, as part of this conversion, we've migrated the Mollom Backend to the Glassfish 3.0.1 Application Server. This is, however, only the beginning....
The Mollom project, co-founded by Dries Buytaert and Benjamin Schrauwen, provides a "software as a service" backend that tells clients whether a specific comment is spam or ham. If Mollom is not sure how a specific post should be classified, it returns "unsure," and clients most often then display a CAPTCHA challenge to verify the 'humanity' of the poster. The whole process is self-learning, and a number of parameters (text analysis, IP addresses, and included links) are taking into account. I'm continually amazed by Mollom's accuracy.
When I began work on this project in July of last year, the backend used its own implementation of thread-pool, connection-pool, resource management (and other similar things that we all wrote at least once in our Java careers). This was quite reasonable for the early development of the system: early Mollom work concentrated on the functionality of the classifiers and reputation systems, and infrastructure was added and refined gradually.
As Mollom grows in popularity, more infrastructural work was needed. In the end, before our most recent upgrades began, most resources were consumed by infrastructure issues, instead of Mollom-specific concepts. Both Dries and Benjamin realized that this was the moment to start using software designed as a solution to some of these typical infrastructure problems, and we began work on the port to Glassfish.
The Java EE 6 specification addresses many of the needs of Mollom. Database connection pooling, persistence management, data abstraction, thread pooling, enterprise beans to provide core functionality -- all these features are needed in large enterprise projects and Mollom is no different in this area than other large projects.
Glassfish was the first application server that fully implemented the Java EE 6 specification, and it does so in a clear way. This is important to me: when unexpected behavior occurs, I want to be able to trace that behavior. The fact that Glassfish is open-source is also really helpful. The Glassfish engineers are very approachable, and for people that don't want to spend hours in deep code dives, there are supported solutions to many common problems.
The migration to a professional enterprise infrastructure like Glassfish is not the endpoint of the project. It does, however, provide Mollom a solid foundation that allows the creation of additional services and the use of additional protocols.
Currently, the Mollom API uses XML-RPC but we are testing a REST implementation as well. Actually, my blog has been using the Mollom REST implementation for some time now, and we've been experimenting with it in a number of ways.
I strongly belief that a clean REST interface to the Mollom spam module will allow for broad and easy integration of many existing projects to interface with Mollom.
The additional services we've added -- and the robust backend they are running on -- are very exciting. Mollom goes way beyond protecting websites from comment spam, and we've a number of things up our sleeve. Stay tuned for new chapters in the Mollom story.