Tuesday, June 30, 2009

No Need To Write The Parser

For the last 8-10 years I have witnessed the senseless time wasting of writing the same parsing code over and over and over and over. I have been on both sides of the exercise, the one saying we need to do it and the one saying can we do it differently.
Recently I was writing a simple engine to create a generic parsing engine configurable by either an xml document or property file. I was a few days into the project when, on eof my long time CTOs bonked me on the head and pointed me towards smooks (http://www.smooks.org).

Smooks is a generic parsing engine which can take a xml document, csv document, pojo and almost anything else you can think of and transform it into something else. I am currently in the process of creating a engine around smooks to take data from many different feeds and convert it into a proprietary format for consumption.

I am writing about this type of product, because I believe there are organizations out there which are starting these type of projects and using large heavy weight ETL tools when all they need is a little smooks!

Roque Martinez is a Enterprise Architect with experience in Financial, Insurance and Independent Software developement. Roque is the founder of RM Technology Systems LLC, www.rmtechsys.com and can be reached at rmartinez@rmtechsys.com.