Information Retrieval and Search solutions for Java platform

Posted by – April 3, 2008

After few days reading Lucene and Solr mailing lists archives I discovered a large range of frameworks and tools related with information retrieval and natural language processing, at first moment we can find the most important search engines components at Lucene site as Lucene subprojects, but there are several other initiatives related with this topic.

I really liked Solr, he has a lot of cool features that remembers some of the features found in FAST ESP, but one of the features that I really missed was a kind of document processing pipeline where I can execute several small programs that perform some operations on the document before send it to the indexer. This issue can be solved by using OpenPipe integrated with Solr, the main idea behind OpenPipe is provide a document pipeline very similar to FAST ESP version but much more easier to configure and mantain, this project is quite new but promissing from my perpective.

If you need linguistics features like sentiment analysis, pos tagging, entity extraction in your search solution then LingPipe can solve your problem, LingPipe is free for applications that will be available for free too, but if you use it for comercial purposes a license must be purchased.

But in the end we have a big problem, package all these components/frameworks together offering a complete open source search solution. I hope to see something like this being true in a near future.

Share

4 Comments on Information Retrieval and Search solutions for Java platform

  1. Jan Høydahl says:

    Hi pal,

    You might want to have a look at my blog post and take a look at a new OS product called OpenPipeline which looks promising!

    Jan

  2. rogerio says:

    Thanks dude! I would like see things like your work integrated into Apache SOLR!

  3. Jan Høydahl says:

    Continued discussion about Solr Pipeline at http://search-lucene.com/m/pFegS7BQ7k2

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>