Multi-threaded Extension of the IR Platform Terrier
Overview
Searching for patents on the large amount of available patents is a time consuming task. A parallel implementation of the Information Retrieval toolkit Terrier on a high-performance computer will increase the efficiency of the search process in very large document collections.
Goals
The goal of this project is to extend the Information Retrieval toolkit Terrier in such a way that it can be employed in a parallelised way on a supercomputing infrastructure. This includes the parallelisation of the indexing process and resulting indexes as well as the parallelisation of query expansion algorithms and the query processing itself. As a result, users may search very large document collections on a scalable information retrieval service efficiently.
Expected outcome for IP experts
A highly efficient parallelised information retrieval toolkit.
Timeline
This one year project has started in 2008. First results were presented at the IRFS2008. The command line tool is expected by September 2009.
Project Partners
- Fondazione Ugo Bordoni, IT (Research & Development)
- Matrixware Information Services GmbH, AT (showcase, funding)
- Information Retrieval Facility, AT (data, infrastructure)
Links
Matrixware.net/Terrier (for more information about methods and findings, as well as publications and related works)
Contact
Please send your inquiry to: science@ir-facility.org.

