Home  ›  Symposium 2007  ›  Symposium Programme

Symposium Programme

The 1st IRF Symposium focussed on five main problem areas within the patent information search & retrieval world.

Data Quality
A range of errors can arise at the point of generating electronic database content, such as OCR faults, transposed digits, or inadequate field tagging. The current information retrieval tools available to the patent searcher rely upon an exact match between search term and database. Consequently, any such data errors can fatally affect retrieval. Attempts to attain 100% data accuracy at the point of document creation can never hope to address the full scope of this problem, especially as document volumes continue to rise. New search methods are needed, which do not rely upon exact match or assume the availability of very high quality data.

Language Gap
One of the primary characteristics of a global patentability search is the need to locate prior art from documents in a wide variety of languages. The most recent challenge has been the rapid increase in Chinese- and Korean-language documents, although machine-translation for other Western languages and Cyrillic scripts is still far from perfect. Patent specialists need translation assistance, both to locate candidate documents and to assess them for relevance. One solution is to convert documents to a common language before searching; an alternative approach is to provide dynamic translation of queries during database interrogation and display of results in the language of the query.

Corpus Enrichment
Current technologies treat almost all parts of a patent document as equal at the point of search, and cannot discriminate between different segments of the text (preamble, detailed description, examples, claims) or between different data attributes (numeric values such as temperature, pressure, frequency range, or words with special values such as journal names). Much of the information contained in a single patent specification is implicit (such as specific embodiments of a generic chemical structure, or context-dependent acronyms or abbreviations). In order to achieve improved search results, one avenue of investigation consists of methods for extracting implicit information from an original document and linking it back as a form of meta-data. Ideally, this methodology of enrichment should be applicable to both current documents and the substantial back-file of older patents.

Tools for Information Professionals
The standard techniques of patent searching using today’s retrieval tools vary according to the type of end-result desired. Time and cost constraints are both powerful influences upon the choice of retrieval tool. Search outcomes may be improved by applying more iterative search strategies and ‘learning’ search tools, but many searches still proceed in a step-wise fashion, dictated by cost. Additionally, search results usually need to be customised so that they can be effectively communicated to specific customer groups, each with varied technical backgrounds, business needs and familiarity with patent systems.

Tools for Management & Research
Research managers and commercial decision-makers are increasingly trying to use patent information which has been cumulated to the industry, national or even international scale. This demands different techniques be applied to retrieval, organisation of the information and especially display of the results. Non-text methods for conveying both large-scale trends and fine detail exist, but have largely been adapted from tools developed for other information types, not patents. The specific requirements of the patent searcher and the patent information user have not hitherto been adequately considered in the development stage. Future developments should consist of tools which have been tailored to the needs of the industry.

The IRFS 2007 Programme can be viewed here:
IRFS2007_programme.pdf