Skip to Content

Intellectual Property Evaluation Campaign (CLEF-IP)

--- 1st CALL FOR PARTICIPATION ---

http://www.ir-facility.org/clef-ip

The Intellectual Property Evaluation Campaign (CLEF-IP) is a benchmarking activity on evaluating retrieval techniques in the patent domain. Its two main goals, formulated in 2009, remain:

  • creating a large test collection of multi-lingual European patents
  • evaluating retrieval techniques in the patent domain

In 2011, the CLEF-IP activities are organised as a benchmarking activity of the forthcoming CLEF 2011 conference.

Deadline for registration is May 2011. Registration details will follow.

Data collection

The main body of the CLEF-IP collection consists of patent documents published by the EPO (European Patent Office), with content in English, German and French. This year, two important additions to the patent collection will be operated.

The first one regards the patent documents in the collection that, due to certain regulations falling under the Patent Cooperation Treaty (PCT), contain mainly bibliographic data and little else. For these documents we will add their corresponding patent documents published by the WIPO (World Intellectual Property Organisation). Compared to the previous years, participants to the lab will additionally have more than 1 million documents at their disposal.

The second addition to the CLEF-IP collection consists of a set of images attached to patent documents, images that will be used in the Image-related tasks (see below).

Tasks offered

  • Prior Art Candidate Search: Find patent documents that are likely to constitute prior art to a given patent application.
  • Image-based Document Retrieval: Find patent documents or images relevant to a given patent document containing images.
  • Classification: Classify a given patent document according to the IPC system, up to the subclass level.
  • Refined Classification: Classify a given patent document up to the group/subgroup level, when the subclass is given.
  • Image-based Classification: Categorize given patent images into pre-defined categories of images (such as graph, flowchart, drawing, etc.).

Participants are invited to apply their favourite retrieval method and/or classifier on given sets of topics and submit their results. Relevance assessment will be done using patent citations and current patent classifications.

Tentative timeline

  • March-April 2011: data release (corpus, test topics, training topics, guidelines)
  • end of May, 2011: submission deadline for the Image-based Document Retrieval task (the best runs will be made available for eventual post-processing steps for the participants in the Prior Art task)
  • June 2011: submission deadline for all other tasks
  • July 2011: evaluating submissions, make results available
  • August 2011: submission of Notebook Papers to CLEF 2011 (see the conference's schedule)

Organizers

  • Allan Hanbury (Information Retrieval Facility, Vienna, AT)
  • Florina Piroi (Information Retrieval Facility, Vienna, AT)
  • Veronika Zenz (max.recall, Vienna, AT)

CLEF-IP 2011 Overview

The CLEF-IP track was launched in 2009 to investigate IR techniques for patent retrieval within the Crosslanguage Evaluation Forum. After a successful first year, the track continued in 2010 as a benchmarking activity at the CLEF 2010 conference (22 September 2010) in Padua, Italy. We are present again as a benchmarking activity at the CLEF 2011 conference (September 2011) in Amsterdam, The Netherlands.

The track utilizes a large data collection of patent documents derived from EPO sources, covering English, French, and German patents. In 2011 the CLEF-IP data will include patent images as well.

Tasks

If you are interested in CLEF-IP, please subscribe to our mailing list.

The tasks involving patent images are organized in colaboration with ImageCLEF.

Co-ordinators

  • Allan Hanbury, Information Retrieval Facility, Vienna, Austria
  • Florina Piroi, Information Retrieval Facility, Vienna, Austria
  • Veronika Zenz, max.recall, Vienna, Austria

Data Collection

The data collections are extracts of the MAREC dataset, containing over 2.6 million patent documents pertaining to 1.3 milion patents from the European Patent Office with content in English, German and French, and extended by documents from the WIPO.

The data for all tasks has been released. Prior to downloading it, participants must follow the instructions given under "How to Participate".

How To Participate

To register as a participating group, please follow the steps below:

  1. Register as a participant to the CLEF-IP lab on the CLEF registration page.
  2. Download and print the License Agreement
  3. Fill in and sign the License Agreement and send it
    • as a pdf attachment to allan.hanbury@tuwien.ac.at, and
    • by post to the following address:
      Allan Hanbury (for CLEF-IP)
      Department of Software Technology and Interactive Systems

      Vienna University of Technology
      Favoritenstr. 9-11/188
      A-1040 Vienna, Austria

  4. Download the data: Once we have received your signed License Agreement, you will receive the access data to the CLEF-IP 2011 download site (in preparation).
  5. [Optional] Subscribe to the CLEF-IP mailing list by sending an e-mail to: clef-ip-subscribe@ifs.tuwien.ac.at

 

How To Register To CLEF-IP

Follow these steps to register to the Lab.

CLEF-IP Past and Present