Image Retrieval Session

Tuesday 7 June 2011 afternoon

 

Efficient search for images in patents and for the information hidden within those images is still an unresolved problem. Current image search systems are still using text-based retrieval methodologies. After a keynote outlining the state-of-the-art in content-based image retrieval and semantic indexing, this session will focus on image analysis and similarity search approaches and show a Patent Image Retrieval demo based on Concept extraction and Classification.

 

Schedule:

  • Session Keynote: Stefan Rueger, Professor, The Open University, UK
  • Demos: Visual and text-based search engine, patent image retrieval based on concept extraction and classification
  • Q&As

 

Browsing and searching in image databases using image content

Stefan Rueger, Knowledge Media Institute, The Open University, UK

In his keynote presentation Stefan Rueger will give an overview of automated image retrieval techniques and attempt to relate these to patent retrieval. The common best practice of locating images in most commercial image databases is still a structured text search in metadata and in text describing the images. Generating the right metadata and image description is a process that is subjective, costly, and depends on language, ontologies, (controlled) vocabulary and, ultimately, the purpose of the search.

Utilising the image content itself and an automated analysis of the images holds the promise of improving and widening access to image databases, but indexing image content is made difficult by a number of challenges: some aspects of external metadata simply cannot be extracted by analysing the image contents; there is a pronounced semantic gap between human interpretation and low-level image features; image elements have many meanings and interpretations, which forms yet another barrier to automated image understanding.

Stefan Rueger will argue that — in particular in vertical domains of image search — some of these challenges can be tackled by automated processing, machine learning and by utilising the skills of the user, for example through browsing or through a process that is called relevance feedback, thus putting the user at centre stage. 

Stefan Rueger

read Physics at the Freie Universität Berlin and gained his PhD at the Technische Universität Berlin (1996). He carved out his academic career at Imperial College London (1997-2006), where he also held an EPSRC Advanced Research Fellowship (1999-2004). In 2006 he became a Professor of Knowledge Media when he joined The Open University's Knowledge Media Institute to head a research group on Multimedia and Information Systems. Since 2009 he has held a Honorary Professorship from the University of Waikato, New Zealand, for his collaboration with the Greenstone Digital Library group on Multimedia Digital Libraries. Rüger has published widely in the area of Multimedia Information Retrieval. He was Principal Investigator in the EPSRC-funded Multimedia Knowledge Management Network, of a recent EPSRC grant to research and develop video digital libraries and for The Open University in the European FP6-ICT project PHAROS that established a horizontal layer of technologies for large-scale audio-visual search engines. As of 2011, he has served the academic community in various roles as conference chair (3x), programme chair (3x), journal editor (3x), guest editor (3x) and as referee for a wide range of Computing journals (>25), international conferences (>50) and research sponsors (>10). Rüger is a member of the EPSRC College, ACM, BCS, the BCS IRSG committee and a fellow of the Higher Education Academy.

Image Retrieval Demonstrations: 

Patent Retrieval based on both Visual and Textual Information

Alba García Seco de Herrera, University of Applied Sciences Western Switzerland

The proposed system is a simple combination of existing open source tools for the retrieval of patents using both visual and textual information of the patents.
For text retrieval, the Lucene system is used and for visual retrieval GIFT (GNU Image Finding Tool). A simple AJAX (Asybchronous Javascript and XML) interface connects the two components and allows switching between visual search of images including relevance feedback and the exact search in the patent documents. The visual and textual information sources are fairly complementary and sometimes using visual information can help finding particular patents that might not have any matching words with the query.Currently for the visual retrieval the unit is the image and for the text the unit is a full patent. We are currently working on extending the visual retrieval onto a patent-basis as well to allow for a fully mixed visual/text-based search where the percentage of visualness/textness can potentially be adapted by the searcher.

See: http://medgift.unige.ch:8080/Patents/faces/Search.jsp

 

Alba García Seco de Herrera studied Mathematics at ŒUniversidad Complutense de Madrid¹ in Spain until 2008. She received her master¹s degree in Telemedicine and Bioengineering at ŒUniversidad Politécnica de Madrid¹ in Spain in 2009. She then worked as a research assistant at General Electric Healthcare in several projects concerning Magnetic Resonance Image treatment. Currently, as a PhD student she is assistant at University of Applied Sciences Western Switzerland (Hes-so) in Sierre, Switzerland, where she works on the EU projects Promise and Khresmoi. The GNU Image Finding tool she is presenting has been used for medical projects but also in other domains such as patent retrieval.

 


Patent Image Retrieval based on Concept extraction and Classification

Stefanos Vrochidis, Informatics and Telematics Institute

 

In this demonstration a patent image analysis and classification system, which facilitates search by supporting concept-based retrieval, will be presented. The challenge faced by this approach is to extract human understandable high level concepts (e.g. ski boot, cleats, etc.) from patent figures by exploiting visual low level information. The presented framework is built on top of machine learning techniques and image analysis algorithms. Specifically, support vector machine classifiers are trained with manually annotated examples using as image low level features the Adaptive Hierarchical Density Histogram [1]. The system is evaluated by reporting the accuracy of the classifiers for a set of concepts and patent images from the test data provided by the IRF.


[1] P. Sidiropoulos, S. Vrochidis, I. Kompatsiaris, "Content-Based Binary Image Retrieval using the Adaptive Hierarchical Density Histogram", Pattern Recognition Journal, Elsevier, Volume 44, Issue 4, pp 739-750, April 2011.

 

 

Stefanos Vrochidis received the Diploma degree in electrical and computer engineering from Aristotle University of Thessaloniki, Greece and the MSc degree in radio frequency communication systems from University of Southampton, UK, in 2000 and 2001, respectively. Currently, he is an Associate Researcher with the Informatics and Telematics Institute - Centre for Research and Technology Hellas (ITI-CERTH) and a PhD candidate at the Queen Mary University of London. His research interests include semantic multimedia analysis, indexing and retrieval, as well as search engines development and patent retrieval. He has successfully participated in many European and National projects related to multimedia analysis and retrieval, patent search and digital tv technologies. He is the coauthor of 6 articles in refereed journals, 3 book chapters and more than 20 papers in international conferences.

 

 Sponsored by

 

 In cooperation with

 

 Supported and endorsed by


WON

CEPIUG