Industry Track Speakers

Task based information seeking in a large organisation, report from a collaborative project

Summary: We used a work-task based research approach where we studied information practices – that is, the normalized ways we use to recognize information needs, look for information, and how it is valued and used. By studying such practices in real-life work tasks, we can outline the role that a search engine plays in relation to other work tasks as well as to other ways of finding information. The results also concluded in an initially tested model for how to approach content acquisition for enterprise search engines in a structured way.

 

Lessons learned and challenges ahead for patent analytics software

Summary: In this presentation the lessons learned from the development a Treparel's text mining and visualization software platform (KMX) for patents and research literature are presentated. The importance of client requirements and use cases from the beginning will be described. Also the evolution of features and capabilities of the software and future challenges challenges will be presented. Additionally the value of classification and clustering for text data in other markets with other use cases will be addressed too.
 

Community and corporate patent search portals

Summary: This speech outlines some of the opportunities and challenges we have faced around deploying datasets of up to 100 million patents and patent applications to groups of specialist patent data users. Example opportunities and challenges are provided from Boliven.com, which hosts one of the world’s largest collections of patents and related scientific literature, comprising over 100 million documents. Boliven.com documents come from a very wide range of data sources, concern specialist subject matter and are associated with a variety of classification schemes by suppliers. As user numbers and user types proliferate, the range of user needs for the system to address can proliferate. Our user base of over 15,000 is very diverse, spanning individuals to multi-national corporate groups, across a wide range of technical disciplines. The users require a range of outputs, including web based information, reports and API access to our data. Topics covered in the speech include Data curation, Cloud storage; User modelling and personalisation; Collaborative workflows; Machine learning; Multi-language information retrieval, Patent analytics and Visualisation of search results.
 

Patent and Norm Exploration with the m2n Knowledge Discovery Suite

Summary: In patents, texts and images are linked through references in the patent text itself. This kind of association is not represented explicitly in patent databases, so it is not possible to make efficient use of them during search. Also, databases provide metadata only partially. For example they lack image type information (e.g. technical drawings, diagrams, flow charts and graphs). Moreover, extraction of additional metadata to allow for powerful querying is a costly manual task (also in the norm domain).
We present patent and norm exploration with the m2n Knowledge Discovery Suite as prototype implementation, which adresses several of those issues. We will show how textual figure references get semantically linked to the image files and can be inspected in the norm/patent viewer. Also, we will show how textual references can be defined as "finding objects" to allow interaction during search (e.g. searching for similar referencing paragraphs). Additionally, advanced metadata generation, browsing and searching will be presented on example norm documents as well as some collaboration features of the m2n-kd suite.