is a public-private research community solving grand challenges in information and communication science shaping tomorrow's society

INFINITI (Information retrieval for information services)

Sketch of the VARA second screen application. The column on the far right hand side shows icons representing key topics mention during a live television show; by clicking on them users can bookmark them or get further information on through a pop-up window. ©VARA Sketch of the VARA second screen application. The column on the far right hand side shows icons representing key topics mention during a live television show; by clicking on them users can bookmark them or get further information on through a pop-up window. ©VARA

  

The author John Naisbitt said: We drown in information but starve for knowledge. He said this a couple of decades ago but with exponential data growth it is becoming more difficult to make sense of it all.

Fundamental notion of information retrieval, namely relevance

The ICT-science challenge of INFINITI is in the fundamental notion of information retrieval, namely relevance. How do we extract meaning from Big Data so as to inform intelligent actions? The body of information that we are confronted with today increases in volume, it increases in length, with records dating back increasingly long periods, and it increases in variety, with more and more parallel sources of information, each contributing a piece of the puzzle. The strategy adopted by the project is to approximate relevance as a combination of a large number of signals: textual, structural, audio, visual, semantic, social, cognitive and behavioural.

Mining and analysing signals in an online setting

INFINITI is inspired by the fact that we increasingly live our life online. We are not only witnessing data streams that are expanding in every imaginable dimension, the nature of the Big Data is changing too. We share experiences, all of our transactions are logged, body and brain signals are becoming a commodity. Being able to extract useful knowledge from these streams is not just an economic opportunity: for all of us, this ability is becoming a key asset to operate successfully in today’s world. The ICT science challenge is to mine and analyse signals in an online setting, with little to no supervision. The INFINITI team consists of ICT researchers as well as scholars in the humanities, social sciences and cognitive sciences.

Second screen application with VARA

To prepare for future valorization, the project has laid the foundation for large-scale demonstrators for Dutch broadcasting. The first step has been to design and implement a second screen application together with Dutch Broadcaster VARA, where TV viewers receive automatically generated links to background information during live television talk shows based on a combination of textual, semantic and social signals. It uses Dutch subtitle information for link generation, but may be expanded to other languages and programs without subtitles.

Valorization strategy

During 2012 INFINITI started rolling out its three-pronged valorization strategy.

  1. The project releases open source software for text and multimedia analytics. Based on its work on text analysis, the project pushed a release of xTAS, the open source extensible text analysis service. In late 2012, the Netherlands e-Science Center pledged long-term support for the development of xTAS. Contributing work packages here are WP01–05. 
  2. The project is pursuing a small number of targeted demonstrators. We have focused on Shoshin (a longitudinal search engine for digital humanities based on the open Elastic Search platform), OpenGeist (a public API for trend analysis from Wikipedia data) and Treinplanner.info (a natural search in databases for the Dutch Railways). The demonstrators will have mining functionalities on one of the types of signals in the project. So far, WP02, 05, 07 have contributed to this action line. 
  3. To prepare for future valorization, the project has laid the foundation for large-scale demonstrators for Dutch broadcasting. The first step has been to design and implement a second screen application together with Dutch Broadcaster VARA, where TV viewers receive automatically generated links to background information during live television talk shows based on a combination of textual, semantic and social signals. It uses Dutch subtitle information for link generation, but may be expanded to other languages and programs without subtitles.

TV playlists with VPRO

Planning towards a second large-scale demonstrator in collaboration with another Dutch broadcasting organization, VPRO, was started towards the end of 2012. The VPRO use case will revolve around a complex recommendation scenario where “TV playlists” will be recommended based on user interest and a mixture of textual, audio, social, visual and semantic signals.

Video: