Project ‘Art DATIS’ receives NWO Creative Industries funding

The Dutch Organization for Scientific Research (NWO) has granted Smart Culture – Big Data / Digital Humanities funding of ca. € 500.000 for the project Digital Art Technical sources for the Netherlands: Integration and improvement of sources on glass for a Sustainable future (Art DATIS), by prof. dr. Sven Dupré  UU/UvA), dr. Marieke Hendriksen(UU), prof. dr. Evangelos Kanoulas (UvA) and their private partners RKD Netherlands Institute for Art History (Anita Hopmans and Reinier van ‘t Zelfde), Stichting Vrij Glas (Durk Valkema) and Picturae (Mark Lindeman).

The project, which will start in 2018, will focus on the questions of how historical sources were used to innovate glass production and education in the twentieth century, and how we can efficiently link the enormous amount of newly digitized art technical sources on artistic glass production, such as object documentation, technical texts, images, and research data. The project will digitize unique materials from the archives of glass artist Sybren Valkema (1916-1996), managed by Stichting Vrij Glas, and integrate and enrich them with existing databases containing different kinds of data on the history of artistic glass production. The project is designed to link with the RKD Exploreand ARTECHNE databases. The latter, directed by Marieke Hendriksen, is part of the ERC Artechne Project led by Sven Dupré. The Art DATIS researchers will collaborate with internationally renowned data and collections specialists from the Corning Museum of Glass and Glasmuseum Hentrich/Museum Kunstpalast Düsseldorf to ensure integration with museum databases worldwide.

The project will recruit two PhD students – a glass historian and a data scientist. Both will be embedded at the RKD and Stichting Vrij Glas and will start their work in 2018.

Posted in Funding | Comments Off on Project ‘Art DATIS’ receives NWO Creative Industries funding

Session Search

Information Retrieval (IR) research has traditionally focused on serving the best results for a single query— so-called ad hoc retrieval. However, users typically search iteratively, refining and reformulating their queries during a session. A key challenge in the study of this interaction is the creation of suitable evaluation resources to assess the effectiveness of IR systems over sessions. This paper describes the TREC Session Track, which ran from 2010 through to 2014, which focussed on forming test collections that included various forms of implicit feedback. We describe the test collections; a brief analysis of the differences between datasets over the years; and the evaluation results that demonstrate that the use of user session data significantly improved effectiveness.

[embeddoc url=”” height=”300px” download=”all”]

Lexical query modeling has been the leading paradigm for session search. In this paper, we analyze TREC session query logs and compare the performance of different lexical matching approaches for session search. Naive methods based on term frequency weighting perform on par with specialized session models. In addition, we investigate the viability of lexical query models in the setting of session search. We give important insights into the potential and limitations of lexical query modeling for session search and propose future directions for the field of session search.

[embeddoc url=”” height=”200px” download=”all”]

Posted in Publications, Sessions, TREC | Comments Off on Session Search

A Short Survey on Search Evaluation

Evaluation has always been the cornerstone of scientific development. Scientists come up with hypotheses (models) to explain physical phenomena, and validate these models by comparing their output to observations in nature. A scientific field consists then merely by a collection of hypotheses that could not been disproved (yet) when compared to nature. Evaluation plays the exact key role in the field of information retrieval. Researchers and practitioners develop models to explain the relation between an information need expressed by a person and information contained in available resources, and test these models by comparing their outcomes to collections of observations.

This article is a short survey on methods, measures, and designs used in the field of Information Retrieval to evaluate the quality of search algorithms (aka the implementation of a model) against collections of observations. The phrase “search quality” has more than one interpreta- tions, however here I will only discuss one of these interpretations, the effectiveness of a search algorithm to find the information requested by a user. There are two types of collections of observations used for the purpose of evaluation: (a) relevance annotations, and (b) observable user behaviour. I will call the evaluation framework based on the former a collection-based evaluation, while the one based on the latter an in-situ evaluation. This survey is far from complete; it only presents my personal viewpoint on the recent developments in the field.

[embeddoc url=”” height=”200px” download=”all”]

Posted in Publications | Comments Off on A Short Survey on Search Evaluation

From Patients Dossiers to Structured Forms

Hospitals need to provide information to many external parties (e.g. tumor registration to IKNL, statistics to clinical auditing institutes such as DICA, information to health insurance companies etc.). Often this requires filling in predefined forms for all eligible patients/cases. Being able to fill in forms and check for eligibility automatically can save hospitals time and money, since most effort is currently manual.

The goal of this project is to develop an algorithmic pipeline that automatically (a) extracts information from medical dossiers, (b) tests for eligibility, and (c) fills in predefined forms.

This is a joint project with CTcue funded by Amsterdam Data Science

Posted in Funding | Comments Off on From Patients Dossiers to Structured Forms

Aldo Lipani visits ILPS

PastedGraphic-1Aldo Lipani is visiting me for a month. Aldo is a PhD Student at the Vienna University of Technology working on evaluation in Information Retrieval. Aldo’s main focus is on improving the reliability of the test collection based evaluation, developing of an analytical approach to accessibility measures. Aldo and I will work on extending the definition and measurement of retrievability.

Posted in Funding | Comments Off on Aldo Lipani visits ILPS

ControCurator: Crowds and Machines for Controversy Discovery and Modeling

COMMIT logo PNG 2000x818

This is a joined project with Lora Aroyo (VU), Sound & Vision and Crowdynews funded by Commit on identifying controversial topics in multi-model data, with applications ranging from health, to political discourse, news, etc.
The ControCurator project aims to enable modern information access systems to discover and understand controversial topics and events by bringing together different types of crowds (niches of experts, lay crowds and engaged social media contributors) and machines in a joint active learning workflow for the creation of adequate training data (real-time and offline).
Posted in Funding | Comments Off on ControCurator: Crowds and Machines for Controversy Discovery and Modeling

TREC 2016 Tasks Track

Are you interested in building systems that will assist users towards task completion rather than simply showing relevant results for a query?

The primary goals of the Tasks track are (1) to evaluate a system’s understanding of tasks users aim to complete, and (2) to evaluate how useful are retrieved documents towards the underlying task completion.

Ideally, a search engine should be able to understand the reason that caused the user to submit a query (i.e., the actual task that caused the query to be issued), and rather than just showing results relevant to the query submitted, the search engine should be able to guide the user to complete their task by incorporating the information about the actual information need.

The Tasks track will run for a second year at TREC. For details visit the track’s website.


Posted in TREC, Tasks | Comments Off on TREC 2016 Tasks Track

Google Faculty Research Award

Google will be supporting my research on ‘Session-based Personalization: Analysis and Evaluation’. The focus of this research is on personalization of search engine results on the basis of user interactions with the search engine on the current session.

The Google Research Awards Program received 805 proposals and funded 113 of them, with only 3 of them in the fields of Information retrieval, extraction and organization (including semantic graphs).

More Information.

Posted in Funding | Comments Off on Google Faculty Research Award