SIGIR 2015 tutorials on click models

Aleksandr Chuklin, Ilya Markov and I will be teaching an introductory and advanced tutorial on click models for web search at SIGIR 2015, on August 10, 2015.

The tutorial is based on a forthcoming book on click models for web search. Participants will have access to the book, slides, as well as code used for demo sessions during the tutorial. The advanced part of the tutorial will also feature a guest speaker who will talk about developing new click models.

We’ll announce more details as soon as we can.

Now playing: Jamie xx — Far Nearer

ICWSM 2015 paper on determining the presence of political parties in social circles online

“Determining the Presence of Political Parties in Social Circles” by Christophe Van Gysel, Bart Goethals and Maarten de Rijke is available online.

In the paper, we derive the political climate of the social circles of Twitter users using a weakly-supervised approach. By applying random walks over a sub-sample of Twitter’s social graph we infer a distribution indicating the presence of eight Flemish political parties in users’ social circles in the months before the 2014 elections. The graph structure is induced through a combination of connection and retweet features and combines information of over a million tweets and 14 million follower connections. We solely exploit the social graph structure and do not rely on tweet content. For validation we compare the affiliation of politically active Twitter users with the most-influential party in their network. On a validation set of around 700 politically active individuals we achieve F1 scores of 0:85 and greater. We asked the Twitter community to evaluate our classification performance. More than half of the 2 258 users who responded reported a score higher than 60 out of 100.

CfP: 1st Workshop on User Modeling in Heterogeneous Search Environments (HetUM 2015) in conjunction with CIKM 2015

1st Workshop on User Modeling in Heterogeneous Search Environments (HetUM 2015) in conjunction with CIKM 2015
19 October 2015, Melbourne, Australia


Regular paper submission: 19 June 2015
Special track for re-submissions of CIKM papers: 8 July 2015
Notification of acceptance: 23 July 2015
Camera ready: 7 August 2015
Workshop: 19 October 2015

When users interact with information retrieval (IR) systems, they leave rich implicit feedback in the form of clicks, mouse movements, etc. This feedback contains valuable information about users and about IR systems. Analyzing and interpreting user interactions and modeling user search behavior has become an important research direction. It enables us to better understand users, perform user simulations, improve search algorithms and build quality metrics.


We’re hiring: Fully funded PhD student and two postdocs

We’re looking for three strong candidates for a fully funded PhD student position in information retrieval, a three-year postdoc position in information retrieval, and a three-year postdoc position in media studies. This is a collaborative project, called MediaNow, between the Informatics Institute and Department of Media Studies of the University of Amsterdam and the Netherlands Institute for Sound and Vision. You can find out more about the project, plus links to the advertisements etc., at The deadline for applications is April 12, 2015. Interviews are scheduled for May 1, 2015.

Fully funded PhD position in machine learning for information retrieval

We’re looking for a strong candidate for a fully funded four-year PhD position on a collaborative project with Microsoft Research Cambridge. The research will focus on the development of new algorithms for leveraging data reuse in order to efficiently evaluate and optimize the behavior of information retrieval systems. See this page for the advertisement, further requirements, and conditions. The deadline for applications is March 22, 2015.

NWO grant: MediaNow project on narrative search engines

José van Dijck, Johan Oomen and I obtained an NWO Creative Industries grant to work on next generation search engine technologies for for exploring large multimedia archives. The target users are media-professionals. The proposed innovations at the interface of computer science and media studies come in three kinds. First, we will develop, test and release self-learning search algorithms that adapt and improve their behavior while being used. Second, we will create robust methods for semantically analyzing content in media archives. Third, we will develop new search engine result page presentations that provide automatically generated storylines as narratives for professionals in the creative industries. The algorithmic solutions will be implemented in the research environment of the Netherlands Institute for Sound and Vision and released as open source search solutions.

Now playing: Bright Eyes — Messenger Bird’s Song

ACM TOIS paper on a comparative analysis of interleaving methods for aggregated search online

Our ACM Transactions on Information Systems paper called “A comparative analysis of interleaving methods for aggregated search” by Aleksandr Chuklin, Anne Schuth, Ke Zhou and Maarten de Rijke is available online now.

A result page of a modern search engine often goes beyond a simple list of “ten blue links.” Many specific user needs (e.g., News, Image, Video) are addressed by so-called aggregated or vertical search solutions: specially presented documents, often retrieved from specific sources, that stand out from the regular organic web search results. When it comes to evaluating ranking systems, such complex result layouts raise their own challenges. This is especially true for so-called interleaving methods that have arisen as an important type of online evaluation: by mixing results from two different result pages interleaving can easily break the desired web layout in which vertical documents are grouped together, and hence hurt the user experience.

We conduct an analysis of different interleaving methods as applied to aggregated search engine result pages. Apart from conventional interleaving methods, we propose two vertical-aware methods: one derived from the widely used Team-Draft Interleaving method by adjusting it in such a way that it respects vertical document groupings, and another based on the recently introduced Optimized Interleaving framework. We show that our proposed methods are better at preserving the user experience than existing interleaving methods while still performing well as a tool for comparing ranking systems. For evaluating our proposed vertical-aware interleaving methods we use real world click data as well as simulated clicks and simulated ranking systems.

IPM paper on burst-aware data fusion for microblog search online

An Information Processing & Management paper on burst-aware data fusion for microblog search by Shangsong Liang and Maarten de Rijke is online now.

We consider the problem of searching posts in microblog environments. We frame this microblog post search problem as a late data fusion problem. Previous work on data fusion has mainly focused on aggregating document lists based on retrieval status values or ranks of documents without fully utilizing temporal features of the set of documents being fused. Additionally, previous work on data fusion has often worked on the assumption that only documents that are highly ranked in many of the lists are likely to be of relevance. We propose BurstFuseX, a fusion model that not only utilizes a microblog post’s ranking information but also exploits its publication time. BurstFuseX builds on an existing fusion method and rewards posts that are published in or near a burst of posts that are highly ranked in many of the lists being aggregated. We experimentally verify the effectiveness of the proposed late data fusion algorithm, and demonstrate that in terms of mean average precision it significantly outperforms the standard, state-of-the-art fusion approaches as well as burst or time-sensitive retrieval methods.

ECIR 2015 paper on user behavior in location search on mobile devices online

Our ECIR 2015 paper on user behavior in location search on mobile devices by Yaser Norouzzadeh Ravari, Ilya Markov, Artem Grotov, Maarten Clements and Maarten de Rijke is online now.

Location search engines are an important part of GPS-enabled devices such as mobile phones and tablet computers. In this paper, we study how users behave when they interact with a location search engine by analyzing logs from a popular GPS-navigation service to find out whether mobile users’ location search characteristics differ from those of regular web search. In particular, we analyze query- and session-based characteristics and the temporal distribution of location searches performed on smart phones and tablet computers. Our findings may be used to improve the design of search interfaces in order to help users perform location search more effectively and improve the overall experience on GPS-enabled mobile devices.