We’ve just created an “official” Meetup for SEA: Search Engines Amsterdam, with the idea of expanding the reach of the meetings. Join the Meetup group at http://www.meetup.com/SEA-Search-Engines-Amsterdam/ and RSVP to the SEA Meetup this Friday.
Our CIKM 2014 paper on “Online exploration for detecting shifts in fresh intent” by Damien Lefortier, Pavel Serdyukov and Maarten de Rijke is available online now.
In web search, recency ranking refers to the task of ranking documents while taking into account freshness as one of the criteria of their relevance. There are two approaches to recency ranking. One focuses on extending existing learning to rank algorithms to optimize for both freshness and relevance. The other relies on an aggregated search strategy: a (dedicated) fresh vertical is used and fresh results from this vertical are subsequently integrated into the search engine result page. In this paper, we adopt the second strategy. In particular, we focus on the fresh vertical prediction task for repeating queries and identify the following novel algorithmic problem: how to quickly correct fresh intent detection mistakes made by a state-of-the-art fresh intent detector, which erroneously detected or missed a fresh intent shift upwards for a particular repeating query (i.e., a change in the degree to which the query has a fresh intent). We propose a method for solving this problem. We use online exploration at the early start of what we believe to be a detected intent shift. Based on this exploratory phase, we correct fresh intent detection mistakes made by a state-of-that-art fresh intent detector for queries, whose fresh intent has shifted. Using query logs of Yandex, we demonstrate that our methods allow us to significantly improve the speed and quality of the detection of fresh intent shifts.
Our CIKM 2014 paper “Time-aware rank aggregation for microblog search” by Shangsong Liang, Zhaochun Ren, Wouter Weerkamp, Edgar Meij and Maarten de Rijke is available online now.
In the paper we tackle the problem of searching microblog posts and frame it as a rank aggregation problem where we merge result lists generated by separate rankers so as to produce a final ranking to be returned to the user. We propose a rank aggregation method, TimeRA, that is able to infer the rank scores of documents via latent factor modeling. It is time-aware and rewards posts that are published in or near a burst of posts that are ranked highly in many of the lists being aggregated. Our experimental results show that it significantly outperforms state-of-the-art rank aggregation and time-sensitive micro- blog search algorithms.