Maarten de Rijke

Information retrieval

Page 2 of 15

ECIR 2019 paper on online learning to rank online

Optimizing ranking models in an online setting by Harrie Oosterhuis and Maarten de Rijke is available online now at this location.

Online Learning to Rank (OLTR) methods optimize ranking models by directly interacting with users, which allows them to be very efficient and responsive. All OLTR methods introduced during the past decade have extended on the original OLTR method: Dueling Bandit Gradient Descent (DBGD). Recently, a fundamentally different approach was introduced with the Pairwise Differen- tiable Gradient Descent (PDGD) algorithm. To date the only comparisons of the two approaches are limited to simulations with cascading click models and low levels of noise. The main outcome so far is that PDGD converges at higher levels of performance and learns considerably faster than DBGD-based methods. However, the PDGD algorithm assumes cascading user behavior, potentially giving it an unfair advantage. Furthermore, the robustness of both methods to high levels of noise has not been investigated. Therefore, it is unclear whether the reported advantages of PDGD over DBGD generalize to different experimental conditions. In this paper, we investigate whether the previous conclusions about the PDGD and DBGD comparison generalize from ideal to worst-case circumstances. We do so in two ways. First, we compare the theoretical properties of PDGD and DBGD, by taking a critical look at previously proven properties in the context of ranking. Second, we estimate an upper and lower bound on the performance of methods by simulating both ideal user behavior and extremely difficult behavior, i.e., almost-random non-cascading user models. Our findings show that the theoretical bounds of DBGD do not apply to any common ranking model and, furthermore, that the performance of DBGD is substantially worse than PDGD in both ideal and worst-case circumstances. These results reproduce previously published findings about the relative performance of PDGD vs. DBGD and generalize them to extremely noisy and non-cascading circumstances.

The paper will be presented at ECIR 2019.

WSDM 2019 paper on off-policy evaluation online

When people change their mind: Off-policy evaluation in non-stationary recommendation environments by Rolf Jagerman, Ilya Markov, and Maarten de Rijke is online now at this location.

We consider the novel problem of evaluating a recommendation policy offline in environments where the reward signal is non- stationary. Non-stationarity appears in many Information Retrieval (IR) applications such as recommendation and advertising, but its effect on off-policy evaluation has not been studied at all. We are the first to address this issue. First, we analyze standard off-policy estimators in non-stationary environments and show both theoretically and experimentally that their bias grows with time. Then, we propose new off-policy estimators with moving averages and show that their bias is independent of time and can be bounded. Furthermore, we provide a method to trade-off bias and variance in a principled way to get an off-policy estimator that works well in both non-stationary and stationary environments. We experiment on publicly available recommendation datasets and show that our newly proposed moving average estimators accurately capture changes in non-stationary environments, while standard off-policy estimators fail to do so.

The paper will be presented at WSDM 2019.

WSDM 2019 paper on open-domain question answering online

Learning to transform, combine, and reason in open-domain question answering by Mostafa Dehghani, Hosein Azarbonyad, Jaap Kamps, and Maarten de Rijke is online now at this location.

Users seek direct answers to complex questions from large open-domain knowledge sources like the Web. Open-domain question answering has become a critical task to be solved for building systems that help address users’ complex information needs. Most open-domain question answering systems use a search engine to retrieve a set of candidate documents, select one or a few of them as context, and then apply reading comprehension models to ex- tract answers. Some questions, however, require taking a broader context into account, e.g., by considering low-ranked documents that are not immediately relevant, combining information from multiple documents, and reasoning over multiple facts from these documents to infer the answer. In this paper, we propose a model based on the Transformer architecture that is able to efficiently operate over a larger set of candidate documents by effectively combining the evidence from these documents during multiple steps of reasoning, while it is robust against noise from low-ranked non-relevant documents included in the set. We use our proposed model, called TraCRNet, on two public open-domain question answering datasets, SearchQA and Quasar-T, and achieve results that meet or exceed the state-of-the-art.

The paper will be presented at WSDM 2019.

AAAI 2019 paper on repeat aware recommendation online

RepeatNet: A Repeat Aware Neural Recommendation Machine for Session-based Recommendation by Pengjie Ren, Zhumin Chen, Jing Li, Zhaochun Ren, Jun Ma, and Maarten de Rijke is online now at this location.

Recurrent neural networks for session-based recommendation have attracted a lot of attention recently because of their promising performance. Repeat consumption is a common phenomenon in many recommendation scenarios (e.g., e-commerce, music, and TV program recommendations), where the same item is re-consumed repeatedly over time. However, no previous studies have emphasized repeat consumption with neural networks. An effective neural approach is needed to decide when to perform repeat recommendation. In this paper, we incorporate a repeat-explore mechanism into neural networks and propose a new model, called RepeatNet, with an encoder-decoder structure. RepeatNet integrates a regular neural recommendation approach in the decoder with a new repeat recommendation mechanism that can choose items from a user’s history and recommends them at the right time. We report on extensive experiments on three benchmark datasets. RepeatNet outperforms state-of-the-art baselines on all three datasets in terms of MRR and Recall. Furthermore, as the dataset size and the repeat ratio increase, the improvements of RepeatNet over the baselines also increase, which demonstrates its advantage in handling repeat recommendation scenarios.

The paper will be presented at AAAI 2019.

AAAI 2019 paper on dialogue generation online

Dialogue generation: From imitation learning to inverse reinforcement learning by Ziming Li, Julia Kiseleva, and Maarten de Rijke is online now at this location.

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the gen- erator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a re- cently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator train- ing. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.

The paper will be presented at AAAI 2019.

Come work with us. We’re hiring again

Come work with us. We’re hiring again:

  • One PhD student in AI and IR, to work on multi-modal search, see this page for details
  • Two PhD students in AI and Operations Research, to work on replenishment, see this page for details

Contact me at derijke@uva.nl if you’d like to find out more. I particularly encourage those of diverse backgrounds to apply. I strongly believe that diversity of education, experience, career, and perspectives will lead to a better environment for the researchers in my team and a better intellectual environment for the team as a whole. This is something I value deeply.

Paper on incremental sparse Bayesian ordinal regression published in Neural Networks

“Incremental sparse Bayesian ordinal regression” by Chang Li and Maarten de Rijke has been published in the October 2018 issue of Neural Networks. See the journal’s site.

Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high-dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches.

CIKM 2018 paper on Web-based Startup Success Prediction online

Web-based Startup Success Prediction by Boris Sharchilev, Michael Roizner, Andrey Rumyantsev, Denis Ozornin, Pavel Serdyukov, Maarten de Rijke is online now at this page.

In the paper we consider the problem of predicting the success of startup companies at their early development stages. We formulate the task as predicting whether a company that has already secured initial (seed or angel) funding will attract a further round of investment in a given period of time. Previous work on this task has mostly been restricted to mining structured data sources, such as databases of the startup ecosystem consisting of investors, incubators and startups. Instead, we investigate the potential of using web-based open sources for the startup success prediction task and model the task using a very rich set of signals from such sources. In particular, we enrich structured data about the startup ecosystem with information from a business- and employment-oriented social networking service and from the web in general. Using these signals, we train a robust machine learning pipeline encompassing multiple base models using gradient boosting. We show that utilizing companies’ mentions on the Web yields a substantial performance boost in comparison to only using structured data about the startup ecosystem. We also provide a thorough analysis of the obtained model that allows one to obtain insights into both the types of useful signals discoverable on the Web and market mechanisms underlying the funding process.

The paper will be presented at CIKM 2018 in October 2018.

SCAI 2018 paper on Understanding the Low-Diversity Problem of Chatbots online

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots by Shaojie Jiang and Maarten de Rijke is available online now at this location.

Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

The paper will be presented at the SCAI 2018 workshop at EMNLP 2018 in October 2018.

CIKM 2018 paper on Differentiable Unbiased Online Learning to Rank online

Differentiable Unbiased Online Learning to Rank by Harrie Oosterhuis and Maarten de Rijke is available online now at this location.

Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.

The paper will be presented at CIKM 2018 in October 2018.

« Older posts Newer posts »

© 2019 Maarten de Rijke

Theme by Anders NorenUp ↑