## Maarten de Rijke

### Information retrieval

#### Category: Publications (page 1 of 8)

The papers that my team and I will be presenting at CIKM 2019 are online now:

• Wanyu Chen, Fei Cai, Chen Honghui, and Maarten de Rijke. A Dynamic Co-attention Network for Session-based Recommendation. In CIKM 2019: 28th ACM Conference on Information and Knowledge Management. ACM, November 2019. Bibtex, PDF
@inproceedings{chen-2019-dynamic,
Author = {Chen, Wanyu and Cai, Fei and Chen Honghui and de Rijke, Maarten},
Booktitle = {CIKM 2019: 28th ACM Conference on Information and Knowledge Management},
Date-Modified = {2019-08-09 09:23:39 +0200},
Month = {November},
Publisher = {ACM},
Title = {A Dynamic Co-attention Network for Session-based Recommendation},
Year = {2019}}
• Svitlana Vakulenko, Javier Fernández, Axel Polleres, Maarten de Rijke, and Michael Cochez. Message Passing for Complex Question Answering over Knowledge Graphs. In CIKM 2019: 28th ACM Conference on Information and Knowledge Management. ACM, November 2019. Bibtex, PDF
@inproceedings{vakulenko-2019-message,
Author = {Vakulenko, Svitlana and Fern{\'{a}}ndez, Javier and Polleres, Axel and de Rijke, Maarten and Cochez, Michael},
Booktitle = {CIKM 2019: 28th ACM Conference on Information and Knowledge Management},
Date-Modified = {2019-08-09 09:29:12 +0200},
Month = {November},
Publisher = {ACM},
Title = {Message Passing for Complex Question Answering over Knowledge Graphs},
Year = {2019}}
• Shanshan Wang, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Jun Ma, and Maarten de Rijke. Order-free Medicine Combination Prediction With Graph Convolutional Reinforcement Learning. In CIKM 2019: 28th ACM Conference on Information and Knowledge Management. ACM, November 2019. Bibtex, PDF
@inproceedings{wang-2019-order-free,
Author = {Wang, Shanshan and Ren, Pengjie and Chen, Zhumin and Ren, Zhaochun and Ma, Jun and de Rijke, Maarten},
Booktitle = {CIKM 2019: 28th ACM Conference on Information and Knowledge Management},
Date-Modified = {2019-08-09 09:26:11 +0200},
Month = {November},
Publisher = {ACM},
Title = {Order-free Medicine Combination Prediction With Graph Convolutional Reinforcement Learning},
Year = {2019}}
• Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Maarten de Rijke, Qingyao Ai, Yufei Huang, Min Zhang, and Shaoping Ma. Improving Web Image Search with Contextual Information. In CIKM 2019: 28th ACM Conference on Information and Knowledge Management. ACM, November 2019. Bibtex, PDF
@inproceedings{xie-2019-improving,
Author = {Xie, Xiaohui and Mao, Jiaxin and Liu, Yiqun and de Rijke, Maarten and Ai, Qingyao and Huang, Yufei and Zhang, Min and Ma, Shaoping},
Booktitle = {CIKM 2019: 28th ACM Conference on Information and Knowledge Management},
Date-Modified = {2019-08-09 09:27:42 +0200},
Month = {November},
Publisher = {ACM},
Title = {Improving Web Image Search with Contextual Information},
Year = {2019}}

Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model by Chang Li and Maarten de Rijke is online now at this location.

In the paper, we argue that non-stationarity appears in many online applications such as web search and advertising. We study the online learning to rank problem in a non-stationary environment where user preferences change abruptly at an unknown moment in time. We consider the problem of identifying the K most attractive items and propose cascading non-stationary bandits, an online learning variant of the cascading model, where a user browses a ranked list from top to bottom and clicks on the first attractive item. We propose two algorithms for solving this non-stationary problem: CascadeDUCB andCascadeSWUCB. We analyze their performance and derive gap-dependent upper bounds on the $n$-step regret of these algorithms. We also establish a lower bound on the regret for cascading non-stationary bandits and show that both algorithms match the lower bound up to a logarithmic factor. Finally, we evaluate their performance on a real-world web search click dataset.

• Chang Li and Maarten de Rijke. Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model. In IJCAI 2019: Twenty-Eighth International Joint Conference on Artificial Intelligence, page 2859–2865, August 2019. Bibtex, PDF
@inproceedings{li-2019-cascading,
Author = {Li, Chang and de Rijke, Maarten},
Booktitle = {IJCAI 2019: Twenty-Eighth International Joint Conference on Artificial Intelligence},
Date-Modified = {2019-08-04 15:53:45 +0200},
Month = {August},
Pages = {2859--2865},
Title = {Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model},
Year = {2019}}

Other papers and presentations at IJCAI are part of the SCAI workshop:

• Jiahuan Pei, Arent Stienstra, Julia Kiseleva, and Maarten de Rijke. SEntNet: Source-aware Recurrent Entity Networks for Dialogue Response Selection. In 4th International Workshop on Search-Oriented Conversational AI (SCAI), August 2019. Bibtex, PDF
@inproceedings{pei-2019-sentnet,
Author = {Pei, Jiahuan and Stienstra, Arent and Kiseleva, Julia and de Rijke, Maarten},
Booktitle = {4th International Workshop on Search-Oriented Conversational AI (SCAI)},
Date-Modified = {2019-06-06 11:56:16 +0200},
Month = {August},
Title = {SEntNet: Source-aware Recurrent Entity Networks for Dialogue Response Selection},
Year = {2019}}
• Yangjun Zhang, Pengjie Ren, and Maarten de Rijke. Improving Background Based Conversation with Context-aware Knowledge Pre-selection. In 4th International Workshop on Search-Oriented Conversational AI (SCAI), August 2019. Bibtex, PDF
@inproceedings{zhang-2019-improving,
Author = {Zhang, Yangjun and Ren, Pengjie and de Rijke, Maarten},
Booktitle = {4th International Workshop on Search-Oriented Conversational AI (SCAI)},
Date-Modified = {2019-06-06 11:55:01 +0200},
Month = {August},
Title = {Improving Background Based Conversation with Context-aware Knowledge Pre-selection},
Year = {2019}}

• Maarten de Rijke and Pengjie Ren. SERP-based Conversations. In 4th International Workshop on Search-Oriented Conversational AI (SCAI), August 2019.

Here’s our harvest for SIGIR 2019, which is about to get started in less than 24 hours:

• Joris Baan, Maartje ter Hoeve, Marlies van der Wees, Anne Schuth, and Maarten de Rijke. Do Transformer Attention Heads Provide Transparency in Abstractive Summarization?. In FACTS-IR: SIGIR 2019 Workshop on Fairness, Accountability, Confidentiality, Transparency and Safety in Information Retrieval, July 2019. Bibtex, PDF
@inproceedings{baan-2019-do,
Author = {Baan, Joris and ter Hoeve, Maartje and van der Wees, Marlies and Schuth, Anne and de Rijke, Maarten},
Booktitle = {FACTS-IR: SIGIR 2019 Workshop on Fairness, Accountability, Confidentiality, Transparency and Safety in Information Retrieval},
Date-Modified = {2019-05-31 22:38:02 +0200},
Month = {July},
Title = {Do Transformer Attention Heads Provide Transparency in Abstractive Summarization?},
Year = {2019}}
• Yifan Chen, Pengjie Ren, Yang Wang, and Maarten de Rijke. Bayesian Personalized Feature Interaction Selection for Factorization Machines. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 665–674. ACM, July 2019. Bibtex, PDF
@inproceedings{chen-2019-bayesian,
Author = {Chen, Yifan and Ren, Pengjie and Wang, Yang and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:58:43 +0200},
Month = {July},
Pages = {665--674},
Publisher = {ACM},
Title = {Bayesian Personalized Feature Interaction Selection for Factorization Machines},
Year = {2019}}
• Yang Fang, Xiang Zhao, Peixin Huang, Weidong Xiao, and Maarten de Rijke. M-HIN: Complex Embeddings for Heterogeneous Information Networks via Metagraphs. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 913–916. ACM, July 2019. Bibtex, PDF
@inproceedings{fang-2019-m-hin,
Author = {Fang, Yang and Zhao, Xiang and Huang, Peixin and Xiao, Weidong and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:59:43 +0200},
Month = {July},
Pages = {913--916},
Publisher = {ACM},
Title = {M-HIN: Complex Embeddings for Heterogeneous Information Networks via Metagraphs},
Year = {2019}}
• Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 15–24. ACM, July 2019. Bibtex, PDF
@inproceedings{jagerman-2019-model,
Author = {Jagerman, Rolf and Oosterhuis, Harrie and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:58:05 +0200},
Month = {July},
Pages = {15--24},
Publisher = {ACM},
Title = {To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions},
Year = {2019}}
• Claudio Lucchese, Franco Maria Nardini, Rama Kumar Pasumarthi, Sebastian Bruch, Michael Bendersky, Xuanhui Wang, Harrie Oosterhuis, Rolf Jagerman, and Maarten de Rijke. Learning to Rank in Theory and Practice: From Gradient Boosting to Neural Networks and Unbiased Learning. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 1419–1420, ACM, July 2019. Bibtex, PDF
@inproceedings{lucchese-2019-learning,
Author = {Lucchese, Claudio and Nardini, Franco Maria and Pasumarthi, Rama Kumar and Bruch, Sebastian and Bendersky, Michael and Wang, Xuanhui and Oosterhuis, Harrie and Jagerman, Rolf and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 16:00:12 +0200},
Month = {July},
Pages = {1419--1420},
Title = {Learning to Rank in Theory and Practice: From Gradient Boosting to Neural Networks and Unbiased Learning},
Year = {2019}}
• Ana Lucic, Hinda Haned, and Maarten de Rijke. Explaining Predictions from Tree-based Boosting Ensembles. In FACTS-IR: SIGIR 2019 Workshop on Fairness, Accountability, Confidentiality, Transparency and Safety in Information Retrieval, July 2019. Bibtex, PDF
@inproceedings{lucic-2019-explaining,
Author = {Lucic, Ana and Haned, Hinda and de Rijke, Maarten},
Booktitle = {FACTS-IR: SIGIR 2019 Workshop on Fairness, Accountability, Confidentiality, Transparency and Safety in Information Retrieval},
Date-Modified = {2019-05-31 22:38:12 +0200},
Month = {July},
Title = {Explaining Predictions from Tree-based Boosting Ensembles},
Year = {2019}}
• Muyang Ma, Pengjie Ren, Yujie Lin, Zhumin Chen, Jun Ma, and Maarten de Rijke. $\pi$-Net: A Parallel Information-sharing Network for Cross-domain Shared-account Sequential Recommendations. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 685–694. ACM, July 2019. Bibtex, PDF
@inproceedings{ma-2019-pi-net,
Author = {Ma, Muyang and Ren, Pengjie and Lin, Yujie and Chen, Zhumin and Ma, Jun and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:59:03 +0200},
Month = {July},
Pages = {685--694},
Publisher = {ACM},
Title = {$\pi$-Net: A Parallel Information-sharing Network for Cross-domain Shared-account Sequential Recommendations},
Year = {2019}}
• Alexandra Olteanu, Jean Garcia-Gathright, Maarten de Rijke, and Michael D. Ekstrand. Workshop on Fairness, Accountability, Confidentiality, Transparency, and Safety in Information Retrieval (FACTS-IR). In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 1423–1425. ACM, July 2019. Bibtex, PDF
@inproceedings{olteanu-2019-workshop,
Author = {Olteanu, Alexandra and Garcia-Gathright, Jean and de Rijke, Maarten and Ekstrand, Michael D.},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 16:00:36 +0200},
Month = {July},
Pages = {1423--1425},
Publisher = {ACM},
Title = {Workshop on Fairness, Accountability, Confidentiality, Transparency, and Safety in Information Retrieval (FACTS-IR)},
Year = {2019}}
• Jiahuan Pei, Pengjie Ren, and Maarten de Rijke. A Modular Task-oriented Dialogue System Using a Neural Mixture-of-Experts. In WCIS: SIGIR 2019 Workshop on Conversational Interaction Systems. ACM, July 2019. Bibtex, PDF
@inproceedings{pei-2019-modular,
Author = {Pei, Jiahuan and Ren, Pengjie and de Rijke, Maarten},
Booktitle = {WCIS: SIGIR 2019 Workshop on Conversational Interaction Systems},
Date-Modified = {2019-07-14 12:24:23 +0200},
Month = {July},
Publisher = {ACM},
Title = {A Modular Task-oriented Dialogue System Using a Neural Mixture-of-Experts},
Year = {2019}}
• Taihua Shao, Fei Cai, Honghui Chen, and Maarten de Rijke. Length-adaptive Neural Network for Answer Selection. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 869–872. ACM, July 2019. Bibtex, PDF
@inproceedings{shao-2019-length-adaptive,
Author = {Shao, Taihua and Cai, Fei and Chen, Honghui and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:59:20 +0200},
Month = {July},
Pages = {869--872},
Publisher = {ACM},
Year = {2019}}
• Meirui Wang, Pengjie Ren, Lei Mei, Zhumin Chen, Jun Ma, and Maarten de Rijke. A Collaborative Session-based Recommendation Approach with Parallel Memory Modules. In SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval, page 345–354. ACM, July 2019. Bibtex, PDF
@inproceedings{wang-2019-collaborative,
Author = {Wang, Meirui and Ren, Pengjie and Mei, Lei and Chen, Zhumin and Ma, Jun and de Rijke, Maarten},
Booktitle = {SIGIR 2019: 42nd international ACM SIGIR conference on Research and Development in Information Retrieval},
Date-Modified = {2019-08-02 15:58:23 +0200},
Month = {July},
Pages = {345--354},
Publisher = {ACM},
Title = {A Collaborative Session-based Recommendation Approach with Parallel Memory Modules},
Year = {2019}}

“Incremental sparse Bayesian ordinal regression” by Chang Li and Maarten de Rijke has been published in the October 2018 issue of Neural Networks. See the journal’s site.

Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high-dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches.

Web-based Startup Success Prediction by Boris Sharchilev, Michael Roizner, Andrey Rumyantsev, Denis Ozornin, Pavel Serdyukov, Maarten de Rijke is online now at this page.

In the paper we consider the problem of predicting the success of startup companies at their early development stages. We formulate the task as predicting whether a company that has already secured initial (seed or angel) funding will attract a further round of investment in a given period of time. Previous work on this task has mostly been restricted to mining structured data sources, such as databases of the startup ecosystem consisting of investors, incubators and startups. Instead, we investigate the potential of using web-based open sources for the startup success prediction task and model the task using a very rich set of signals from such sources. In particular, we enrich structured data about the startup ecosystem with information from a business- and employment-oriented social networking service and from the web in general. Using these signals, we train a robust machine learning pipeline encompassing multiple base models using gradient boosting. We show that utilizing companies’ mentions on the Web yields a substantial performance boost in comparison to only using structured data about the startup ecosystem. We also provide a thorough analysis of the obtained model that allows one to obtain insights into both the types of useful signals discoverable on the Web and market mechanisms underlying the funding process.

The paper will be presented at CIKM 2018 in October 2018.

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots by Shaojie Jiang and Maarten de Rijke is available online now at this location.

Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

The paper will be presented at the SCAI 2018 workshop at EMNLP 2018 in October 2018.

Differentiable Unbiased Online Learning to Rank by Harrie Oosterhuis and Maarten de Rijke is available online now at this location.

Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.

The paper will be presented at CIKM 2018 in October 2018.

Calibration: A Simple Way to Improve Click Models by Alexey Borisov, Julia Kiseleva, Ilya Markov, and Maarten de Rijke is available online now at this location.

In the paper we show that click models trained with suboptimal hyperparameters suffer from the issue of bad calibration. This means that their predicted click probabilities do not agree with the observed proportions of clicks in the held-out data. To repair this discrepancy, we adapt a non-parametric calibration method called isotonic regression. Our experimental results showthat isotonic regression significantly improves click models trained with suboptimal hyperparameters in terms of perplexity, and that it makes click models less sensitive to the choice of hyperparameters. Interestingly, the relative ranking of existing click models in terms of their predictive performance changes depending on whether or not their predictions are calibrated. Therefore, we advocate that calibration becomes a mandatory part of the click model evaluation protocol.

The paper will be presented at CIKM 2018 in October 2018.

Attentive Encoder-based Extractive Text Summarization by Chong Feng, Fei Cai, Honghui Chen, and Maarten de Rijke is available online now at this location.

In previous work on text summarization, encoder-decoder architectures and attention mechanisms have both been widely used. Attention-based encoder-decoder approaches typically focus on taking the sentences preceding a given sentence in a document into account for document representation, failing to capture the relationships between a sentence and sentences that follow it in a document in the encoder. We propose an attentive encoder-based summarization (AES) model to generate article summaries. AES can generate a rich document representation by considering both the global information of a document and the relationships of sentences in the document. A unidirectional recurrent neural network (RNN) and a bidirectional RNN are considered to construct the encoders, giving rise to unidirectional attentive encoder-based summarization (Uni-AES) and bidirectional attentive encoder-based summarization (Bi-AES), respectively. Our experimental results show that Bi-AES outperforms Uni-AES. We obtain substantial improvements over a relevant start-of-the-art baseline.

The paper will be presented at CIKM 2018 in October 2018.

Mix ‘n Match: Integrating Text Matching and Product Substitutability within Product Search by Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas is available online now at this location.

Two products are substitutes if both can satisfy the same consumer need. Intrinsic incorporation of product substitutability—where substitutability is integrated within latent vector space models—is in contrast to the extrinsic re-ranking of result lists. The fusion of text matching and product substitutability objectives allows latent vector space models to mix and match regularities contained within text descriptions and substitution relations. We introduce a method for intrinsically incorporating product substitutability within latent vector space models for product search that are estimated using gradient descent; it integrates flawlessly with state-of-the-art vector space models. We compare our method to existing methods for incorporating structural entity relations, where product substitutability is incorporated extrinsically by re-ranking. Our method outperforms the best extrinsic method on four benchmarks. We investigate the effect of different levels of text matching and product similarity objectives, and provide an analysis of the effect of incorporating product substitutability on product search ranking diversity. Incorporating product substitutability information improves search relevance at the cost of diversity.

The paper will be presented at CIKM 2018 in October 2018.