Vrij Nederland heeft een uitgebreid stuk over het gebruik van sociale media en sociale media analyse door politie- en inlichtingendiensten. Ik doe daarin enige uitspraken over het relatieve gemak waarmee allerlei persoonlijke kenmerken kunnen worden afgeleid uit sociale media, zowel uit tekst als uit andere signalen zoals “likes.”
Author: mdr
ECIR 2014 proceedings online
With 10 days to go to ECIR 2014, the ECIR 2014 proceedings are online now. You can find information about it at the Springer web site or access the online version directly at this page. Now playing: Portico Quartet — Rubidium
HPC grant
My colleague Lars Buitinck and I received a grant from the HPC fund to support the development of a more scalable version of xTAS, our extensible text analysis service. It’s the pipeline that we (and others) use for our text mining work. This is great news as it allows us to port modules to the…
The autonomous search engine
After Tuesday’s talk on personal data mining, I gave another talk to non-experts on Thursday. This time the topic was “The Autonomous Search Engine”. The backbone of the story is the move from supervised to weakly supervised technology development of one of the core components of search engines: rankers. Weak supervision in this context means…
Life mining talk
I gave a talk aimed at the general public on personal data mining last night, in Maastricht. The talk is about explaining what type of information can be mined from the content of open sources (news, social media, etc) using state of the art search and text mining technology. And the focus is on extracting personal information,…
Microsoft PhD Fellowship
For a proposal entitled “Leveraging Data Reuse for Efficient Ranker Evaluation in Information Retrieval”, my colleague Shimon Whiteson and I received funding. The proposal was submitted to the Microsoft Research PhD Scholarship Programme. The project is a collaboration with Filip Radlinksi and will run for three years, with a start planned in the fall. We’ll…
ECIR 2014 paper on blending vertical and web results online
“Blending Vertical and Web results: A Case Study using Video Intent” by Damien Lefortier, Pavel Serdyukov, Fedor Romanenko and Maarten de Rijke is available online now. Modern search engines aggregate results from specialized verticals into the Web search results. We study a setting where vertical and Web results are blended into a single result list, a setting…
ECIR 2014 paper on query-dependent contextualization of streaming data online
“Query-dependent contextualization of streaming data” by Nikos Voskarides, Daan Odijk, Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is available online. We propose a method for linking entities in a stream of short textual documents that takes into account context both inside a document and inside the history of documents seen so far. Our method uses a…
ECIR 2014 paper on cluster-based fusion for microblog search online
“The impact of semantic document expansion on cluster-based fusion for microblog search” by Shangsong Liang, Zhaochun Ren and Maarten de Rijke is available online now. Searching microblog posts, with their limited length and creative language usage, is challenging. We frame the microblog search problem as a data fusion problem. We examine the effectiveness of a recent cluster-based…
ECIR 2014 paper on predicting new concepts in social streams online
“Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams” by David Graus, Manos Tsagkias, Lars Buitinck and Maarten de Rijke is available online now. The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and…