energy in physical systems, the
meaning of an utterance cannot be
observed directly. Yet, Meaning
preservation can be
measured on naturally produced human
input-output data in tasks such as
translation, summarizing and
paraphrasing. Physics discovered
energy conservation laws. We are yet
to discover adequate models of meaning
conservation without observing
meaning itself. I believe that meaning
(or its representations) must emerge
as a factor in meaning preserving
data without trees is as arid
as a desert. Linguistic grammars
without data are decorative plants.
Data is the soil on which we should
grow our trees. But if we can't see
the wood for the trees, might it not
be better to stay in the desert?
P.O. Box 94242
Science Park 107
+31 20 525 6573
+31 20 525 5206
of Computational Linguistics and Vici
Laureate. I lead the Statistical
language processing and learning lab.
at the Institute
for Logic, Language and
Linguist I work on statistical
learning for natural language processing
including Machine Translation, Syntactic
Parsing, Morphology and Semantics. See
language processing and learning lab
for projects and group's research.
I have a PhD degree in Computational
Linguistics from Utrecht
University and was a postdoctoral fellow
of the Royal
Netherlands Academy for Arts and
Sciences (KNAW) before joining the
ILLC-UvA as assistant professor in 2003.
I became associate professor at ILLC-UvA
in 2011, and full professor by the
summer of 2014. I have been visiting
researcher at Technion (2000),
University of Maryland (2002), Johns
Hopkins University (summer workshops
2005) and CNGL at Dublin City University
(frequently 2003-2011). I am also a
laureate of the Vidi (2006) and Vici
(2013) personal funding competitions of
Organization for Scientific Research
I serve on the Editorial Boards of Machine
Translation and Journal
of Natural Language Engineering,
on the Standing Elite Reviewer
committee for TACL,
as Advisory Board member for John
Benjamins Publishers' book series on
Natural Language Processing, co-opted
member of the EAMT
Executive Committee (since 2012). As
Advisory Board member of TRAMOOC
European project. I also
served as PC Chair for MT
Summit XIV (2013) and International
Conference on Parsing Technologies
(2013), as Tutorial Co-chair for
EMNLP 2015 and as Area Chair syntax and
parsing for ACL 2010.
2019: Methodology developed in project
is now deployed and tested by TAUS.
See the news TAUS
Blog and UvA
News press releases.
version 1.1 with
Milos Stanojević performs very well
again as evaluation metric at WMT
2015 both in correlation with human
judgements and as tuning metric (now
available in MOSES as well).
members Milos Stanojević and Amir
Kamran co-organize the WMT 2015
metrics and tuning tasks (together
with Prague and other
metric with Milos
Stanojević: Wins sentence level
evaluation at WMT 2014.
Domain Data Selection for MT.
Fundazione Bruno Kessler (FBK),
Trento, Italy, Nov 16 2015.
Treebanks and Reordering Grammar.
Keynote given at the International
Conference on Recent Advances in
NLP (RANLP), Hissarya,
Bulgaria 7 Sep 2015.
QTLeap Project meeting, Prague
16 Apr 2015.
Treasures of Translation Data.
Science Faculty Colloquium,
UvA, 1 Dec 2014.
Structure in Statistical
Machine Translation, Tilburg
University, April 2014.
- Syntax in
Statistical Machine Translation
in Action, Max-Planck
Institute for Psycholinguistics,
Nijmegen, 6 November 2013.
Data Adaptation at TAUS
Industry Leaders Forum,
Dublin, 11 June 2013.
- Statistical Machine
- Statistical Parsing and
- Statistical Learning for
- Computational and Cognitive
Models of Language Processing
- Natural Language Engineering
More than formal
linguistics and logic: The case for
Prediction and Learning