Like energy in physical systems, the meaning of an utterance cannot be observed directly. Yet, Meaning preservation can be measured on naturally produced human input-output data in tasks such as translation, summarizing and paraphrasing. Physics discovered energy conservation laws. We are yet to discover adequate models of meaning conservation without observing meaning itself. I believe that meaning (or its representations) must emerge as a factor in meaning preserving models.

anguage data without trees is as arid as a desert. Linguistic grammars without data are decorative plants. Data is the soil on which we should grow our trees. But if we can't see the wood for the trees, might it not be better to stay in the desert?


Postal address
P.O. Box 94242
1090 GE
The Netherlands

Visiting address
Room F2.06
Science Park 107
Building F
+31 20 525 6573

+31 20 525 5206
Khalil Sima'an
خليل سمعان

Computational Linguistics
Vici Laureate 2013

Khalil in Turkey
"I see (bi-)trees everywhere"
'Brief CV 'Publications 'SLPL Lab.
'Activities  'Teaching 'The O'ther

Professor of Computational Linguistics and Vici Laureate. I lead the Statistical language processing and learning lab. at the Institute for Logic, Language and Computation, University of Amsterdam.

I am a Computational Linguist. My work is on statistical learning for natural language processing including Machine Translation, Syntactic Parsing, Morphology and Semantics. See the statistical language processing and learning lab for projects and group's research.
I have a PhD degree in Computational Linguistics from Utrecht University and was a postdoctoral fellow of the Royal Netherlands Academy for Arts and Sciences (KNAW) before joining the ILLC-UvA as assistant professor in 2003. I became associate professor at ILLC-UvA in 2011, and full professor by the summer of 2014. I have been visiting researcher at Technion (2000), University of Maryland (2002), Johns Hopkins University (summer workshops 2005) and CNGL at Dublin City University (frequently 2003-2011). I am also a laureate of the Vidi (2006) and Vici (2013) personal funding competitions of the Netherlands Organization for Scientific Research (NWO).

I serve on the Editorial Boards of Machine Translation  and  Journal of Natural Language Engineering, on  the Standing Elite Reviewer committee for TACL, as Advisory Board member for John Benjamins Publishers' book series on Natural Language Processing and
co-opted member of the EAMT Executive Committee (since 2012).  I also served as PC Chair for MT Summit XIV (2013) and International Conference on Parsing Technologies (2013), and as Area Chair for ACL 2010.


  • BEER@ILLC-UvA metric with Milos Stanojević: Wins sentence level evaluation at WMT 2014.
  • Recent invited talks
    • Latent Reordering Grammar. QTLeap Project meeting, Prague 16 Apr 2015.
    • The Treasures of Translation Data. Science Faculty Colloquium, UvA, 1 Dec 2014.
    • Hierarchical Structure in Statistical Machine Translation, Tilburg University, April 2014.
    • Syntax in Statistical Machine Translation at e-Humanities in Action, Max-Planck Institute for Psycholinguistics, Nijmegen,  6 November 2013.
    • Big Data Adaptation at TAUS Industry Leaders Forum, Dublin, 11 June 2013.

Research Topics
  • Statistical Machine Translation
  • Statistical Parsing and Probabilistic Grammars
  • Statistical Learning for structured models
  • Computational and Cognitive Models of Language Processing
  • Natural Language Engineering and Technology
More than formal linguistics and logic: The case for Prediction and Learning