Like energy in physical systems, the meaning of an utterance cannot be observed directly. Yet, Meaning preservation can be measured on naturally produced human input-output data in tasks such as translation, summarizing and paraphrasing. Physics discovered energy conservation laws. We are yet to discover adequate models of meaning conservation without observing meaning itself. I believe that meaning (or its representations) must emerge as a factor in meaning preserving models.

anguage data without trees is as arid as a desert. Linguistic grammars without data are decorative plants. Data is the soil on which we should grow our trees. But if we can't see the wood for the trees, might it not be better to stay in the desert?


Postal address
P.O. Box 94242
1090 GE
The Netherlands

Visiting address
Room F2.06
Science Park 107
Building F
+31 20 525 6573

+31 20 525 5206
Khalil Sima'an
خليل سمعان

Computational Linguistics
Vici Laureate 2013

Khalil in Turkey
"I see (bi-)trees everywhere"
'Brief CV 'Publications 'SLPL Lab.

'Teaching 'The O'ther

I am a professor of Computational Linguistics and Vici Laureate, I lead the Statistical language processing and learning lab. at the Institute for Logic, Language and Computation, University of Amsterdam.

I work on Computational Linguistics, Natural Language Processing and Artificial Intelligence. My focus is on statistical machine learning for natural language processing (NLP) tasks including machine translation, syntactic parsing, morphology and semantics. See the statistical language processing and learning lab for projects and group's research.
I have a PhD degree in Computational Linguistics from Utrecht University and was a postdoctoral fellow of the Royal Netherlands Academy for Arts and Sciences (KNAW) before joining the ILLC-UvA as assistant professor in 2003. I became associate professor at ILLC-UvA in 2011, and full professor by the summer of 2014. I have been visiting researcher at Technion (2000), University of Maryland (2002), Johns Hopkins University (summer workshops 2005) and CNGL at Dublin City University (frequently 2003-2011). I am also a laureate of the Vidi (2006) and Vici (2013) personal funding competitions of the Netherlands Organization for Scientific Research (NWO).

I serve on the Editorial Boards of Machine Translation  and  Journal of Natural Language Engineering, on  the Standing Elite Reviewer committee for TACL, as Advisory Board member for John Benjamins Publishers' book series on Natural Language Processing,
co-opted member of the EAMT Executive Committee (since 2012). As Advisory Board member of TRAMOOC European project. I also served as PC Chair for MT Summit XIV (2013) and International Conference on Parsing Technologies (2013), as Tutorial Co-chair for EMNLP 2015 and as Area Chair syntax and parsing for ACL 2010.


  • Jan 2019: Methodology developed in project DatAptor is now deployed and tested by TAUS. See the news TAUS Blog and  UvA News press releases.
  • BEER@ILLC-UvA version 1.1 with Milos Stanojević performs very well again as evaluation metric at WMT 2015 both in correlation with human judgements and as tuning metric (now available in MOSES as well).
  • SLPL members Milos Stanojević and Amir Kamran co-organize the WMT 2015 metrics and tuning tasks (together with Prague and other colleagues). 
  • BEER@ILLC-UvA metric with Milos Stanojević: Wins sentence level evaluation at WMT 2014.

Research Topics
  • Statistical Machine Translation
  • Statistical Parsing and Probabilistic Grammars
  • Statistical Learning for structured models
  • Computational and Cognitive Models of Language Processing
  • Natural Language Engineering and Technology
More than formal linguistics and logic: The case for Prediction and Learning