Refereed publications Khalil Sima'an

    • 2018
      • Miguel Rios, Wilker Aziz, Khalil Sima'an:
        Deep Generative Model for Joint Alignment and Word Representation. NAACL-HLT 2018: 1011-1023
      • Joost Bastings, Wilker Aziz, Ivan Titov, Khalil Sima'an.
        Modeling Latent Sentence Structure in Neural Machine Translation
        In Extended abstract at ACL's NMT workshop, 2018.
    • 2017
      • Gideon Maillette de Buy Wenniger, Khalil Sima'an and Andy Way. 2017. "Elastic-substitution decoding for Hierarchical SMT: efficiency, richer search and double labels." MT Summit. pages 201--215. September 2017. Download paper ] [ Bibtex ] [ Presentation ] [ Code ]
      • Hoang Cuong and Khalil Sima'an. Induction of Latent Domains in Heterogeneous Corpora: A Case Study of Word Alignment.  Machine Translation Journal 31(4): 225-249 ().
      • Hoang Cuong and Khalil Sima'an. A Survey of Domain Adaptation for Statistical Machine Translation. Machine Translation Journal 31(4): 187-224 ().
      • Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima’an (2017). Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. EMNLP’17. Copenhagen, Denmark. [bib]
    • 2016
      • Joachim Daiber, Miloš Stanojević and Khalil Sima'an.   Universal Reordering via Linguistic Typology. In Proceedings COLING 2016.
      • Miloš Stanojević and Khalil Sima'an. Hierarchical Permutation Complexity for Word Order Evaluation. In Proceedings COLING 2016.
      • Joachim Daiber, Miloš Stanojević, Wilker Aziz and Khalil Sima'an. Examining the Relationship between Preordering and Word Order Freedom in Machine Translation. In proceedings First Conference on Statistical Machine Translation (WMT 2016), Berlin, August 2016.
      • Sophie Arnoult and Khalil Sima'an. Factoring Adjunction in Phrase-based SMT. Proceedings of the second workshop on Deep Machine Translation. Lisbon, 2016.

    • 2015
      • Hoang Cuong and Khalil Sima'an. Latent Domain Word Alignment for Heterogeneous Corpora. Proceedings HLT-NAACL 2015: 398-408.
      • Miloš Stanojević and Khalil Sima'an. Reordering Grammar Induction. Proceedings of the Conference on Empirical Methods in NLP 2015, EMNLP 2015, Lisboa, Portugal.
      • Miloš Stanojević and Khalil Sima'an. Evaluating MT systems with BEER. The Prague Bulletin of Mathematical Linguistics No. 104, 2015.
      • Miloš Stanojević and Khalil Sima'an. BEER 1.1: ILLC UvA submission to metrics and tuning task. The WMT 2015 Metrics and tuning tasks proceedings.
      • Joachim Diaber and Khalil Sima’an. Machine Translation with Source-Predicted Target Morphology. Proceedings of MT Summit XV. 2015. Miami, USA.
      • Joachim Daiber and Khalil Sima’an. Delimiting Morphosyntactic Search Space with Source-Side Reordering Models. Proceedings of the first Deep Machine Translation Workshop. 2015. Prague, Czech Republic. [Slides]
      • Sophie Arnoult and Khalil Sima'an: Modelling the Adjunct/Argument Distinction in Hierarchical Phrase-Based SMT.  Proceedings of the first Deep Machine Translation Workshop. 2015. Prague, Czech Republic.
      • Constantin Orasan, Alessandro Cattelan, Gloria Corpas Pastor, Josef van Genabith, Manuel Herranz, Juan José Arevalillo, Qun Liu, Khalil Sima'an and Lucia Specia. The EXPERT project: Advancing the state of the art in hybrid translation technologies.  In proceedings Translating and the Computer 37, 2015.




    • 2014
      • Gideon Maillette de Buy Wenniger and Khalil Sima'an. Bilingual Markov Labels for Hierarchical SMT. Proceedings Workshop on SSST'2014 @ EMNLP 2014.
      • Milos Stanojević and Khalil Sima'an. Fitting Sentence Level Translation Evaluation with Many Dense Features. Proceedings EMNLP 2014. Qatar.
      • Hoang Cuong and Khalil Sima'an. Latent Domain Phrase-Based Translation Models for Adaptation. Proceedings EMNLP 2014. Qatar.
      • Hoang Cuong and Khalil Sima'an. Latent Domain Translation Models in Mix-of-Domains Haystack. Proceedings COLING 2014. Dublin, Ireland.
      • Milos Stanojević and Khalil Sima'an. BEER: Better Evaluation by Ranking. Proceedings Workshop on Statistical Machine Translation (WMT) 2014. Software for Beer: available for download (no additives, pure ingredients and straight from the tap!)
      • Sophie Arnoult and Khalil Sima'an. Translation Equivalence of Adjuncts. Proceedings Workshop on SSST'2014 @ EMNLP 2014.
      • Milos Stanojevic and Khalil Sima'an. Evaluating Word Order Recursively over  Permutation Forests. Proceedings Workshop on SSST'2014 @ EMNLP 2014. Software also included in Beer: available for download (no additives, pure ingredients and straight from the tap!)
      • Joost Bastings and Khalil Sima'an: All Fragments Count in Parser Evaluation. Proceedings LREC 2014: 78-82. Software available for download from  FREVAL: https://github.com/bastings/freval
      • Gideon Maillette de Buy Wenniger and Khalil Sima'an. Visualization, Search and Analysis of Hierarchical Translation Equivalence in Machine Translation Data. The Prague Bulletin of Mathematical Linguistics. Number 101, pages 43-54. April 2014. Software available for download https://bitbucket.org/teamwildtreechase/hatparsing
    • 2013
      • Gideon Maillette de Buy Wenniger and Khalil Sima'an. A Formal Characterization of Parsing Word Alignments by Synchronous Grammars with Empirical Evidence to the ITG Hypothesis. In proceedings of NAACL workshop on Syntax, Semantics and Structure in Statistical Translation (SSST), June 2013, Atlanta, USA.
      • Gideon Maillette de Buy Wenniger and Khalil Sima'an. Hierarchical Alignment Decomposition Labels for Hiero Grammar Rules. In proceedings of NAACL workshop on Syntax, Semantics and Structure in Statistical Translation (SSST), June 2013, Atlanta, USA.
      • Tejaswini Deoskar, Markos Mylonakis and Khalil Sima'an. Learning Structural Dependencies of Words in the Zipfian Tail. Journal of Logic and Computation, 2013. 
      • Khalil Sima'an and Gideon Maillete de Buy Wenniger. Hierarchical Alignment Trees: A Recursive Factorization of Reordering in Word Alignments with Empirical Results. Technical report. pdf file
    • 2012 (on sabbatical January-August)
      • Maxim Khalilov and Khalil Sima'an. Statistical Translation After Source Reordering: Oracles, Context-Aware Models and Empirical Analysis.  Journal of Natural Language Engineering (JNLE), to appear 2012.
      • Sophie Arnoult and Khalil Sima'an. Adjunct Alignment in Translation Data with an Application to Phrase-Based Statistical Machine Translation. European Association for Machine Translation. Trento, Italy, May 2012.
    • 2011
        
      • Tejaswini Deoskar, Markos Mylonakis and Khalil Sima'an. Learning Structural Dependencies of Words in the Zipfian Tail. International Conf. on Parsing Technologies (IWPT 2011), Dublin, Ireland.Electronic Edition
      • Markos Mylonakis and Khalil Sima'an. Learning Hierarchical Translation Structure with Linguistic Annotations. In the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL:HLT 2011).pdf file
      • Hany Hassan, Khalil Sima'an and Andy Way. Efficient Accurate Direct Translation Models: One Tree at a Time. Machine Translation Journal, Springer. 2011. pdf file
      • Maxim Khalilov and Khalil Sima'an. Context-Sensitive Syntactic Source-Reordering by Statistical Transduction. Proc. of the The 5th International Joint Conference on Natural Language Processing (IJCNLP'11), pages - to appear, Chiang Mai (Thailand), November 2011.  
      • Maxim Khalilov and Khalil Sima'an. ILLC-UvA translation system for EMNLP-WMT 2011. Proc. of the EMNLP 2011 5th Workshop on Statistical Machine Translation (WMT'11), pages - to appear, Edinburg (UK), July 2011.
    • 2010
      • Maxim Khalilov and Khalil Sima'an. ILLC-UvA machine translation system for the IWSLT 2010 evaluation. Proc. of the 7th Int. Workshop on Spoken Language Translation (IWSLT'10), Paris (France), December 2010. PDF
      • Markos Mylonakis and Khalil Sima'an. Learning Probabilistic Synchronous CFGs for Phrase Translation    Models. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning (CoNLL 2010), Uppsala, Sweden, July 2010  [pdf]
      • Gideon Maillette de Buy Wenniger, Maxim Khalilov and Khalil Sima'an. A Toolkit for Visualizing the Coherence of Tree-based Reordering with Word-Alignments. In The Prague Bulletin of Mathematical Linguistics, Charles University Prague, 2010.
      • Reut Tsarfaty and Khalil Sima'an. Modeling Morphosyntactic Agreement for Constituency-Based Parsing of Modern Hebrew. In: Proceedings of the first workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL) at NA-ACL. Los Angeles, CA, USA, June 6, 2010. [pdf]
      • Maxim Khalilov and Khalil Sima'an. A discriminative syntactic model for source permutation via tree transduction. Proc. of The Fourth Workshop on Syntax and Structure in Statistical Translation (SSST-4) at the 23rd International Conference on Computational Linguistics (COLING'10), pages, Beijing (China), August 2010.
      • Maxim Khalilov and Khalil Sima'an. Source reordering using MaxEnt classifiers and supertags. Proc. of the 14th Annual Conference of the European Association for Machine Translation (EAMT'10), pp. 292-299, St.Raphael (France), 2010.
    • 2009
      • Reut Tsarfaty and Khalil Sima'an. Evaluating an Alternative to Head-Driven Approaches to Parsing a (Relatively) Free Word-Order Language. In Proceedings of the Conference on Empirical Methods in NLP (EMNLP'09), Singapore. [pdf]
      • Hany Hassan, Khalil Sima'an and Andy Way. A Syntactified Direct Translation Model with Linear-Time Decoding. In Proceedings of the Conference on Empirircal Methos in NLP (EMNLP'09), Singapore.[pdf]
      • Hany Hassan, Khalil Sima'an and Andy Way. Lexicalized Semi-Incremental Dependency Parsing. In proceedings Recent Advances in NLP (RANLP'09), Borovets, Bulgaria. [pdf]
      • Tejaswini Deoskar, Mats Rooth and Khalil Sima'an. Smoothing fine-grained PCFG Lexicons. Proceedings International Conference on Parsing Technologies, Oct 2009. [pdf]
    • 2008
      • Khalil Sima'an and Markos Mylonakis. Better Statistical Estimation Can Benefit All Phrases in Phrase-Based Statistical Machine Translation. In Proceedings IEEE Workshop on Spoken Language Technology (SLT) 2008, Goa, India.
      • Hany Hassan, Khalil Sima'an and Andy Way. A Syntactic Language Model based on Incremental CCG Parsing. In Proceedings IEEE Workshop on Spoken Language Technology (SLT) 2008, Goa, India.


      • Markos Mylonakis and Khalil Sima'an. Phrase Translation Probabilities with ITG Priors and Smoothing as Learning Objective.In Proceedings Conf. on Empirical Methods in NLP (EMNLP'08), 2008.
      • Barbara Plank and Khalil Sima'an. Parsing with Subdomain Instance Weighting from Raw Corpora. In proceedings Interspeech 2008, Australia, Sep. 2008.
      • Reut Tsarfaty and Khalil Sima'an. Relational Realizational Parsing. In proceedings COLING 2008, Manchester, UK, August 2008.
      • Hany Hassan, Khalil Sima'an and Andy Way. Syntactically Lexicalized Phrase-Based Statistical Translation. In IEEE Transactions on Audio, Speech and Language Processing, August 2008.
      • Barbara Plank and Khalil Sima'an. Subdomain Sensitive Statistical Parsing using Raw Corpora. In Proceedings sixth International conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco.
      • Roy Bar-Haim, Khalil Sima'an and Yoad Winter. Part-of-Speech Tagging of Modern Hebrew Text.  Journal of Natural Language Engineering (J-NLE), 14(2):223-251, 2008.
    • 2007
      • Markos Mylonakis,  Khalil Sima'an and R. Hwa.  Unsupervised Estimation for Noisy-Channel Models.  In 24th Annual International Conference on Machine Learning (ICML 2007).
      • Hany Hassan, Khalil Sima'an and Andy Way. Supertagged Phrase-Based Statistical Machine Translation. In Proceedings of 45th Annual Meeting of the Association for Comp. Linguistics (ACL'07). 
      • Reut Tsarfaty and Khalil Sima'an. Accurate Unlexicalized Parsing for Modern Hebrew. In Proceedings of Text, Speech and Dialog (TSD'07). Lecture Notes in Computer Science (LNCS). Pilsen, Czech Republic, September 2007.
      • Reut Tsarfaty and Khalil Sima'an. Three-Dimensional Parametrization for Parsing Morphologically Rich Languages.  In Proceedings of the International Conference on Parsing Technologies (IWPT'07). Prague, Czech Republic, June 2007. 
      • Saib Mansour, Khalil Sima'an and Yoad Winter. Smoothing a Lexicon-based POS tagger for Arabic and Hebrew.  In proceedings of  ACL 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources. Prague, Czech Republic, 2007. Presented also as extended abstract at Bar Ilan Symposium on Artificial Intelligence (BISFAI 2007),
      • Markos Mylonakis and Khalil Sima'an. Translation Lexicon Estimates from Non-Parallel Corpora Pairs.  In Proceedings Belgian-Netherlands AI Conference (BNAIC), Utrecht, 2007. BNAIC'07 Best Paper Award!!.
      • Reut Tsarfaty and Khalil Sima'an. Dimensions of Parameterization for Modern Hebrew Statistical Parsing. Extended abstract at Bar Ilan Symposium on Artificial Intelligence (2007),