- 2018
- Miguel Rios, Wilker Aziz, Khalil
Sima'an:
Deep Generative Model for Joint Alignment and Word
Representation. NAACL-HLT 2018: 1011-1023
- Joost Bastings, Wilker Aziz, Ivan Titov, Khalil
Sima'an.
Modeling Latent Sentence Structure in Neural Machine
Translation
In Extended abstract at ACL's NMT workshop,
2018.
- 2017
- Gideon Maillette de Buy Wenniger, Khalil Sima'an
and Andy Way. 2017. "Elastic-substitution decoding
for Hierarchical SMT: efficiency, richer search and
double labels." MT Summit. pages 201--215. September
2017. [ Download paper ] [ Bibtex ] [ Presentation ] [ Code ]
- Hoang Cuong and Khalil Sima'an. Induction of
Latent Domains in Heterogeneous Corpora: A Case
Study of Word Alignment. Machine Translation
Journal
31(4): 225-249 (2017).
- Hoang Cuong and Khalil Sima'an. A Survey of Domain
Adaptation for Statistical Machine Translation.
Machine Translation Journal
31(4): 187-224 (2017).
- Joost Bastings, Ivan Titov, Wilker Aziz, Diego
Marcheggiani, Khalil Sima’an (2017). Graph
Convolutional Encoders for Syntax-aware Neural
Machine Translation. EMNLP’17. Copenhagen,
Denmark. [bib]
- 2016
- Joachim Daiber, Miloš Stanojević and Khalil
Sima'an. Universal
Reordering via Linguistic Typology. In Proceedings
COLING 2016.
- Miloš Stanojević and Khalil Sima'an. Hierarchical
Permutation Complexity for Word Order Evaluation. In
Proceedings COLING 2016.
- Joachim Daiber, Miloš Stanojević, Wilker Aziz and
Khalil Sima'an. Examining the Relationship between
Preordering and Word Order Freedom in Machine
Translation. In proceedings First Conference on
Statistical Machine Translation (WMT 2016), Berlin,
August 2016.
- Sophie Arnoult and Khalil Sima'an. Factoring
Adjunction in Phrase-based SMT. Proceedings of the
second workshop on Deep Machine Translation. Lisbon,
2016.
- Philip Schulz, Wilker Aziz and Khalil Sima'an.
Word Alignment without NULL Words. Proceedings ACL
2016, Berlin, August 2016.
-
Lucia
Specia, Stella Frank, Khalil Sima’an and Desmond
Elliott. 2016. A
Shared Task on Multimodal Machine Translation
and Crosslingual Image Description. In
Proceedings of the First Conference on Statistical
Machine Translation (WMT). [Slides]
-
Desmond
Elliott, Stella Frank, Khalil Sima’an, and Lucia
Specia.
Multi30K: Multilingual English-German Image
Descriptions. 2016. In Proceedings of the
5th Workshop on Vision and Language (VL’16).
- Hoang Cuong, Stella Frank and Khalil Sima'an.
ILLC-UvA Adaptation System (Scorpio) at WMT'16
IT-DOMAIN Task. In proceedings First Conference on
Statistical Machine Translation (WMT 2016), Berlin,
August 2016.
- Hoang Cuong, Khalil Sima'an and Ivan Titov.
Adapting to All Domains at Once: Rewarding Domain
Invariance in SMT. Transactions
of the Association for Computational Linguistics
(TACL) 2016.
- Gideon Maillette de Buy Wenniger and Khalil
Sima'an. Labeling Hierarchical Phrase-Based Models
without Linguistic Resources. Machine Translation
Journal. Online
January 2016.
- 2015
- Hoang Cuong and Khalil Sima'an. Latent Domain Word
Alignment for Heterogeneous Corpora. Proceedings HLT-NAACL 2015: 398-408.
- Miloš Stanojević and Khalil Sima'an. Reordering
Grammar Induction. Proceedings of the Conference on
Empirical Methods in NLP 2015, EMNLP 2015,
Lisboa, Portugal.
- Miloš Stanojević and Khalil Sima'an. Evaluating
MT systems with BEER. The Prague Bulletin of
Mathematical Linguistics No. 104, 2015.
- Miloš Stanojević and Khalil Sima'an. BEER 1.1:
ILLC UvA submission to metrics and tuning task. The
WMT 2015 Metrics and tuning tasks proceedings.
- Joachim Diaber and Khalil
Sima’an. Machine Translation
with Source-Predicted Target Morphology. Proceedings
of MT Summit XV. 2015. Miami, USA.
- Joachim Daiber and Khalil
Sima’an. Delimiting
Morphosyntactic Search Space with Source-Side
Reordering Models. Proceedings of the
first Deep Machine Translation Workshop. 2015.
Prague, Czech Republic. [Slides]
- Sophie Arnoult and Khalil
Sima'an: Modelling the Adjunct/Argument
Distinction in Hierarchical Phrase-Based SMT.
Proceedings of the first Deep Machine Translation
Workshop. 2015. Prague, Czech Republic.
- Constantin Orasan, Alessandro
Cattelan, Gloria Corpas Pastor, Josef van Genabith,
Manuel Herranz, Juan José Arevalillo, Qun Liu,
Khalil Sima'an and Lucia Specia. The EXPERT project:
Advancing the state of the art in hybrid translation
technologies. In proceedings Translating and
the Computer 37, 2015.
- 2014
- Gideon Maillette de Buy Wenniger and Khalil
Sima'an. Bilingual Markov Labels for Hierarchical
SMT. Proceedings Workshop on SSST'2014 @ EMNLP 2014.
- Milos Stanojević and Khalil Sima'an. Fitting
Sentence Level Translation Evaluation with Many
Dense Features. Proceedings EMNLP 2014. Qatar.
- Hoang Cuong and Khalil Sima'an. Latent Domain
Phrase-Based Translation Models for Adaptation.
Proceedings EMNLP 2014. Qatar.
- Hoang Cuong and Khalil Sima'an. Latent Domain
Translation Models in Mix-of-Domains Haystack.
Proceedings COLING 2014. Dublin, Ireland.
- Milos Stanojević and Khalil Sima'an. BEER: Better
Evaluation by Ranking. Proceedings Workshop on
Statistical Machine Translation (WMT) 2014. Software
for Beer:
available for download (no additives, pure
ingredients and straight from the tap!)
- Sophie Arnoult and Khalil Sima'an. Translation
Equivalence of Adjuncts. Proceedings Workshop on
SSST'2014 @ EMNLP 2014.
- Milos Stanojevic and Khalil Sima'an. Evaluating
Word Order Recursively over Permutation
Forests. Proceedings Workshop on SSST'2014 @ EMNLP
2014. Software also included in Beer:
available for download (no additives, pure
ingredients and straight from the tap!)
- Joost Bastings and Khalil Sima'an: All Fragments
Count in Parser Evaluation. Proceedings LREC 2014:
78-82. Software available for download from
FREVAL: https://github.com/bastings/freval
- Gideon Maillette de Buy Wenniger and Khalil
Sima'an. Visualization, Search and Analysis of
Hierarchical Translation Equivalence in Machine
Translation Data. The Prague Bulletin of
Mathematical Linguistics. Number 101, pages 43-54.
April 2014. Software available for download https://bitbucket.org/teamwildtreechase/hatparsing
- 2013
- Gideon Maillette de Buy Wenniger and Khalil
Sima'an. A Formal Characterization of Parsing Word
Alignments by Synchronous Grammars with Empirical
Evidence to the ITG Hypothesis. In proceedings of
NAACL workshop on Syntax, Semantics and Structure in
Statistical Translation (SSST), June 2013, Atlanta,
USA.
- Gideon Maillette de Buy Wenniger and Khalil
Sima'an. Hierarchical Alignment Decomposition Labels
for Hiero Grammar Rules. In proceedings of NAACL
workshop on Syntax, Semantics and Structure in
Statistical Translation (SSST), June 2013, Atlanta,
USA.
- Tejaswini Deoskar, Markos Mylonakis and Khalil
Sima'an. Learning Structural Dependencies of Words
in the Zipfian Tail. Journal of Logic and
Computation, 2013.
- Khalil Sima'an and Gideon Maillete de Buy
Wenniger. Hierarchical Alignment Trees: A Recursive
Factorization of Reordering in Word Alignments with
Empirical Results. Technical report.
- 2012 (on sabbatical January-August)
- Maxim Khalilov and Khalil Sima'an. Statistical
Translation After Source Reordering: Oracles,
Context-Aware Models and Empirical Analysis.
Journal of Natural Language Engineering (JNLE), to
appear 2012.
- Sophie Arnoult and Khalil Sima'an. Adjunct
Alignment in Translation Data with an Application to
Phrase-Based Statistical Machine Translation.
European Association for Machine Translation.
Trento, Italy, May 2012.
- 2011
- Tejaswini Deoskar, Markos Mylonakis and Khalil
Sima'an. Learning Structural Dependencies of Words
in the Zipfian Tail. International Conf. on Parsing
Technologies (IWPT 2011), Dublin, Ireland.
- Markos Mylonakis and Khalil Sima'an. Learning
Hierarchical Translation Structure with Linguistic
Annotations. In the Proceedings of the 49th Annual
Meeting of the Association for Computational
Linguistics: Human Language Technologies (ACL:HLT
2011).
- Hany Hassan, Khalil Sima'an and Andy Way.
Efficient Accurate Direct Translation Models: One
Tree at a Time. Machine Translation Journal,
Springer. 2011.
- Maxim Khalilov and Khalil Sima'an.
Context-Sensitive Syntactic Source-Reordering by
Statistical Transduction. Proc. of the The 5th
International Joint Conference on Natural Language
Processing (IJCNLP'11), pages - to appear, Chiang
Mai (Thailand), November 2011.

- Maxim Khalilov and Khalil Sima'an. ILLC-UvA
translation system for EMNLP-WMT 2011. Proc. of the
EMNLP 2011 5th Workshop on Statistical Machine
Translation (WMT'11), pages - to appear, Edinburg
(UK), July 2011.
- Maxim Khalilov and Khalil Sima'an. ILLC-UvA
machine translation system for the IWSLT 2010
evaluation. Proc. of the 7th Int. Workshop on Spoken
Language Translation (IWSLT'10), Paris (France),
December 2010.
- Markos Mylonakis and Khalil Sima'an. Learning
Probabilistic Synchronous CFGs for Phrase
Translation Models.
In Proceedings of the Fourteenth Conference on
Computational Natural Language Learning (CoNLL
2010), Uppsala, Sweden, July 2010
[pdf]
- Gideon Maillette de Buy Wenniger, Maxim Khalilov
and Khalil Sima'an. A Toolkit for Visualizing the
Coherence of Tree-based Reordering with
Word-Alignments. In The Prague Bulletin of
Mathematical Linguistics, Charles University Prague,
2010.
- Reut Tsarfaty and Khalil Sima'an. Modeling
Morphosyntactic Agreement for Constituency-Based
Parsing of Modern Hebrew. In: Proceedings of the
first workshop on Statistical Parsing of
Morphologically Rich Languages (SPMRL) at NA-ACL.
Los Angeles, CA, USA, June 6, 2010. [pdf]
- Maxim Khalilov and Khalil Sima'an. A
discriminative syntactic model for source
permutation via tree transduction. Proc. of The
Fourth Workshop on Syntax and Structure in
Statistical Translation (SSST-4) at the 23rd
International Conference on Computational
Linguistics (COLING'10), pages, Beijing (China),
August 2010.
- Maxim Khalilov and Khalil Sima'an. Source
reordering using MaxEnt classifiers and supertags.
Proc. of the 14th Annual Conference of the European
Association for Machine Translation (EAMT'10), pp.
292-299, St.Raphael (France), 2010.
- 2009
- Reut Tsarfaty and Khalil Sima'an. Evaluating an
Alternative to Head-Driven Approaches to Parsing a
(Relatively) Free Word-Order Language. In
Proceedings of the Conference on Empirical Methods
in NLP (EMNLP'09), Singapore. [pdf]
- Hany Hassan, Khalil Sima'an and Andy Way. A
Syntactified Direct Translation Model with
Linear-Time Decoding. In Proceedings of the
Conference on Empirircal Methos in NLP (EMNLP'09),
Singapore.[pdf]
- Hany Hassan, Khalil Sima'an and Andy Way.
Lexicalized Semi-Incremental Dependency Parsing. In
proceedings Recent Advances in NLP (RANLP'09),
Borovets, Bulgaria. [pdf]
- Tejaswini Deoskar, Mats Rooth and Khalil Sima'an.
Smoothing fine-grained PCFG Lexicons. Proceedings
International Conference on Parsing Technologies,
Oct 2009. [pdf]
- 2008
- Khalil Sima'an and Markos Mylonakis. Better
Statistical Estimation Can Benefit All Phrases in
Phrase-Based Statistical Machine Translation. In Proceedings IEEE
Workshop on Spoken Language Technology (SLT) 2008,
Goa, India.
- Hany Hassan, Khalil Sima'an and Andy Way. A
Syntactic Language Model based on Incremental CCG
Parsing. In Proceedings
IEEE Workshop on Spoken Language Technology (SLT)
2008, Goa, India.
-
Markos Mylonakis and Khalil Sima'an. Phrase
Translation Probabilities with ITG Priors and
Smoothing as Learning Objective.In Proceedings Conf.
on Empirical Methods in NLP (EMNLP'08),
2008.
- Barbara Plank and Khalil Sima'an. Parsing with
Subdomain Instance Weighting from Raw Corpora. In
proceedings Interspeech
2008, Australia, Sep. 2008.
- Reut Tsarfaty and Khalil Sima'an. Relational
Realizational Parsing. In proceedings COLING 2008,
Manchester, UK, August 2008.
- Hany Hassan, Khalil Sima'an and Andy Way.
Syntactically Lexicalized Phrase-Based Statistical
Translation. In IEEE Transactions
on Audio, Speech and Language Processing, August
2008.
- Barbara Plank and Khalil Sima'an. Subdomain
Sensitive Statistical Parsing using Raw Corpora. In
Proceedings sixth International conference on
Language Resources and Evaluation (LREC'08),
Marrakech, Morocco.
- Roy Bar-Haim, Khalil Sima'an and Yoad Winter.
Part-of-Speech Tagging of Modern Hebrew Text.
Journal of Natural Language Engineering (J-NLE),
14(2):223-251, 2008.
- 2007
- Markos Mylonakis, Khalil Sima'an and R.
Hwa. Unsupervised Estimation for Noisy-Channel
Models. In
24th Annual International Conference on Machine
Learning (ICML
2007).
- Hany Hassan, Khalil Sima'an and Andy Way. Supertagged
Phrase-Based Statistical Machine Translation. In
Proceedings of 45th Annual Meeting of the
Association for Comp. Linguistics (ACL'07).
- Reut Tsarfaty and Khalil Sima'an. Accurate
Unlexicalized Parsing for Modern Hebrew. In
Proceedings of Text, Speech and Dialog (TSD'07).
Lecture Notes in Computer Science (LNCS). Pilsen,
Czech Republic, September 2007.
- Reut Tsarfaty and Khalil Sima'an.
Three-Dimensional Parametrization for Parsing
Morphologically Rich Languages. In Proceedings
of the International Conference on Parsing
Technologies (IWPT'07). Prague, Czech Republic, June
2007.
- Saib Mansour, Khalil Sima'an and Yoad Winter.
Smoothing a Lexicon-based POS tagger for Arabic and
Hebrew. In proceedings of ACL 2007
Workshop on Computational Approaches to Semitic
Languages: Common Issues and Resources. Prague,
Czech Republic, 2007. Presented also as extended
abstract at Bar Ilan Symposium on Artificial
Intelligence (BISFAI 2007),
- Markos Mylonakis and Khalil Sima'an. Translation
Lexicon Estimates from Non-Parallel Corpora
Pairs. In Proceedings Belgian-Netherlands AI
Conference (BNAIC), Utrecht, 2007. BNAIC'07 Best
Paper Award
.
- Reut Tsarfaty and Khalil Sima'an. Dimensions of
Parameterization for Modern Hebrew Statistical
Parsing. Extended abstract at Bar Ilan Symposium on
Artificial Intelligence (2007),
- 2006
- Khalil Sima'an, Maarten de Rijke, Remko Scha and
Rob van Son (eds.) Proceedings
of 16th Computational Linguistics in the
Netherlands (selected papers from CLIN 2005
meeting). Edited volume, December 2006.
- Khalil Sima'an. Book review of Applied
Combinators on Words (M. Lothaire). In
Computational Linguistics (Vol 32, No. 3), Briefly
Noted Section, 2006
- Hany
Hassan, Mary Hearne, Khalil Sima'an and
Andy Way. Syntactic
Phrase-based
Statistical Machine Translation.
Proceedings IEEE/ACL first
International Workshop on Spoken Language
Technology (SLT), December 2006, Aruba.
- Rebecca
Hwa, Carol Nichols and Khalil
Sima'an. Corpus
Variations for Translation Lexicon Induction.
In proceedings of the Association for Machine
Translation in the Americas (AMTA 2006).
- D.
Prescher, Remko Scha, Khalil Sima'an and Andreas
Zollmann. What are
Treebank Grammars? In proceedings
of the Belgian-Netherlands Artificial
Intelligence Conference (BNAIC), 2006, Namur,
Belguim, 2006.
- 2005 and before
- Andreas Zollmann and Khalil Sima'an. A
Consistent and Efficient Estimator for
Data-Oriented Parsing. Journal of Automata,
Languages and Combinatorics (JALC), Vol. 10 (2005)
Number 2/3, pages 367-388. In short: As far as I
know, first proof of consistency (in the limit) of
an estimator for models based on probabilistic
grammars. Based on (Sima'an & Buratto 2003) the
estimation by smoothing approach for DOP is enhanced
here by held-out estimation (and leave-one-out),
which leads to efficiency gains. For simplicity and
efficiency reasons, the DOP* estimator is restricted
to shortest-derivations instead of EM-training.
- Roy Bar-Haim, Khalil Sima'an and Yoad
Winter. Choosing
an
Optimal
Architecture for Segmentation and POS-Tagging of
Modern Hebrew. In proceedings of
ACL 2005 Workshop on Computational Approaches to
Semitic Languages.
ps pdf
abstract
MorphTagger
- D.
Prescher, Remko Scha, Khalil
Sima'an, Andreas Zollmann. On the
Statistical Consistency of DOP Estimators. In
Proceedings of the 14th Meeting of Computational
Linguistics in the Netherlands (CLIN), 15 pages.
Antwerp, Belgium. In short: We prove that unbiased
statistical estimators for all-fragments DOP models
necessarily lead to overfitting. We also prove that
all-fragment DOP models can capture any distibution
over sentence-parse pairs (which PCFGs cannot do).
Finally we show that all-fragments DOP must be
treated as a nonparametric learning model with
consistent estimators to be found in smoothing
techniques, just like other nonparametric and
memory-based methods. This paper provides the
theoretical and mathematical foundation for the
previous papers [Sima'an and Buratto 2003; Hearne
and Sima'an 2003].
- Mary Hearne and Khalil Sima'an. Structured
Parameter Estimation for LFG-DOP (pre-publication
version). In Recent Advances in Natural
Language Processing III. N. Nicolov, K. Bontcheva,
G. Angelova and R. Mitkov (eds). Current
Issues
in
Linguistic Theory 260. John Benjamins
Publishing Company.
- Khalil Sima'an. Robust
Data-Oriented
Understanding
of Spoken Utterances. In H. Bunt, J.
Carroll and G. Satta (eds.), New Developments in
Parsing Technologies, pages 323-338,
Kluwer (2004). (In short: Statistical
parsing of word-lattices output by a
speech-recognizer under a DOP-like language model
enriched with Lambda-calculus style update
semantics).
- O. Tsur, M. de Rijke, and Khalil Sima'an. BioGrapher: Biography Questions as
a Restricted Domain Question Answering Task.
In Proceedings of the ACL 2004 Workshop on Question
Answering in Restricted Domains, 2004.
- Rens Bod, Remko Scha and Khalil Sima'an (eds.). Data-Oriented
Parsing (edited volume) (410 pp.)
Studies in Computational Linguistics, CSLI
Publications, University of Chicago Press,
2003. Contributions by Rens Bod, Remko Bonnema, John
Carroll, Jean-C�dric Chappelier, David Chiang, Ido
Dagan, Guy De Pauw, Joshua Goodman, Lars Hoogweg,
Aravind Joshi, Ronald Kaplan, Yuval Krymolowski,
G�nter Neumann, Arjen Poutsma, Martin Rajman, Anoop
Sarkar, Remko Scha, Khalil Sima'an, Srinivas
Bangalore, Andy Way, David Weir and Menno van
Zaanen.
- Rens Bod, Remko Scha and Khalil Sima'an:
"Introduction." In: Rens Bod, Remko Scha and Khalil
Sima'an (eds.): Data-Oriented Parsing. Stanford:
CSLI Publications, 2003, pp. 1-9.
- Khalil Sima'an and L. Buratto. Backoff
Parameter Estimation for the DOP Model In N.
Lavrac, D. Gamberger, H. Blockeel and L. Todorovski
(ed.). Proceedings of the European Conference on
Machine Learning (ECML'03), Lecture Notes in
Artificial Intelligence (LNAI 2837), pages 373-384,
Springer, 2003. In short: The first consistent
estimators for the DOP model are presented in this
paper. It is shown how the DOP model parameters can
be structured into a connected, directed acyclic
graph that allows parameter estimation by smoothing
(just like K-NN approaches) using backoff or
interpolation. Insights that can be transferred for
the estimation of other models in NLP and possibly
other areas.
- M. Hearne and Khalil Sima'an. Structured
Parameter
Estimation
for LFG-DOP by Backoff In Proceedings of
International Conference on Recent Advances in
Natural Language Processing (RANLP'03),
Bulgaria, 2003.
- L. Buratto and Khalil Sima'an. Backoff
DOP:
Parameter
Estimation by Backoff In V.
Matousek and P. Mautner (eds.). Proceedings of the
International Conference on Text, Speech and
Dialogue (TSD'03), Lecture Notes in Artificial
Intelligence (LNAI 2807), Springer, 2003.(8
pages)
-
Khalil Sima'an. On
Maximizing
Metrics
for Syntactic Disambiguation In
Proceedings of the International Workshop on Parsing
Technologies (IWPT'03).Nancy, France,
April2003 In short: In this paper I develop
the ``Maximizing Metrics" (MM) disambiguation
idea(Joshua Goodman -- phd thesis) to linguistically
more adequate metrics and algorithms showing that
stricter optimizing metrics could provide better
results than optimizing a weak evaluation metric.
The MM method is also known in
speech-recognition with the names ``Maximum-Expected
Recall" (Goodman),
``Minimum
Risk
Decoding" (Goel
and Byrne), ``Word Error Minimization" (Stolcke
et al 1997). One of the algorithms described
in this paper is referred to with the "Max-Rule"
algorihm in recent work (e.g.,
Matsuzaki et al 2005; Petrov
et al 2008).
- Khalil Sima'an.
Empirical
validity and technological viability:
Probabilistic models of Natural Language
Processing. In R. Bernardi and M.
Moortgat (eds.), Linguistic
Corpora
and
Logic Based Grammar Formalisms,
CoLogNET Area 6, 2003
- Khalil Sima'an.
Computational
Complexity
of
Probabilistic Disambiguation NP-Completeness
results for Parsing Problems that arise in speech
and language Processing Applications . In the
journal Grammars
vol. 5 (2), Kluwer Publishers,
2002. The paper provides proofs of NP-Completeness
for the probabilistic disambiguation problems (1)
Selecting the most probable sentence/path in a
lattice/word-graph/SFSA (Stochastic Finite State
Automaton) using a parser based on PCFG/SCFG or
STSG/PTSG and (2) Selecting the most probable parse
for an input sentence or lattice using a parser
based on STSG. The results hold for weighted
versions of these grammars also.
- G. Musillo and Khalil Sima'an. Towards
Comparing
Parsers
from Different Linguistic Frameworks: An
Information Theoretic Approach.
Proceedings of Beyond PARSEVAL: Towards Improved
Evaluation Measures for Parsing Systems, LREC'02,
Las Palmas, Gran Canaria, Spain, 2002.
- G. Infante-Lopez, M. de Rijke and Khalil Sima'an.
A General Probabilistic Model for Dependency
Parsing. In proceedings of the BNAIC 2002,
Leuven, Belgium.
- M. Dastani and Khalil Sima'an. A
Machine Learning Approach to Visual Perception.
In European Society for the Study of Cognitive
Systems (ESSCS 2002), Workshop
on Multidisciplinary Aspects of Learning, Paris.
- W. Daelemans, Khalil Sima'an, J. Veenstra and J.
Zavrel (editors). Computational
Linguistics in the Netherlands 2000 (CLIN'00)
Language
and Computers - Studies in Practical Linguistics 37
Rodopi publications, November 2001.
- Khalil Sima'an, A. Itai, Y. Winter, A. Altman and
N. Nativ.
Building a Tree-Bank of Modern Hebrew Text.
In Beatrice Daille and Laurent Romary (eds.),
Journal Traitement Automatique des Langues
(t.a.l.) , 2001. Special Issue on Natural Language
Processing and Corpus Linguistics. Appeared also in
Chinese translation in the book Window
to
the
Computational Linguistics (ISBN
7-81085-140-3/N.48)(Here
is
the
front page of the translated article)
- Khalil Sima'an. Robust
Data-Oriented Parsing for Speech-Understanding.
Proceedings of the International Workshop on Parsing
Technologies (IWPT'01). Beijing, China, October
2001.This paper describes treebank-based
syntactic+semantic parsing of word lattices/graphs
output by a speech-recognizer in order to obtain
update semantics for the user utterances in a spoken
dialogue system over the telephone. It present
robustness techniques for data-oriented parsing.
- Khalil Sima'an. Enhancing
the Robustness of Data-Oriented Parsing for
Speech-Understanding. Proceedings of the
Natural Language Processing Pacific Rim Symposium
(NLPRS'01). Tokyo, Japan, November 2001.
- Khalil Sima'an. Tree-gram
Parsing:
Lexical
Dependencies and Structural Relations
Proceedings of 38th Annual Meeting of
the Association for Computational Linguistics
(ACL'00) , Hong Kong, China, 2000.
- Khalil Sima'an. Efficient
Parsing
of
Domain Language. Proceedings of the
Belgian-Dutch Artificial Intelligence Conference
(BNAIC'00), Efteling, The Netherlands, 2000. BNAIC'00 Best
Paper Award.
- Remko Scha, Rens Bod and Khalil Sima'an. A
Memory-Based Modelof Syntactic Analysis:
Data-Oriented Parsing. In special
Issue on Memory-Based Processing, W.
Daelemans (ed.), Journal of Empirical and
Theoretical Artificial Intelligence (JETAI), 11 (3),
1999.
- Gert Veldhuijzen van Zanten, Gosse Bouma, Khalil
Sima'an, Gertjan van Noord and Remko
Bonnema. Evaluation
of
the
NLP Components of the OVIS2 Spoken Dialogue
System. In Computational Linguistics in
the Netherlands 1998 (CLIN), F. van
Eynde, I. Schuurman and N. Schelkens (editors),
1999.
- Khalil Sima'an.
Efficient Disambiguation by means of Stochastic
Tree Substitution GrammarsIn New Methods in
Language Processing . D. Jones and H. Somers
(editors), UCL Press, UK, 1997.
- Khalil Sima'an.
An Optimized Algorithm for Data-Oriented
Parsing In R. Mitkov and N.
Nicolov (editors), Recent Advances in Natural
Language Processing, Vol.136 of Current Issues in
Linguistic Theory, John Benjamins, Amsterdam, 1996.
- Khalil Sima'an. Explanation-Based
Learning
of
Data-Oriented Parsing. In Proceedings of the
Conference on Computational Natural Language
Learning (CoNLL) at ACL/EACL-97, T. Mark Ellison
(editor), Madrid, Spain, July 1997.
- Khalil Sima'an. Explanation-Based
Learning of Partial-Parsing In W. Daelemans.,
A. Van den Bosch and A. Weijters (editors). Workshop
Notes of the ECML / MLnet Workshop on Empirical
Learning of Natural Language Processing
Tasks.Prague, Czech Republic. April 1997
- Khalil Sima'an. Computational
Complexity
of
Probabilistic Disambiguation by means of Tree
Grammars. In Proceedings of the
International Conference on Computational
Linguistics (COLING '96), pp.1175-1180
(vol. 2), Copenhagen, Denmark, August 1996.
- Khalil Sima'an.
An Optimized Algorithm for Data-Oriented Parsing.
In proceedings of International Conference on
Recent Advances in Natural Language Processing
(RANLP'95). Tzigov Chark. Bulgaria, 1995.
- Khalil Sima'an, Rens Bod, S. Krauwer and Remko
Scha. Efficient
Disambiguation by means of Stochastic Tree
Substitution Grammars. In Proceedings of the
International Conference on New Methods in Language
Processing (NeMLaP). Centre for Computational
Linguistics, UMIST, Manchester, UK, pp.50-58,
1994.
- Ph.D. Thesis
- Popularizing articles
about language technology
- Khalil Sima'an. ``Zes Weken JHU Summer Language
Engineering Workshop (Baltimore 11 Juli - 20
Augustus, 2005)". In Dutch. DIXIT 2005. (A Dutch
language and speech technology magazine).
- Khalil Sima'an. ``Een half lege glas wijn..." (in
Dutch). DIXIT (A Dutch language and speech
technology magazine), issue 2, 2004.
- Khalil Sima'an. Feeling the mood: On Machine
Learning for Natural Language Processing. A column
in ILLC Annual Report 2004, Institute for Logic,
Language and Computation (ILLC), 2004.
- Technical
reports/Tutorials:
- JHU
Summer workshop team. Parsing
Arabic Dialects. Final report (version I),
January 2006. CSLP, JHU, Baltimore, USA.
- JHU
Summer workshop team. Closing day
presentation. August 2005. CSLP,
JHU, Baltimore, USA.
- Khalil Sima'an. Probabilistic
Models for NLP. Course slides for ESSLLI
Foundational course with same title (176
slides in total)
- Khalil Sima'an. Probabilistic
Parsing. Course slides for ESSLLI Advanced
course with same title (45 slides)
- Khalil Sima'an. A
Short Introduction to the DOP Model Part
of material for ESSLLI
Advanced course on Probabilistic Parsing
- Rens Bod, R. Kaplan, Remko Scha, and Khalil
Sima'an.A Data-Oriented Approach to
Lexical-Functional Grammar. Computational
Linguistics in the Netherlands 1996 (CLIN),
Eindhoven, The Netherlands, 1997. (See also
``A probabilistic approach to LFG." Ftp://ftp-lfg.stanford.edu/pub/lfg/lfg-presentations/LFG96/kaplan-doptalk.ps.
Slides of the keynote lecture by Ronald Kaplan held
at LFG-workshop, Grenoble, France).
- Khalil Sima'an. Learning
Efficient
Parsing,
with application to DOP and Speech Understanding
(Draft version). Report #35, Probabilistic
Natural Language Processing, NWO's Priority
Programme on Language and Speech Technology,
Amsterdam/Utrecht, January 1997.
- Khalil Sima'an (University of Utrecht) and Remko
Scha, R. Bonnema, Rens Bod (University of
Amsterdam). Disambiguation and Interpretation of
Wordgraphs using Data-Oriented Parsing. Report
#31, Probabilistic Natural Language Processing, NWO
priority Programme for Language and SpeechFaculty of
Arts, University Utrecht. 1996 Technology,
Amsterdam, November 1996.
- Rens Bod, S. Krauwer and Khalil Sima'an. Combining
Linguistic
and
Statistical Knowledge CLASK: Final report.
Institute for Language and Speech, Faculty of Arts,
University Utrecht. 1996.
- Khalil Sima'an. Design
Principles for Real-Time Process Control
Systems Technical Report 94-42: Delft
University of Technology, 1994.
|