!!!PAGE for 2004/2005 (see later years at http://staff.science.uva.nl/~simaan).



Lectures (Slides, Reading, Homework etc)
Reading Material (Books etc)
Description
Lecturer: Khalil Sima'an
MID-TERM PROJECTS
Student Presentations!!
Material etc.

Final Projects
Grades


Language and Speech Processing

               Lecturer: Khalil Sima'an




Place and Time:
 
Blok A, Sep 6 till Oct 22:  Friday 9:00-11:00    room  P.018
  Blok B, Nov 5 till Dec 17: Friday 11:00-13:00  room   I.203

             Except for Dec 10:
Friday 11:00-13:00  room A.728

Course and Reading material:  See special page.









Slides and readings for each lecture:  
  1. Introduction and Motivation (Why probabilistic models for language and speech processing?)
    Read chapters 1 and 2 of Manning &Scheutze or Jurafsky&Martin; chapters 1 and 2 of Mitchell.
    Also read this paper: Empirical validity and technological viability: Probabilistic models of Natural Language Processing.  
  2. Probability Theory, Statistics, Machine Learning and Objective Functions
    Read chapter 1,2 from Manning&Scheutze; read also chapter 1 of Krenn and Samuelsson
    HOMEWORKa pdf file containing the homework   
  3. Word-precition, sentence probability (without structure), Ngrams and Markov models
    Read chapter 6 of Juranfsky and Martin  or  Sections 6.1-6.3 + 9.1 from Manning and Schutze
    Homework: Exercices  6.3, 6.4, 6.5 Jurafsky and Martin (pages 232-233) combine into one program: do it!
  4. POS tagging and Markov Models: standard generative POS taggers. (includes slides of next lecture)
    Read chapter 8 (Jurafsky and Martin) about POS tagging in general (you may skip section 8.6)
         On HMMs: read from chapter 9 (Manning and Schutze) only sections 9.1+9.2 +9.3.1+9.3.2)

                   Further on evaluation of Taggers: read section 10.6 (Manning and Schutze)
            Homework:  Build a POS tagger based on the standard architecture of a Generative Stochatsic Markov Tagger.
                               At your disposal there is a training set and a test set of tagged corpus (to be obtained from the lecturer).
                              The goal is to vary the architecture of the tagger every time by adding some different context in the language or/and
                               the lexical model  in order to observe the effect of this on tagging accuracy on the test set.
                               Here is the general architecture:
                                               P(t_1... t_n |  w_1.....w_n)  =   prod_i   P( t_i | Hl)  P(w_i | Hx)
                                where   Hl  and  Hx  are the conditioning contexts for the language and the lexical models respectively.
                                 Here are some suggestions for different instantiations of Hl and Hx
                                    A.  Hl =  <t_{i-1}>     and     Hx = < t_i>
                                    B.  Hl =  <t_{i-2}, t_{i-1}>     and     Hx =  <t_i>
                                    C.  Hl =  <t_{i-2}, t_{i-1}>     and     Hx =  <t_i, t_{i-1}>
                                    D.  Hl =  <t_{i-2}, t_{i-1}>     and     Hx =  <t_i, SUFFIX(w_{i-1})>    where  SUFFIX(word) = last 3 letters of "word"
                                 To build these models you use a training set of tagged sentences from which you extract the tables that fit with each model.
                                 Having built these four taggers, you test them: you strip the test set from the tags, leaving only words, and tag
                                 each of the taggers. Then compare the result of each tagger to the manually tagged test set (original one including the tags).
                                 Report precision of each tagger as:   count of correct tags devided by total count of tags in test set.

  5. (same slides as 4) HMM implemented as SFST; Tagging Algorithms; Forward/Backward + Application of Markov Models in Spelling  Correction.
    Read Chapter 10 of Manning and Schutze and on Spelling Correction from Jurafsky and Martin chapter 5 (till section 5.6) and chapter 6.
    Homework  finish the excercise given in the preceding lecture (number 5 above).
  6. Dealing with Unseen Events: Methods for Smoothing Maximum-Likelihood Ngram Statistics  (Monday 20 October - to compensate for first lecture)
    Read 1) chapter 6 from Manning and Schutze (or  chapter [6.1-6.6] from  Jurafsky and Martin)
                2) Until page 15 from Joshua Goodman and Stanley Chen. "An empirical study of smoothing techniques for language modeling".
                          Technical report TR-10-98, Harvard University, August 1998. Print from  http://research.microsoft.com/~joshuago/
                          A correction of a small error in the statement of Katz formula in this report can be found here.

                          See also more in  Simple Good-Turing

    MID-TERM PROJECT: COUNTS FOR 40% of the final mark for this subject (SEE TOP OF THIS PAGE)



  7. Parsing, Phrase-Structure and Probabilistic Context-Free Grammars
    Read Chapters 9 and [10.1-10.4] of Jurafsky and Martin; For CYK algorithm, read this in Charniak's book (chapter 1+2)
  8. TabularParsing Algorithms for CFGs and Probabilistic Context-Free Grammars
    Read Chapters 9 and [10.1-10.4] of Jurafsky and Martin; For CYK algorithm, read this in Charniak's book (chapter 1+2)
  9. Probabilistic Context-Free Grammars, Viterbi-like Disambiguation and Treebank PCFGs
    *Read Chapters 9 and [10.1-10.4] of Jurafsky and Martin; For CYK algorithm, read this in Charniak's book (chapter 1+2)
    *Read also Tree-bank grammars, Technical Report CS-96-02, Department of Computer Science, Brown University (1996).
      Here are an  abstract and postscript version.
  10. Probability Estimation and the Maximum-Likelihood Principle  (lecture by Dr. Detlef Prescher).
  11. Treebank Parsing with PCFGs     (students' presentations: see special page !!).
  12. Transforms on Phrase-Structure for Improved PCFG parsing  (students' presentations: see special page !!).
  13. Data Oriented Parsing  (students' presentations: see special page !!).
  14. (Reserve lecture:) Information Theory, Communication, Compression and Error Minimization
    Read chapter 1,2 from Manning&Scheutze; read also chapter 1 of Krenn and Samuelsson


FINAL PROJECT: COUNTS FOR 40% of the final mark for this subject (COMING SOON)

THE LAST 20% of the final mark will go towards the participation in the discussions in the last two lectures