Next: Statistical Parsing I
Up: Lecture schedule
Previous: Lecture schedule
- Motivation. General overview explaining the nature of our data: Treebanks and Parallel Corpora.
Hidden variables/regularities in Treebanks (derivation generating rules or nodes at each step, node labels);
In parallel corpus: word-alignment (translation lexicon), ITG structures, labels of nodes, generic edit operators
on trees etc.; A road map for the course: data and learning structured models. Parsing: Treebanks and Learning
how to Parse: the explosion of parse space and the disambiguation problem. Defining probabilistic models over parse
trees and Probabilistic Grammars.
- Manning & Scheutze section 3.2 and chapter 11. Chapter 12 includes material in next lecture.
Alternative: from Jurafsky & Martin (J & M) read chapter 9, 10 and 12,
and as a formal background Secs. 13.0-13.3.