Course Description:
Computational Linguistics is the study of human language behaviour and language learning from a computational perspective. This course will explore computational techniques for understanding, translating and producing natural language, and investigate the structure and meaning of sentences and connected discourse. After providing the necessary linguistic background, symbolic (unification-based approaches) and probabilistic (statistical language processing) techniques will be considered. If time permits, some applications will be discussed, such as the problems of question answering, machine translation, text classification, information extraction, and grammar induction.
Objectives (expected learning outcomes):
Learning objectives include:
Knowledge of the terminology and concepts of computational linguistics;
Insight into the possibilities and fundamental limitations of computational linguistics;
Insight into the relative advantages and disadvantages of two major approaches to computational linguistics (statistical and unification-based approaches);
Understanding of the basic methods and techniques used in computational linguistics;
Skills in applying the basic methods and techniques to concrete problems in computational linguistics.
Classes Tues/Thurs 10:00-11:30 LAS 3033
Office Hours Wednesdays 3:00 or by appointment
Topics
Course Introduction
Part l: Computational Linguistics, Language, Natural Language Processing, Theory and Applications
Part ll: Linguistic Background - Unification-based approach to NLP
Part lll: Statistical Approach to NLP - Statistical Methods in Natural Language Processing and Data Analysis
Part IV: Course review – one day
Part V: Student Presentations
Grading
The course will be graded on the basis of one minor and substantial assignment (10% and 25%), one major in-class lecture presentation (10%) and one minor (10 min) project report (5%), and one course project (50%). Grades should follow the distribution A (90-100); B (80-89); C (70-79); D (60-60); uh oh (below 60)
Class Materials
1. Many many class handouts.
2. Copies of Chapters 1, 5 & 9 from Reference 1; Chapters 4-6 from Reference 2.
3. Copies of many many relevant papers.
4. Many other notes.
References
Recommended
1. Manning, C., & Schutze, H. (1999). Foundations of Statistical Natural Language Processing,
MIT Press, Cambridge, MA.
2. Jurafsky, D., & Martin, J.H., (2000, 2nd ed 2009). Speech and Language Processing, Prentice-Hall, Upper Saddle River, New Jersey.
Other
3. Charniak, E., (1993). Statistical Language Learning,
MIT Press, Cambridge, MA.
4. Klavans, J., & Resnik, P., (1996). The Balancing Act: Combining Symbolic and Statistical Approaches to Language,
MIT.
5. Bennett Jr., W., (1976). Scientific and engineering problem-solving with the computer, Prentice-Hall, Englewood Cliffs, New Jersey, Chapter 4, 103-199.