Computer Science 384

Computational Linguistics
Fall 2017
Thomas VanDrunen



Meeting time: MWF 12:55-2:05 pm.
Meeting place: Science 131

Office hours: MWF 3:30-4:30 pm; Th 9:00-10:30 am, 11:00-11:30, and 1:15-3:15 pm.

Contact: 163 Meyer Science; 752-5692; Thomas.VanDrunen@wheaton.edu
http://cs.wheaton.edu/~tvandrun/cs384


Syllabus




Final exam: Thurs, Dec 14, 8:00am


Moon's dayWoden' s dayFrigga's day

Aug 21

NO CLASS

Aug 23

Background. Introduction
Slides

Read J&M ch 1

Aug 25

Preliminaries, history, etc

Learn Python (for 8/30)

Aug 28

Introduction to NLTK

Aug 30

Trying out NLTK


Read J&M Chapter 2

Sept 1

Regular expressions

Sept 4

NO CLASS

Sept 6

Trying out NLTK

In-class activity
Project 1 Due 9/20
Read given prob & stats summary (for 9/11)

Sept 8

Conversational agents

In-class activity

Sept 11

Words and language models. Probability and statistics background

Sept 13

More probability and statistics.

Read J&M 4.(1&2)

Sept 15

The noisy channel model. Introduction to language models
Slide

Do the practice problem on the handout

Sept 18

More about language models

Read J&M 4.(3&4)

Sept 20

Statistics about words and other N-grams
Slides

Sept 22

Evaluating language models

Read J&M 4.(5-7); find something interesting in Google NGram viewer

Sept 25

Smoothing

Sept 27

Good-Turing smoothing
Slides

Project 2, due 10/13

Sept 29

Engineering a language model
Slides

Oct 2

Linear interpolation; Expectation-Maximization
Slides

Read J&M 3.(10-12)

Oct 4

Edit distance

Oct 6

Spelling correction

Read J&M 4.(10&11)

Oct 9

Information theory. Introduction. Entropy and perplexity
Slides

Oct 11

The entropy of English
Slides

Oct 13

Text compression
Slides

Project 3, due Oct 27
Read J&M 5.(1&2)

Oct 16

NO CLASS

Oct 18

NO CLASS

Oct 20

POS tagging and HMMs. Lexical categories
Slides

Read J&M 5.3 and 5.5 through pg 145

Oct 23

Introduction to HMMs

Oct 25

More on HMMs (forward algorithm

Oct 27

State recovery (Viterbi algorithm)

Oct 30

[Take-home midterm released]

Read J&M 5.5.3

Nov 1

HMM training (Forward-Backward algorithm)

Nov 3

Implementing HMMs

Project 4, due Nov 17

Nov 6

Experiments with HMMs

Read J&M 19.(1-3) (the rest of Ch 19 is recommended)

Nov 8

Lexical semantics. Introduction to lexical semantics

Nov 10

Applied lexical semantics

In-class activity
Read J&M 12.(1-3,9) for Mon 11/13, and the rest of chapter 12 by 11/17

Nov 13

Grammars and parsing Introduction to grammars

Read J&M 13.(1-3)

Nov 15

Grammars and parsing

Read J&M 13.4.1

Nov 17

CKY parsing algorithm

Project 5, due 12/8

Nov 20

More about CKY

Nov 22

NO CLASS

Nov 24

NO CLASS

Nov 27

Machine learning. Stylometry and authorship attribution

Spent an hour and a half learning R with swirl

Nov 29

Introduction to R

Dec 1

Support vector machinesIntroduction to R

Read this paper about stylo
See also the stylo manual

Dec 4

The Stylo package

In-class activity

Dec 6

Stylometry investigations

Dec 8

Review