Computer Science 384

Computational Linguistics
Fall 2017
Thomas VanDrunen



Meeting time: MWF 12:55-2:05 pm.
Meeting place: Science 131

Office hours: MWF 3:30-4:30 pm; Th 9:00-10:30 am, 11:00-11:30, and 1:15-3:15 pm.

Contact: 163 Meyer Science; 752-5692; Thomas.VanDrunen@wheaton.edu
http://cs.wheaton.edu/~tvandrun/cs384


Syllabus




Final exam: Thurs, Dec 14, 8:00am


Moon's dayWoden' s dayFrigga's day

Aug 21

NO CLASS

Aug 23

Background. Introduction
Slides

Read J&M ch 1

Aug 25

Preliminaries, history, etc

Learn Python (for 8/30)

Aug 28

Introduction to NLTK

Aug 30

Trying out NLTK


Read J&M Chapter 2

Sept 1

Regular expressions

Sept 4

NO CLASS

Sept 6

Trying out NLTK

In-class activity
Project 1 Due 9/20
Read given prob & stats summary (for 9/11)

Sept 8

Conversational agents

In-class activity

Sept 11

Words and language models. Probability and statistics background

Sept 13

More probability and statistics.

Read J&M 4.(1&2)

Sept 15

The noisy channel model. Introduction to language models
Slide

Do the practice problem on the handout

Sept 18

More about language models

Read J&M 4.(3&4)

Sept 20

Statistics about words and other N-grams
Slides

Sept 22

Evaluating language models

Sept 25

Smoothing

Sept 27

Good-Turing smoothing

Sept 29

Linear interpolation

Oct 2

Expectation-maximization

Oct 4

Edit distance

Oct 6

Spelling correction

Oct 9

Information theory. Introduction. Entropy and perplexity

Oct 11

The entropy of English

Oct 13

Text compression

Oct 16

NO CLASS

Oct 18

NO CLASS

Oct 20

Hidden Markov Models. Lexical categories

Oct 23

Introduction to HMMs

Oct 25

More on HMMs

Oct 27

Part-of-speech tagging

Oct 30

Viterbi algorithm

Nov 1

HMM training

Nov 3

More on HMM training

Nov 6

Lexical semantics. Introduction to lexical semantics

Nov 8

Applied lexical semantics

Nov 10

Grammars and parsing Introduction to grammars

Nov 13

Grammars and parsing

Nov 15

CKY parsing algorithm

Nov 17

Machine learning. Principles of machine learning

Nov 20

Support vector machines

Nov 22

NO CLASS

Nov 24

NO CLASS

Nov 27

Stylometry and authorship attribution

Nov 29

Introduction to R

Dec 1

The Stylo package

Dec 4

Dec 6

Dec 8

Review