CSCI 384. Computational Linguistics

Fall 2021
Thomas VanDrunen



Meeting time: MWF 12:55am-2:05pm.
Meeting place: Meyer 131

Office hours: Schedule through Calendly
Contact: 163 Mey Sci; 752-5692; Thomas.VanDrunen@wheaton.edu
http://cs.wheaton.edu/~tvandrun/cs384


Syllabus



Final exam: Wed, Dec 15, 1:30-3:30pm


Moon's dayWoden' s dayFrigga's day

Aug 23

NO CLASS

Aug 25

General introduction
Slides

Aug 27

Finishing general introduction

Aug 30

Python and NLTK

Sept 1

Regular expressions
Slides

Sept 3

Regular expression exercise

Sept 6

NO CLASS

Sept 8

Language models. Probability and statistics background

Sept 10

Statistics about language, N-grams

Sept 13

Introduction to language models
Slides

Sept 15

Smoothing
Slides

Sept 17

Interpolation among language models
Slides

Sept 20

Finishing language models; spelling correction

Sept 22

Edit distance
Slides

Sept 24

Information theory Introduction to information theory
Slides

Sept 27

Language entropy; compression

Sept 29

Review

Oct 1

Midterm

Oct 4

HMMs and POS tagging. Lexical categories
Slides

Oct 6

Hidden Markov models

Oct 8

The forward-backward algorithm
Slides

Oct 11

The Viterbi algorithm and the Baum-Welch algorithm

Oct 13

HMM experiment

Oct 15

HMM experiment

Oct 18

NO CLASS

Oct 20

NO CLASS

Oct 22

Parsing. Introduction to syntax
Slides

Oct 25

Recursive descent parsing

Oct 27

CKY parsing

Oct 29

Semantics. Lexical semantics

Nov 1

WordNet

Nov 3

Machine learning background. Word embeddings

Nov 5

Vector semantics

Nov 8

Neural nets and RNNs

Nov 10

Machine translation. Introduction

Nov 12

Encoder-decoder model

Nov 15

More machine translation techniques

Nov 17

Ethics in machine translation

Nov 19

Stylometry. Introduction

Nov 22

The stylo package

Nov 24

NO CLASS

Nov 26

NO CLASS

Nov 29

Authorship attribution

Dec 1

Sentiment classification. Naive Bayes classifiers

Dec 3

Sentiment analysis

Dec 6

Sentiment analysis

Dec 8

Text generation. Survey

Dec 10

Review