Project 2: A spelling-checker

The goals of this project are

You will write a program that reads in a text and produces a similar text but with some new words substituted for old words, which you hope will be correctly spelled words substituted in for misspelled words. The success of the program is measured by how well it does this correction: you want to catch as many misspelled words as possible, replace those with the right word as frequently as possible, and to catch and replace correctly spelled words as little as possible (ie, your program will almost certainly have some false positives, but you want to minimize that).

1. Given resources

You can find the following files in ~tvandrun/Public/cs394/proj2:

Your task, then is to modify spellcheck.py so that the decisions it makes are based on probabilities suggested by a language model and the edit distance between words.

2. The minimum way to complete this project

3. How to ramp it up (which I expect most of you to do, each according to his or her ability)

4. To turn in

Turn in your code and test cases and a short (about one page) report on the choices you made for varying edit distance, the language model, etc. Discuss your program's performance, especially what, if anything, you observed to improve it. Explain the test cases you used. Copy your files to

/cslab.all/ubuntu/cs394/proj2/your_name

where your_name is [brandon|chris|davenport|elliot|gill|josh|kate|leanne|nathan|taylor|tmac].

DUE: 12:00 midnight between Wednesday, Nov 6, and Thursday, Nov 7, 2013.


Thomas VanDrunen
Last modified: Tue Nov 5 15:10:29 CST 2013