Elaboration and hints on assignments

CLRS naming scheme. Note that CLRS has "exercises" at the end of each section and "problems" at the end of each chapter. The exercises are enumerated as chapter.section-exercise. for example 2.3-7. Problems are enumerated as chapter-problem, for example 2-3. Problems often have sub-parts, for example 2-3c. "Daily work" will come mainly from the exercises, and "homework" will come mainly from the problems, but there will be some crossover as well.

Turning in HW assignments. Please turn in your solutions electronically to /cslab/class/cs445/(your id)/(assignment id) where assignment id is in the form hw-month-day, referring to the date it was assigned. For example, the homework assigned on Aug 31 should be turned in to hw-8-31.

Please turn in proofs etc as a pdf. Turn in the source code for your solutions to problems designated as "complete." Consider including a README file if the way your code is set up or how it is to be executed is non-obvious.

Hints on coding up solutions. Some of the problems in the book are not specified completely enough to implement in a real programming language without filling in a few details. You'll need to make a few reasonable assumptions about how the input data is represented---and about how the result should be represented. This is especially true for problems that take a list or array or set. Sometimes the data is complicated enough that in Java you may want to make a little class for it. In any case, state your assumptions/interpretations of the problem clearly in comments.

Here's how I structure my code for problems in this course. My Java solution to Exercise 2.3-7 is in a project folder called c2s3e7j, that is, "Chapter 2 Section 3 Exercise 7 Java." That folder contains a package folder called c2s3e7. In that package I have two files, FindPairSum.java which contains the solution to the problem (as a static method in the FindPairSum class) and TestFindPairSum which contains the JUnit tests. I have a Python solution set up similarly, except it has only one file: c2s3e7p/c2s3e7/findPairSum.py contains both the solution (as a stand alone function) and the PyUnit testcase.

The real way to do test-infected programming is to start by writing the stub for the method (or more generally, unit) that you will test; then writing the unit tests; then confirming that the unit tests fail; and finally writing the solution so that the unit tests pass.

Of course doing this in Eclipse will be very convenient. But it's also good to have a commandline option available (it's what I use when grading, for example). To run the tests in c2s3e7.TestFindPairSum, do

java -cp .:/usr/share/java/junit4.jar org.junit.runner.JUnitCore c2s3e7.TestFindPairSum

For Python it's a bit easier:

python -m unittest c2s3e7.findPairSum.py

I've never used CUnit (the unit testing library for C), and it appears that the unit testing tools available for SML are pretty primitive. I would probably just make unit tests from scratch when writing in those languages (but feel free to explore the available tools yourself).

Dates refer to when the assignment is assigned (not due).

Daily Aug 24. Skim the first chapter of the book to become familiar with their terminology. Read Chapter 2 carefully--the thrust of the content should be review, but pay attention to the details. Do 2.2-(2&4) and 2.3-(3, 6, 7).

For 2.3-7, practice what I'm going to call a complete solution: implement your solution in a programming language of your choice with unit tests; state any invariants for the loops, prove them, and use the proof(s) to argue for the algorithm's correctness; analyze the algorithm's efficiency carefully.

If you need help with that, I'm providing stubs and unit tests for 2.3-7 in both Java and Python at ~tvandrun/Public/cs445/c2s3e7j and ~tvandrun/Public/cs445/c2s3e7p, but give it a try on your own before you use my stuff.

This is a particularly long daily assignment (especially for the first day of the semester). If it's too much for you, you can hold off on 2.3-7; we probably won't get around to talking about it until Monday anyway.

Daily Aug 26. Do Problem 2-3c from Chapter 2 (make sure you read the whole Problem 2-3 for context); part c doesn't explicitly say you should prove the loop invariant (it says "use"), but that is implied. Then read Section 3.1 carefully, noting the differences among the asymptotic categories and notations as best you can. Do Exercises 3.1-(4 & 5).

Daily Aug 31. I'm not giving any practice problems this time so you can start working on the first HW problem set. But read the first three sections of Chapter 4.

HW Aug 31. In addition to 2-2, 3-1(b,e) and 3-4(d-h), do the following problem based on Exercise 4.4.9 from Anany Levitin, Introduction to the Design and Analysis of Algorithms, Third Edition, Pearson, 2012; pg 157:

Given a list (or array) of length n containing all integers from 0 to n inclusive except for one in increasing order, find the missing number. For example, in the list 0 1 2 3 5 6 7 8, the number 4 is missing. Corner cases: 1 2 3 4 5 is lacking 0, and 0 1 2 3 4 5 6 is lacking 7.

Make your algorithm as efficient as possible. Give a complete answer: Implement this in a programming language of your choice plus unit tests; state and prove a loop invariant, and use that to argue for your solution's correctness; analyze the efficiency.

You can find a stub and some unit tests for a Python solution at ~tvandrun/Public/cs445/supp1p, but you may also make yours from scratch.

Daily Sep 9. Section 8.1 is the main thing we'll be looking at; read it carefully. Sections 8.(2-4) should be review. Judge for yourself how carefully to read it---but you should know the stuff. Do Exercises 8.1-(1, 3, 4).

HW Sep 9. Do Problems 4-(2 & 5) and 7-(1 & 4).

Give a "complete" solution for Problem 4-5.c, that is, code up a solution in a language of your choice, demonstrate correctness with JUnit tests, give a correcntess proof, and an analysis. (The problem as stated in the text essentially asks for the correctness proof.) You may want to make use of my stub-'n'-Junit tests, found at ~tvandrun/Public/cs445/c4p5j. In particular, I've made some classes to represent the chips, good and bad, being careful to make the difference between the two opaque to the code using the algorithm. Chip.java contains the code for the chips. The stub for the algorithm is in Diogenes.java.

The significance of Problem 7-4 will be clearer to those who have taken Programming Language Concepts. If you haven't, you may want to discuss the problem with someone who has.

Daily Sep 12. Do the following problem:

You are playing a computer game in which the hero must pass through a series of rooms and halls collecting treasure. There are 2n rooms (in pairs) and n-1 halls interspersed between the pairs. Each room ahs a one-way door to the next hall, and each hall has two one-way doors to the rooms of the next pair. The hero must, therefore, pass through exactly one room in earh pair. The area looks something like

T₃ S₃

H₂

T₂ S₂

H₁

T₁ S₁

H₀

T₀ S₀

Each room has a certain amount of treasure, T_i or S_i. Halls do not have treasure, but they each have a guardian who demands payment to let the hero cross diagnoally through the hall. So, to move from T_i-1 to T_i is free, but to move from T_i-1 costs H_i.

Devise and implement an algorithm to find the route that yields the most treasure. Analyze its efficiency.

Daily Sep 14. Do the following problem (based on a problem by Susanne Hambrusch, 1998):

A lumberjack has an k-yard long log of wood he wants cut at n specific places j₁, j₂, ... j_n, represented as the distance of that cut point from one end of the log. (We can also consider the ends as trivial "cut points" j₀ = 0 and j_n+1 = k.) The sawmill charges $x to cut a log that is x yards long (regardless of where that cut is). The sawmill also allows the customer to specify the ordering and location of the cuts.

For example, if k = 20 and we want cuts at 3 yards, 6 yards, and 10 yards from the left end, then if we cut them from left to right the cost would be

20 + (20-3) + 20-6) = 20 + 17 + 14 = 51

But making the same cuts from right to left would cost

20 + 10 + 6 = 36

Devise and implement an algorithms to minimize the cost, and analyze its running time.

HW Sep 16. Do Problems 8-4, 15-(4 & 6). Each of them are to be "complete", ie implementation in a programming language of your choice with unit tests, a proof of correctness, and an analysis of the running time.

Daily Sept 19. This is a longer daily work assignment than usual. It's basically two day's worth, and that's because I'm not sure how long various pieces will take. You can think 16.3 (Huffman encoding) as being for Wed, Sept 21, 16.4 (matroids) for Fri, Sept 23, and 17.(1-3) (amortized analysis) for Monday, Sept 26. However, we may be able to cover both 16.3 and 16.4 on Wednesday, or we might get partway through something. In other words, I'm asking you to be "one day ahead" on daily work.

HW Oct 3. Do Problem 16-2, both a and b complete. For the correctness proof in each part, explain what the subproblem is and what the greedy choice is, and then prove that the problem has the greedy choice. Recall the structure of proof like that: suppose a solution for a given subproblem exists that doesn't use the greedy choice; construct a solution based on that supposed one but that does use the greedy choice; show that your constructed solution is as good as or better than the supposed solution.

I have provided "scaffolding", stubs, and one JUnit for each part, found at ~tvandrun/Public/cs445/c16p2j. Note that the output of a solution to part a is much simpler than that of part b. My stub for part a is void; the assumption is that the method will merely rearrange the given array of tasks into an optimal order. Part b somehow must construct a schedule indicating what portion of which tasks to run in what order. The suggestion implied by my stub is to return an ordered collection (such as an ArrayList) of "schedule units", each of which indicates a task to run and the length of time given to that task before it is preempted.

The constraint for part a is merely that all tasks are executed, which doesn't require any enforcement. The constraint for part b is that all tasks are completed and that no task is executed before it is released. Do not assume that it is possible to schedule the tasks in such a way that the processor is always busy. For example, if the tasks have running times 3, 4, 9, and 1 but release times 0, 5, 6, and 7, respectively, then after the 3-cycle task is finished excuting, the other tasks haven't even been released yet, and so the schedule would need to include some idle time until another task is ready. (Obviously you want your schedule to include as little idle time as possible).

This problem set will have a temporal overlap with the B-tree problem set.

Daily Oct 7. Read the introduction to Chapter 27 (pg 772-774) and Section 27.1 through the part about performance (pg 774-781). Or you can just keep reading the rest of the section, since we'll talk about that the next class period. Also, this reading makes use of graph concepts. If you need review, see Chapters 22-24 in CLRS.

HW Oct 7. Implement the delete operation for B-trees, as described in CLRS Section 18.3. Recommended interpretation of previous sentence: Finish the implementation of B-trees found at ~tvandrun/Public/cs445/btreej, which contains starter code for a "CSCI 345-style" project. Specifically,

Implement BTreeMap.iterator(), to iterate over the B tree, as a "warm-up." (Find the method stub way at the end of BTreeMap.java.) In this problem you need to reason about the state of the iteration. What information do you need in order to keep track of where you in the progress of traversing the B-tree? In my solution, I used a stack. Alternately you could make this method recursive in the nodes: give each node an iterator method, and make your B-tree iterator to be an iterator-of-iterators.
Implement BTree.Leaf.delete() and BTree.Internal.delete(). I recommend you break this up into helper methods. Specifically, I used five helper methods, deletePred(), deleteSucc(), merge(), shareRight(), and shareLeft(). I have provided stubs for these five in BTree.Leaf and the whole implementation for them in BTree.Internal.

In the method stubs, you'll notice that I indicate the number of lines of code (LOC) that I used in my solution. (That count includes closing curly braces, but does not include comments, white space, or assertions.) Take that with a grain of salt---you don't have to do it my way. You could, for example, break the problem up differently and use different helper methods from the ones I suggest. Still, the LOC should give you a quick feel for how big of a task something is likely to be.

This problem isn't "complete" in that there's no correctness proof or anything to do, but,

Unlike in CSCI 345 (or CSCI 365, which about half of you have taken), I will inspect your code when I grade this.
I recommend using assertions and including comments that exhibit the sort of reasoning we've practiced in correctness proofs: pre/post conditions, loop invariants, etc. I've tried to set a good example in the code I've provided.
You should add more JUnit tests, especially tests that use a lot of data. Sharing tests with classmates is strongly encouraged. Extra credit for JUnit tests that break my solution (something I think should be possible---I didn't test mine as thoroughly as I should).

HW Nov 14. These are difficult. I want to make allowance for that (ie, not all of you will get all of them), but also to incentivize doing your best (you can get this with time and effort) and disincentivize wild guessing. To that end, the problems will be graded this way:

A good proof (correct answer, correct proof; formal, complete, etc): 6 points
A correct answer and essentially correct but informal proof: 4 points
A correct answer with a proof attempt, but marked as uncertain: 2 points
An incorrect answer, but marked as uncertain: 1 point
An incorrect answer, but not marked as uncertain: -2 points

That doesn't cover quite every possble kind of submission, but I'll try to grade all of them in the spirit of the above point schedule.

HW Nov 30. The book instructions for LP 7.3.4 say "prove that it is NP-complete by showing that it is the generalization of an NP-complete problem. Give the appropriate parameter restriciton in each case." That seems to suggest a brief answer, "it's just like this other known NP complete problem, just change this or that parameter to..." But that is not my intention with this assignment. You should do a complete NP-completeness proof for each of these (where "each"= parts f and h, which are assigned; trying the others wouldn't be a bad studying strategy, though). That means, prove that the problem is in class NP, then that the problem is NP hard by showing a reduction. Of course, you may use a generalization or specialization of the problem for the reduction. That just makes the problem a bit easier. You still must do them proofs completely. Same thing for the other problem assigned, CLRS 34.5-2.

Clarification on LP 7.3.4.f: The phrase "two nodes 1 and n" means two distinct nodes. The phrase "not repeating any node twice" should be read as "not repeating any node" or "not visiting any node twice." (Taken literally, "not repeating any node twice" means "not visiting any node three times." I do not think that's what the authors intended.)

CLRS 34.5-2 is actually the same problem as LP 7.3.4.e, just stated a little more clearly and with a different hint. Note that what CLRS calls "3-CNF-SAT" is what LP calls "3-SAT".

Thomas VanDrunen

Last modified: Fri Dec 2 10:29:09 CST 2016