Today, you will be working on completing the second program from last week’s lab. Complete occurs.py and submit it as P5 before 5:00 pm on Tuesday, Feb. 23.
Recall from class that a dictionary (Python type dict) is a type that provides a mapping from one set of values (the keys) to another (the values). Another way to think of a dictionary is as a collection of (key, value) pairs, stored so that, given a key, it is easy to find the corresponding value.
Dictionaries use much of the same syntax as lists, with the key as the subscript. Thus subscriptina allows a value to be stored into, accessed, and removed from a dictionary d with
A dictionary is not ordered in any meaningful way; so they do not support the slicing that lists do.
Unlike a list, which is dense—its subscripts form a compact range—a dictionary is sparse. To test whether there is an entry for a given key, use the in operator, so that
is true if and only if key k is present in the dictionary d. The absence can be checked with
It is so often useful to get the value for a key, if there is one, or some particular value if it is not present, that there is a method get() for this purpose:
A similar method setdefault() can be used to insert the default value for the key.
Not all types can be used as keys. The technical name for the requirement on a key type is that it be hashable–the property needed to make efficient lookup possible. In practice, one requirement for hashable is that the type be immutable. So we can use strings, integers, and tupes for keys, but not lists or dictionaries.
Python will print a dictionary uses braces, with a colon separating each key and value, as in
You can write out a dictionary in a program using the same syntax.
Note that, as this example shows, a dictionary need not be homogeneous in either the key or value type.
A dictionary is stored for quick access, but the order in which key-value pairs are stored is not meaningful to us. You can extract the whole of a dictionary as a list using one of three methods:
While the order is hard to predict, if you make calls to keys() and values() without making any changes to the dictionary, the two lists will be in corresponding order. A dictionary emphitem is a key-value pair, as a tuple. The ability to do multiple assignment from a tuple gives rise to a very common way of iterating over a dictionary:
The methods mentioned above build a list containing an entry for each item in the dictionary. If the dictionary were very large (or the computer’s memory small!), that might be wasteful when all you want to do is iterate over the list.
Python provides types known as iterators for this purpose. An iterator can be thought of as an object the represents a sequence, with the promise to produce each element of the sequence as needed. That may be more efficient than producing the entire list at one time.
For dictionaries, the methods
are just like the previous methods, except that they return an iterator instead of an actual list.
In computing, we use the verb sort to mean “put into (ascending) order”. The list type has a standard method sort() that rearranges the list in this fashion. So, for example
If you get the help on the method list.sort, you’ll see
The parameters listed here are different from what we’ve seen before. First, they are optional–you can call the sort() method without specifying any of them. What this description shows you is the default value that each formal parameter is assigned if there is no corresponding actual parameter.
The third parameter, reverse, says whether to sort backwards. Since we know what the defaults for the other two parameters are (even if we don’t know what they mean), we can set just the third with a call like:
But that is awkward: we have to remember what all the parameters are, what order they appear in, and their default values. Python makes this easier by allowing us to specify parameters by keyword, as in
When there is more than one optional parameter, as in this case, it is usual to specify values by keyword.
The other two parameters have the default value None, which is a very common choice for a default. We’ll ignore cmp for the time being, but the key parameter expects to be passed a function. That function should be one that, when called on an item in the list, returns the value that should be used in comparisons for that item. This makes it possible to sort lists of tuples, lists, etc., in interesting ways. Given the list
we could sort it lexicographically with
We could instead sort by the second element of each tuple by defining a function and passing it for key:
There are lots of ways to use something like this. For example, if we had a list of strings that we wanted to sort in a case-insensitive manner, we could do it thus
The sort() method works only on a list, and it changes the list on which it is invoked—that is what is meant by the “in place” in the description. There is also generic function sorted() that works on any iterable sequence and produces a new sequence (list) that is in sorted order. If you pass a list to sorted(), the original list is not changed. For example
Be sure that you do the following:
Review the section on “Standard output and standard error” from last week’s lab handout. Be sure that both occurs.py and words.py (from last week) are well-behaved when used in a pipeline.
As before, write your reflections on the exercise in a file notes5.txt in the same directory as occurs.py.
To turn in your program, make sure you are in the directory with your file occurs.py, then execute the command
If you’ve modified words.py and want to include that,
This is due before lab on February 18.