Lab 12: Working with C

The goal of this lab is to practice writing in C, particularly learning how to manage a small C project.

1. Introduction

Working with a project in C is different from managing a Java project in a variety of ways. In this lab we'll step through some practical considerations for developing even a small program in C.

Begin by copying code that I've given you from the course directory.

cp /cslab.all/ubuntu/cs245/lab12/* .

2. Writing functions

Begin by editing the file gcd.c. This contains a(n unfinished) program for computing the greatest common divisor of 32 and 56. It expects two function, gcd_it(), which computed the gcd using an iterative algorithm, and gcd_rec() which compute the same result but recursively.

Write these two methods. Try to use the C conventions, such as moving curly braces to the next line. Remember to compile with gcc gcd.c and run with ./a.out. When it works, move to the next step.

3. Writing reusable functions.

As with any programming language, we would like to write code that is reusable. We would like to write methods that can be plugged into other applications. This is done in C by making libraries. Actually, "library" is not a C-specific term; with any programming language, people talk about libraries. They merely take different forms in different languages. In Java, a library is a class or a package. In C, a library is collection of functions and other constucts (such as those used to define new types).

The easiest way to make a library in C-- though not the best-- is to put reusable code in what's called a header file. To make a header file for your gcd functions, do the following steps:

Open a new file called gcd-lib.h. This will be your header file.
Cut the two functions out of gcd.c and past them into gcd-lib.h.
Add the following line to gcd.c.
```
#include "gcd-lib.h"
```

Now it's time to explain what exactly #include means. When a C program is compiled, the first step is for the file to be processed by a compiler componet called the preprocessor, which makes a few alterations to the text of the file before the compiler starts breaking it down (for example, it strips out all documentation). The programmer is able to give some commands to the preprocessor using what are called preprocessor directives. (When you learn C, you really learn two languages: C itself, and the language of the C preprocessor.)

Preprocessor directives must be at the beginning of a line, and they all begin with the # character. The directive #include tells the preprocessor to take a file and paste its contents verbatim into that place in the code. When we said

#include

we were telling the preprocessor to include information about the standard I/O functions that is contained in the file stdio.h. Likewise we are now telling it to include our new file gcd-lib.h. (The difference between the quotes and the angle brackets is that the quotes tell the preprocessor to look in the current directory, the angle brackets indicate that the file can be found in a standard system location for C libraries.)

To see the effect, try compiling gcd.c with the -E flag. This means "preprocess only." It will spit out the resulting file after preprocessing, but will not continue with the compilation.

gcc -E gcd.c

You will see it spit out a lot of code included with stdio.h, but also your code from gcd-lib.h.

Now compile and test your code, and then move on.

4. Preprocessor fun

Let's learn a few other things about the preprocessor. Besides including other files, the preprocessor can be used to define replacement values for symbols and to declare that certain code should be compiled only under certain conditions. Here's what that means and how to do it:

The directive #define is used to define a symbol, indicating that everywhere the symbol occurs in the text, it should be replaced by another piece of text. Open the file hi.c. The line

#define LIMIT 10

means every time LIMIT occurs, it is replace by 10. Thus it can be used to define constants, or short pieces of code.

The other important directives are #ifdef and #endif. The first directive tests to see whether a certain symbol has been defined, and the all the code between that directive and the #endif is included only if that symbol has been defined. Likewise, #ifndef is used to test if a symbol has not been defined.

Here's why one would want to use this. In very complicated C applications, there will be many libraries, some of which might depend on each other. What we do not want is for there to be redundancy--- two libraries happen each to be dependent on the same other library, and both include it. These preprocessor directives prevent the library from being included more than once. What we do is begine our header file with

#ifndef MY_LIBRARY_SYMBOL
#define MY_LIBRARY_SYMBOL

    // library contents

    // blah blah blah

#endif

Take a moment to figure out how the process described above works.

Now, to see how well you understand how the preprocessor works, let's look at a few cases of misuse of the preprocessor. These examples come from Steve Oualline, Practical C, O'Reilly Media, 1997.

Open and inspect the file max.c. It should count down from 10, printing Hi there each time. Try compiling it. Can you tell what the error is? (The current version of C actually treats this example a little differently that the version when this example was first made. It used to be that the program compiled but behaved unexpectedly when run. Ask Dr VanDrunen about it.)

Next, open and inspect size.c. Trace what the program is doing, setting the symbols SIZE and FUDGE. From reading it, one would expect the program to output "Size is 8", but that's now what happens. Compile and run it to see. Can you explain the results? (If not, compile with the -E flag to see what the preprocessor is doing.)

The file die.c, which you should open and inspect next, uses the symbol to define what amounts to a short void function-- in this cases, exiting the program abruptly. This program sets a variable to a value, and tests that value, exiting if it is negative. Since the variable is set to 1, we expect that we should reach the line that says "We did not die". Compile and run the program. Can you explain what happens?

5. Managing libraries

Despite what we did in part 3, keeping an entire library in a header file and including it in every application that needs it is not a good idea. there are three problems with it.

A non-trivial application written in C is contains a very large amount of code, espcially if we consider all the code in the standard C libraries. Compiling all that code from scratch would take too long. In Java, we could compile a .java file once and then use its .class file in any other application--it didn't need to be recompiled. We want the same sort of efficiency with C.
If you are a professional C-library writer, you don't want to give out the source code for your C libraries. They might contain trade secrets about the most efficient algorithms for certain tasks. You'd much rather give out compiled versions of your libraries, which are inherently obfuscated.
Putting the code in a header file breaks encapsulation, in a C sort of way. Just as in Java we want classes to be dependent only on each other's interface, not implementation, so in C we want various applications to be dependent on the prototypes of the various functions, not on their implementations (for example, helper functions, or anything else that would be private if it were written in Java). We want to be able to change the library (say, to improve efficiency) without affecting the applications that are dependent on it.

To fix this, we split the library into two parts: the header and the implementation (thus the header should really correspond to the interface in Java). The header file generally contains the prototypes for the functions that the application is expected to use. The implementation file contains the definitions of those functions.

Open a new file, gcd-lib.c. Make it include gcd-lib.h. Then copy the functions from gcd-lib.h to gcd-lib.c. Finally, remove the function bodies from gcd-lib.h.

To compile this, it will take three steps. First, we will need to compile the library. This will not make an executable file, since there is no main function. To deal with this, we need to compile with the -c flag:

gcc gcd-lib.c -c

Do this, and look at the directory to see what happened. It produced a file gcd-lib.o, which is an "object" file. It contains the compiled versions of the functions. -c means "compile only," as opposed to compile all the pieces and link them up together into an executable application.

Next, we compile gcd.c, but also with the -c flag. Because it has the header file, it knows the types and such expected for the parameters and return type of the function, but it does not know the code for those functions.

gcc gcd.c -c

Finally, we join the object files together by what's called linking. The idea is that the various modules are hooked together to make a complete application. Here's the command (this will also call the reslting executable file gcd rather than a.out):

gcc gcd.o gcd-lib.o -o gcd

6. Makefiles

All these commands can be a lot of work, especially if you have many libraries you're dealing with. Fortunately, this process can be automated using a tool called make.

make reads in a file (usually called makefile) which contains a recipe for performing steps in building an application from source code. A makefile consists of a series of rules, each of which consists of a target, a list of prerequisites, and a list of commands. Open the file makefile and consider the following rule:

gcd: gcd.o gcd-lib.o
        gcc gcd.o gcd-lib.o -o gcd

The target is terminated by a colon, and the prerequisites are listed on the same line. The commands are listed on the following lines, each preceeded by a tab. This rule means, "to make gcd, first check if the current version of gcd (if any) is older than the current versions of gcd.o or gcd-lib.o. (In turn, check to see if gcd.o or gcd-lib.o need to be made because either they don't exist of if they are older than the current version of their prerequites.) If so, perform the command gcc gcd.o gcd-lib.o -o gcd."

The advantage of using a makefile is that it will automatically check to see which pieces need to be compiled because they are out of date. For example, if you made a change to gcd-lib.c but not either of the other files, it will recomile gcd-lib.c and re-link, but it will not recompiles gcd.c

The prerequites specify dependencies among files. gcd depends directly on gcd.o and gcd-lib.o. gcd.o depends directly on gcd-lib.h and gcd.c.

Usually the targets in a makefile are the names of files which the rule will produce, but this is not always the case. A target that is not the name of a file is called a phony target. The most common phony target to have is clean, which removes extraneous files, like the object files. If you wanted to get rid of all objects files and compile completely from scratch, running make using the clean target would make sure all objects files were compiled fresh.

You can run a make file by typing make at the command line and specifying a target. Try

make clean

If you do not specify a target, then by default it will run the first target it sees in the file. Try

make

7. To turn in

In a typescript file, cat all the files you wrote. Then run the program to demonstrate that it works. Print the file.

Thomas VanDrunen< /address> Last modified: Tue Apr 14 12:39:51 CDT 2009