CS 223: Data Structures. Instructor: Jim Aspnes
The Zoo is located on the third floor of Arthur K Watson Hall, toward the front of the building.
More information about the Zoo can be found through its top-level web page.
Most people run Unix with a command-line interface provided by a shell. Each line typed to the shell tells it what program to run (the first word in the line) and what arguments to give it (remaining words). The interpretation of the arguments is up to the program.
When you log in to a Zoo node directly, you may not automatically get a shell window. If you use the default login environment (which puts you into the Gnome window manager), you can click on the foot in the middle of the toolbar to pop up a terminal emulator from which you can run emacs, gcc, and so forth.
Most of what one does with Unix programs is manipulate the filesystem. Unix files are unstructured blobs of data whose names are given by paths consisting of a sequence of directory names separated by slashes: for example /home/accts/some-user/cs223/hw1.c. At any time you are in a current working directory (type pwd to find out what it is and cd new-directory to change it). When working on files in the current working directory you only have to give the part of the pathname after the last slash.
Here are some handy Unix commands:
In order to write your programs you will need to use some sort of text editor. There are three reasonable text editors on Linux: Vi, Emacs, and gedit. My personal preference is for Vi, but almost everybody likes Emacs better.
To start Emacs, type emacs at the command line. If you are actually sitting at a Zoo node it should put up a new window. If not, Emacs will take over the current window. If you have never used Emacs before, you should immediately type C-h t (this means hold down the Control key, type h, then type t without holding down the Control key). This will pop you into the Emacs built-in tutorial.
General note: C-x means hold down Control and press x; M-x means hold down Alt (Emacs calls it ``Meta'') and press x. For M-x you can also hit Esc and then x.
If you don't find yourself liking Emacs very much, you might want to try Vim instead. Vim is a vastly enhanced reimplementation of the classic vi editor, which I personally find easier to use than Emacs. Type vimtutor to run the tutorial. You can always get out by hitting the Escape key a few times and then typing :qa!.
A C program will typically consist of one or more files whose names end with .c. To compile foo.c, you can type gcc foo.c. Assuming foo.c contains no errors egregious enough to be detected by the extremely forgiving C compiler, this will produce a file named a.out that you can then execute by typing ./a.out.
If you want to debug your program using gdb or give it a different name, you will need to use a longer command line. Here's one that compiles foo.c to foo (run it using ./foo) and includes the information that gdb needs:
gcc -g3 -o foo foo.c
By default, gcc doesn't check everything that might be wrong with your program. But if you give it a few extra arguments, it will warn you about many (but not all) potential problems:
gcc -g3 -Wall -std=c99 -pedantic -o foo foo.c
For complicated programs involving multiple source files, you are probably better off using make than calling gcc directly. Make is a ``rule-based expert system'' that figures out how to compile programs given a little bit of information about their components.
For example, if you have a file called foo.c, try typing make foo and see what happens.
In general you will probably want to write a Makefile, which is named Makefile or makefile and tells make how to compile programs in the same directory. Here's a typical Makefile:
# Any line that starts with a sharp is a comment and is ignored # by Make. # These lines set variables that control make's default rules. # We STRONGLY recommend putting "-Wall -ansi -pedantic" in your CFLAGS. CC=gcc CFLAGS=-g3 -Wall -ansi -pedantic # The next line is a dependency line. # It says that if somebody types "make all" # make must first make "hello-world". # By default the left-hand-side of the first dependency is what you # get if you just type "make" with no arguments. all: hello-world # How do we make hello-world? # The dependency line says you need to first make hello-world.o # and hello-library.o hello-world: hello-world.o hello-library.o # Subsequent lines starting with a TAB character give # commands to execute. Note the use of the CC and CFLAGS # variables. $(CC) $(CFLAGS) -o hello-world hello-world.o hello-library.o echo "I just built hello-world! Hooray!" # We can also declare that several things depend on one thing. # Here we are saying that hello-world.o and hello-library.o # should be rebuilt whenever hello-library.h changes. # There are no commands attached to this dependency line, so # make will have to figure out how to do that somewhere else # (probably from the builtin .c -> .o rule). hello-world.o hello-library.o: hello-library.h # Command lines can do more than just build things. For example, # "make test" will rebuild hello-world (if necessary) and then run it. test: hello-world ./hello-world # This lets you type "make clean" and get rid of anything you can # rebuild. The -f tells rm not to complain about files that aren't # there. clean: rm -f hello-world *.o
Given a Makefile, make looks at each dependency line and asks: (a) does the target on the left hand side exist, and (b) is it older than the files it depends on. If so, it looks for a set of commands for rebuilding the target, after first rebuilding any of the files it depends on; the commands it runs will be underneath some dependency line where the target appears on the left-hand side. It has built-in rules for doing common tasks like building .o files (which contain machine code) from .c files (which contain C source code). If you have a fake target like all above, it will try to rebuild everything all depends on because there is no file named all (one hopes).
Make really really cares that the command lines start with a TAB character. TAB looks like eight spaces in Emacs and other editors, but it isn't the same thing. If you put eight spaces in (or a space and a TAB), Make will get horribly confused and give you an incomprehensible error message about a ``missing separator''. This misfeature is so scary that I avoided using make for years because I didn't understand what was going on. Don't fall into that trap--- make really is good for you, especially if you ever need to recompile a huge program when only a few source files have changed.
Few programs do exactly what you expect on the first try. Sometimes it's not too hard to figure out why your program is misbehaving, but sometimes you have to look closely at what it's doing.
Let's look at a contrived example. Suppose you have the following program bogus.c:
/* Print the sum of the integers from 1 to 1000 */ int main(int argc, char **argv) { int i; int sum; sum = 0; for(i = 0; i -= 1000; i++) { sum += i; } printf("%d\n", sum); return 0; }
Let's compile and run it and see what happens:
$ gcc -g3 -o bogus bogus.c $ ./bogus -34394132 $
That doesn't look like the sum of 1 to 1000. So what went wrong? If we were clever, we might notice that the test in the for loop is using the mysterious -= operator instead of the <= operator that we probably want. But let's suppose we're not so clever right now--- it's four in the morning, we've been working on bogus.c for twenty-nine straight hours, and there's a -= up there because in our befuddled condition we know in our bones that it's the right operator to use. We need somebody else to tell us that we are deluding ourselves, but nobody is around this time of night. So we'll have to see what we can get the computer to tell us.
The first thing to do is fire up gdb, the debugger. This runs our program in stop-motion, letting us step through it a piece at a time and watch what it is actually doing. In the example below gdb is run from the command line. You can also run it directly from Emacs with M-x gdb, which lets Emacs track and show you where your program is in the source file with a little arrow.
$ gdb bogus GNU gdb 4.17.0.4 with Linux/x86 hardware watchpoint and FPU support Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... (gdb) run Starting program: /home/accts/aspnes/tmp/bogus -34394132 Program exited normally.
So far we haven't learned anything. To see our program in action, we need to slow it down a bit. We'll stop it as soon as it enters main, and step through it one line at a time while having it print out the values of the variables.
(gdb) break main Breakpoint 1 at 0x8048476: file bogus.c, line 9. (gdb) run Starting program: /home/accts/aspnes/tmp/bogus Breakpoint 1, main (argc=1, argv=0xbffff9ac) at bogus.c:9 9 sum = 0; (gdb) display sum 1: sum = 1 (gdb) n 10 for(i = 0; i -= 1000; i++) 1: sum = 0 (gdb) display i 2: i = 0 (gdb) n 11 sum += i; 2: i = -1000 1: sum = 0 (gdb) n 10 for(i = 0; i -= 1000; i++) 2: i = -1000 1: sum = -1000 (gdb) n 11 sum += i; 2: i = -1999 1: sum = -1000 (gdb) n 10 for(i = 0; i -= 1000; i++) 2: i = -1999 1: sum = -2999 (gdb) quit The program is running. Exit anyway? (y or n) y $
Here we are using break main to tell the program to stop as soon as it enters main, display to tell it to show us the value of the variables i and sum whenever it stops, and n (short for next) to execute the program one line at a time.
When stepping through a program, gdb displays the line it will execute next as well as any variables you've told it to display. This means that any changes you see in the variables are the result of the previous displayed line. Bearing this in mind, we see that i drops from 0 to -1000 the very first time we hit the top of the for loop and drops to -1999 the next time. So something bad is happening in the top of that for loop, and if we squint at it a while we might begin to suspect that i -= 1000 is not the nice simple test we might have hoped it was.
In general, the idea behind debugging is that a bad program starts out sane, but after executing for a while it goes bananas. If you can find the exact moment in its execution where it first starts acting up, you can see exactly what piece of code is causing the problem and have a reasonably good chance of being able to fix it. So a typical debugging strategy is to put in a breakpoint (using break) somewhere before the insanity hits, ``instrument'' the program (using display) so that you can watch it going insane, and step through it (using next, step, or breakpoints and cont) until you find the point of failure. Sometimes this process requires restarting the program (using run) if you skip over this point without noticing it immediately.
For large or long-running programs, it often makes sense to do binary search to find the point of failure. Put in a breakpoint somewhere (say, on a function that is called many times or at the top of a major loop) and see what the state of the program is after going through the breakpoint 1000 times (using something like cont 1000). If it hasn't gone bonkers yet, try restarting and going through 2000 times. Eventually you bracket the error as occurring (for example) somewhere between the 4000th and 8000th occurrence of the breakpoint. Now try stepping through 6000 times; if the program is looking good, you know the error occurs somewhere between the 6000th and 8000th breakpoint. A dozen or so more experiments should be enough isolate the bug to a specific line of code.
The key to all debugging is knowing what your code is supposed to do. If you don't know this, you can't tell the lunatic who thinks he's Napoleon from lunatic who really is Napoleon. If you're confused about what your code is supposed to be doing, you need to figure out what exactly you want it to do. If you can figure that out, often it will be obvious what is going wrong. If it isn't obvious, you can always go back to gdb.
Fri 03 May 2002 23:06:16 EDT howto.tyx Copyright © 1998-2002 by Jim Aspnes