Lab 7: Testing and Lists

Testing code for correctness; linked lists

Goals

This lab provides:

  1. practice writing tests to determine the correctness of functions.
  2. implementing operations on linked lists
  3. write an efficient list sort function

Setup

Note that these instructions no longer include all steps. You should know how to do these things by now. If in doubt, refer to previous lab instructions.

In the terminal:

  1. Add the new directory '7' to the root directory of your repository. This will be the working directory for all the instructions below.
  2. Fetch the new material for Lab 7, stored in a compressed tarball:
    $ wget http://www.cs.sfu.ca/~vaughan/127/lab7.tgz
  3. Expand the archive into the working directory:
    $ tar xzvf lab7.tgz

Make sure you read and underdstand the Guide section below. Lists have been described in CMPT125, but it doesn't hurt to see it again.

Drawing diagrams like those below is very helpful before writing code that modifes lists. In general, sketching machines - including code - is a tool for thinking: even if you think you know what you are doing, drawing out the operations is a concrete test of your understanding. Even very experienced people with great intuition use drawings for thinking.


Guide: Linked Lists

Linked lists are data structures that contain a sequence of data elements, like arrays, but with different dynamic properties.

The key idea in the linked list is to use a simple data structure to store each data element along with a pointer to the next element in the list. The end of the list is denoted by NULL next-pointer.

Our implementation is typical in that it uses a second data structure called a header to store pointers to the first (head) and last (tail) elements in the list.

The list is assembled as follows:

First a list_t structure is allocated on the heap, with its head- and tail-pointers set to NULL, representing an empty list.

To insert the first value into the list, a new element_t is allocated on the heap, the value is stored in it, and the header's head- and tail-pointers are both set to point to it. The first element's next-pointer is NULL to indicate it is the last element in the list.

When a subsequent element is added, the next-pointer of the tail element and the tail-pointer of the header are both changed to the address of the new element:

One more addition using the same mechanism. Notice that the tail element always has its next-pointer set to NULL.

Notice that the head- and tail-pointers can be used to add elements to the beginning or end of the list in constant time (O(1)), in contrast to extending native C arrays which in general takes time proportional to the number of elements in the array, i.e. O(n).

Also, given a pointer to any element, an element can be inserted after it in constant time.

However, looking up a list element by its position in the list, like an array index, takes time proportional to its position in the list, i.e. O(n). List elements are therefore best accessed in order, since accessing the next element takes constant time.

A common variant is the double-linked list in which every element contains a previous-pointer in addition to the next-pointer. Double-linked lists can be traversed forwards and backwards, at the cost of a little more storage space per element.

It is important to note that while lists have good theoretical resizing properties, iterating over elements in a list can be much slower than iterating over C arrays due to the cache behaviour of current computer architectures. C arrays guarantee that elements are contiguous in memory, making very effective use of the cache. We will talk about cache behaviour later in labs.

Task 1

There is no Task 1. The supplied file t1.c is just used in the lab demo. Move on to Task 2.

Tasks 2..6

The tarball contains a header file list.h that contains an interface specification for a linked-list-of-integers data structure. See the Guide below for notes on linked lists. Six slightly different implementations are provided, in files tN.c where N = [2..6]. The file main.c contains a very weak test program for the linked list code.

The Makefile will build programs t2 through t6, each linking the same main.c with one of the list implementation C files.

Build each program by naming it as your make target: e.g.

$ make t2
$ make t5

Running the resulting programs, you will see that every one passes the test in main.c. At this point the beginner might relax and load up Minecraft. But we have left blissful ignorance behind us. We are suspicious. In fact all of these implementations contain bugs.

Your task is to extend main.c to thoroughly test the list implementations. Your program must reliably distinguish all these faulty implementations from a correct one.

The server tests work as follows: Your main.c will be compiled with each of tN.c as well as a bug-free version (not supplied to you). You pass Task N if your program consistently returns 1 when it runs the buggy code, but 0 when it runs the bug-free code.

Requirements:

  1. A program built from your main.c and linked against any implementation of the functions in list.h must return 0 if the functions are bug-free, or 1 if they contain one or more bugs.
  2. Preferably, your program should not crash or halt on assert(). But a crash or assertion will be recorded as indicating the code contained bugs, just like on the server.
  3. Preferably, print an explanatory error message on stdout describing the problem you discovered.
  4. You may produce (a sensible amount) of other text output on stdout or stderr if you wish.
  5. The server will not test your text output: only the return value. Try to make the text output helpful for yourself or an instructor/TA helping you.

Submission

Commit a revised version of main.c. This will be linked against each of the buggy list implementations tN.c on the server in the same way you did locally. The server also has a correct version of the code. Your program should return zero when linked to the correct code, and non-zero when linked to any of the buggy versions.

Task N will be passed if your program can reliably distinguish between a buggy and correct version of the code. To test reliability, your test program will be run several times. It must correctly detect bugs or no-bugs every time.

A useful thing to know about the behaviour of the standard library

Recall that in foo_t* p = malloc( sizeof(foo_t) );, malloc() returns the base address of (i.e. a pointer to) a chunk of memory on the heap the size of foo_t. When you are finished with the memory, you a return it to the system with free(p);. Now, if you immediately malloc( sizeof(foo_t)) again, malloc() is very likely to give you the exact same piece of memory back again. This is not required by the specification, but it gives very good performance.

This fact is very handy for debugging, since we can exploit the fact that the contents of the memory chunk will probably be unchanged between the free() and malloc(). For example, if you set the entire contents of the memory chunk to be non-zero before you free it, it will not contain any NULL pointers when you get it back again. You can also set data fields to particular memorable values. You can check for them later and you might infer that these fields were not set by an intialization function.

Task 7

Requirements:

  1. Commit a new file called t7.c containing correct implementations of all the functions described in list.h.
  2. You may use any piece(s) of the supplied code, or write your own.

Submission

Commit a single C file called t7.c.

Task 8

Write a function that sorts an instance of our linked list of integers from smallest to largest value.

Requirements:

  1. Commit a new file called t8.c containing at least the function list_sort()
  2. Your file must not contain a main() function or change the structures in list.h.
  3. The function must #include the list.h header...
  4. ... and be consistent with the function declaration:
     void list_sort( list_t* list ); 
  5. Use an efficient algorithm.
  6. Did you read requirement 5?

Submission

Commit a single C file called t8.c.

Lab complete. Back to the list of labs.