Lab 5: Composite Data Types

struct and typedef; malloc() practice

Goals

After this lab you will be able to

  1. Create and use structured composite data types with typedef and struct.
  2. Implement an interface to a collection using structures.

Setup

In the terminal:

  1. Create and svn add the new directory '5' to the root directory of your repository, then make it your working directory. (Exact commands ommitted - you should know how to do this by now. Refer to earlier labs if necessary).
  2. Fetch the header file containing Lab 5 tasks:
    $ wget http://www.cs.sfu.ca/~vaughan/127/intarr.h

    You can also get the headerfile from your browser at http://www.cs.sfu.ca/~vaughan/127/intarr.h


Guide

Structures

It is often useful to collect multiple data items together into a single logical entity: a composite data type. For example, consider the raster images from Lab 3, each described by a pointer to an array of data, a width and height. All three items are required to interpret the encoded image. Most languages provide a mechanism to collect such a set into an object called a structure or class. C provides the struct keyword for this purpose. For example:

struct { 
  uint8_t* pixels;
  unsigned int cols;
  unsigned int rows;
} img;

This declaration creates a new variable called img, that collects the variables we need for one image.

The components, called fields or members, of the structure are accessed using "dot" syntax, so we can initialize our image like so:

img.cols = 640;
img.rows = 480;
img.pixels = malloc( img.cols * img.rows * sizeof(uint8_t));

The structure is implemented in the simple and fast sort of way you would expect from C. It is just a chunk of memory with space for all its fields. The dot syntax is interpreted at compile time as a number of bytes offset from the structure's base address, so these things are true:

sizeof(img) == sizeof(uint8_t*) + sizeof(int) + sizeof(int)

&img.pixels == (void*)&img + 0
&img.cols == (void*)&img + sizeof(uint8_t*)
&img.rows == (void*)&img + sizeof(uint8_t*) + sizeof(int)

This means that C structures have almost no overhead on CPU or memory use at run-time.

An occasion when this is not quite true is that the compiler may add some empty space called padding between fields so that each field starts at the CPU's favoured memory alignment boundary. Most modern CPUs are fastest when accessing memory at 4- or 8-byte boundaries. Inserting padding wastes a little bit of memory. Unless memory is very tight we usually ignore these details and stick to the dot syntax.

More information on memory alignment, including padding can be found here.

Typedef

The struct keyword as used above creates just one instance of a variable. Since we may want to create lots of images, or pass pointers to them into functions, we can create a new type based on our structure definition using the typedef keyword:

typedef struct { 
  uint8_t* pixels;
  unsigned int cols;
  unsigned int rows;
} img_t;

With the typedef prefix, instead of declaring a variable, we have declared a new type called img_t. It is conventional to use "_t" as a suffix on defined types. Having defined a type, we can create instances of it on the stack as shown below, creating two variables of type img_t, and a pointer-to-img_t. As the example shows, once typedef'd we can use our new type just like we use int, char, etc.

typedef struct { 
  uint8_t* pixels;
  unsigned int cols;
  unsigned int rows;
} img_t;

img_t img1;
img_t img2;
img_t* imgptr = &img1;

Allocating structures on the heap

Like any other type, space can be allocated for structures on the heap with malloc(). The sizeof() macro also works as it does for any other type, so the heap-allocation process looks very familiar:

img_t* imgptr = malloc( sizeof(img_t) );

This is an important mechanism because it allows functions to return pointers to newly-created structures. Recall from Lab 2 that returning pointers to local variables like this:

  1. // ...
  2. img_t img;
  3. return &img; // OOPS!
  4. }
is a serious bug as the stack memory allocated for variable img is freed when the function returns. Returning a pointer to heap memory provided by malloc() is safe:
  1. // ...
  2. img_t* imgptr = malloc( sizeof(img_t) );
  3. if( imgptr == 0 )
  4. {
  5. printf( "Warning: failed to allocate memory for an image structure\n" );
  6. }
  7.  
  8. return imgptr; // safe to return a pointer provided by malloc()
  9. }

Pointer indirection

There is one more bit of syntax to learn: how to access the fields of a structure through a pointer. We have two choices: either "look inside" the pointer using the normal '*' syntax, followed by the dot syntax to access the field:

int width = (*imgptr).cols;

The parentheses are necessary, since the '.' operator has precedence over the '*' operator.

Alternatively we can use the indirection arrow syntax to "look through" the pointer:

int width = imgptr->cols;

These two are exactly equivalent. The indirection arrow is arguably neater and is preferred.

User-defined types as function arguments

You may use any defined type for function arguments and return values, for example something like this:

void draw_image( uint8_t* pixels, 
                 unsigned int cols, 
                 unsigned int rows )
{ ... }

// call the function above
draw_image( arr, w, h );

Can be considerably simplified:

void draw_image( img_t* img )
{ ... }

// call the function above
draw_image( &img1 );

When a structure variable is assigned, as on line 6 below, the structure on the right-hand-side of the assignment is copied byte-for-byte into the variable on the left-hand-side.

  1. img_t original;
  2. original.cols = 32;
  3. original.rows = 32;
  4. original.pixels = malloc( w*h*sizeof(uint8_t);
  5.  
  6. img_t duplicate = original;
  7.  
  8. assert( duplicate.rows == original.rows ); // always true
  9. assert( duplicate.cols == original.cols ); // always true
  10. assert( duplicate.pixels == original.pixels ); // always true

Note that in this example, the original.pixels and duplicate.pixels fields have the same value. They point to the same array of pixels obtained by return from malloc() on line 4: the array is not duplicated, we just have two identical pointers to it. This is known as a shallow copy. Be very careful when copying structures that contain pointers, since this may not be the behaviour you want.

Also be aware that if you free( original.pixels ) any subsequent use of the duplicate.pixels pointer is a bug, since the memory it points to has been freed. Having more than one copy of a pointer is called pointer aliasing, and is a notorious source of bugs. If possible, avoid having more than one copy of a pointer.

Arrays of structs

Arrays of typedef'd structs work in the same way as any other type:

// small array on the stack
img_t imgarr[100];
imgarr[43].cols = 640;
unsigned int width = imgarr[43].cols;

// large array on the heap
img_t* thousand_images = malloc( 1000 * sizeof(img_t) );

When to use typedef struct

Deciding when to create a new structured type is an important part of program design. Your type choices can have effects throughout your code. Some rules of thumb:

You should probably use a struct when a set of variables always appears together, or are jointly responsible for something, e.g. interpreting an encoding in our image example.

You should probably use typedef when you need to declare more than once instance of a structure, and when using a structure for function arguments, since the resulting code is easier to read.

Tasks [1:8]

  1. Your task is to implement the the integer array functions declared and specified in the supplied header file intarr.h. These are introduced in the guided part of the lab.

  2. Your implementation should be entirely contained in a C source file called intarr.c (full path inside your repo: 4/intarr.c).

  3. Write your own test code in another source file to make sure your functions are correct.

  4. Important: DO NOT add a main() function to intarr.c. This will prevent the server's test program from linking the file.

  5. Important: DO NOT require modifications to intarr.h: the server will use the file as supplied to you. In particular, your function definitions must match the declararions given in intarr.h

Submission

Add and commit the file intarr.c to svn. The server will run code to test all the functions.

Lab complete. Back to the list of labs.