After this lab you will be able to
A program's working memory of stack and heap data structures only exists while the program is running. To store data between runs, and to capture output, we can use the filesystem. The filesystem is a service provided by the OS that provides files to your programs. A file is like a named array of bytes, and once created a file will persist until deleted, even when the computer is turned off.
You are familiar with files: text files like C sourcecode; sound files like MP3s; executable files like your compiled programs. At the filesystem abstraction level, these are all the same thing: just a contiguous sequence of bytes. The interpretation of these bytes is up to your program.
Files are a common special case of the general problem of storing data outside a running program. In general this is called External Data Representation (XDR). Other examples of XDR occur when using databases or networking.
Files are identified by a path, which is a generalization of a filename and can be any of:
OPEN( path, mode ): opens a file for reading and/or writing, possibly creating a new file if path does not exist. These options are defined by the mode. Returns an identifier that the other functions use to identify an open file. The file has a length in bytes, which is initially zero for a new file, and a current read/write position which is an index into the bytes of the file at which read and write operations will do their work. After OPEN() the initial read/write position is either at the beginning or end of the file depending on the mode.
WRITE( ID, source, length ): writes length bytes from source into the file, starting from the current read/write position and overwriting anything already there. The length of the file will increase automatically if necessary. When the write has finished, the current read/write position is set to one beyond the data written.
READ( ID, dest, length ): reads length bytes from the file into dest. When the read is finished, the current read/write position is set to one beyond the data read.
CLOSE( ID ): closes the file, indicating to the OS that we have finished using it.
Almost every programming language supports a version of this interface. You may recognize it from Python. For the C programmer, this interface is provided by these four system calls defined in stdio.h:
FILE * fopen( const char * filename, const char * mode); size_t fwrite( const void * ptr, size_t size, size_t nitems, FILE * stream); size_t fread( void * ptr, size_t size, size_t nitems, FILE * stream); int fclose( FILE *stream);
These calls closely match their abstract versions, except that read and write have a convenient extension that makes it easy to work with structs (see example code above). The following links give the specifications of each of these functions according to the Open Group standard:
Documentation is also available as man pages on your local computer. The advantages of the Open Group specifications are that they are sometimes better written, cover only the functionality supported by all standard implementations and often contain examples. The man pages will contain details that are specific to your local OS.
You should get used to reading documentation in these forms.
Unless you have a good reason, stick to the standard interfaces. This will make it easier (i) to port your code to another OS; and (ii) to find another programmer who can understand it. Also, new versions of OS are more likely to implement the standard than to retain their previous quirks.
These functions are a masterpiece of interface design. fopen() has the most complex functionality, but a very simple interface. fwrite() and fread() have the same interface to opposite functionality. Your calls to read and write look exactly the same, which makes it easy to write them correctly.
fseek() : repositions the current read/write location.
feof() : tells you if the end-of-file is reached.
ftell() : returns the current read/write location.
ftruncate() : truncate a file to a specified length.
stat() : get file status
Examples of using the file API as demonstrated in class, and beyond. Background on files and links to the interface specifications are provided below.
#include <stdio.h> int main( int argc, char* argv[] ) { const size_t len = 100; int arr[len]; // put data in the array // ... // write the array into a file (error checks ommitted) FILE* f = fopen( "myfile", "w" ); fwrite( arr, sizeof(int), len, f ); fclose( f ); return 0; }
#include <stdio.h> int main( int argc, char* argv[] ) { const size_t len = 100; int arr[len]; // read the array from a file (error checks ommitted) FILE* f = fopen( "myfile", "w" ); fread( arr, sizeof(int), len, f ); fclose( f ); // use the array // ... return 0; }
#include <stdio.h> #include <stdlib.h> #include <string.h> typedef struct { int x,y,z; } point3d_t; int main( int argc, char* argv[] ) { const size_t len = atoi(argv[1]); // array of points to write out point3d_t wpts[len]; // fill with random points for( size_t i=0; i<len; i++ ) { wpts[i].x = rand() % 100; wpts[i].y = rand() % 100; wpts[i].z = rand() % 100; } // write the struct to a file (error checks ommitted) FILE* f1 = fopen( argv[2], "w" ); fwrite( wpts, sizeof(point3d_t), len, f1 ); fclose( f1 ); // array of points to read in from the same file point3d_t rpts[len]; // read the array from a file (error checks ommitted) FILE* f2 = fopen( argv[2], "r" ); fread( rpts, sizeof(point3d_t), len, f2 ); fclose( f2 ); if( memcmp( wpts, rpts, len * sizeof(rpts[0]) ) != 0 ) puts( "Arrays differ" ); else puts( "Arrays match" ); return 0; }
This example shows the use of a simple file format that uses a short "header" to describe the file contents, so that an object of unknown size can be loaded.
Make sure you understand this example in detail. It combines elements from the examples above into a simple but realistic implementation of a file format.
/* saves an image to the filesytem using the file format: [ cols | rows | pixels ] where: cols is a uint32_t indicating image width rows is a uint32_t indicating image height pixels is cols * rows of uint8_ts indicating pixel grey levels */ int img_save( const img_t* img, const char* filename ) { assert( img ); assert( img->data ); FILE* f = fopen( filename, "w" ); if( f == NULL ) { puts( "Failed to open image file for writing" ); return 1; } // write the image dimensions header uint32_t hdr[2]; hdr[0] = img->cols; hdr[1] = img->rows; if( fwrite( hdr, sizeof(uint32_t), 2, f ) != 2 ) { puts( "Failed to write image header" ); return 2; } const size_t len = img->cols * img->rows; if( fwrite( img->data, sizeof(uint8_t), len, f ) != len ) { puts( "Failed to write image pixels" ); return 3; } fclose( f ); return 0; } /* loads an img_t from the filesystem using the same format as img_save(). Warning: any existing pixel data in img->data is not free()d. */ int img_load( img_t* img, const char* filename ) { assert( img ); FILE* f = fopen( filename, "r" ); if( f == NULL ) { puts( "Failed to open image file for reading" ); return 1; } // read the image dimensions header: uint32_t hdr[2]; if( fread( hdr, sizeof(uint32_t), 2, f ) != 2 ) { puts( "Failed to read image header" ); return 2; } img->cols = hdr[0]; img->rows = hdr[1]; // helpful debug: // printf( "read header: %u cols %u rows\n", // img->cols, img->rows ); // allocate array for pixels now we know the size const size_t len = img->cols * img->rows; img->data = malloc( len * sizeof(uint8_t) ); assert( img->data ); // read pixel data into the pixel array if( fread( img->data, sizeof(uint8_t), len, f ) != len ) { puts( "Failed to read image pixels" ); return 3; } fclose( f ); return 0; }
Usage:
- img_t img;
- img_load( &img, "before.img" );
- image_frobinate( img ); // manipulate the image somehow
- img_save( &img, "after.img" );
Extend the functionality of your integer array from Lab 5 to support saving and loading arrays from the filesystem in a binary format.
Fetch the header file "intarr.h". It contains these new function declarations:
/* LAB 6 TASK 1 */ /* Save the entire array ia into a file called 'filename' in a binary file format that can be loaded by intarr_load_binary(). Returns zero on success, or a non-zero error code on failure. Arrays of length 0 should produce an output file containing an empty array. */ int intarr_save_binary( intarr_t* ia, const char* filename ); /* Load a new array from the file called 'filename', that was previously saved using intarr_save_binary(). Returns a pointer to a newly-allocated intarr_t on success, or NULL on failure. */ intarr_t* intarr_load_binary( const char* filename );
Commit the single file "t1.c" to your repo in the lab 6 directory.
Extend the functionality of your integer array from Lab 5 to support saving and loading arrays from the filesystem in JSON, a common human- and machine-readable text format.
Sometimes it is useful for humans to be able to read your stored data, or to import your data into another program that does not understand your binary format. The most readable, portable XDR format is plain text. A popular syntax for text files is JSON (JavaScript Object Notation), which, as the name suggests, was originally an XDR format for web programs. It is easier to use and less verbose than the also-popular Extensible Markup Language (XML) and more expressive than the bare-bones Comma-Separated Values (CSV) formats you may have seen.
The down side of text formats is that they are:
The header file "intarr.h" also contains these new function declarations:
The standard library has two functions that can be very helpful for rendering text into files:
They work just like the familiar printf() and scanf() but read to and write from FILE* objects instead of standard input and standard output. You should probably use these to solve this task.Notice from those man pages that another pair of functions snprintf() and sscanf() is also available to print and scan from C strings too. (sprintf() exists, but the lack of array length checking means this is not safe or secure to use. Always use snprintf()).
/* LAB 6 TASK 2 */ /* Save the entire array ia into a file called 'filename' in a JSON text file array file format that can be loaded by intarr_load_json(). Returns zero on success, or a non-zero error code on failure. Arrays of length 0 should produce an output file containing an empty array. The JSON output should be human-readable. Examples: The following line is a valid JSON array: [ 100, 200, 300 ] The following lines are a valid JSON array: [ 100, 200, 300 ] */ int intarr_save_json( intarr_t* ia, const char* filename ); /* Load a new array from the file called 'filename', that was previously saved using intarr_save(). The file may contain an array of length 0. Returns a pointer to a newly-allocated intarr_t on success (even if that array has length 0), or NULL on failure. */ intarr_t* intarr_load_json( const char* filename );
Commit the single file "t2.c" to your repo in the lab 6 directory.