CSE 333 Exercise 7

Out:   Wednesday, January 21
Due:   Monday, January 26 by 11 AM
Rating:   3 (note)
CSE 333 Exercise Rating Scale

Each exercise this quarter is rated on a integer scale of 1 – 5, inclusive, with 1 being the "least time-consuming" and 5 being the "most time-consuming".

This difficulty scale is meant as a rough guide for you in predicting the amount of time to set aside for each exercise as you balance the work required for 333 with your other obligations. However, it is necessarily imperfect as everyone's set of circumstances and experiences with the exercises differ. If your experience with an exercise does not align with its rating, that is not a reflection of you or your abilities.

Goals

  • Use various POSIX I/O library functions with files and directories.
  • Explain the relationship between C standard library I/O and POSIX I/O.
  • Implement code that error checks I/O function calls and properly cleans up resources on every execution path.

Background

Directories

Directories are special files that store the names and locations of the related files/directories — itself, its parent, and all of its children (i.e., the directory's contents). You can take CSE 451 to learn more about the directory structure, but for this class, you can access directory information in C/C++ using the struct dirent structure and the POSIX library functions found in dirent.h.

POSIX I/O

The Portable Operating System Interface (POSIX) defines a standard set of low-level interfaces for interacting directly with the operating system on Unix-like systems. These interfaces provide a common, portable API for tasks such as opening files, reading and writing bytes, seeking within files, and working with directories.

Unlike the C standard library I/O functions (such as fopen, fread, and fgets), POSIX I/O functions (such as open, read, write, and lseek) operate at a lower level and interact more directly with the operating system. POSIX I/O is typically unbuffered and works with explicit file descriptors, making program behavior more predictable and easier to reason about in systems-level code.

In this exercise, we use POSIX I/O to expose the mechanics of file access that are normally hidden behind higher-level library abstractions. This helps build an understanding of how files and directories are actually represented and accessed by the operating system, and why careful error handling and resource management are essential in systems programming.

Buffering

POSIX I/O functions perform unbuffered reads and writes, meaning that each call to read or write may result in a system call. In practice, programs often introduce buffering to improve performance and reduce system call overhead.

While you are not expected to implement buffering here, being aware of how buffering affects performance and program structure is considered good systems programming practice and will be useful in later coursework and real-world applications.

Problem Description

Write a C program in ex7.c that:

  1. accepts a directory name as a command-line argument. The directory name can be a simple name or a longer file path; it might, or might not, have a leading '/' at the beginning; and might, or might not, have a trailing '/' at the end.
  2. scans through the directory looking for filenames that end in the four characters .txt. You should not scan subdirectories, and you do not need to sort the directory entries; you may also assume that no subdirectory itself has a directory name ending in .txt.
  3. reads the contents of those files, copying the contents to stdout.
    • You must use the POSIX open, read, and close functions to read the files. You may use C standard library functions (e.g., fwrite) to write to stdout.


Implementation Notes

User Input

You will be reading in user input from a command-line argument. You should handle various inputs from the user, which may be in an unexpected format. For each input, you should take some time to reason through your options for handling it gracefully (i.e., without unexpectedly crashing), decide which one seems "best", and document your decision in your code. Refer to the Problem Description instuctions above for some directory name considerations.

Testing

Testing for this exercise should reflect the scope of the problem. At a minimum, you should test your program using a small directory that you create yourself, containing a few simple text files with known contents. This makes it easy to verify correctness and reason about program behavior.

As with most systems programs, testing can be approached at multiple levels, including user input testing, input file testing, and debugging individual functions. If you are interested in more robust testing, you may find it useful to experiment with larger or more varied text files, such as those available at Online Samples. Use of external files is optional and not required.

POSIX

The POSIX functions you will likely be most interested in are: open(), close(), read(), lseek(), opendir(), readdir(), and closedir().

Error Handling & Robustness

Library functions (e.g., C standard library and POSIX) have many possible errors that may arise during execution. It is your responsibility to make sure that you handle these errors correctly by retrying in the case of recoverable errors (EAGAIN and EINTR) and returning an error status in the case of a non-recoverable error. Make sure that you clean up system resources in all possible cases.

Style Focus

General

Organize your code in a clear and conventional way. Group related functionality together, place declarations before use, and structure your files so that their purpose and flow are easy to follow.

Favor readability and consistency over cleverness. Use descriptive names, consistent formatting, and explicit control flow so that your code can be easily understood by someone reading it for the first time.

As always in C, carefully check the results of library and system calls, report errors clearly, and clean up any resources you acquire before exiting. Your program should behave predictably even when errors occur.

POSIX I/O

When using POSIX I/O, be mindful of the differences from C standard library I/O. POSIX functions operate at a lower level and may return partial results, so code should be written defensively and avoid assumptions about how much data is read or written in a single call.

Submission

Submit the following file by creating an ex7-submit tag in your exercise repository before the assignment deadline. The file should be located in the exact directory listed below, including capitalization:

  • ex7/ex7.c

Your code must:

  • Compile without errors or warnings on CSE Linux machines (lab workstations, attu, or CSE home VM).
  • Have no crashes, memory leaks, or memory errors on CSE Linux machines. (We strongly recommend testing with valgrind.)
  • Be contained in a single file named ex7.c that compiles with the command:
    $ gcc -Wall -g -std=c17 -o ex7 ex7.c
    
  • Have a comment at the top of the source file with your name and CSE or UW email address.
  • Be pretty: the formatting, modularization, variable and function names, and commenting should be clear, consistent, and easy to read.
  • Be robust: your code should handle invalid or unexpected input (if any) and hard-to-handle cases gracefully.