|
CSE Home | About Us | Search | Contact Info |
Due: Wednesday, April 28, at 11 pm.
The purpose of this assignment is to gain some experience with C programming
by implementing a utility program that is similar to grep
, but
without the ability to process regular expressions (i.e., a lot like a simple
version of fgrep
). In particular, in this assignment,
you will:
gdb
.This assignment does not include any particularly complicated logic or algorithms, but it will require you to organize your code well and make effective use of the C language and libraries. You will also have to explore the details of the C string and file I/O libraries to discover how to do various operations that should already be familiar from your programming experience in other languages, but not in C. It is meant as an orientation to the Unix/Linux C programming environment. You should do this assignment by yourself.
Implement in C a Unix utility program gasp
. The command
gasp [options] STRING FILE...
should read the listed files (FILE...
) and copy each line from
the input to stdout
if
it contains STRING
. Each output line should be preceded by the
name of the file that contains it. The argument STRING
may be
any sequence of characters (as expanded, of course, by the shell depending
on how it is quoted). There are two available options,
which may appear in any order if both are present:
-i
Ignore case when searching for lines that contain STRING
.
If the -i
option is used, the strings "this
", "This
", "THIS
", and "thiS
"
all match; if -i
is not used, they are all considered different.-n
Number lines in output. Each line copied to stdout
should
include the line number in the file where it was found in addition to the
file name. The lines in each file are numbered from 1.Your program does not need to be able to handle combinations of option letters
written as a single multi-character option
like -in
or -ni
. But it does need to be able to handle
any combination of either or both (or neither) option when they appear separately
on the command line.
(This is basically the same output produced by grep
if
the STRING
argument is treated as literal data and not as a regular
expression. You should pretty much match the output format of grep
,
although your program's output does not need to be byte-for-byte
identical. One difference, though, is that a file name should be printed on
every output line, even if only one file is specified on the gasp
command
line.)
Besides the general specification given above, your program should meet the following requirements to receive full credit.
\0
). This number should not be hard-wired in
the code, but should be specified with an appropriate #define
preprocessor
command so it can be changed easily. Your program is allowed to produce incorrect
results or fail if presented with input data containing lines longer
than this limit.\0
). This length
should also be specified by an appropriate #define
.strncpy
in <string.h>
can
be used to copy \0
-terminated
strings; you should not be writing loops to copy such strings one character
at a time.getopt
function in the Linux library that
provides simplified handling of command line options. For this assignment,
only, you may not use this function. You should implement the processing
of command line options yourself, of course using the string library functions
when these are helpful.fgets
and strncpy
instead of routines like gets
and strcpy
.
The safe functions allow specification of maximum buffer or array lengths
and will not overrun adjacent memory if used properly.-i
option, two characters are considered to be equal ignoring case
if they are the same when translated by the tolower(c)
function
(or, alternatively,
toupper(c)
) in <ctype.h>
.stderr
and continue processing
any remaining files on the command line.gcc
with the -Wall
option.
The versions of gcc
on dante
and the EE lab machines
are recent enough, as are the ones in any current cygwin or Linux distribution.As with any program you write, your code should be readable and understandable to anyone who knows C. In particular, for full credit your code must observe the following requirements.
main
function
that does everything. If you wish, you may include all of your functions
in a single C source file,
since the total size of this program will be fairly small. Be sure to include
appropriate function prototypes near the beginning of the file so the actual
function definitions can appear in whatever order is most appropriate in
the remainder of the file.void
functions
that perform an action without returning a value. Variables of local significance
like loop counters,
indices, or pointers should be given simple names like i
, k
, n
,
or p
, and do not require further comments.-i
or -n
options
are selected or not.STRING
argument
to lower- or upper-case, translate it (or a copy of it) once;
don't do this repeatedly for each input line or even for each input file.
Don't make unnecessary copies of large data structures;
use pointers. (Copies of int
s,
pointers, and similar things are cheap; copies of arrays and large structs
are expensive.) Don't read the input by calling a library function to read
each individual character. Read the input a line at a time (it
costs just about the same to call an I/O function to read an entire line
into a char array as it does to read a single character). But
don't overdo it. Your code should be simple and clear, not complex containing
lots
of
micro-optimizations
that don't matter.stdout
before you add
the code to search for the STRING
argument and selectively print
lines containing it. Be able to search for exact matches before adding the -i
option
to ignore case. Add the -n
option separately when you're not
trying to do something else.printf
is
your friend here to print values while debugging. The debugger is also your
friend -- learn how to use it.<stdio.h>
, <string.h>
, <ctype.h>
and
<stdlib.h>
libraries.-i
option is to translate both
the STRING
argument
and each input line to lowercase, then search for the translated STRING
in
the translated input line. (Translating a string to lower-case sure sounds
like a well-defined operation that should be in a separate function!)Learning how to use a debugger effectively can be a big help in getting
your programs to work (although it is no substitute for thinking and careful
programming). To encourage you to gain a basic familiarity with gdb
,
you are required to do the following.
-g
option, to include debugging
information in the executable file.script
program to capture the following console session
in a text file named debug.txt
.gdb
with
your executable program as an argument.main
, and the
other at the point in your program right after the fopen
function
call that opens the input files.gdb
run
command, providing a
search string and at least one file name as arguments.main
,
use the appropriate gdb
command to display the contents of the
variable containing the search string (the
first argument to the program following any options that are present). When
you've done that, continue execution of the program.gdb
commands to display (i)
a backtrace showing the functions active at the time the breakpoint was reached,
(ii) source code showing the line containing the
breakpoint and a couple of surrounding lines, (iii) the name of the file
that was supplied
to the fopen
function
(this should be in a variable somewhere), and (iv) the pointer value returned
by fopen
(presumably
just a hex address, although it might be NULL
if the file can't
be opened).gdb
and
exit from the script
program's shell. Save the debug.txt
output
file from script
to turn
in later.A small amount of extra credit will be awarded for adding the following extensions to an already complete, working assignment. No extra credit will be awarded if the basic program is not fully implemented and substantially bug-free.
If you do any extra credit parts, you should turn in both your original program
without the extra credit and your extended program. The extra credit version
should be in a separate file whose name ends with "-extra", like gasp-extra.c
.
stdin
.
This should be fairly easy to add if your code is organized as a well-designed
collection of functions.-w
to search for words separated by whitespace.
If -w
is
specified, the STRING
on the gasp
command line
should only be found if it is surrounded by whitespace (blanks,
tabs, newlines, etc.) and not as part of some other string. A character c
in
the input should be treated as whitespace if the library function isspace(c)
returns
true. If you implement this option, the program should find the word if it
appears at the beginning or end of a line, as well as in the middle. You
may also use an additional global variable to record the state of this option
if you wish.-n
option to count line numbers incorrectly.
However you decide to implement this, long input lines should not cause your
program to fail or crash.STRING
parameter anywhere in the line. If you read arbitrarily
long input lines in chunks that have only part of an input line, be sure
you can correctly handle situations where the STRING
value
spans two parts of the line instead of falling entirely inside one chunk
of the
line.Use the regular online dropbox
to turn in the source code to your program and a copy of the script
(console)
output file debug.txt
from the Debugging exercise above. If
you did any extra credit, you should turn in a separate source file with
your additions. You should also turn in a plain text file named README
that
describes any extra credit that you added to your program, or just a brief
note that you did not implement any of the extra credit parts. Be sure
your name is included
in the source code and README
files.
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to Hal Perkins] |