The purpose of this assignment is to gain some experience with C programming
by implementing a utility program that is similar to grep
, but
without the ability to process regular expressions (i.e., a lot like a simple
version of fgrep
). In particular, in this assignment,
you will:
gdb
.This assignment does not include any particularly complicated logic or algorithms, but it will require you to organize your code well and make effective use of the C language and libraries. It is meant as an orientation to the Unix/Linux C programming environment.
Implement in C a Unix utility program gasp
. The command
gasp [options] STRING FILE...
should read the listed files (FILE...
) and copy each line from an input file
to stdout
if
it contains STRING
. Each output line should be preceded by the
name of the file that contains it. The argument STRING
may be
any sequence of characters (as expanded, of course, by the shell depending
on how it is quoted). There are two available options,
which may appear in any order if both are present:
-i
Ignore case when searching for lines that contain STRING
.
If the -i
option is used, the strings "this
", "This
", "THIS
", and "thiS
"
all match; if -i
is not used, they are all considered different.-n
Number lines in output. Each line copied to stdout
should
include the line number in the file where it was found in addition to the
file name. The lines in each file are numbered from 1.Your program does not need to be able to handle combinations of option letters
written as a single multi-character option
like -in
or -ni
. But it does need to be able to handle
any combination of either or both (or neither) option.
(This is basically the same output produced by grep
if
the STRING
argument is treated as literal data and not as a regular
expression. You should basically match the output format of grep
,
although your program's output does not necessarily need to be byte-for-byte
identical.)
Besides the general specification given above, your program should meet the following requirements to receive full credit.
\0
). This number should not be hard-wired in
the code, but should be specified with an appropriate #define
preprocessor
command so it can be changed easily. Your program is allowed to produce incorrect
results, fail, or crash if presented with input data containing lines longer
than this limit.malloc
and
return the storage with free
when you are done with it. The
amount allocated should be based on the actual size needed, not some arbitrary
size that is
assumed to be "large enough". Exception to the "only as long as
needed" rule: if necessary, you should allocate a character array or two
that is large enough for the
largest
input
line
(see
#1) and reuse it as needed to process each input line. This avoids having
to count the characters in each input line and (re-)allocate a new array
for
each
one which you shouldn't need to do for every input line.strcpy
in <string.h>
can
be used to copy \0
-terminated
strings; you should not be writing loops to copy such strings one character
at a time.getopt
function in the Linux library that
provides simplified handling of command line options. For this assignment,
only, you may not use this function. You should implement the processing
of command line options yourself, of course using the string library functions
when these are helpful.-i
option, two characters are considered to be equal ignoring case
if they are the same when translated by the tolower(c)
function
(or, alternatively,
toupper(c)
) in <ctype.h>
.stderr
and continue processing
any remaining files on the command line.gcc -Wall
on attu.As with any program you write, your code should be readable and understandable to anyone who knows C. In particular, for full credit your code must observe the following requirements.
main
function
that does everything. If you wish, you may include all of your functions
in a
single
C source file,
since the total size of this program will be fairly small. Be sure to include
appropriate function prototypes near the beginning of the file.void
functions
that perform an action without returning a value. Variables of local significance
like
loop counters,
indices, or pointers should be given simple names like i
, n
,
or p
, and do not require further comments.-i
or -n
options
are selected or not.STRING
argument
to lower- or upper-case, make a copy of it and translate that copy once;
don't do this repeatedly for each input line or even for each input file.
Don't use malloc
or free
excessively - they are
expensive. Don't make unnecessary copies of large data structures; use
pointers. (Copies of int
s,
pointers, and similar things are cheap; copies of arrays and large structs
are expensive.) Don't read the input by calling a library function to read
each indivual character. Read the input a line at a time (it
costs just about the same to call a I/O to read an entire line into
a char array as it does to read a single character). But
don't overdo it. Your code should be simple and clear, not complex containing
lots
of
micro-optimizations
that don't matter.stdout
before you add
the code to search for the STRING
argument and selectively print
lines containing it. Be able to search for exact matches before adding the -i
option
to ignore case. Add the -n
option separately when you're not
trying to do something else.printf
is
your friend here to print values while debugging.<stdio.h>
, <string.h>
, <ctype.h>
and
<stdlib.h>
libraries.-i
option is to translate both the STRING
argument
and each input line to lowercase, then search for the translated STRING
in the translated input line.Learning how to use a debugger effectively can be a big help in getting
your programs to work (although it is no substitute for thinking and careful
programming). To encourage you to gain a basic familiarity with gdb
,
you are required to do the following.
-g
option, to include debugging
information in the executable file.script
program to capture the following console session
in a text file named debug.txt
.gdb
with
your executable program as an argument.main
, and the
other at the point in your program right after the fopen
function
call that opens the input files.gdb
, providing a search string and at least
one file name as arguments.main
,
use the appropriate gdb
command to display the contents of the
variable containing the search string (the
first argument to the program following any options that are present). When
you've done that, continue execution of the program.gdb
commands to display (i)
a backtrace showing the functions active at the time the breakpoint was reached,
(ii) source code showing the line containing the
breakpoint and a couple of surrounding lines, (iii) the name of the file
that was supplied
to the fopen
function
(this should be in a variable somewhere), and (iv) the pointer value returned
by fopen
(presumably
just a hex address, although it might be NULL
if the file can't
be opened).gdb
and
exit from the script
program's shell. Save the debug.txt
output
file from script
to turn
in later.A small amount of extra credit will be awarded for adding the following extensions to an already complete, working assignment. No extra credit will be awarded if the basic program is not fully implemented and substantially bug-free.
If you do any extra credit parts, you should turn in both your original program
without the extra credit and your extended program. The extra credit part should
be in a file whose name ends with "-extra", like gasp-extra.c
.
stdin
.
This should be fairly easy to add if your code is organized as a well-designed
collection of functions.-w
to search for words separated by whitespace.
If -w
is
specified, the STRING
on the gasp
command line
should only be found if it is surrounded by whitespace (blanks,
tabs, newlines, etc.) and not as part of some other string. A character c
in
the input should be treated as whitespace if the library function isspace(c)
returns
true. If you implement this option, the program should find the word if it
appears at the beginning or end of a line, as well as in the middle. You
may also use an additional global variable to record the state of this option
if you wish.-n
option to count line numbers incorrectly.
However you decide to implement this, long input lines should not cause your
program to fail or crash.STRING
parameter anywhere in the line. If you read arbitrarily
long input lines in chunks that have only part of an input line, be sure
you can correctly handle situations where the string in the search spans
two parts of the line instead of falling entirely inside one chunk of the
line.Use the regular online dropbox to turn in the source code to your program
and a copy of the script
(console) output file debug.txt
from
the Debugging exercise above. If you did any extra credit, you should turn
in a separate source file with your additions. You should also
turn in a plain text file named README
that
describes any extra credit that you added to your program, or just a brief
note that
you did not implement any of the extra credit parts. Be sure your name is included
in the source code and README
files.