PRACTICE: Homework 3
    
      
      Due: Friday, April 28, 2023, at 11:59pm
      Assignment Goal
      The purpose of this assignment is to gain some experience with C
	programming. In particular, in this assignment, you will:
	
	  - Gain  experience creating and running C programs,
 
	  - Become familiar with some of the basic C libraries, including those for
	    file and string handling,
 
	  - Get a better understanding of how Unix utilities are implemented,
 
	  - Gain some basic experience with the unix debugger, 
gdb, and 
	  - Learn to use a style-checking tool to locate source code that may need attention.
 
	
      
      This project should be done independently.  If you work with a classmate make sure you are each editing and working on your own set of files.  You should not copy and paste code.
      
      The material that we have learned in lecture is not enough
	to complete this assignment. It is expected that you will investigate
	the resources and libraries mentioned in this document to learn about
	how to use them.
      
      Synopsis
      In this assignment you will create your own version of the unix command wc.  You will read in a file and report stats including the number of lines and number of words in the file.  Your code should behave as follows:
	
	  - Compile with the command 
gcc -Wall -std=c11 -o wordcount wordcount.c 
	  - Requires one or more input files.
	  
- If there are no input files, reports an error and exits with a (1) code.
 
	  - Does not need to work with redirected input
 
	    - If a file can not be opened an error message is printed that the file is skipped.
 
	  
 
	  - For each input file calculate the number of lines, words, and characters
	  
- Can check against wc on any test file - the results should be the same.
 
 
	  - Prints a message with the results and the file name, similar to 
wc 
	  - Additionally prints the total number of lines for all files
 
	  - Proccesses a limited number of options - 
-l, -w, -c
	    - If an option is detected the program will output only the number of lines, words, or characters respectively.
 
	      - The program will additionally not print the total number of lines in all the files
 
	      - The program shouldn't process more than one option.
		If more than one option is submitted the first one should be activated. 
	      - To simplify the option handling you will assume that any options will come before the names of the input files.  
If you detect any input argument string that is not a valid option, you will assume that it, and any subsequent arguments are file names. 
	  - Be able to handle input lines containing up to 500 characters (including the terminating \0 byte).  
The performance for other files is undefined, and will not be evaluated.  
	
	
      Example Operation:
	
$ ./wordcount -l
Usage: ./wordcount requires an input file.
$ echo $?
1
$ ./wordcount point.c "NON FILE" shorttext
73 247 1574 point.c
NON FILE will not open.  Skipping.
4 13 68 shorttext
Total Lines = 77
$ wc point.c "NON FILE" shorttext
73  247 1574 point.c
wc: 'NON FILE': No such file or directory
4   13   68 shorttext
77  260 1642 total
$ ./wordcount -l point.c
73
$ ./wordcount -c point.c shorttext
1574
68
$ ./wordcount -l -wc shorttext
-wc will not open.  Skipping.
 4  
	
      Technical Requirements
	
      You should pay attention to the following guidelines for meeting performance expectations.
	
	  
	  - Use standard C library functions where possible; do not
	    reimplement operations available in the basic libraries. For
	    instance, 
strncpy in <string.h>
	    can be used to copy \0-terminated strings; you should
	    not be writing loops to copy such strings one character at a
	    time.
	    
	   - You should use "safe" versions of file and string
	    handling routines such as 
fgets
	    and strncpy instead of routines
	    like gets and strcpy.  The safe
	    functions allow specification of maximum buffer or array lengths
	    and will not overrun adjacent memory if used properly. 
	  
	  - If an error occurs when opening or reading a file, the program
	    should write an appropriate error message to 
stderr
	    and continue processing any remaining files on the command
	    line. 
	  
	  - Since this program is likely relatively short, all of the functions
	    should be in a single file called 
wordcount.c. You
	    should arrange your code so that related
	    functions are grouped together in a logical order in the file. 
	  
	  - Your code must compile and run without errors or warnings when
	    compiled and executed on 
seaside using gcc
	    with the -Wall
	    and -std=c11 options. Since this assignment should
	    not need to use any unusual or system-dependent code you can
	    probably develop and test your code on any recent Linux
	    system or other system that supports a standard C
	    compiler. However, you should verify your program before the
	    submission deadline. 
	    
	  - Your program must be robust.  It should not
	    crash (segfault or otherwise) or produce meaningless or incorrect
	    output regardless of the contents of command line parameters or input
	    files  (except, of course, you are not required to deal with files
	    or string parameters with lines longer than the limits given above).
	    If the program terminates prematurely because of some error,
	    it should print an appropriate error message to 
stderr and
	    exit with an exit code of EXIT_FAILURE (defined in
	    <stdlib.h> -- see the description of the
	    exit() function). 
	  
	  - If the program terminates normally after attempting to open and
	    process all of the files listed on the command line, it
	    should terminate with an exit
	    code of 
EXIT_SUCCESS (see <stdlib.h>).
	    This is normally done by returning this value as the int
	    result of the main function. 
	
      
      Code Quality Requirements
	As with any program you write, your code should be readable and
	  understandable to anyone who knows C. In particular, for full credit
	  your code must observe the following requirements.
	  
	    - Divide your program into suitable functions, each of which does
	      a single well-defined task. For example, there should almost
	      certainly be a function that processes a single input file, which
	      is called as many times as needed to process each of the files
	      listed on the command line (and which, in turn, might call other
	      functions to perform identifiable subtasks). Your program most
	      definitely may not consist of one huge 
main function
	      that does everything. However it should not contain tiny functions
	      that only contain isolated statements or code fragments instead of
	      dividing the program into coherent pieces. Be sure to
	      include appropriate function prototype declarations near the
	      beginning of the file so the actual function definitions can
	      appear in whatever order is most appropriate for presenting the
	      code in the remainder of the file in a logical sequence and so that
	      related functions are grouped together. 
	    
	    - Comment sensibly, but not excessively. You should not use
	      comments to repeat the obvious or explain how the C language works
	      -- assume that the reader knows C at least as well as you. Your
	      code should, however, include the following minimum comments:
	      
		- Every function must include a heading comment that
		  explains what the function does (not how it does
		  it), including the significance of all parameters and any
		  effects on or use of global variables (to the extent that
		  there are any). 
 
		
		- Every significant variable must include a comment that is
		  sufficient to understand what information is stored in the
		  variable and how it is stored. In some cases, you may
		  describe many variables in one comment.
 
		- Any code based on someone else's work must include a
		  comment for citation.  (If the code is longer than a couple
		  of lines you should read for understanding and then create
		  your own version.  Citations are still appropriate.)
 
		- In addition, there should be a comment at the top of the
		  file giving basic identifying information, including your
		  name, the date, and the name and purpose of the
		  file. 
 
	      
	     
	    
	    - Use appropriate names for variables and functions: nouns or noun
	      phrases suggesting the contents of variables or the results of
	      value-returning functions; verbs or verb phrases
	      for 
void functions that perform an action without
	      returning a value. Variables of local significance like loop
	      counters, indices, or pointers should be given simple names
	      like i,
	      or p, and often do not require further comments. 
	    
	    - Avoid global variables.  Use parameters (particularly pointers)
	      appropriately.
 
	    
	    - You may use an appropriate #define MAXLINE command to set the
	      maximum line length mentioned above. 
 
	    
	    - No unnecessary computation. Don't make unnecessary copies of
	      large data structures; use pointers. (Copies of 
ints,
	      pointers, and similar things are cheap; copies of arrays and large
	      structs are expensive.) Don't read the input by calling a library
	      function to read each individual character. Read the input a line
	      at a time (it costs just about the same to call an I/O function to
	      read an entire line into a char array as it does to read a single
	      character). Your code should be simple and
	      clear, not complex containing lots of micro-optimizations that
	      don't matter. 
	    - You should use the cpplint.py style checker:
	      
- 
	    (right-click to download, or used 
wget  and
	    chmod +x to make it
	    executable if needed). 
		-  Use 
./cpplint.py --clint wordcount.cto review your code. 
		Note: If this fails, check for your python installation (whereis python.  In some cases you must call python3 explicitely: python3 ./cpplint.py --clint wordcount.c, or modify the first line of cpplint.py to point to your system's python installation.) 
		- There is more help for using this code on the CSE 333 page.
		  
 - cpplint.py is an example of a linter, which is tool to check code for compliance to style and or coding standards.  In this case, the linter is checking that code complies with style guidelines developed at Google, and used widely in industry. Compliance with style guidelines is essential when multiple developers are working on the same codebase.
 
		- While this checker may
		  flag a few things that you wish to leave as-is, (Notably: You may ignore warnings
		  about 'strtok') most of the things it
	  catches, including whitespace errors in the code, should be fixed.  We
	  will run this style checker on your code to check for any issues that
	  should have been fixed.  Use the discussion board or office hours if
	  you have questions about particular clint warnings.
 
	
	    - Hint: All reasonable programming text editors have commands or
	  settings to use spaces instead of tabs at the beginning of lines,
	  which is required by the style checker and is much more robust than
	  having tabs in the code.  For example, if you are a emacs user, you
	  can add the following line to the 
.emacs file in your
	  home directory to ensure that emacs translates leading tabs to
	  spaces:
(setq-default indent-tabs-mode nil). 
 
	
	
	Implementation Hints
	
	  
	    - There are a lot of things to get right here; the job may seem
	      overwhelming if you try to do all of it at once. But if you break
	      it into small tasks, each one of which can be done individually by
	      itself, it should be quite manageable.  For instance, figure out
	      how to process a single file before you implement the logic to
	      process all of the files on the command line. (Or, vice-versa, but
	      start small and test before you move on). Figure out how to
	      open, read, and copy all of a file to 
stdout before
	      adding another step.  HINT: We have a sample program from an earlier lecture that does this. 
	    - You might notice that the structure of this program is similar
	      to your 
spellcheck script: You first check for usage
	      (is the command called correctly?), and then, for each input
	      file you perform a task.  You write similar error messages, and exit
	      with a failure code under similar circumstances. 
	    
	    - Implement a main that gets the arguments and prints a place holder for the output
	      (either to screen or file), and you are off to a good start.  Test this before you move on.
 
	    
	    - Think before you code.  You will ultimately get the job done
	      faster, better, and with less pain if you spend some time to sketch
	      your design (which functions are needed? what exactly do they do? what
	      are the main data structures?) before you write detailed code.  Start
	      coding by writing function headings and heading comments and creating
	      significant variables -- and commenting those too.  Then as you write
	      detailed code and test it you will have your written design
	      information in the comments to compare and check as you work on the
	      code.
 
	    - I/O is relatively expensive, while storing one more integer is relatively inexpensive.
	      As a result, it is likely a good idea to write one function that calculates
	      all the potential output values in one go, and use the options to determine
	      which ones to print to stdout.
 
	    
	    - Every time you add something new to your code (see hint #1),
	      test it. Right Now! Immediately!! BEFORE YOU
		DO ANYTHING ELSE!!! (Did we mention that you
	      should test new changes right away?) It is much easier to find and
	      fix problems if you can isolate the potential bug to a small
	      section of code you just added or changed. 
 
	    
	    - The standard C library contains many functions that you will
	      find useful.  In particular, look at the 
<stdio.h>, 
	      <string.h>, <ctype.h>
	      and <stdlib.h> libraries.  Use the cplusplus reference
	      link on the course home page to look up details about functions
	      and their parameters; use a good book like The C Programming
		Language for background and perspective about how they are
	      intended to be used.
	     
	    
	    strlen tells you how many characters are in a string. 
	    - Every file stream that is open should be subsequently closed.
 
	    - Use the compiler 
-Wall option.  Don't waste time
	      searching for errors that the compiler or run-time tests could have
	      caught for you. 
	    
	    - Be sure to test for errors like trying to open or read a
	      nonexistent file to see if your error handling is working
	      properly.
 
	    
	    - Once you're done, read the instructions again to see if you
	      overlooked anything.
 
	    
	  
	
	Assessment
	This assignment is worth 45 points
	Your code will be evaluated by an auto-grader that will check specified performance criteria.  There will be some additional manual grading that will look at the points brought up in the Code Quality requirements above (specifically looking at commenting and code efficiency).
	Turning In
	Please submit your files via the Gradescope HW3 Assignment. You should submit one file named wordcount.c.  As usual, you may resubmit for an improved score on the autograder.