Tip

Before trying to write and debug a script, it’s very helpful to first work out the necessary commands by experimenting in a shell window. Similarly, you can test one statement in your script at a time in your shell. Also, look at man pages and other descriptions of commands. Sometimes options are available that allow a single command to do something you want instead of having to use several commands or writing loops or complex control structures in a script. You can also find ways to control how much output a command gives.

There are two independent scripts you will be writing: combine and spellcheck.

Combine

Create a bash script combine that takes 2 or more arguments, call them f1, f2, ..., fn.

Script combine should work as follows:

  • All arguments are treated as filenames.

  • If fewer than two arguments are given, print Usage: combine outputfilename [inputfilename ...] error message on stderr and exit with a return code of 1.

  • If a file or directory f1 already exists, print Error: Output file should not exist on stderr and exit with a return code of 1.

  • Otherwise, concatenate the contents of f2, ..., fn and put them in f1. You will want to handle errors with input files, but do not print any error messages from this (for example, if some file does not exist or is a directory). Instead, these error messages should be written to f1. Exit with a return code of 0 after copying the input files.

Restrictions

You may not use the file names /dev/stdout or /dev/stderr. These are not portable across *nix systems. Although they are found on most versions of Linux, the problem can be solved without them.

Example Run

$ echo "making file 1" > file1
$ echo "and another one for file 2" > file2
$ touch file3
$ ./combine output file1 file2 nonfile file3
$ cat output
making file 1
and another one for file 2
cat: nonfile: No such file or directory

Hint

  • Put filenames in double quotes in case they contain “funny characters” (such as spaces). Your script should work with any file names, no matter what they contain.

  • What happens if you run cat 'nonexisting file'?

  • You don’t need to write your own error message regarding non-existing input files. How about redirecting cat?

  • Some commands that may be useful (but you don’t have to use these): cat, shift, $@, -lt, -a, &>>.

Spellcheck

Create a bash script called spellcheck that takes one or more arguments, f1, f2, ..., fn.

Script spellcheck should work as follows:

  • If the script is not supplied with an input argument, print a usage note to stderr and exit with a code of 1
$ ./spellcheck
Usage: ./spellcheck filename ...
  • If an argument is supplied that is not a valid file ([ ! -f $1 ]), spellcheck prints an error message to stderr and skips that argument.
$ ./spellcheck nonfile
./spellcheck error: nonfile does not exist - skipping.
  • Your script should read each word in a supplied file and use grep to compare that word to the dictionary (which is assumed to exist at /usr/share/dict/words). If the exact word is not found in the dictionary, it should be added to the end of a file called <FILE>.spelling. You should add repeated words to the word list in the initial step, and you’ll want to add words in the order you encounter them.

  • If <FILE>.spelling does not exist, print a message to stdout stating that the script is creating it: ./spellcheck creating <FILE>.spelling file

  • If <FILE>.spelling already exists, print a message to stdout stating that you are deleting the old file and replacing it: ./spellcheck replacing <FILE>.spelling file

  • The grep search should be case-insensitive.

  • At the end of processing each file, print a message to stdout asserting completion of the file, with the number of words found and the number of unique words found: ./spellcheck processed <FILE> and found <NUMBER> spelling errors, <NUMBER> of which are unique

  • Your script should exit with a return code of 0 after completing the final print statement.

Note

This is not a true spell check - you will identify any word that does not appear precisely in the dictionary. This includes numbers, words with weird punctuation, names, and other edge cases. The provided files above have most of this removed for clarity.

Example Run

$ ./spellcheck shorttext
./spellcheck creating shorttext.spelling file
./spellcheck processed shorttext and found 1 spelling errors, 1
of which are unique
$ ./spellcheck tricky shorttext
./spellcheck error: tricky does not exist - skipping.
./spellcheck replacing shorttext.spelling file
./spellcheck processed shorttext and found 1 spelling errors, 1
of which are unique

Execute ./spellcheck longtext will create a file called longtext.spelling, which contains a list of words such as ‘gnu.org’ and ‘Korn’.

Test

You should use shorttext and longtext to test your code. The longtext file has 9 spelling errors, 8 of which are unique.

You can download these files to a bash shell using the command wget, or right click on the link, do the downloads onto your local machine, and then scp to seaside.

Hint

  • Your combine script should provide a method for stepping through input arguments that you can re-use here.

  • Nested loops allow you to read lines and words within lines. This is not the only way to solve the problem, but it works.

  • In Lecture 2 we learned how to use grep to find words in a file. Look for options that allow case-insensitive and quiet operation.

  • Make sure you match the word with the exact pattern. Do you need any anchors?

  • Some other commands that may be useful: wc, sort, and uniq.

  • There are a couple of slides from Lecture 4 that talks about redirecting your output.

  • As always, get one small part of your script working at a time.