CSE 303, Autumn 2009
Homework 2A: anagram

Due Tuesday, October 13, 2009, 11:30 PM
50 points total


This assignment will focus on some basic shell scripting.  I chose a general topic -- anagrams -- that was the basis for an assignment for many of you in CSE143.  (Yes, I know a number of you did not do this assignment, and there is no knowledge from it that you need.)  It will also be given and turned-in in two parts: this is 2A (50 points) and I will announce part 2B (50 points) on Monday (October 12) and it will be due on Sunday October 18 at 11:30PM.

This assign has one objective: helping you get used to basic bash shell scripting.

Part 2A of the assignment is: write a bash shell script that accepts a string as a parameter and reports all entries in /usr/share/dict/linux.words that are anagrams of that string.  Here are some examples:

$ ./allgrams "wyxha"
$ ./allgrams "live"
evil
live
veil
vile
$

A word is only considered to be an anagram if it contains exactly the same letters as the parameter in any order: said another way, the anagram has no extra or missing letters but doesn't care about the order of the letters in the parameter.  You must name your script allgrams.  The order of the results can vary (although alphabetical is surely the most likely).  Ignore any non-alphabetic characters (in Unix parlance, only consider [a-z][A-Z]) in both the parameter and in the dictionary.  My solution has several subscripts and uses commands including sort, tr, if, test and sed.  Your solution may take an entirely different set of commands.

Both parts will be submitted via the Catalyst dropbox.  Submit a single file named hw2a.tar.gz that includes your allgrams.sh script, any supporting subscripts, and a script named test-allgrams.sh that takes no arguments and runs a set of test cases on allgrams.sh.  What is a tar.gz file?  A tar file (man tar for more information) archives a set of files together in a single file.  A gz extension is used to represent files compressed using gzip.  Since creating an archive and then compressing it is so common, you can take both of these steps with a single command:

$ tar -czvf hw2a.tar.gz *.sh

The c option says to create an archive, the v option says it should list the files added to the archive (this will include allgrams.sh, test-allgrams.sh, and any subscripts), the f option gives the name of the file that will contain the archive, and the z option says "then run gzip on the archive."   The man pages (tar, gzip) can be used to figure out what to do if you need to decompress and/or extract files from the archive.

Grading will be consistent with that objective: (a) correct scripts are valued; and (b) common Unix style is valued.  This second point may be confusing, so let me clarify a little bit.  The way you solve problems using, for example, Java, is quite different in general from the way you solve problems in a Unix-based scripting language like bash.  For example, when possible using streams of commands and pipes connecting them is common in Unix scripting but is not so in Java.  And extensive uses of loops and recursion are more common in Java than in shell scripting.  Don't over-stress on this issue, but if you find yourself thinking about how to solve the problem in (say) Java and then converting it to bash, then you might well be off-track and you should check-in with the staff.