The following is a suggested path for completing this
assignment. This is provided merely as a recommendation for anyone
that may have a hard time knowing how to get started. By no means are
you required to follow this plan of attack.
- Spend some time reading the program description and making
sure that you understand it on paper before you start implementing
anything. Make sure to also spend some time thinking about some of
the design issues that I haven't specified for you (see the second to
last paragraph in the assignment
description).
- Set up the List, Stack, and Queue ADTs. For good modularity,
put each in its own file. Implement any routines that the book
doesn't supply for you. Note that the code in the book is available
on-line (check the textbook
section for the link). Note that you will need to instantiate the
ADTs for the specific types that you plan to be storing in each (the
book's implementation is for a non-specified type).
- To make sure that the implementations are working, and working
the way you expect them to, I'd strongly recommend writing some short
test codes that manipulate each of the ADTs to ensure that they work
as you expect (using integers, for example). Although this doesn't
get you any closer to finishing the program, it's a good way to build
up confidence in the ADT implementation and your ability to use them
before doing anything complicated with them.
- Write a routine that takes a filename as an argument, opens
the file, reads it in a character at a time, and prints the characters
to the console (this should just result in the file being printed to
the screen). Once you've done this, you've done all the file I/O
you'll need to for this program.
- Test the above by prompting the user for a root filename and
sending the filename to your function. In general, you should test
each of the following steps before proceeding to the next too (I won't
say so every time).
- Modify the above code so that only the tags are printed out
but none of the text that falls between tags.
- Modify the above so that only the first word of each tag (the
characters between the "<" and the next space or ">" that you
see) is stored into a string and then printed to the console at the
end of the tag.
- Classify the string that you got in the previous step as being
one of the three tag types: singleton, opener, or closer.
I'd say that this is the minimal amount of work you
should try to get done in the first week. You should make a strong
effort to get through a few more steps, though, so that you start
working with the ADTs.
- Take the appropriate action based on the tag type: ignore it,
push it on the stack, or compare it to the tag on top of the
stack.
- Do what you need to do to generate appropriate error messages
when an unbalanced closer is encountered.
- Check the stack once the entire file has been read and
generate an appropriate error message if it isn't empty. At this
point your program should work for a single HTML file, not following
any of its hyperlinks.
I'd say that this is a good amount of work to try to get done in
the first week. If you can, you're over the hump, for sure
- Put in code that reads a filename whenever you get to an
anchor tag. The filename will be the string after the
<a
that falls within double quotes "". For now, print out this
filename so that you can check that it's working right.
- Enqueue this filename into your queue.
- At this point, you're ready to make your program be driven by
the queue -- rather than sending your original root filename directly
to a function, enqueue it in the queue, and then have a loop that
pulls things out of the queue and processes them until the queue is
empty.
- Make your function that processes a filename return without an
error if the file is not found when you open it (we're assuming that
files that don't exist are remote links and can be ignored). You may
want to print out a message to this effect. At this point your
code should work for HTML webs that have no recursive
hyperlinks.
- At this point, change your program so that every time you're
done with a file, you add it to your list of files that you've
processed.
- Now change it so that before processing a file, you check your
list first to make sure that it's something that you haven't processed
before. Once you have this working, your program should be able to
handle general webs.
- Make the output your program creates as clear and useful as
possible. Be sure to take out any printing that you were doing just
to debug it.