The following is a suggested path for completing this assignment. This is provided merely as a recommendation for anyone that may have a hard time knowing how to get started. By no means are you required to follow this plan of attack.

  1. Spend some time reading the program description and making sure that you understand it on paper before you start implementing anything. Make sure to also spend some time thinking about some of the design issues that I haven't specified for you (see the second to last paragraph in the assignment description).

  2. Set up the List, Stack, and Queue ADTs. For good modularity, put each in its own file. Implement any routines that the book doesn't supply for you. Note that the code in the book is available on-line (check the textbook section for the link). Note that you will need to instantiate the ADTs for the specific types that you plan to be storing in each (the book's implementation is for a non-specified type).

  3. To make sure that the implementations are working, and working the way you expect them to, I'd strongly recommend writing some short test codes that manipulate each of the ADTs to ensure that they work as you expect (using integers, for example). Although this doesn't get you any closer to finishing the program, it's a good way to build up confidence in the ADT implementation and your ability to use them before doing anything complicated with them.

  4. Write a routine that takes a filename as an argument, opens the file, reads it in a character at a time, and prints the characters to the console (this should just result in the file being printed to the screen). Once you've done this, you've done all the file I/O you'll need to for this program.

  5. Test the above by prompting the user for a root filename and sending the filename to your function. In general, you should test each of the following steps before proceeding to the next too (I won't say so every time).

  6. Modify the above code so that only the tags are printed out but none of the text that falls between tags.

  7. Modify the above so that only the first word of each tag (the characters between the "<" and the next space or ">" that you see) is stored into a string and then printed to the console at the end of the tag.

  8. Classify the string that you got in the previous step as being one of the three tag types: singleton, opener, or closer.

    I'd say that this is the minimal amount of work you should try to get done in the first week. You should make a strong effort to get through a few more steps, though, so that you start working with the ADTs.

  9. Take the appropriate action based on the tag type: ignore it, push it on the stack, or compare it to the tag on top of the stack.

  10. Do what you need to do to generate appropriate error messages when an unbalanced closer is encountered.

  11. Check the stack once the entire file has been read and generate an appropriate error message if it isn't empty. At this point your program should work for a single HTML file, not following any of its hyperlinks.

    I'd say that this is a good amount of work to try to get done in the first week. If you can, you're over the hump, for sure

  12. Put in code that reads a filename whenever you get to an anchor tag. The filename will be the string after the <a that falls within double quotes "". For now, print out this filename so that you can check that it's working right.

  13. Enqueue this filename into your queue.

  14. At this point, you're ready to make your program be driven by the queue -- rather than sending your original root filename directly to a function, enqueue it in the queue, and then have a loop that pulls things out of the queue and processes them until the queue is empty.

  15. Make your function that processes a filename return without an error if the file is not found when you open it (we're assuming that files that don't exist are remote links and can be ignored). You may want to print out a message to this effect. At this point your code should work for HTML webs that have no recursive hyperlinks.

  16. At this point, change your program so that every time you're done with a file, you add it to your list of files that you've processed.

  17. Now change it so that before processing a file, you check your list first to make sure that it's something that you haven't processed before. Once you have this working, your program should be able to handle general webs.

  18. Make the output your program creates as clear and useful as possible. Be sure to take out any printing that you were doing just to debug it.