Project 2
Notes on Part B
Deadline Change: Electronically Wednesday, Feb. 5 (usual time); paperwork in
class the next day.
These notes don't stand alone. Refer back to the
main Project 2 instructions for a base
description of the project. In particular, those instructions described briefly
the three oracle types to be added to Part B.
[New! Oracle
Mall and OracleUtils (1/31). View README before using.]
New specifications:
- Text searcher oracles should be of type GenericTextSearcherOracle, and
their
oracles omens should be of type GenericTextSearcherOmen.
- The identity of the source text being searched should be available to
clients of any type of text searcher oracle or omen. In particular:
- Such oracles and omens must implement a public method String
getTextSource(); which gives the full name of the file being
searched (path name or URL). The name should be in a format that would
be directly usable in another Java program that needed to locate or search
the same file. (In other words, don't abbreviate the name or add
formats or comments, etc.).
- The value returned by an interpretInDetail() method of an omen should
include the text source (unlike the above requirement, the information is
formatted as part of a larger message) in addition to other information
expected of interpretInDetail.
- The numeric result of an omen is refined and reported more carefully.
In particular, all text searcher omen types should make available two pieces
of numeric information to clients, the location of the match, and the quality
of the match. As before, the numeric information should also appear in a
formatted form as part of the interpretInDetail output.
-
integer int getMatchLocation() tells the line number
within the file where the match occurred (or began, in the case of matches
that span several lines). As before, lines are numbered from 1.
The value should be negative if there was no match at all.
- double getMatchScore() tells how good the match was.
This value should be 0.0 if there was no match at all; 1.0 if there was a
perfect match. Values between 0.0 and 1.0 indicate some degree of
partial match (for the current project, no partial matches are defined or
required; you may define partial match values if you can do it in a
consistent way.)
- When two omens (or two matches) are compared, the one with the higher
match score is considered more favorable. If the two have
exactly the same match score, then the one which occurs earlier in the file
is considered more favorable.
Clarifications
- Your program should be able to correctly process any normal (ASCII) text
file, regardless of file name or length. If the program is given a file
of some other format, such as a binary file, your program does not have to
interpret the data -- but it must not blow up, under any circumstance.
It should simply return no omen, or return an omen which shows no match for
the required data. To achieve this, you will need to pay attention to
the exceptions which can arise from methods that you call.
- The file and stream processing you do should use Java classes of the io
package. Do not use any uwcse or other external code. As mentioned
originally, process the file data as you get it, rather than storing the whole
file in an array or other structure to search later. Of course, you are
welcome to look at the sample solution to project 1 for ideas on how to
process the file.
- For the Secret Prophecy Finder Plus! oracle/omen only...
- The requests are taken from a file. This should be a text file.
The user should select it using a File Browser. Each line of the file
is taken to be a complete and independent request. All requests
are processed against the same source text file.
- the interpretInDetail method should list the best matches found.
In particular, it should list all of the perfect matches, or state that
there were none. For each match listed, it should include include the
original request string, the match location, and the match score.
- The (overall) getMatchLocate and getMatchScore should both refer
to the same match, and that should be the "best" match found.
- The requirements from Part A, including the operation of the Luck Tester
Oracle, still apply. You will have to turn in Luck Tester again and it may
be tested again. You are free to use or adapt any official sample
solution code that is posted.
- There will probably be a new version of OracleMall within the next couple
of days.
- Contest. If there is a contest, separate instructions will
follow. Contest results will not affect your project grade.
Examples
A "letter" is any of the letters a-z of the English alphabetic. A
"word" is a set of consecutive letters, preceded by whitespace or punctuation,
and followed by whitespace or punctuation. Nothing else is a word.
(Note that under this definition, a word cannot span more than one line.) In the
following
R U aware, CSE142 and 143 is the funnest d*** course, on this
plnet!
the words are:
R U aware and is the funnest course on this
plnet
Each of the following are two words:
O'Brien, helter-skelter, MyProject.java, base-10-Ethernet,
Nick's
"P2Main.java" is one word ("java").
When matching a request string to the file, you should ignore any non-English
letters in the request. Case is not considered in making a match.
For example, the following three requests should give identical results:
"July 4, 1776" " Jul Y 5 1976"
"1776, ju (1776)-ly"
The following line of a text file:
"Someone took Oscar Peterson's last song"
will match any of these requests (and many others):
"STOP" "top" "tops"
Just as with the original Text Searcher, your program should stop when it
finds the first (complete) match to the request. The line number reported
should be the line of the file where the match begins. The text captured
in the omen should include that line plus any following lines that are part of
the match. For example, given the following lines (with line numbers
shown):
55 a b c d
56 e f g
57 h i j k
59 l m n o p
If the request is "DE", then the omen will contain 55 as the line number and
the two matching lines, both of which the omen should report, are
a b c d
e f g
h i j k
Further Hints,
etc. on the main P2 page.