Package websphinx

Interface Summary
Action  
Classifier Classifier interface.
CrawlListener Crawl event listener.
LinkListener Link event listener.
LinkPredicate  
PagePredicate  
 

Class Summary
Chronicle Run a crawler periodically.
Concatenator Transformer that concatenates multiple pages into a single HTML page.
CrawlAdapter Adapter for CrawlListener interface.
Crawler Web crawler.
CrawlEvent Crawling event.
CrawlTimer  
DownloadParameters Download parameters.
Element Element in an HTML page.
EventLog Crawling monitor that writes messages to standard output or a file.
Form <FORM> element in an HTML page.
FormButton Button element in an HTML form -- for example, <INPUT TYPE=submit> or <INPUT TYPE=image>.
Hashtable2  
HTMLParser HTML parser.
HTMLTransformer  
Link Link to a Web page.
LinkEvent Link event.
LinkTransformer Transformer that remaps URLs in links.
Mirror Offline mirror of a Web site.
MirrorTransformer  
Netscape4Policy  
Page A Web page.
Pattern Base class for pattern matchers.
PatternMatcher  
RecordTransformer  
Regexp  
RegexpMatcher  
Region Region of an HTML page.
RewritableLinkTransformer Transformer that remaps URLs in links in such a way that if the URL mapping changes during (or after) some HTML has been transformed, the HTML can be fixed up after the fact.
RewriteRegion  
RobotExclusion  
SecurityPolicy  
StandardClassifier Standard classifier, installed in every crawler by default.
Tag Tag in an HTML page.
Tagexp Tag pattern.
TagexpMatcher  
Text Tagless text regions on an HTML page.
Wildcard Wildcard pattern.
Worm  
WormTimer