websphinx
Class StandardClassifier
java.lang.Object
|
+--websphinx.StandardClassifier
- All Implemented Interfaces:
- Classifier, java.io.Serializable
- public class StandardClassifier
- extends java.lang.Object
- implements Classifier
Standard classifier, installed in every crawler by default.
On the entire page, this classifier sets the following labels:
- root: page is the root page of a Web site. For instance,
"http://www.digital.com/" and "http://www.digital.com/index.html" are both
marked as root, but "http://www.digital.com/about" is not.
Also sets one or more of the following labels on every link:
- hyperlink: link is a hyperlink (A, AREA, or FRAME tags) to another page on the Web (using http, file, ftp, or gopher protocols)
- image: link is an inline image (IMG).
- form: link is a form (FORM tag). A form generally requires some parameters to use.
- code: link points to code (APPLET, EMBED, or SCRIPT).
- remote: link points to a different Web server.
- local: link points to the same Web server.
- same-page: link points to the same page (e.g., by an anchor reference like "#top")
- sibling: a local link that points to a page in the same directory (e.g. "sibling.html")
- descendent: a local link that points downwards in the directory structure (e.g., "deep/deeper/deepest.html")
- ancestor: a link that points upwards in the directory structure (e.g., "../..")
- See Also:
- Serialized Form
Field Summary |
static float |
priority
Priority of this classifier. |
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait |
priority
public static final float priority
- Priority of this classifier.
StandardClassifier
public StandardClassifier()
- Make a StandardClassifier.
classify
public void classify(Page page)
- Classify a page.
- Specified by:
classify
in interface Classifier
- Parameters:
page
- Page to classify
getPriority
public float getPriority()
- Get priority of this classifier.
- Specified by:
getPriority
in interface Classifier
- Returns:
- priority.