|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--websphinx.DownloadParameters
Download parameters. These parameters are limits on how Page can download a Link. A Crawler has a default set of download parameters, but the defaults can be overridden on individual links by calling Link.setDownloadParameters().
DownloadParameters is an immutable class (like String). "Changing" a parameter actually returns a new instance of the class with only the specified parameter changed.
Field Summary | |
private java.lang.String |
acceptedMIMETypes
|
private int |
crawlTimeout
|
private int |
delay
|
private int |
downloadTimeout
|
private boolean |
interactive
|
private int |
maxPageSize
|
private int |
maxRequestsPerServer
|
private int |
maxThreads
|
private boolean |
obeyRobotExclusion
|
private boolean |
useCaches
|
private java.lang.String |
userAgent
|
Constructor Summary | |
DownloadParameters()
Make a DownloadParameters object with default settigns. |
Method Summary | |
DownloadParameters |
changeAcceptedMIMETypes(java.lang.String types)
Change accepted MIME types. |
DownloadParameters |
changeCrawlTimeout(int timeout)
Change timeout value. |
DownloadParameters |
changeDownloadTimeout(int timeout)
Change download timeout value. |
DownloadParameters |
changeInteractive(boolean f)
Change interactive flag. |
DownloadParameters |
changeMaxPageSize(int maxPageSize)
Change maximum page size. |
DownloadParameters |
changeMaxThreads(int maxthreads)
Set maximum threads. |
DownloadParameters |
changeObeyRobotExclusion(boolean f)
Change obey-robot-exclusion flag. |
DownloadParameters |
changeUseCaches(boolean f)
Change use-caches flag. |
DownloadParameters |
changeUserAgent(java.lang.String userAgent)
Change User-agent field used in HTTP requests. |
java.lang.Object |
clone()
Clone a DownloadParameters object. |
java.lang.String |
getAcceptedMIMETypes()
Get accepted MIME types. |
int |
getCrawlTimeout()
Get timeout on entire crawl. |
int |
getDownloadTimeout()
Get download timeout value. |
boolean |
getInteractive()
Get interactive flag. |
int |
getMaxPageSize()
Get maximum page size. |
int |
getMaxThreads()
Get maximum threads. |
boolean |
getObeyRobotExclusion()
Get obey-robot-exclusion flag. |
boolean |
getUseCaches()
Get use-caches flag. |
java.lang.String |
getUserAgent()
Get User-agent header used in HTTP requests. |
Methods inherited from class java.lang.Object |
|
Field Detail |
private int maxThreads
private int maxPageSize
private int downloadTimeout
private int crawlTimeout
private boolean obeyRobotExclusion
private int maxRequestsPerServer
private int delay
private boolean interactive
private boolean useCaches
private java.lang.String acceptedMIMETypes
private java.lang.String userAgent
Constructor Detail |
public DownloadParameters()
Method Detail |
public java.lang.Object clone()
clone
in class java.lang.Object
public int getMaxThreads()
public DownloadParameters changeMaxThreads(int maxthreads)
maxthreads
- maximum number of background threads used by crawlerpublic int getMaxPageSize()
public DownloadParameters changeMaxPageSize(int maxPageSize)
maxPageSize
- maximum page size in kilobytespublic int getDownloadTimeout()
public DownloadParameters changeDownloadTimeout(int timeout)
timeout
- length of time (in seconds) to wait for a page to download
Use a negative value to turn off timeout.public int getCrawlTimeout()
public DownloadParameters changeCrawlTimeout(int timeout)
timeout
- maximum length of time (in seconds) that crawler will run.
Use a negative value to turn off timeout.public boolean getObeyRobotExclusion()
public DownloadParameters changeObeyRobotExclusion(boolean f)
f
- If true, then the
crawler checks robots.txt on the remote Web site
before downloading a page.public boolean getInteractive()
public DownloadParameters changeInteractive(boolean f)
f
- true if a user is available to respond
to dialog boxespublic boolean getUseCaches()
public DownloadParameters changeUseCaches(boolean f)
f
- true if cached pages should be used whenever possiblepublic java.lang.String getAcceptedMIMETypes()
public DownloadParameters changeAcceptedMIMETypes(java.lang.String types)
types
- list of MIME types that can be handled
by the crawler. Use null if the crawler can handle anything.public java.lang.String getUserAgent()
public DownloadParameters changeUserAgent(java.lang.String userAgent)
userAgent
- user-agent field used in HTTP
requests. Pass null to use the Java library's default
user-agent field.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |