All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.classifiers.ThresholdSelector

java.lang.Object
    |
    +----weka.classifiers.Classifier
            |
            +----weka.classifiers.DistributionClassifier
                    |
                    +----weka.classifiers.ThresholdSelector

public class ThresholdSelector
extends DistributionClassifier
implements OptionHandler
Class for selecting a threshold on a probability output by a distribution classifier. The threshold is set so that a given performance measure is optimized. Currently this is the F-measure. Performance is measured either on the training data, a hold-out set or using cross-validation. In addition, the probabilities returned by the base learner can have their range expanded so that the output probabilities will reside between 0 and 1 (this is useful if the scheme normally produces probabilities in a very narrow range).

Valid options are:

-C num
The class for which threshold is determined. Valid values are: 1, 2 (for first and second classes, respectively), 3 (for whichever class is least frequent), 4 (for whichever class value is most frequent), and 5 (for the first class named any of "yes","pos(itive)", "1", or method 3 if no matches). (default 5).

-W classname
Specify the full class name of the base classifier.

-X num
Number of folds used for cross validation. If just a hold-out set is used, this determines the size of the hold-out set (default 3).

-R integer
Sets whether confidence range correction is applied. This can be used to ensure the confidences range from 0 to 1. Use 0 for no range correction, 1 for correction based on the min/max values seen during threshold selection (default 0).

-S seed
Random number seed (default 1).

-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation, 1 for evaluation using hold-out set, and 2 for evaluation on the training data (default 1).

Options after -- are passed to the designated sub-classifier.

Version:
$Revision: 1.22 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)

Variable Index

 o EVAL_CROSS_VALIDATION
 
 o EVAL_TRAINING_SET
 
 o EVAL_TUNED_SPLIT
 
 o OPTIMIZE_0
 
 o OPTIMIZE_1
 
 o OPTIMIZE_LFREQ
 
 o OPTIMIZE_MFREQ
 
 o OPTIMIZE_POS_NAME
 
 o RANGE_BOUNDS
 
 o RANGE_NONE
 
 o TAGS_EVAL
 
 o TAGS_OPTIMIZE
 
 o TAGS_RANGE
 

Constructor Index

 o ThresholdSelector()
 

Method Index

 o buildClassifier(Instances)
Generates the classifier.
 o designatedClassTipText()
 
 o distributionClassifierTipText()
 
 o distributionForInstance(Instance)
Calculates the class membership probabilities for the given test instance.
 o evaluationModeTipText()
 
 o getDesignatedClass()
Gets the method to determine which class value to optimize.
 o getDistributionClassifier()
Get the DistributionClassifier used as the classifier.
 o getEvaluationMode()
Gets the evaluation mode used.
 o getNumXValFolds()
Get the number of folds used for cross-validation.
 o getOptions()
Gets the current settings of the Classifier.
 o getRangeCorrection()
Gets the confidence range correction mode used.
 o getSeed()
Gets the random number seed.
 o globalInfo()
 
 o listOptions()
Returns an enumeration describing the available options
 o main(String[])
Main method for testing this class.
 o numXValFoldsTipText()
 
 o rangeCorrectionTipText()
 
 o seedTipText()
 
 o setDesignatedClass(SelectedTag)
Sets the method to determine which class value to optimize.
 o setDistributionClassifier(DistributionClassifier)
Set the DistributionClassifier for which threshold is set.
 o setEvaluationMode(SelectedTag)
Sets the evaluation mode used.
 o setNumXValFolds(int)
Set the number of folds used for cross-validation.
 o setOptions(String[])
Parses a given list of options.
 o setRangeCorrection(SelectedTag)
Sets the confidence range correction mode used.
 o setSeed(int)
Sets the seed for random number generation.
 o toString()
Returns description of the cross-validated classifier.

Field Detail

 o RANGE_NONE
public static final int RANGE_NONE
 o RANGE_BOUNDS
public static final int RANGE_BOUNDS
 o TAGS_RANGE
public static final Tag[] TAGS_RANGE
 o EVAL_TRAINING_SET
public static final int EVAL_TRAINING_SET
 o EVAL_TUNED_SPLIT
public static final int EVAL_TUNED_SPLIT
 o EVAL_CROSS_VALIDATION
public static final int EVAL_CROSS_VALIDATION
 o TAGS_EVAL
public static final Tag[] TAGS_EVAL
 o OPTIMIZE_0
public static final int OPTIMIZE_0
 o OPTIMIZE_1
public static final int OPTIMIZE_1
 o OPTIMIZE_LFREQ
public static final int OPTIMIZE_LFREQ
 o OPTIMIZE_MFREQ
public static final int OPTIMIZE_MFREQ
 o OPTIMIZE_POS_NAME
public static final int OPTIMIZE_POS_NAME
 o TAGS_OPTIMIZE
public static final Tag[] TAGS_OPTIMIZE

Constructor Detail

 o ThresholdSelector
public ThresholdSelector()

Method Detail

 o listOptions
public java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
Returns:
an enumeration of all the available options
 o setOptions
public void setOptions(java.lang.String options[]) throws java.lang.Exception
          Parses a given list of options. Valid options are:

-C num
The class for which threshold is determined. Valid values are: 1, 2 (for first and second classes, respectively), 3 (for whichever class is least frequent), 4 (for whichever class value is most frequent), and 5 (for the first class named any of "yes","pos(itive)", "1", or method 3 if no matches). (default 3).

-W classname
Specify the full class name of classifier to perform cross-validation selection on.

-X num
Number of folds used for cross validation. If just a hold-out set is used, this determines the size of the hold-out set (default 3).

-R integer
Sets whether confidence range correction is applied. This can be used to ensure the confidences range from 0 to 1. Use 0 for no range correction, 1 for correction based on the min/max values seen during threshold selection (default 0).

-S seed
Random number seed (default 1).

-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation, 1 for evaluation using hold-out set, and 2 for evaluation on the training data (default 1).

Options after -- are passed to the designated sub-classifier.

Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported
 o getOptions
public java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
Returns:
an array of strings suitable for passing to setOptions
 o buildClassifier
public void buildClassifier(Instances instances) throws java.lang.Exception
          Generates the classifier.
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully
Overrides:
buildClassifier in class Classifier
 o distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
          Calculates the class membership probabilities for the given test instance.
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if instance could not be classified successfully
Overrides:
distributionForInstance in class DistributionClassifier
 o globalInfo
public java.lang.String globalInfo()
Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui
 o designatedClassTipText
public java.lang.String designatedClassTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o getDesignatedClass
public SelectedTag getDesignatedClass()
          Gets the method to determine which class value to optimize. Will be one of OPTIMIZE_0, OPTIMIZE_1, OPTIMIZE_LFREQ, OPTIMIZE_MFREQ, OPTIMIZE_POS_NAME.
Returns:
the class selection mode.
 o setDesignatedClass
public void setDesignatedClass(SelectedTag newMethod)
          Sets the method to determine which class value to optimize. Will be one of OPTIMIZE_0, OPTIMIZE_1, OPTIMIZE_LFREQ, OPTIMIZE_MFREQ, OPTIMIZE_POS_NAME.
Parameters:
newMethod - the new class selection mode.
 o evaluationModeTipText
public java.lang.String evaluationModeTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setEvaluationMode
public void setEvaluationMode(SelectedTag newMethod)
          Sets the evaluation mode used. Will be one of EVAL_TRAINING, EVAL_TUNED_SPLIT, or EVAL_CROSS_VALIDATION
Parameters:
newMethod - the new evaluation mode.
 o getEvaluationMode
public SelectedTag getEvaluationMode()
          Gets the evaluation mode used. Will be one of EVAL_TRAINING, EVAL_TUNED_SPLIT, or EVAL_CROSS_VALIDATION
Returns:
the evaluation mode.
 o rangeCorrectionTipText
public java.lang.String rangeCorrectionTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setRangeCorrection
public void setRangeCorrection(SelectedTag newMethod)
          Sets the confidence range correction mode used. Will be one of RANGE_NONE, or RANGE_BOUNDS
Parameters:
newMethod - the new correciton mode.
 o getRangeCorrection
public SelectedTag getRangeCorrection()
          Gets the confidence range correction mode used. Will be one of RANGE_NONE, or RANGE_BOUNDS
Returns:
the confidence correction mode.
 o seedTipText
public java.lang.String seedTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setSeed
public void setSeed(int seed)
          Sets the seed for random number generation.
Parameters:
seed - the random number seed
 o getSeed
public int getSeed()
          Gets the random number seed.
Returns:
the random number seed
 o numXValFoldsTipText
public java.lang.String numXValFoldsTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o getNumXValFolds
public int getNumXValFolds()
          Get the number of folds used for cross-validation.
Returns:
the number of folds used for cross-validation.
 o setNumXValFolds
public void setNumXValFolds(int newNumFolds)
          Set the number of folds used for cross-validation.
Parameters:
newNumFolds - the number of folds used for cross-validation.
 o distributionClassifierTipText
public java.lang.String distributionClassifierTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setDistributionClassifier
public void setDistributionClassifier(DistributionClassifier newClassifier)
          Set the DistributionClassifier for which threshold is set.
Parameters:
newClassifier - the Classifier to use.
 o getDistributionClassifier
public DistributionClassifier getDistributionClassifier()
          Get the DistributionClassifier used as the classifier.
Returns:
the classifier used as the classifier
 o toString
public java.lang.String toString()
          Returns description of the cross-validated classifier.
Returns:
description of the cross-validated classifier as a string
Overrides:
toString in class java.lang.Object
 o main
public static void main(java.lang.String argv[])
          Main method for testing this class.
Parameters:
argv - the options

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home