All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.classifiers.MetaCost

java.lang.Object
    |
    +----weka.classifiers.Classifier
            |
            +----weka.classifiers.MetaCost

public class MetaCost
extends Classifier
implements OptionHandler
This metaclassifier makes its base classifier cost-sensitive using the method specified in

Pedro Domingos (1999). MetaCost: A general method for making classifiers cost-sensitive, Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155-164. Also available online at http://www.cs.washington.edu/homes/pedrod/kdd99.ps.gz.

This classifier should produce similar results to one created by passing the base learner to Bagging, which is in turn passed to a CostSensitiveClassifier operating on minimum expected cost. The difference is that MetaCost produces a single cost-sensitive classifier of the base learner, giving the benefits of fast classification and interpretable output (if the base learner itself is interpretable). This implementation uses all bagging iterations when reclassifying training data (the MetaCost paper reports a marginal improvement when only those iterations containing each training instance are used in reclassifying that instance).

Valid options are:

-W classname
Specify the full class name of a classifier (required).

-C cost file
File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -D option.

-D directory
Name of a directory to search for cost files when loading costs on demand (default current directory).

-I num
Set the number of bagging iterations (default 10).

-S seed
Random number seed used when reweighting by resampling (default 1).

-P num
Size of each bag, as a percentage of the training size (default 100).

Options after -- are passed to the designated classifier.

Version:
$Revision: 1.7 $
Author:
Len Trigg (len@intelligenesis.net)

Variable Index

 o MATRIX_ON_DEMAND
 
 o MATRIX_SUPPLIED
 
 o TAGS_MATRIX_SOURCE
 

Constructor Index

 o MetaCost()
 

Method Index

 o buildClassifier(Instances)
Builds the model of the base learner.
 o classifyInstance(Instance)
Classifies a given test instance.
 o getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.
 o getClassifier()
Gets the distribution classifier used.
 o getCostMatrix()
Gets the misclassification cost matrix.
 o getCostMatrixSource()
Gets the source location method of the cost matrix.
 o getNumIterations()
Gets the number of bagging iterations
 o getOnDemandDirectory()
Returns the directory that will be searched for cost files when loading on demand.
 o getOptions()
Gets the current settings of the Classifier.
 o getSeed()
Get seed for resampling.
 o listOptions()
Returns an enumeration describing the available options
 o main(String[])
Main method for testing this class.
 o setBagSizePercent(int)
Sets the size of each bag, as a percentage of the training set size.
 o setClassifier(Classifier)
Sets the distribution classifier
 o setCostMatrix(CostMatrix)
Sets the misclassification cost matrix.
 o setCostMatrixSource(SelectedTag)
Sets the source location of the cost matrix.
 o setNumIterations(int)
Sets the number of bagging iterations
 o setOnDemandDirectory(File)
Sets the directory that will be searched for cost files when loading on demand.
 o setOptions(String[])
Parses a given list of options.
 o setSeed(int)
Set seed for resampling.
 o toString()
Output a representation of this classifier

Field Detail

 o MATRIX_ON_DEMAND
public static final int MATRIX_ON_DEMAND
 o MATRIX_SUPPLIED
public static final int MATRIX_SUPPLIED
 o TAGS_MATRIX_SOURCE
public static final Tag[] TAGS_MATRIX_SOURCE

Constructor Detail

 o MetaCost
public MetaCost()

Method Detail

 o listOptions
public java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
Returns:
an enumeration of all the available options
 o setOptions
public void setOptions(java.lang.String options[]) throws java.lang.Exception
          Parses a given list of options. Valid options are:

-W classname
Specify the full class name of a classifier (required).

-C cost file
File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -D option.

-D directory
Name of a directory to search for cost files when loading costs on demand (default current directory).

-I num
Set the number of bagging iterations (default 10).

-S seed
Random number seed used when reweighting by resampling (default 1).

-P num
Size of each bag, as a percentage of the training size (default 100).

Options after -- are passed to the designated classifier.

Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported
 o getOptions
public java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
Returns:
an array of strings suitable for passing to setOptions
 o getCostMatrixSource
public SelectedTag getCostMatrixSource()
          Gets the source location method of the cost matrix. Will be one of MATRIX_ON_DEMAND or MATRIX_SUPPLIED.
Returns:
the cost matrix source.
 o setCostMatrixSource
public void setCostMatrixSource(SelectedTag newMethod)
          Sets the source location of the cost matrix. Values other than MATRIX_ON_DEMAND or MATRIX_SUPPLIED will be ignored.
Parameters:
newMethod - the cost matrix location method.
 o getOnDemandDirectory
public java.io.File getOnDemandDirectory()
          Returns the directory that will be searched for cost files when loading on demand.
Returns:
The cost file search directory.
 o setOnDemandDirectory
public void setOnDemandDirectory(java.io.File newDir)
          Sets the directory that will be searched for cost files when loading on demand.
Parameters:
newDir - The cost file search directory.
 o setClassifier
public void setClassifier(Classifier classifier)
          Sets the distribution classifier
Parameters:
classifier - the distribution classifier with all options set.
 o getClassifier
public Classifier getClassifier()
          Gets the distribution classifier used.
Returns:
the classifier
 o getBagSizePercent
public int getBagSizePercent()
          Gets the size of each bag, as a percentage of the training set size.
Returns:
the bag size, as a percentage.
 o setBagSizePercent
public void setBagSizePercent(int newBagSizePercent)
          Sets the size of each bag, as a percentage of the training set size.
Parameters:
newBagSizePercent - the bag size, as a percentage.
 o setNumIterations
public void setNumIterations(int numIterations)
          Sets the number of bagging iterations
 o getNumIterations
public int getNumIterations()
          Gets the number of bagging iterations
Returns:
the maximum number of bagging iterations
 o getCostMatrix
public CostMatrix getCostMatrix()
          Gets the misclassification cost matrix.
Returns:
the cost matrix
 o setCostMatrix
public void setCostMatrix(CostMatrix newCostMatrix)
          Sets the misclassification cost matrix.
Parameters:
the - cost matrix
 o setSeed
public void setSeed(int seed)
          Set seed for resampling.
Parameters:
seed - the seed for resampling
 o getSeed
public int getSeed()
          Get seed for resampling.
Returns:
the seed for resampling
 o buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
          Builds the model of the base learner.
Parameters:
data - the training data
Throws:
java.lang.Exception - if the classifier could not be built successfully
Overrides:
buildClassifier in class Classifier
 o classifyInstance
public double classifyInstance(Instance instance) throws java.lang.Exception
          Classifies a given test instance.
Parameters:
instance - the instance to be classified
Throws:
java.lang.Exception - if instance could not be classified successfully
Overrides:
classifyInstance in class Classifier
 o toString
public java.lang.String toString()
          Output a representation of this classifier
Overrides:
toString in class java.lang.Object
 o main
public static void main(java.lang.String argv[])
          Main method for testing this class.
Parameters:
argv - should contain the following arguments: -t training file [-T test file] [-c class index]

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home