All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.classifiers.CostMatrix

java.lang.Object
    |
    +----weka.core.Matrix
            |
            +----weka.classifiers.CostMatrix

public class CostMatrix
extends Matrix
Class for a misclassification cost matrix. The element in the i'th column of the j'th row is the cost for (mis)classifying an instance of class j as having class i. It is valid to have non-zero values down the diagonal (these are typically negative to indicate some varying degree of "gain" from making a correct prediction).

Version:
$Revision: 1.8 $
Author:
Len Trigg (len@intelligenesis.net)

Variable Index

 o FILE_EXTENSION
The filename extension that should be used for cost files

Constructor Index

 o CostMatrix(CostMatrix)
Creates a cost matrix identical to an existing matrix.
 o CostMatrix(int)
Creates a default cost matrix for the given number of classes.
 o CostMatrix(Reader)
Creates a cost matrix from a cost file.

Method Index

 o applyCostMatrix(Instances, Random)
Changes the dataset to reflect a given set of costs.
 o expectedCosts(double[])
Calculates the expected misclassification cost for each possible class value, given class probability estimates.
 o getMaxCost(int)
Gets the maximum misclassification cost possible for a given actual class value
 o initialize()
Sets the costs to default values (i.e.
 o main(String[])
Tests out creation of a frequency dependent cost matrix from the command line.
 o makeFrequencyDependentMatrix(Instances, double)
Creates a cost matrix for the class attribute of the supplied instances, where the misclassification costs are higher for misclassifying a rare class as a frequent one.
 o normalize()
Normalizes the cost matrix so that diagonal elements are zero.
 o readOldFormat(Reader)
Reads misclassification cost matrix from given reader.
 o size()
Gets the number of classes.

Field Detail

 o FILE_EXTENSION
public static java.lang.String FILE_EXTENSION
          The filename extension that should be used for cost files

Constructor Detail

 o CostMatrix
public CostMatrix(CostMatrix toCopy)
          Creates a cost matrix identical to an existing matrix.
Parameters:
toCopy - the matrix to copy.
 o CostMatrix
public CostMatrix(int numClasses)
          Creates a default cost matrix for the given number of classes. The default misclassification cost is 1.
Parameters:
numClasses - the number of classes
 o CostMatrix
public CostMatrix(java.io.Reader r) throws java.lang.Exception
          Creates a cost matrix from a cost file.
Parameters:
r - a reader from which the cost matrix will be read
Throws:
java.lang.Exception - if an error occurs

Method Detail

 o makeFrequencyDependentMatrix
public static CostMatrix makeFrequencyDependentMatrix(Instances instances,
                                                      double weight) throws java.lang.Exception
          Creates a cost matrix for the class attribute of the supplied instances, where the misclassification costs are higher for misclassifying a rare class as a frequent one. The cost of classifying an instance of class i as class j is weight * Pj / Pi. (Pi and Pj are laplace estimates)
Parameters:
instances - a value of type 'Instances'
weight - a value of type 'double'
Returns:
a value of type CostMatrix
Throws:
java.lang.Exception - if no class attribute is assigned, or the class attribute is not nominal
 o readOldFormat
public void readOldFormat(java.io.Reader reader) throws java.lang.Exception
          Reads misclassification cost matrix from given reader. Each line has to contain three numbers: the index of the true class, the index of the incorrectly assigned class, and the weight, separated by white space characters. Comments can be appended to the end of a line by using the '%' character.
Parameters:
reader - the reader from which the cost matrix is to be read
Throws:
java.lang.Exception - if the cost matrix does not have the right format
 o initialize
public void initialize()
          Sets the costs to default values (i.e. 0 down the diagonal, and 1 for any misclassification).
 o size
public int size()
          Gets the number of classes.
Returns:
the number of classes
 o normalize
public void normalize()
          Normalizes the cost matrix so that diagonal elements are zero. The value of non-zero diagonal elements is subtracted from the row containing the value. For example:


 2  5
 3 -1
 

becomes


 0  3
 4  0
 

This normalization will affect total classification cost during evaluation, but will not affect the decision made by applying minimum expected cost criteria during prediction.

 o applyCostMatrix
public Instances applyCostMatrix(Instances instances,
                                 java.util.Random random) throws java.lang.Exception
          Changes the dataset to reflect a given set of costs. Sets the weights of instances according to the misclassification cost matrix, or does resampling according to the cost matrix (if a random number generator is provided). Returns a new dataset.
Parameters:
instances - the instances to apply cost weights to.
random - a random number generator
Returns:
the new dataset
Throws:
java.lang.Exception - if the cost matrix does not have the right format
 o expectedCosts
public double[] expectedCosts(double probabilities[]) throws java.lang.Exception
          Calculates the expected misclassification cost for each possible class value, given class probability estimates.
Parameters:
probabilities - an array containing probability estimates for each class value.
Returns:
an array containing the expected misclassification cost for each class.
Throws:
java.lang.Exception - if the number of probabilities does not match the number of classes.
 o getMaxCost
public double getMaxCost(int actualClass)
          Gets the maximum misclassification cost possible for a given actual class value
Parameters:
actualClass - the index of the actual class value
Returns:
the highest cost possible for misclassifying this class
 o main
public static void main(java.lang.String args[])
          Tests out creation of a frequency dependent cost matrix from the command line. Either pipe a set of instances into system.in or give the name of a dataset as an argument. The last column will be treated as the class attribute and a cost matrix with weight 1000 output.
Parameters:
[]args - a value of type 'String'

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home