All Packages Class Hierarchy This Package Previous Next Index WEKA's home
Class weka.classifiers.Evaluation
java.lang.Object
|
+----weka.classifiers.Evaluation
- public class Evaluation
- extends java.lang.Object
- implements Summarizable
Class for evaluating machine learning models.
-------------------------------------------------------------------
General options when evaluating a learning scheme from the command-line:
-t filename
Name of the file with the training data. (required)
-T filename
Name of the file with the test data. If missing a cross-validation
is performed.
-c index
Index of the class attribute (1, 2, ...; default: last).
-x number
The number of folds for the cross-validation (default: 10).
-s seed
Random number seed for the cross-validation (default: 1).
-m filename
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file.
-d filename
Saves classifier built from the training data into the given file.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p range
Outputs predictions for test instances, along with the attributes in
the specified range (and nothing else). Use '-p 0' if no attributes are
desired.
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
-------------------------------------------------------------------
Example usage as the main of a classifier (called FunkyClassifier):
public static void main(String [] args) {
try {
Classifier scheme = new FunkyClassifier();
System.out.println(Evaluation.evaluateModel(scheme, args));
} catch (Exception e) {
System.err.println(e.getMessage());
}
}
------------------------------------------------------------------
Example usage from within an application:
Instances trainInstances = ... instances got from somewhere
Instances testInstances = ... instances got from somewhere
Classifier scheme = ... scheme got from somewhere
Evaluation evaluation = new Evaluation(trainInstances);
evaluation.evaluateModel(scheme, testInstances);
System.out.println(evaluation.toSummaryString());
- Version:
- $Revision: 1.42 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz)
- Author:
- Len Trigg (trigg@cs.waikato.ac.nz)
Evaluation(Instances)
- Initializes all the counters for the evaluation.
Evaluation(Instances, CostMatrix)
- Initializes all the counters for the evaluation and also takes a
cost matrix as parameter.
avgCost()
- Gets the average cost, that is, total cost of misclassifications
(incorrect plus unclassified) over the total number of instances.
confusionMatrix()
- Returns a copy of the confusion matrix.
correct()
- Gets the number of instances correctly classified (that is, for
which a correct prediction was made).
correlationCoefficient()
- Returns the correlation coefficient if the class is numeric.
crossValidateModel(Classifier, Instances, int)
- Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
crossValidateModel(String, Instances, int, String[])
- Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
equals(Object)
- Tests whether the current evaluation object is equal to another
evaluation object
errorRate()
- Returns the estimated error rate or the root mean squared error
(if the class is numeric).
evaluateModel(Classifier, Instances)
- Evaluates the classifier on a given set of instances.
evaluateModel(Classifier, String[])
- Evaluates a classifier with the options given in an array of
strings.
evaluateModel(String, String[])
- Evaluates a classifier with the options given in an array of
strings.
evaluateModelOnce(Classifier, Instance)
- Evaluates the classifier on a single instance.
evaluateModelOnce(double[], Instance)
- Evaluates the supplied distribution on a single instance.
evaluateModelOnce(double, Instance)
- Evaluates the supplied prediction on a single instance.
falseNegativeRate(int)
- Calculate the false negative rate with respect to a particular class.
falsePositiveRate(int)
- Calculate the false positive rate with respect to a particular class.
fMeasure(int)
- Calculate the F-Measure with respect to a particular class.
incorrect()
- Gets the number of instances incorrectly classified (that is, for
which an incorrect prediction was made).
kappa()
- Returns value of kappa statistic if class is nominal.
KBInformation()
- Return the total Kononenko & Bratko Information score in bits
KBMeanInformation()
- Return the Kononenko & Bratko Information score in bits per
instance.
KBRelativeInformation()
- Return the Kononenko & Bratko Relative Information score
main(String[])
- A test method for this class.
meanAbsoluteError()
- Returns the mean absolute error.
meanPriorAbsoluteError()
- Returns the mean absolute error of the prior.
numFalseNegatives(int)
- Calculate number of false negatives with respect to a particular class.
numFalsePositives(int)
- Calculate number of false positives with respect to a particular class.
numInstances()
- Gets the number of test instances that had a known class value
(actually the sum of the weights of test instances with known
class value).
numTrueNegatives(int)
- Calculate the number of true negatives with respect to a particular class.
numTruePositives(int)
- Calculate the number of true positives with respect to a particular class.
pctCorrect()
- Gets the percentage of instances correctly classified (that is, for
which a correct prediction was made).
pctIncorrect()
- Gets the percentage of instances incorrectly classified (that is, for
which an incorrect prediction was made).
pctUnclassified()
- Gets the percentage of instances not classified (that is, for
which no prediction was made by the classifier).
precision(int)
- Calculate the precision with respect to a particular class.
priorEntropy()
- Calculate the entropy of the prior distribution
recall(int)
- Calculate the recall with respect to a particular class.
relativeAbsoluteError()
- Returns the relative absolute error.
rootMeanPriorSquaredError()
- Returns the root mean prior squared error.
rootMeanSquaredError()
- Returns the root mean squared error.
rootRelativeSquaredError()
- Returns the root relative squared error if the class is numeric.
setPriors(Instances)
- Sets the class prior probabilities
SFEntropyGain()
- Returns the total SF, which is the null model entropy minus
the scheme entropy.
SFMeanEntropyGain()
- Returns the SF per instance, which is the null model entropy
minus the scheme entropy, per instance.
SFMeanPriorEntropy()
- Returns the entropy per instance for the null model
SFMeanSchemeEntropy()
- Returns the entropy per instance for the scheme
SFPriorEntropy()
- Returns the total entropy for the null model
SFSchemeEntropy()
- Returns the total entropy for the scheme
toClassDetailsString()
-
toClassDetailsString(String)
- Generates a breakdown of the accuracy for each class,
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure.
toCumulativeMarginDistributionString()
- Output the cumulative margin distribution as a string suitable
for input for gnuplot or similar package.
toMatrixString()
- Calls toMatrixString() with a default title.
toMatrixString(String)
- Outputs the performance statistics as a classification confusion
matrix.
toSummaryString()
- Calls toSummaryString() with no title and no complexity stats
toSummaryString(boolean)
- Calls toSummaryString() with a default title.
toSummaryString(String, boolean)
- Outputs the performance statistics in summary form.
totalCost()
- Gets the total cost, that is, the cost of each prediction times the
weight of the instance, summed over all instances.
trueNegativeRate(int)
- Calculate the true negative rate with respect to a particular class.
truePositiveRate(int)
- Calculate the true positive rate with respect to a particular class.
unclassified()
- Gets the number of instances not classified (that is, for
which no prediction was made by the classifier).
updatePriors(Instance)
- Updates the class prior probabilities (when incrementally
training)
Evaluation
public Evaluation(Instances data) throws java.lang.Exception
Initializes all the counters for the evaluation.
- Parameters:
data
- set of training instances, to get some header
information and prior class distribution information
- Throws:
- java.lang.Exception - if the class is not defined
Evaluation
public Evaluation(Instances data,
CostMatrix costMatrix) throws java.lang.Exception
Initializes all the counters for the evaluation and also takes a
cost matrix as parameter.
- Parameters:
data
- set of instances, to get some header information
costMatrix
- the cost matrix---if null, default costs will be used
- Throws:
- java.lang.Exception - if cost matrix is not compatible with
data, the class is not defined or the class is numeric
confusionMatrix
public double[][] confusionMatrix()
Returns a copy of the confusion matrix.
- Returns:
- a copy of the confusion matrix as a two-dimensional array
crossValidateModel
public void crossValidateModel(Classifier classifier,
Instances data,
int numFolds) throws java.lang.Exception
Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
- Parameters:
classifier
- the classifier with any options set.
data
- the data on which the cross-validation is to be
performed
numFolds
- the number of folds for the cross-validation
- Throws:
- java.lang.Exception - if a classifier could not be generated
successfully or the class is not defined
crossValidateModel
public void crossValidateModel(java.lang.String classifierString,
Instances data,
int numFolds,
java.lang.String options[]) throws java.lang.Exception
Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
- Parameters:
classifier
- a string naming the class of the classifier
data
- the data on which the cross-validation is to be
performed
numFolds
- the number of folds for the cross-validation
options
- the options to the classifier. Any options
accepted by the classifier will be removed from this array.
- Throws:
- java.lang.Exception - if a classifier could not be generated
successfully or the class is not defined
evaluateModel
public static java.lang.String evaluateModel(java.lang.String classifierString,
java.lang.String options[]) throws java.lang.Exception
Evaluates a classifier with the options given in an array of
strings.
Valid options are:
-t filename
Name of the file with the training data. (required)
-T filename
Name of the file with the test data. If missing a cross-validation
is performed.
-c index
Index of the class attribute (1, 2, ...; default: last).
-x number
The number of folds for the cross-validation (default: 10).
-s seed
Random number seed for the cross-validation (default: 1).
-m filename
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file.
-d filename
Saves classifier built from the training data into the given file.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs detailed information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p range
Outputs predictions for test instances, along with the attributes in
the specified range (and nothing else). Use '-p 0' if no attributes are
desired.
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
- Parameters:
classifierString
- class of machine learning classifier as a string
options
- the array of string containing the options
- Returns:
- a string describing the results
- Throws:
- java.lang.Exception - if model could not be evaluated successfully
main
public static void main(java.lang.String args[])
A test method for this class. Just extracts the first command line
argument as a classifier class name and calls evaluateModel.
- Parameters:
args
- an array of command line arguments, the first of which
must be the class name of a classifier.
evaluateModel
public static java.lang.String evaluateModel(Classifier classifier,
java.lang.String options[]) throws java.lang.Exception
Evaluates a classifier with the options given in an array of
strings.
Valid options are:
-t name of training file
Name of the file with the training data. (required)
-T name of test file
Name of the file with the test data. If missing a cross-validation
is performed.
-c class index
Index of the class attribute (1, 2, ...; default: last).
-x number of folds
The number of folds for the cross-validation (default: 10).
-s random number seed
Random number seed for the cross-validation (default: 1).
-m file with cost matrix
The name of a file containing a cost matrix.
-l name of model input file
Loads classifier from the given file.
-d name of model output file
Saves classifier built from the training data into the given file.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs detailed information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p
Outputs predictions for test instances (and nothing else).
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
- Parameters:
classifier
- machine learning classifier
options
- the array of string containing the options
- Returns:
- a string describing the results
- Throws:
- java.lang.Exception - if model could not be evaluated successfully
evaluateModel
public void evaluateModel(Classifier classifier,
Instances data) throws java.lang.Exception
Evaluates the classifier on a given set of instances.
- Parameters:
classifier
- machine learning classifier
data
- set of test instances for evaluation
- Throws:
- java.lang.Exception - if model could not be evaluated
successfully
evaluateModelOnce
public double evaluateModelOnce(Classifier classifier,
Instance instance) throws java.lang.Exception
Evaluates the classifier on a single instance.
- Parameters:
classifier
- machine learning classifier
instance
- the test instance to be classified
- Returns:
- the prediction made by the clasifier
- Throws:
- java.lang.Exception - if model could not be evaluated
successfully or the data contains string attributes
evaluateModelOnce
public double evaluateModelOnce(double dist[],
Instance instance) throws java.lang.Exception
Evaluates the supplied distribution on a single instance.
- Parameters:
dist
- the supplied distribution
instance
- the test instance to be classified
- Throws:
- java.lang.Exception - if model could not be evaluated
successfully
evaluateModelOnce
public void evaluateModelOnce(double prediction,
Instance instance) throws java.lang.Exception
Evaluates the supplied prediction on a single instance.
- Parameters:
prediction
- the supplied prediction
instance
- the test instance to be classified
- Throws:
- java.lang.Exception - if model could not be evaluated
successfully
numInstances
public final double numInstances()
Gets the number of test instances that had a known class value
(actually the sum of the weights of test instances with known
class value).
- Returns:
- the number of test instances with known class
incorrect
public final double incorrect()
Gets the number of instances incorrectly classified (that is, for
which an incorrect prediction was made). (Actually the sum of the weights
of these instances)
- Returns:
- the number of incorrectly classified instances
pctIncorrect
public final double pctIncorrect()
Gets the percentage of instances incorrectly classified (that is, for
which an incorrect prediction was made).
- Returns:
- the percent of incorrectly classified instances
(between 0 and 100)
totalCost
public final double totalCost()
Gets the total cost, that is, the cost of each prediction times the
weight of the instance, summed over all instances.
- Returns:
- the total cost
avgCost
public final double avgCost()
Gets the average cost, that is, total cost of misclassifications
(incorrect plus unclassified) over the total number of instances.
- Returns:
- the average cost.
correct
public final double correct()
Gets the number of instances correctly classified (that is, for
which a correct prediction was made). (Actually the sum of the weights
of these instances)
- Returns:
- the number of correctly classified instances
pctCorrect
public final double pctCorrect()
Gets the percentage of instances correctly classified (that is, for
which a correct prediction was made).
- Returns:
- the percent of correctly classified instances (between 0 and 100)
unclassified
public final double unclassified()
Gets the number of instances not classified (that is, for
which no prediction was made by the classifier). (Actually the sum
of the weights of these instances)
- Returns:
- the number of unclassified instances
pctUnclassified
public final double pctUnclassified()
Gets the percentage of instances not classified (that is, for
which no prediction was made by the classifier).
- Returns:
- the percent of unclassified instances (between 0 and 100)
errorRate
public final double errorRate()
Returns the estimated error rate or the root mean squared error
(if the class is numeric). If a cost matrix was given this
error rate gives the average cost.
- Returns:
- the estimated error rate (between 0 and 1, or between 0 and
maximum cost)
kappa
public final double kappa()
Returns value of kappa statistic if class is nominal.
- Returns:
- the value of the kappa statistic
correlationCoefficient
public final double correlationCoefficient() throws java.lang.Exception
Returns the correlation coefficient if the class is numeric.
- Returns:
- the correlation coefficient
- Throws:
- java.lang.Exception - if class is not numeric
meanAbsoluteError
public final double meanAbsoluteError()
Returns the mean absolute error. Refers to the error of the
predicted values for numeric classes, and the error of the
predicted probability distribution for nominal classes.
- Returns:
- the mean absolute error
meanPriorAbsoluteError
public final double meanPriorAbsoluteError()
Returns the mean absolute error of the prior.
- Returns:
- the mean absolute error
relativeAbsoluteError
public final double relativeAbsoluteError() throws java.lang.Exception
Returns the relative absolute error.
- Returns:
- the relative absolute error
- Throws:
- java.lang.Exception - if it can't be computed
rootMeanSquaredError
public final double rootMeanSquaredError()
Returns the root mean squared error.
- Returns:
- the root mean squared error
rootMeanPriorSquaredError
public final double rootMeanPriorSquaredError()
Returns the root mean prior squared error.
- Returns:
- the root mean prior squared error
rootRelativeSquaredError
public final double rootRelativeSquaredError()
Returns the root relative squared error if the class is numeric.
- Returns:
- the root relative squared error
priorEntropy
public final double priorEntropy() throws java.lang.Exception
Calculate the entropy of the prior distribution
- Returns:
- the entropy of the prior distribution
- Throws:
- java.lang.Exception - if the class is not nominal
KBInformation
public final double KBInformation() throws java.lang.Exception
Return the total Kononenko & Bratko Information score in bits
- Returns:
- the K&B information score
- Throws:
- java.lang.Exception - if the class is not nominal
KBMeanInformation
public final double KBMeanInformation() throws java.lang.Exception
Return the Kononenko & Bratko Information score in bits per
instance.
- Returns:
- the K&B information score
- Throws:
- java.lang.Exception - if the class is not nominal
KBRelativeInformation
public final double KBRelativeInformation() throws java.lang.Exception
Return the Kononenko & Bratko Relative Information score
- Returns:
- the K&B relative information score
- Throws:
- java.lang.Exception - if the class is not nominal
SFPriorEntropy
public final double SFPriorEntropy()
Returns the total entropy for the null model
- Returns:
- the total null model entropy
SFMeanPriorEntropy
public final double SFMeanPriorEntropy()
Returns the entropy per instance for the null model
- Returns:
- the null model entropy per instance
SFSchemeEntropy
public final double SFSchemeEntropy()
Returns the total entropy for the scheme
- Returns:
- the total scheme entropy
SFMeanSchemeEntropy
public final double SFMeanSchemeEntropy()
Returns the entropy per instance for the scheme
- Returns:
- the scheme entropy per instance
SFEntropyGain
public final double SFEntropyGain()
Returns the total SF, which is the null model entropy minus
the scheme entropy.
- Returns:
- the total SF
SFMeanEntropyGain
public final double SFMeanEntropyGain()
Returns the SF per instance, which is the null model entropy
minus the scheme entropy, per instance.
- Returns:
- the SF per instance
toCumulativeMarginDistributionString
public java.lang.String toCumulativeMarginDistributionString() throws java.lang.Exception
Output the cumulative margin distribution as a string suitable
for input for gnuplot or similar package.
- Returns:
- the cumulative margin distribution
- Throws:
- java.lang.Exception - if the class attribute is nominal
toSummaryString
public java.lang.String toSummaryString()
Calls toSummaryString() with no title and no complexity stats
- Returns:
- a summary description of the classifier evaluation
toSummaryString
public java.lang.String toSummaryString(boolean printComplexityStatistics)
Calls toSummaryString() with a default title.
- Parameters:
printComplexityStatistics
- if true, complexity statistics are
returned as well
toSummaryString
public java.lang.String toSummaryString(java.lang.String title,
boolean printComplexityStatistics)
Outputs the performance statistics in summary form. Lists
number (and percentage) of instances classified correctly,
incorrectly and unclassified. Outputs the total number of
instances classified, and the number of instances (if any)
that had no class value provided.
- Parameters:
title
- the title for the statistics
printComplexityStatistics
- if true, complexity statistics are
returned as well
- Returns:
- the summary as a String
toMatrixString
public java.lang.String toMatrixString() throws java.lang.Exception
Calls toMatrixString() with a default title.
- Returns:
- the confusion matrix as a string
- Throws:
- java.lang.Exception - if the class is numeric
toMatrixString
public java.lang.String toMatrixString(java.lang.String title) throws java.lang.Exception
Outputs the performance statistics as a classification confusion
matrix. For each class value, shows the distribution of
predicted class values.
- Parameters:
title
- the title for the confusion matrix
- Returns:
- the confusion matrix as a String
- Throws:
- java.lang.Exception - if the class is numeric
toClassDetailsString
public java.lang.String toClassDetailsString() throws java.lang.Exception
toClassDetailsString
public java.lang.String toClassDetailsString(java.lang.String title) throws java.lang.Exception
Generates a breakdown of the accuracy for each class,
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure. Should be
useful for ROC curves, recall/precision curves.
- Parameters:
title
- the title to prepend the stats string with
- Returns:
- the statistics presented as a string
numTruePositives
public double numTruePositives(int classIndex)
Calculate the number of true positives with respect to a particular class.
This is defined as
correctly classified positives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the true positive rate
truePositiveRate
public double truePositiveRate(int classIndex)
Calculate the true positive rate with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total positives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the true positive rate
numTrueNegatives
public double numTrueNegatives(int classIndex)
Calculate the number of true negatives with respect to a particular class.
This is defined as
correctly classified negatives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the true positive rate
trueNegativeRate
public double trueNegativeRate(int classIndex)
Calculate the true negative rate with respect to a particular class.
This is defined as
correctly classified negatives
------------------------------
total negatives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the true positive rate
numFalsePositives
public double numFalsePositives(int classIndex)
Calculate number of false positives with respect to a particular class.
This is defined as
incorrectly classified negatives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the false positive rate
falsePositiveRate
public double falsePositiveRate(int classIndex)
Calculate the false positive rate with respect to a particular class.
This is defined as
incorrectly classified negatives
--------------------------------
total negatives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the false positive rate
numFalseNegatives
public double numFalseNegatives(int classIndex)
Calculate number of false negatives with respect to a particular class.
This is defined as
incorrectly classified positives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the false positive rate
falseNegativeRate
public double falseNegativeRate(int classIndex)
Calculate the false negative rate with respect to a particular class.
This is defined as
incorrectly classified positives
--------------------------------
total positives
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the false positive rate
recall
public double recall(int classIndex)
Calculate the recall with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total positives
(Which is also the same as the truePositiveRate.)
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the recall
precision
public double precision(int classIndex)
Calculate the precision with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total predicted as positive
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the precision
fMeasure
public double fMeasure(int classIndex)
Calculate the F-Measure with respect to a particular class.
This is defined as
2 * recall * precision
----------------------
recall + precision
- Parameters:
classIndex
- the index of the class to consider as "positive"
- Returns:
- the F-Measure
setPriors
public void setPriors(Instances train) throws java.lang.Exception
Sets the class prior probabilities
- Parameters:
train
- the training instances used to determine
the prior probabilities
- Throws:
- java.lang.Exception - if the class attribute of the instances is not
set
updatePriors
public void updatePriors(Instance instance) throws java.lang.Exception
Updates the class prior probabilities (when incrementally
training)
- Parameters:
instance
- the new training instance seen
- Throws:
- java.lang.Exception - if the class of the instance is not
set
equals
public boolean equals(java.lang.Object obj)
Tests whether the current evaluation object is equal to another
evaluation object
- Parameters:
obj
- the object to compare against
- Returns:
- true if the two objects are equal
- Overrides:
- equals in class java.lang.Object
All Packages Class Hierarchy This Package Previous Next Index WEKA's home