All Packages Class Hierarchy This Package Previous Next Index WEKA's home
java.lang.Object | +----weka.clusterers.Clusterer | +----weka.clusterers.DistributionClusterer | +----weka.clusterers.EM
EM assigns a probability distribution to each instance which indicates the probability of it belonging to each of the clusters. EM can decide how many clusters to create by cross validation, or you may specify apriori how many clusters to generate.
Valid options are:
-V
Verbose.
-N
-I
-S
-M
Valid options are:
-V
-N
-I
-S
-M
-t training file [-T test file] [-N number of clusters] [-S random seed]
Specify the number of clusters to generate. If omitted,
EM will use cross validation to select the number of clusters
automatically.
Terminate after this many iterations if EM has not converged.
Specify random number seed.
Set the minimum allowable standard deviation for normal density calculation.
EM()
buildClusterer(Instances)
densityForInstance(Instance)
distributionForInstance(Instance)
getDebug()
getMaxIterations()
getMinStdDev()
getNumClusters()
getOptions()
getSeed()
globalInfo()
listOptions()
main(String[])
maxIterationsTipText()
minStdDevTipText()
numberOfClusters()
numClustersTipText()
seedTipText()
setDebug(boolean)
setMaxIterations(int)
setMinStdDev(double)
setNumClusters(int)
setOptions(String[])
setSeed(int)
toString()
EM
public EM()
Constructor.
globalInfo
public java.lang.String globalInfo()
Returns a string describing this clusterer
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.
Verbose.
Specify the number of clusters to generate. If omitted,
EM will use cross validation to select the number of clusters
automatically.
Terminate after this many iterations if EM has not converged.
Specify random number seed.
Set the minimum allowable standard deviation for normal density
calculation.
setOptions
public void setOptions(java.lang.String options[]) throws java.lang.Exception
Parses a given list of options.
options
- the list of options as an array of strings
minStdDevTipText
public java.lang.String minStdDevTipText()
Returns the tip text for this property
setMinStdDev
public void setMinStdDev(double m)
Set the minimum value for standard deviation when calculating
normal density. Reducing this value can help prevent arithmetic
overflow resulting from multiplying large densities (arising from small
standard deviations) when there are many singleton or near singleton
values.
m
- minimum value for standard deviation
getMinStdDev
public double getMinStdDev()
Get the minimum allowable standard deviation.
seedTipText
public java.lang.String seedTipText()
Returns the tip text for this property
setSeed
public void setSeed(int s)
Set the random number seed
s
- the seed
getSeed
public int getSeed()
Get the random number seed
numClustersTipText
public java.lang.String numClustersTipText()
Returns the tip text for this property
setNumClusters
public void setNumClusters(int n) throws java.lang.Exception
Set the number of clusters (-1 to select by CV).
n
- the number of clusters
getNumClusters
public int getNumClusters()
Get the number of clusters
maxIterationsTipText
public java.lang.String maxIterationsTipText()
Returns the tip text for this property
setMaxIterations
public void setMaxIterations(int i) throws java.lang.Exception
Set the maximum number of iterations to perform
i
- the number of iterations
getMaxIterations
public int getMaxIterations()
Get the maximum number of iterations
setDebug
public void setDebug(boolean v)
Set debug mode - verbose output
v
- true for verbose output
getDebug
public boolean getDebug()
Get debug mode
getOptions
public java.lang.String[] getOptions()
Gets the current settings of EM.
toString
public java.lang.String toString()
Outputs the generated clusters into a string.
numberOfClusters
public int numberOfClusters() throws java.lang.Exception
Returns the number of clusters.
buildClusterer
public void buildClusterer(Instances data) throws java.lang.Exception
Generates a clusterer. Has to initialize all fields of the clusterer
that are not being set via options.
data
- set of instances serving as training data
densityForInstance
public double densityForInstance(Instance inst) throws java.lang.Exception
Computes the density for a given instance.
inst
- the instance to compute the density for
distributionForInstance
public double[] distributionForInstance(Instance inst) throws java.lang.Exception
Predicts the cluster memberships for a given instance.
data
- set of test instances
instance
- the instance to be assigned a cluster.
main
public static void main(java.lang.String argv[])
Main method for testing this class.
argv
- should contain the following arguments:
All Packages Class Hierarchy This Package Previous Next Index WEKA's home