
Gain Ratio
Gain
ratio is an alternative metric from Quinlan’s 1986
paper
and used in the popular C4.5 package (free!).
GainRatio(S,A) = ------------------
Gain(S,a)
SplitInfo(S,A)
SplitInfo(S,A)
= å - ----- log ------
|Si|
|S|
|Si|
|S|
where
Si is the subset of S in which attribute A has its ith value.
2
i=1
ni
SplitInfo
measures the amount of information provided
by
an attribute that is not specific to the category.