23
Gain Ratio
Gain ratio is an alternative metric from Quinlan’s 1986
paper and used in the popular C4.5 package (free!).
GainRatio(S,A) =  ------------------
Gain(S,a)
SplitInfo(S,A)
SplitInfo(S,A) =    å   - ----- log      ------
|Si|
|S|
|Si|
|S|
where Si is the subset of S in which attribute A has its ith value.
2
i=1
ni
SplitInfo measures the amount of information provided
by an attribute that is not specific to the category.