Readings for this homework:
- Ch 18.1-18.4 (Decision Trees)
1. Complete the "Additional Exercises" for 4 solvers other than C4.5 for the "Classification with WEKA" exercise begun in class on Nov 11. Write a short (1 paragraph) description of how each solver works, using information you find in the help files, the textbook, or on the web. Create a table comparing the solvers' performance (including C4.5).
2. Suppose you want to use decision tree learning to learn an unknown Boolean function over k-digit binary numbers. You first task is to decide what the features are that can appear in the nodes of the tree. You consider two alternatives:
(a) How many possible features are there for choice 1?
(b) How many possible features are there for choice 2?
(c) Give an example of a simple target function which to learn exactly would require an exponentially larger tree using feature set 1 than using feature set 2, or provide an argument why none could exist. (By "exponential" we mean that if the tree using feature set 2 has m nodes, then the smallest equivalent tree using feature set 1 has (at least) cm nodes, for some constant c.)
(d) Give an example of a simple target function which to learn exactly would require an exponentially larger tree using feature set 2 than using feature set 1, or provide an argument why none could exist.
*** SPOILERS *** (read only if you get stuck...)
Readings for next week: