Optimality Principle 3
Assuming we have an optimal Huffman tree T whose two lowest probability symbols are siblings at maximum depth, they can be replaced by a new symbol whose probability is the sum of their probabilities.
- The resulting tree is optimal for the new symbol set.
C(T’) = C(T) + (h-1)(p+q) - hp -hq = C(T) - (p+q)