B-Trees Study Guide
Depth and Height. We define the depth of a node as how far it is from the root. For consistency, we say the root has a depth of 0. We define the height of a tree as the depth of the deepest node. Notice that depending on how we insert into our BST, our height could vary drastically. We say a tree is “spindly” if it has height close to N and a tree is “bushy” if its height is closer to log N. For operations such as getting a node, we want to have the height to be as small as possible, thus favoring “bushy” BSTs
B-trees. Two specific B-trees in this course are 2-3 Trees (a B-tree where each node has 2 or 3 children), and 2-3-4/2-4 Trees (a B-tree where each node has 2, 3, or 4 children). The key idea of a B-tree is to overstuff the nodes at the bottom to prevent increasing the height of the tree. This allows us to ensure a max height of log N. Make sure you know how to insert into a B-tree. With our restriction on height, we get that the runtime for contains
and add
are both in O(log N).
B-tree Invariants. Because of how we add to our tree, we get two nice invariants for B-trees:
- All leaves must be the same depth from the root.
- A non-leaf node with K keys must have exactly K + 1 non-null children.