AVL trees (Also called "Red-Green" trees (no relation to the show)1, and "Height Balanced" trees) are really, really cool. The AVL comes from the creators, Adelson, Veslkij and Laudis (Source: vera). I will go into a bit more detail behind the theory than the previous posters.
The general premise behind AVL trees is "The height of the left and right subtrees differ by no more than one." What this means is that it's time to whip out the ASCII Art
h: fig 1. | fig 2. |
| |
0: C | B |
1: / \ | / \ |
2. / \ | / \ |
3: A 4 | A C |
4: / \ | / \ / \ |
5: 1 B | 1 2 3 4 |
6: / \ | |
7: 2 3 | |
Figure 1 is unbalanced because the height of the leaf node 3'is of height 7 whereas the height of leaf node 4 is 3. The difference between these nodes is greater than one, and so they are in an unbalanced state.
Unbalanced trees are bad because they increase search time - and that is bad. The whole point to using an AVL tree is for the speed of O(log n).
It is possible for an AVL tree to be unbalanced in four possible ways: left-of-left, right-of-left, left-of-right, right-of-right. Time again for some diagrams.
fig 3. | fig 4. | fig 5. | fig 6. |
| | | |
C | C | A | A |
/ | / | \ | \ |
B | A | C | B |
/ | \ | / | \ |
A | B | B | C |
Figure 3 illustrates left-of-right, figure 4 illustrates right-of-left, figure 5 illustrates left-of-right and figure 6 illustrates right-of-right.
To balance these we need an algorithm. I will do right-of-left; the others are easy to derive from the pattern.
fig 4.
C
/
A
\
B
Here again is figure 4. The C node is the root node and grandparent to node B. Node A is the parent of node B and the child of node C. To balance this tree we do a "rotation" of nodes.
The general algorithm is:
Grandparent -> Temp
Parent -> Grandparent
Child -> Parent
Temp -> Child
When balanced figure 4 will look like:
fig 7.
B
/ \
A C
I will now provide some code to demonstrate the left-of-right balancing. This assumes an intuitive Binary Tree implementation.
public BiTNode rotateRL(...) {
// Left of Right balancing function
// You can figure out the parameters :-)
root = child;
grandparent.setLeft(child.getRight());
parent.setRight(child.getLeft());
root.setLeft(parent);
root.setRight(grandparent);
return root;
}
Once we identify how to rotate nodes and balance AVL trees we have to decide when to balance. To do that we can either write one function to balance the whole tree all at once and call it periodically, or we can do one of two things. What we can do is assign a node a depth identifier (see fig 1 & 2). There are actually two ways to do this.
The first is to assign a height variable to each node. What this height variable does is store how many children are below this (that is, the one storing the numbers) node on the left and right side. Take figure 7, for example. Node A has 0 children, so its left and right height is 0. The same for Node C. Node B, however, has two children, so its height is 1. These numbers are computed recursively so the super-root node will always be correct. We know when a node is unbalanced when its left and right height differ by 2.
The other method is a spinoff of the above. It uses a balance factor. This value is an integer and can have three values:
state meaning
-----------------------
0 | Balanced tree
<0 | Left heavy
>0 | Right heavy
Let's look at figure 3 again. Node A has a BF of 0, node B has a BF of -1 -- meaning it has a left child, but no right -- and node C (the root node) has a BF of -2, which means it has 2 more left children than right. This value is also computed recursively.
All of this is really pointless unless the reader understands and can answer the question "why am I using an AVL tree and rotation code?" The answer is that an AVL tree is faster to search than a standard binary tree that is not balanced (an unbalanced binary tree has O(n) speed). What makes an AVL tree stand out from a regular binary tree is that an AVL tree is (self) balancing. Remember: Speed is the name of the game. While there are speedier data structures, such as B-Trees or M-way trees, an AVL tree is easier to implement.
Source: Notes from CSC-103 - Introduction to Data Structures at Monroe Community College, Fall 2002 class. Thanks to kthejoker for editorial help.
1: m_turner writes that this may be Red-Black instead of Red Green. My professor said "Red-Green" in class, so I will include m_turner's note here as well.