The reason for this post is that I've been working through teaching myself computer science topics and I felt like this data structure was unnecessarily difficult. I would like to share my findings in hopes of helping someone on their journey.
The problems I ran into learning AVL Trees were mostly in actually coding the thing.
- Some of the code that I found online was broken. It didn't actually work, and that's my fault for not testing.
- Some code was unnecessarily complicated (we'll talk about what I think that means later on).
- Some code was difficult to read and understand the underlying concepts. I spent a lot of time trying to figure out just what the code itself was doing and it didn't teach me much when I did figure it out.
AVL trees are named after the inventors Adelson-Velsky and Landis. In short they are a self-balancing binary search tree (BST). The idea behind a BST is that on average BSTs make looking up info really fast because data has a very specific place it can be. The worst case scenario though is if a BST turns lopsided and all of the data ends up making it look more like a list and less like a tree. That is where AVLs come in. They ensure that BSTs never get out of balance and we can keep those super fast look ups.
- Binary Search Trees
- Node height
- balancing (calculating the balance factor)
- what it means to be heavy
- The four rotations LL, RR, LR, RL
I feel like there are a lot of great authoritative sources out there that teach the concepts really well. I gravitated to this list below. This is by no means an exhaustive list.
- San Diego State University's Dr. Rob Edwards helped me through the basics and understanding of a linear fashion through all of my data structure learning.
- I thought Abdul Bari gave an amazing explanation and really drove home balancing factors for me.
My first instinct after understanding the concepts was to try and implement it through sheer problem solving. I think its a great exercise to try to solve an AVL Tree with the above knowledge. When I felt stuck that's when I started looking for other people's implementations so that I could cherry pick what I liked out of those.
My criteria for an implementation that will help me learn the concept should have:
- Traditional implementations that I can imagine using for other problems.
- Novel solutions to traditional problems.
- Has no dependencies and is isolated in one file mostly.
The different implementations I mostly looked at were:
Trekhleb's on Github is a clean implementation and pretty cool concept of using a loop for the insert. However, I didn't like that it uses a
findfunction in the
bstfile, which I just didn't want to have hunt down to juggle back and forth.
Gwtw's on Github seems more of the classic insert implementation that I saw. However I was not a fan of how the rotation methods were defined. They didn't seem as intuitive to me.
3.The Tutorials Point implementation helped me understand most concepts clearly. It has clean naming conventions and a good order of how it does things and is nice and short. This is the code for the LL rotation, which I admire.
The negative aspects of it though were that because it is so short it does some interesting short cuts that make you have to do some extra mental gymnastics. The biggest issue with this code is that it doesn't completely work. There are some typos and it doesn't handle some basic use cases. My biggest issue was that there was no forum, community, or way to know if this is a good or bad implementation because there is no way to give feedback. I think the code however is really elegant and solves the problem in a really interesting, and succinct way.
So, here is the code that I came up with after all this.
Please comment, submit pull requests and let me know what you would like to see in it. Thanks for reading.