DEV Community

Andy Zhao (he/him)
Andy Zhao (he/him)

Posted on

Explain Indexing Like I'm Five

This could be in the context of a database (like Postgres), or with a search service (like Algolia).

Top comments (10)

Collapse
 
idanarye profile image
Idan Arye

There is a library with many books, ordered by author. Let's say you want to find an Harry Potter. You go to one of the bookcases, look at one of the books, and it happens to be one of the Lord of the Rings books. Tolkien starts with "T", which is after "R" in the ABC, so the Harry Potters must be before them. So you go to some place before, look at some book, and it's a Wheel of Time book. This time "Jordan" comes before "Rowling", so you look at books after that one - but before the Tolkien you found earlier. By following that method you can quickly find your Harry Potter book.

But what if you were living under a rock, and don't know that the author of Harry Potter is J.K. Rowling? You only know the title, but the books are alphabetically ordered by author, not title, so that fast lookup method won't be of any help.

Luckily, you are not the only uncultured reader that visits that library, so the staff decided to help. They prepared a big list of all the books, and ordered it by the title. You can search the title alphabetically - just like you did with the actual books - find the author, and then search the actual books for it.

One day, the library got a new series - A Song of Ice and Fire by George R. R. Martin. The staff wanted to put it in it's place, right between "Martel" and "Marton" - but that would mean they have to move all the books after it - and that's a lot of work! Do they have to do it every time they get a new book?

So, the library decided to change it's system. From now on, each book has a fixed place - bookcase, shelf, and place in the shelf. New books are always added to the last shelf in the last bookcase - and if that bookcase is full the library orders a new bookcase. These locations are written in the list of books. Writing something in the middle of the list is easier than moving all the books, because you can write between the lines/words and once it gets too messy you can write it again - that's a lot of work, but you only need to do it after many in-place edits, so it's not that bad.

So now if you want to find a book you look it up at the list, read it's location, and go directly there. This is even better than the old system, because you only have to do the alphabetical search once. Searching alphabetically is faster than looking at the titles one by one, but it's still more work than just knowing the location.

But... what if you want to find a book by author? You could do it before, but you can't now because they are no longer ordered by author!

Well - who said we can only have one list? Just make another list, this time ordered by author. And you can make a list for everything you want to search by, or even for several thing (like author and then title). Just... don't overuse it, because you have to update each and every one of the lists every time you get a new book, so you don't want hundreds of them...


The books are the data, and the lists are indexes. You usually append to the data, but the indexes are structures that are easier to edit and keep sorted. When the database writes a new data entry, it also updates the indexes, making it easier to search for that entry later. You can define multiple indexes, each sorted by different fields.

Collapse
 
casperbraske profile image
Luis

Thank god I'm not your 5 years old son asking about sex...

Collapse
 
andy profile image
Andy Zhao (he/him)

What's that have to do with anything? Not necessary nor constructive.

Thread Thread
 
casperbraske profile image
Luis • Edited

Calm down, bro. It's just a joke because you said "explain like I'm five" and he used very hard language for a kid that age.
I can delete it if you will.

Thread Thread
 
mtbsickrider profile image
Enrique Jose Padilla

I think that the library example is amazing and thorough. Sure it might confuse a 5 year old, but probably a 10 year old would get this.

Collapse
 
andy profile image
Andy Zhao (he/him)

Awesome answer! Thanks!

Collapse
 
adjoa profile image
Adjoa

This really helped me get it. Thank you! :)

Collapse
 
sam_ferree profile image
Sam Ferree

You couldn't actually use this on a five year old today, because it requires understanding what phonebook is, but...

The yellow pages lists people by name in alphabetical order. It is indexed by last name. Searching for a number by last name is very fast. Searching for a last name by number is very slow, and could potentially require you to look at the entire book.

The white pages list businesses by sector. It is indexed by sector. Searching for a business by sector is very fast. Searching for a business by phone number is very slow, and could potentially require you to look at the entire book.

Collapse
 
svemaraju profile image
Srikanth

I have found this blog post on Postgres to be very informative - malisper.me/postgres-index-scans/
There's lot more information about Postgres in general on that blog.

Collapse
 
andy profile image
Andy Zhao (he/him)

Hmm interesting, I'll check it out. Thanks.