Unionfind (Disjointset)
JB ・3 min read
CS Level Up Series (28 Part Series)
1) CS Level Up Series Introduction
2) Dynamic Arrays
3 ... 26
3) Linked Lists
4) Stacks
5) Queues
6) Hash Tables
7) Binary Search Trees
8) Binary Heaps
9) Priority Queues
10) Graphs
11) Tries
12) Binary & Bit Manipulation
13) Common Sorting Algorithms
14) Searching Algorithms
15) Permutations, Combinations, & Subsets
16) NPComplete & Fibonacci Heap
17) Detecting Graph Cycles With DepthFirst Search
18) Finding Shortest Paths In Graphs (using Dijkstra's & BFS)
19) Topological Sorting of Directed Acyclic Graphs (DAGs)
20) Finding Articulation Points & Bridges in Undirected Graphs
21) Finding Strongly Connected Components in Directed Graphs using Tarjan's Algorithm
22) Checking If An Undirected Graph Is Bipartite
23) Extending An Iterator
24) Unionfind (Disjointset)
25) Minimum Spanning Tree (Kruskal's Algorithm)
26) Sliding Window Technique
27) String Searching Using RabinKarp
28) Fenwick Tree (Binary Indexed Tree)
Resources:
Takeaways:
 The unionfind data structure (also known as disjointset) keeps track of elements that are partitioned into disjoint subsets (also called components).
 When elements are initially added to unionfind, they are all considered members of their own component. That means if we create a unionfind with 3 elements it will also contain 3 components (each component being a single element). This is achieved by having each element specify itself as its parent (selfreferencing).
 Unionfind has three main operations:
MakeSet()
,Union()
, &Find()
.
MakeSet()
takes an element, adds it to the underlying collection/tree, gives it either arank
of 0 orsize
of 1 (explained later on), and specifies a parent pointer to itself (which creates a new component). 
Find()
takes an elementx
and traverses it's parent chain.Find()
tells us what componentx
is in by finding the root element of that component. If two elements have the same root, they are in the same component. 
Union()
takes two elementsx & y
and merges the components they belong to together.Union()
leveragesFind()
to determine what componentsx & y
are in. If the roots are the same,x & y
are in the same component, and no action is taken. If the roots are different, this meansx & y
are in separate components. These components are merged by pointing the root of one of the components to the root of the other (one root becomes the parent of the other). How do we determine which component is the one to merge into the other? By using
rank
orsize
mentioned earlier.
 How do we determine which component is the one to merge into the other? By using

 Union by size merges the smallest component (fewest elements) into the larger one.

Union by rank merges the component with the shorter tree into the taller one. Each componenet has a rank, which starts as 0. If two components are unioned and they have the same rank then the resulting component's rank is increased by 1. If the unioned components have a different rank, then the resulting component's rank is equal to the highest rank between the two.
 We use rank instead of tree height or depth because of a techninque called path compression that changes the height/depth of the components over time
 Path compression is utilized in the
Find()
operation to flatten a components tree. It does this by making every element of the component point to the components root (every element's parent is now the root of the component). This makes theFind()
operation faster for that component, as it has a flatter chain to traverse when looking for a component's root.  Unionfind is often implemented with either an array or using a more objectoriented approach. In my implementations I use an array.
 Unionfind is a useful data structure when used on graphs. Unionfind can be used for things such as efficiently connecting vertices, finding connected components, & determining minimum spanning tree.
 Using path compression, both unionfind using
rank
orsize
have an amortized time complexity ofO(n)
forUnion()
&Find()
operations. Space isO(n)
. Without path compression, bothUnion()
&Find()
operations have a complexity ofO(log n)
.
Below are array based unionfind implementations by size & rank:
As always, if you found any errors in this post please let me know!
CS Level Up Series (28 Part Series)
1) CS Level Up Series Introduction
2) Dynamic Arrays
3 ... 26
3) Linked Lists
4) Stacks
5) Queues
6) Hash Tables
7) Binary Search Trees
8) Binary Heaps
9) Priority Queues
10) Graphs
11) Tries
12) Binary & Bit Manipulation
13) Common Sorting Algorithms
14) Searching Algorithms
15) Permutations, Combinations, & Subsets
16) NPComplete & Fibonacci Heap
17) Detecting Graph Cycles With DepthFirst Search
18) Finding Shortest Paths In Graphs (using Dijkstra's & BFS)
19) Topological Sorting of Directed Acyclic Graphs (DAGs)
20) Finding Articulation Points & Bridges in Undirected Graphs
21) Finding Strongly Connected Components in Directed Graphs using Tarjan's Algorithm
22) Checking If An Undirected Graph Is Bipartite
23) Extending An Iterator
24) Unionfind (Disjointset)
25) Minimum Spanning Tree (Kruskal's Algorithm)
26) Sliding Window Technique
27) String Searching Using RabinKarp
28) Fenwick Tree (Binary Indexed Tree)
Discussion
Classic DEV Post from Jul 30 '19