Data structures are formats used to organize, store, and modify data. Data structures are a fundamental component of computer science and software engineering. They can be implemented in any programming language, including the C++ programming language. If you have a coding interview ahead of you, you’ll be expected to know how to use data structures to efficiently work with data.
Today, we'll cover C++ data structures you need to know for your coding interview.
We’ll cover:
- What are C++ data structures?
- 9 C++ data structures you need to know for your interview
- Trees
- Wrapping up and next steps
What are C++ data structures?
Data structures are formats used to organize, store, and manage data. You can use data structures to access and modify data efficiently. There are various types of data structures. The type you use will depend on the application you’re using. Each data structure has its advantages and disadvantages.
Data structures generally fall under these two categories:
- Linear data structures: Elements are arranged sequentially (e.g. arrays, linked lists, stacks, queues)
- Non-linear data structures: Elements are not arranged sequentially, but are stored within different levels (e.g. trees, graphs)
9 C++ data structures you need to know for your interview
1. Arrays
An array is a sequential list of values. You can use arrays to store and access multiple values of the same data type under one identifier. Arrays makes it easier to perform operations on an entire dataset. They're also great for retrieving data.
There are one-dimensional arrays and multi-dimensional arrays. The values stored in an array are called elements. Elements are stored in consecutive blocks of memory called indexes. The size of an array is determined by the number of elements it contains. Pictured is a one-dimensional array with a size of six.
The general syntax for initializing a one-dimensional array is:
In this code, we initialize an array named Example
that stores 200 at index 0
and 201 at index 1
.
#include <iostream>
using namespace std;
int main() {
int Example[2];
Example[0] = 200;
Example[1] = 201;
}
2. Graphs
Graphs are non-linear data structures. They consist of a finite set of vertices (i.e. nodes) connected by a set of *edges*. The order of a graph refers to the number of vertices in the graph. A graph's size corresponds to its number of edges.
This illustration represents vertices and edges. The graph also represents a cyclic graph. A graph is cyclic if at least one node has a path that directs back to itself.
Graphs are frequently used to model real-life relationships in various contexts ranging from social networks to neural networks.
There are three common types of graphs:
- Undirected graphs: A graph in which all edges are bi-directional
- Directed graphs: A graph in which all edges are uni-directional
- Weighted graphs: A graph in which all edges are assigned a value, called a weight
You’ll likely be asked to traverse a graph in your interview. For your C++ interview preparation, make sure you’re comfortable with using common graph algorithms such as breadth-first search and depth-first search.
3. Linked lists
Linked lists are linear data structures consisting of a collection of nodes that represent a sequence. Each node contains a value and a pointer to the next node in the sequence. Linked lists are great for adding and deleting nodes. However, they’re not great for retrieving nodes, since each node is only aware of the node next to it.
There are both singly linked lists and doubly linked lists. A singly linked list is the most basic form of a linked list, where each node is only aware of the next node in the sequence. In a doubly linked list, each node points to the node preceding and following it in the sequence. The direction of these links are often distinguished as “forward” and “backwards,” or “next” and “prev(ious)”.
4. Hash tables
Hash tables store key and data pairs. They're usually implemented using arrays, where each element has its own unique index value to enable quick access. While you could assign the key (an integer value) as the index for data, this gets complicated in the case of large keys. Instead, the hashing technique is used to generate indexes for the elements.
This figure represents key and data pairs stored in a hash table.
The hashing technique uses a hash function to convert large keys into smaller keys which can then be stored in a hash table. A good hash function should meet the following conditions:
- Easy to compute
- Distributes data uniformly (without clustering)
- Minimizes collision
A hash table's performance depends on the size of the hash table, the hash function used, and the collision handling method.
Collision is something we try to avoid when hashing. Collision refers to instances where two different keys passed to a hash function return the same index. The chances of collision increase when we're dealing with large keys. Collision handling methods help to work around collisions, and include the following common strategies: linear probing, chaining, and resizing the array.
Hash tables are excellent data structures for algorithms prioritizing search operations. Insertion and search operations are quite fast in hash tables. Hash tables are a popular topic for C++ coding interviews.
5. Stacks and queues
Stacks and queues are separate but similar linear data structures. They’re both containers of objects based on arrays. They are flexible in size.
Their distinguishing characteristic is the order in which operations may be performed on them:
- Stack operations follow a Last-In-First-Out (LIFO) or First-In-Last-Out (FILO) order
- Queue operations follow a First-In-First-Out (FIFO) order
With stacks, we only remove the most recently added item. A popular example is a stack of lunch trays in a cafeteria. With queues, we only remove the least recently added item. A popular example is a queue in a public setting, where the person waiting the longest is served first (of course, this depends on whether the culture or locality observes the FIFO rule for queueing)!
Stacks and queues have various applications. Stacks are often used with the recursive backtracking algorithm. Stacks are the mechanism supporting the "undo" or "back" button on many applications and browsers. Compilers often use stacks to parse the syntax of expressions before translating them into low-level code. Queues are useful in cases where the order of data needs to be maintained. As such, they're used for caching, as well as handling real-time system interruptions.
6. Strings
A string is generally a data type, but it can often be implemented as an array data structure that stores a sequence of characters. C++ has a string class that makes it easy to manipulate strings of characters. A C++ string is mutable and can be changed, as opposed to an immutable Java string.
You can declare a string by declaring a one-dimensional array. The difference between a character array and a string is that a string is terminated with the null character \0
.
You don't have to place the null character \0
when declaring a string. Using the syntax in the following code, the double quotation "
tells your compiler to automatically append \0
.
char greeting[] = "hello"
This corresponds to the following illustration.
Strings are often used to store human-readable data such as sentences. They're even used to represent DNA sequences.
Trees (#7, 8, and 9)
Next, we'll cover binary trees, binary search trees, and tries. Before we do, let's learn about the category they all belong to: trees.
Trees are abstract data types that model a hierarchical tree structure. Trees consist of nodes (i.e. vertices) and edges that connect them. Linked nodes
are called “parent” and "children” nodes. Unlike graphs, a cycle can't exist in a tree. In a tree, there is no more than a single path connecting any two nodes.
Some basic terminology for trees are:
- Root node: A node with no parent nodes
- Parent nodes: A node connected to one or more child nodes
- Child node: A node that is connected to an upper node (parent node)
- Sibling node: Nodes that share the same parent node
- Leaf node: A node with no child nodes
-
Level: Each step of the tree, beginning from level
0
at the root node and increasing incrementally from top to bottom - Subtree: A portion of the tree that forms a complete tree on its own
This image illustrates the relationships in a tree.
7. Binary trees
Binary trees are one type of tree data structure. You have a binary tree whenever you have a tree wherein each node has no more than two children. Each parent node's children are referred to as the left child and right child.
Binary trees consist of the following types:
-
Complete binary tree: A binary tree in which all levels of the tree are filled, with the exception of the last level, which can be filled from left to right (refer to illustration)
- Heaps are important data structures that are complete binary trees with nodes following the Heap Order Property
Full binary tree: A binary tree for which each node has zero or two children
Perfect binary tree: A binary tree that is both full and complete (i.e. all interior nodes have two children and all leaves are at the same level)
8. Binary search trees
Binary search trees allow binary search algorithms to quickly lookup, add and remove data elements. Binary search is a search algorithm that finds the position of a target value within a sorted array. The algorithm does this by comparing the target value to the middle element of the array.
A binary tree is a binary search tree if it satisfies the following conditions:
- Values of all nodes in the left sub-tree are equal to or less than the value of the current node
- Values of all nodes in the right sub-tree are equal to or greater than the value of the current node
- The above conditions apply to both left and right sub-trees
This illustration represents a binary search tree.
Binary search trees are often used in search applications where data is continuously entering and leaving. Operations such as iterative and recursive searching and traversal can be applied to binary search trees. Binary search trees will likely be involved in solutions to C++ coding interview questions.
9. Tries
A trie is another type of search tree. Tries are efficient data structures for solving programming problems involving strings. They're also known as a prefix tree because they're based on the prefix of a string.
The term "trie" is derived from the word "retrieve" because tries are great for retrieving data.
Tries consist of a combination of nodes, wherein each node represents a unique letter of the alphabet. A trie consists of one empty root node which links to other nodes. A trie can't have duplicate nodes.
Each node in a trie contains:
- A value (can be
null
) - An array of references to child nodes (can also be
null
)
Tries were introduced to help efficiently represent sets of words. They are frequently used for autocomplete and search functionalities, as well as for IP routing.
Wrapping up and next steps
Demonstrating your ability to work with data structures will be essential to acing your coding interview. During your C++ interview preparation, be sure you have a strong understanding of how to leverage the data structures in this article.
To help you master C++ data structures, check out our learning path: Ace the C++ Coding Interview. This interactive learning path consists of several modules and quizzes to help programmers like you walk into your interview with confidence.
Continue learning about data structures and C++
- How to pass an array to a function in C++
- Algorithms 101: How to use graph algorithms
- Data Structures 101: How to build min and max heaps
Start a discussion
What data structures are you studying for your coding interviews? Is there anything we missed that you'd like to learn about? Let us know in the comments below!
Top comments (0)