loading...
Cover image for STL Algorithms: Counting and Finding

STL Algorithms: Counting and Finding

olask profile image Ola Sk Updated on ・9 min read

Foreword

(TL;DR)
STL isn't a very big library compared to many libraries available in other languages. But it's very robust, well tested and documented and it is included with every copy of a compiler. It handles most edge cases you can think of for a particular application of functions there included. So there are definitely good reasons for which it's worth to use it. Plus there's an undergoing work by C++ Standard Committee, on developing and maintaining the library. Moreover, your code just gets beautiful when you use them.
For a person who's at least a bit familiar with what is there in an <algorithm> header, the code is so much shorter and more readable. Code should NOT be the place for rolling out your amazing loops and handling edge cases, thinking that you got everything right. Did you really? There's just a mental burden for anyone else reading your code to read your loop, examine conditions, think of whether for sure it's written well—while in the meantime in the STL lies am amazing function written by experts that handles everything that you've handled and more, sad that you don't want to use it.

All code examples from the article in the github gist
for an overview of all code snippets from the article, see the gist.

Here are some things to remember while reading this article. I included them at the beginning so that the article is to the point and easier to read. For that, you should understand what is a predicate and why I don't include two first parameters in all the descriptions of functions—what they are? You'll know soon.

You should also be familiar with

  • the concept of iterators to collections,

    They are much, kind of like pointers: they can be dereferenced to access the value, incremented and decremented to iterate through elements of a container, they can be passed to other functions that take iterator as its parameter. Each container type in STL has a specific iterator designed to iterate through its elements.

  • the fact that the header is completely coupled with iterators,

  • the fact that you should fell in love with iterators and algorithms from the header,

  • you need to #include the <algorithm> header in order to use them,

  • C++ really sucks at giving you hints about the functions you could use on a particular collection since all of the functions from the algorithm header are free functions. They're not member functions of some collections. Some collections from STL implement a few of some useful algorithms as their member functions—you'll get autocomplete suggestions on your IDE for them, but do not expect much.

What is a predicate

Condition the element must meet (usually a lambda)

A predicate is an expression that evaluates to true or false and serves as an internal switch/condition for STL algorithms to determine, basing on its value, whether the given element should or should not be considered as a subject for counting or as a result of a search.

Same 1st and 2nd parameter for all counting/searching functions

All of the counting and finding functions take a range to count/search over:

  • 1st param: iterator to the beginning of the range.
  • 2nd param: iterator to the end of the range.

There may be one or two more parameters specific for each function.

A collection for examples

In the article I use this vector for all examples:

std::vector<int> v{ 2, 5, 7, 1, 0, 6, 6, 2, -2, 4, 4, 7, 9, 0, 5}; 
//6 odd values

Counting

Count elements in the collection that have a particular value or meet a particular condition.

Functions

count()

2nd+ parameters

  • 3rd: value to which compare elements of a sequence in a range determined by the two iterators (1st and 2nd parameter to count function) in order to count them.

Return

The number of elements of a given value.

//count how many entries have the target value (2)
int const target = 2;
int count_twos = count(begin(v), end(v), target); 
//alternative:
//int twos = count(v.begin(), v.end(), target); 

count_if()

2nd+ parameters

  • 3rd: predicate—condition the element should meet.

Return

Number of elements satisfying the condition.

//count how many entries are odd
int count_odds = count_if(begin(v), end(v), [](auto elem) { return elem % 2 != 0; }); 

//count how many long months are there
std::map<int, int> month_lengths{{1, 31}, {2, 28}, {3, 31}, {4, 30}, {5, 31}, {6, 30}, {7, 31}, {8, 31}, {9, 30}, {10, 31}, {11, 30}, {12, 31}}; 
int long_months = count_if(begin(month_lengths), end(month_lengths), [](auto elem) { /*elem is a pair*/ return elem.second == 31; }); 

all_of; none_of; any_of, and naming conventions of STL

Once you face a problem, you may find yourself splitting it into "manageable pieces". But you don't have to always do it this way. Different questions, in order to get answered, require you to formulate them well. There are many ready-made STL algorithms that are probable to do the task and most of them tend to follow a naming convention you should get comfortable with after some time.

Q: Are all of these elements...? [Null; odd; high priority etc.]
Q: Are any of these elements...?
Q: Are none of these elements...?

One way (too long) to answer these questions is to count and do some comparisons with your count:

bool all_of_them_are_odd = (count_odds == v.size()); 
bool none_of_them_is_odd = (count_odds == 0); 
bool any_of_them_are_odd = (count_odds > 0); 

There's a better, STL way though:

bool allof = all_of(begin(v), end(v), [](auto elem) { return elem % 2 != 0; }); 
bool anyof = any_of(begin(v), end(v), [](auto elem) { return elem % 2 != 0; }); 
bool noneof = none_of(begin(v), end(v), [](auto elem) { return elem % 2 != 0; }); 

Finding

Find elements in the collection that have a particular value or meet a particular condition.

Functions

Chain calls to searching functions

As most of the searching functions take a pair of iterators and return an iterator to the first occurrence of an element with a given value or condition met, you can perform chain calls inside a loop, reusing the return iterators (using return iterators from the previous searches as arguments for next searches), in a loop, comparing current iterator to the end() iterator.

In most of the following functions, you can pass in either a value or a lambda (predicate), so you call search_n(...) function to find a number of consecutive elements that are all returning true from some condition like e.g. being over a €1000, etc. or a consecutive elements that give the same value from the evaluation of some expression based on the value.

find()

2nd+ parameters

  • 3rd: value to search for.

Return

An iterator to an element in a collection (first occurrence) that has a value passed in as a third argument.
If the element isn't found, the end() iterator of a collection is returned.

// find the first zero in the collection
std::vector<int>::iterator result = find(begin(v), end(v), 0); 
int we_looked_for;
if (result != end(v))
{
    we_looked_for = *result; // result is a pointer
}

// find the first 2 after that 0
result = find(result, end(v), 2); 
// result as a "starting from" iterator
if (result != end(v))
{
    we_looked_for = *result;
}

//find the first 'a'
std::string s{ "So, find this a!" };
auto letter = find(begin(s), end(s), 'a');
if (letter != end(s))
{
    char a = *letter; // result is a pointer
}

find_if()

2nd+ parameters

  • 3rd: predicate; a condition that the value of an element shall meet.

Return

Iterator to an element in a collection that meets the condition (first occurrence).
If the element isn't found, the end() iterator of a collection is returned.

// find the first occurence of an odd value (in a vector)
result = find_if(begin(v), end(v), [](auto elem) { return elem % 2 != 0; });
if (result != end(v))
{
    we_looked_for = *result;
}

find_if_not()

Works exactly like find_if, except it finds the first element that doesn't return the true from the predicate.

It's trivially easy to change an equals to a non-equals, or a less than to greater than or equal to. Some conditions though can be harder to negate easily, or you may go after, the code being more readable if you put explicit find_if_not in there.

2nd+ parameters

  • 3rd: predicate; condition that the value of an element shall not meet.

Return

Iterator to an element in a collection that doesn't meet the condition (first occurrence).
If the element isn't found, the end() iterator of a collection is returned.

// find first even value
result = find_if_not(begin(v), end(v), [](auto elem) { return elem % 2 != 0; });
if (result != end(v))
{
    we_looked_for = *result;
}

find_first_of()

Lets you pass in two additional arguments–iterators to a collection and whichever element of that collection is found first—that element's iterator is returned.

2nd+ parameters

  • 3rd: iterator to the beginning of the sub-range to search for.
  • 4th: iterator to the end of the sub-range.

Return

Iterator to the first occurrence of an element, the value of which equals to any of the values from the collection of values to search for.

// here as third and fourth arguments we pass the iterators
// to the beginning and an end of a collection of values, 
// one of which we wanna find.
std::vector<int> primes = { 1, 2, 3, 5, 7, 11, 13, 17 };
result = find_first_of(begin(v), end(v), begin(primes), end(primes));
if (result != end(v))
{
    we_looked_for = *result;
}

search()

Rather than for a single value—it looks for a sequence. So in a collection of integers it may look for 1 followed by 2.
In a collection of characters, it may look for the word "is".

2nd+ parameters

  • 3rd: iterator to the beginning of the sub-range to search for.
  • 4th: iterator to the end of the sub-range.

Return

Iterator to the first element of the first occurrence of the subsequence.

std::vector<int> subsequence = { 2, 4 };
result = search(begin(v), end(v), begin(subsequence), end(subsequence));
if (result != end(v))
{
    we_looked_for = *result;
}

find_end()

It's exactly like search, with the exception that it gives you back the iterator to the first element of the last occurrence in a collection we look for that subsequence in.

2nd+ parameters

Same as search() above

Return

Iterator to the first element of the last occurrence of the subsequence.

//std::vector<int> subsequence = { 2, 4 };
result = search(begin(v), end(v), begin(subsequence), end(subsequence));
if (result != end(v))
{
    we_looked_for = *result;
}

search_n()

Finds consequtive instances. So if you don't just wanna find a zero—you wanna find 4 0s in a row, you use search_n().

2nd+ parameters

  • 3rd: number of consecutive matches we want.
  • 4th: the value that we want to have matched.
// We're looking for two fours in a row (inside of v vector)
// The resulting iterator points to the first element (earlier in terms of an index) of the sequence that we looked for.

result = search_n(begin(v), end(v), 2, 4); 
if (result != end(v))
{
    we_looked_for = *result;
}

adjacent_find()

Finds two consecutive elements that have the same value—whether they are two int values for a collection of integers or two same customers for a collection of customers.
They have to be consecutive, so it's not just that there are two elements with the same value anywhere in the collection, but they have to be right next to each other.

Parameters

notice that it doesn't take any parameters besides two iterators that specify the range to look in.

result = adjacent_find(begin(v), end(v)); 
if (result != end(v))
{
    //there are two sixes in a row in v, 
    //so the value expected is 6/
    we_looked_for = *result;
}

Summary

A well named (they're usually well named with some exceptions) function says more than a comment.
Algorithms work with any collection or part of such a collection that provides the right iterators.
The pattern used in code samples—to prefer end(v) over v.end() is because it works on more kinds of collections and it makes no difference for a common, regarded vector.

Side note

Have you set-up your C++ IDE already? You may want to take a look at how to perform a quick and easy set-up of Visual Studio Code on Linux.

Posted on Dec 9 '19 by:

olask profile

Ola Sk

@olask

I program a bit in Python and C++. Love machine learning and currently getting more knowledge of statistics.

Discussion

markdown guide
 
[deleted]
 

It's a good point. As for code it highly depends whether or not auto makes it more readable, sometimes it's enough for you to know that it's an iterator and you can use it like so, ie. for a further search in this case or dereferencing to extract the value.
As the article aims at rather explaing, this will make the explanation clearer, so I'm gonna update the article soon. Thanks.

 

Update:
auto iterator result; changed to
std::vector<int>::iterator result; to make it easier to deduce the type right away.