Hello! Both core Java and Vavr supply streams that are very handy tool and together with aforesaid Optional/Option and Try enables functional style of your applications. As usual we will start from vanilla Java - first describe what is a stream and how to build pipelines. After it we will dive deeper into Vavr Stream and check how it is different from what Java gives us out of the box.
What is stream in Java?
Streams were introduced in Java 8 and were updated in next releases. Documentation describes a stream as a sequence of elements supporting sequential and parallel aggregate operations. Please don't confuse the word "stream": even before 8th version, Java had InputStream and OutputStream, but these concepts have nothing in common with the hero of this post. Java Stream, that was introduced in Java 8 is an implementation of monad pattern - a concept that was brought from functional languages. There, monads stand for computations that are defined as a sequence of steps.
Let have a look on a simple case that was written in a traditional manner:
List<String> names = Arrays.asList("Anna", "Bob", "Carolina", "Denis", "Anna", "Jack", "Marketa", "Simon", "Anna");
for (String name: names){
if (name.equalsIgnoreCase("Anna")){
System.out.println(name);
}
}
What we do here is that we find all Annas in our list and just print them. This is a simple operation, but, nevertheless, requires us to write a lot of code for such ridiculuous task! Take another code snippet:
List<String> names; // same names as before
names.stream().filter(name->name.equalsIgnoreCase("Anna")).forEach(System.out::println);
Same task, but now it takes only one string of code. What did we do here? We built a pipeline:
- Find all names equal to Anna
- Print each of them
This pipeline consists of an intermediate (fliter()) and terminate (forEach()) operations, that we will observe later in this post.
Create streams
Stream is a programming abstraction, so it is not equal to collection, but we create it from collection. These concepts are often mixed by developers, that start with functional Java, but we need to distinguish them. In our example before we create a stream from List. There are several ways to initialize streams:
From collections
This is an easiest and most obvious one. Java's Collection interface has a built-in method stream() that returns a sequential Stream with this collection as its source. Take a look on a code snippet below:
List<Person> people;
Stream<Person> stream = people.stream();
// do something with stream...
Generating streams
If you don't have a collection of defined data, you can generate data for stream. This may be useful for experementing with streams API methods. We need to provide a Supplier that is used to generate a random sequence of elements. Method generate returns an infinite sequential unordered stream. Here is an example:
DoubleStream numbers = Stream.generate(Math::random);
In this case we generate a stream with a random Double value. IntStream and LongStream also provide a special method range that we also can utilize to generate a stream. Take a look on a code snippet below:
IntStream integers = IntStream.range(1,20);
integers.forEach(System.out::println);
LongStream longs = LongStream.rangeClosed(1,20);
longs.forEach(System.out::println);
In both cases we have a range between 1 and 20, but outputs are different. This is due to the fact that range and rangeClosed return a range that could contains upper limit number or not. rangeClosed method returns a range that includes both limits, while range excludes a second value from results.
ofNullable
Another static method that is used to create streams is ofNullable. It allows us to create a stream containing a single element or empty one (in case of null). NB this method was introduced in Java 9.
Find a code below:
Person anna = null;
Stream<Person> personStream = Stream.ofNullable(anna);
of
Another worth to look method to create streams is of. There are two overloaded versions of this method:
- of (T element)
- of (T... elements)
In the first case, it returns a sequential Stream containing a single element T. In the second one, it returns a sequential ordered stream whose elements are the specified values. NB that second version uses varargs as an argument. This code snippet illustrates this method:
Stream<Car> cars = Stream.of(new Car("tesla"), new Car("skoda"), new Car("toyota"), new Car("mazda"));
cars.forEach(System.out::println);
Stream<Car> skoda = Stream.of(new Car("skoda"));
skoda.forEach(System.out::println);
iterate
Same as ofNullable, this method was introduced in Java 9. iterate takes two parameters: an initial value (seed) and UnaryOperator that produces a sequence. The method starts with the seed value and iteratively applies the given function to get the next element. Here is an example:
Stream.iterate(0, i -> i + 2);
Empty stream
Finally, we can always create an empty stream. NB we mentioned ofNullable method that can return an empty stream, but there is another approach to get explicitly empty stream. empty method returns an empty sequential Stream:
Stream<Double> empty = Stream.empty();
What about Builder?
We explored static methods that are used to create streams. But despite them, there is another way to do it: use Builder. Stream.Builder allows the creation of streams by generating elements individually and adding them to builder without temporary collections or buffers. Let have a look on it:
// 1. create builder
Stream.Builder<String> builder = Stream.builder;
// 2. create stream
Stream<String> names = builder.add("anna").add("bob")
.add("carolina").add("david")
.build();
Builder is an another approach to build streams. We initialize a Stream.Builder instance and then, using add method populate it with values. Finally, we transform Builder to Stream by build method.
Assemble a pipeline
We took a broad introduction to the subject of stream creation and observed key ways to do it. Now, as we obtained a stream instance we can asseble a pipeline in order to do something useful with the stream. From technical point of view, a pipeline consists of a source (Collection, generator function); followed by zero or more intermediate operations and a terminal operation. The graph below represents a concept of pipeline:
In this section we briefly explore role of intermediate and terminal operations and observe most notable of them.
Intermediate operations
Intermediate operations return new stream and are lazy. Their laziness means that the actual computation on the source data is performed only after the terminal operation is invoked, and source elements are consumed only as needed. We can chain multiple intermediate operations, as each returns a new Stream object. Take a look on the graph below:
Let now have a quick look on most used intermediate operations.
Filter
In the beginning of the post we already used this operation in order to filter a collection and find matching names. In a nutshell, it returns a new stream consisting of the elements that match the given condition. This method accepts a predicate that specifies a condition.
names.stream().filter(name->name.equalsIgnoreCase("Joe"));
In this code we use filter to find only names that match Joe. As a result, we will obtain a new stream with only Joes.
Map
There are several map operations, and I decided to group them together under one header. Let start with general map method. It returns a new stream consisting of the results of applying the mapper function to the elements of the stream. Here is an example code:
Stream.of("anna", "benjamin", "carol", "david", "eliska", "frank")
.map(String::toUpperCase)
.forEach(System.out::println);
There we also have a source data that is a list of names. We apply mapping function to transform names into UPPERCASE STRINGS. In all cases, mapper is a Function that accepts one argument and produces a result. There other, specific mapping operations:
- mapToInt = produces an IntStream consisting of the results of applying the given mapper function
- mapToDouble = produces a DoubleStream consisting of the results of applying the given mapper function
- mapToLong = produces a LongStream consisting of the results of applying the given mapper function
Distinct
Another notable intermediate operation in Java Stream API is distinct. It produces a stream of distinct (unique) elements from the data. From technical point of view, distinct method works with equals of enitites in order to avoid duplicates. For ordered streams, the selection of distinct elements is stable, while for unordered ones, Java provides no stability guarantees.
List<Integer> numbers = Arrays.asList(1, 1, 2, 3, 3, 4, 5, 5);
numbers.stream().distinct().forEach(System.out::println);
That is how this method works with numbers. In your custom entites, as it was mentioned you have to override equals and hashCode in order to distinct unique elements. I advice you to go read about overriding hashCode and equals before you will do this.
Sort
Sorting is an another important task that we have to perform with streams. sorted method is an intermediate operation that provides a stream consisting of the elements of this stream, sorted according to natural order. Take a look on a code snippet below:
List<Integer> numbers = Arrays.asList(-9, -18, 0, 25, 4);
numbers.stream().sorted().forEach(System.out::println);
Again, this is how sorting works with numbers. With custom entites you need to implement Comparable, otherwise, ClassCastException will be thrown when terminal operation executes. If you do not implement this marker interface you may use an overloaded sorted version that accepts Comparator as an argument:
Stream.of("barbora", "daria", "cristopher", "adam", "fritz")
.sorted((s1, s2) -> {
return s1.compareTo(s2);
}).forEach(System.out::println);
While
These two methods were also added since Java 9 release: dropWhile and takeWhile. Both are intermediate operations that accepts predicate with condition.
- dropWhile = produces a stream consisting of the remaining elements of this stream after dropping the longest prefix of elements that match the given predicate.
- takeWhile = produces a stream consisting of the longest prefix of elements taken from this stream that match the given predicate.
NB both works with ordered streams.
Take a look on example code snippet below:
Set<Integer> numbers = Set.of(1,2,3,4,5,6,7,8);
numbers.stream()
.takeWhile(x-> x < 5)
.forEach(System.out::println);
Limit
Last intermediate operation that we will observe in this post is limit. It produces a stream consisting of the elements, limited to be no longer than specified length. This method accepts one argument - long value that represents a required length.
List<Integer> numbers = Arrays.asList(-9, -18, 0, 12, -5, 92, 13, 50, -75, 25, 4);
numbers.stream().sorted().limit(5).forEach(System.out::println);
Terminal operations
The other group of operations is called terminal operations. Compare to intermediate operations, there is only one terminal operation that is executed on stream, because after it will be performed, the stream pipeline is consumed, and can no longer be used. Terminal operations produces some result, not streams:
There are several notable terminal operations that we will explore here.
For each
We used this operation in most examples before. This method accepts Consumer function that defines an action to perform on each element of the stream. You remember, that in the beginning of the post we compared two ways of doing this task:
List<String> names = Arrays.asList("Anna", "Bob", "Carolina", "Denis", "Anna", "Jack", "Marketa", "Simon", "Anna");
for (String name: names){
if (name.equalsIgnoreCase("Anna")){
System.out.println(name);
}
}
names.stream().filter(name->name.equalsIgnoreCase("Anna")).forEach(System.out::println);
We also used here method reference to make code shorter and readable. In a full way it will look like this:
stream.forEach(name->System.out.println(name));
NB that for parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization.
Collect
The previous terminal operation has no return: it consumes data, but does not provide something back. However, often we need to perform some stream operation on collection and then get changed collection back. In these situations we use collect method. It does a mutable reduction operation on the elements of this stream using collector.
There are two overloaded versions of collect method: one returns a single result, while another one returns a collection. Let have detailed look:
Stream<Integer> numbers = Stream.of(1, 2, 3, 4, 5);
List<Integer> result = numbers.collect(Collectors.toList());
In this code snippet we use built-in Collectors method to collect stream to list. There are other useful methods that Java provides to us out of the box:
- Collectors.toMap
- Collectors.toSet
Find
Finally there are operations that return Optional object. I group them together, while they are separate methods. Let list them first:
- findAny
- findFirst
Both of them do not have any arguments, so you may ask a very reasonable question: how do they actually find data?. These methods work in combination with filter, that we described earlier. Take a look on example:
List<String> names = Arrays.asList("anna", "barbora", "andrew", "benjamin", "carol");
Optional<String> anna = names.stream().filter(name->name.equalsIgnoreCase("anna")).findFirst();
if (anna.isPresent){
System.out.println("Anna is here!");
} else {
System.out.println("No Anna there");
}
Here we use findFirst in combination with filter to find a matching result. However, this is a very artificial example: usually we don't do this, but filter by some pattern:
// names list
names.stream().filter(name->name.startsWith("A")).findAny().ifPresent(System.out::println);
In both cases we got Anna. What is a difference between these two methods? As names imply, findFirst = returns matching element first occured. In our case they are both Anna. findAny returns any matching element, that can be first or can be not: behavior of this operation is explicitly nondeterministic; it is free to select any element in the stream.
We did a comprehensive review of Java Stream and described most notable methods every developer should know (also this is not complete list, feel free to explore Javadoc). Stream is a concept, borrowed from functional programming languages and better works with other Java functional tools, like optionals. Many developers, however, find them not enough powerful. As alternative to built-in Java tools, we can use Vavr library. In previous parts of this trilogy, I already described Option and Try classes.
Conclusion
This post explored to a subject of streams - a concept that came from functional programming languages. Technically, Java stream is a sequence of elements supporting sequential and parallel aggregate operations. Java started to include Streams API as a part of JDK 8 and has improved them in subsequent releases. Also, Java streams are not only ones. Vavr library that offers additions to functional Java also supplies own streams that are a bit different from vanilla Java. During this post we observed core tasks connected to Java streams: creation, building pipelines. Have a nice day!
Top comments (4)
Nice, similar to observables / RXJS in JS/typescript by the looks.
That's because RX came first (actually a MS Research project, early 2010s), and Java 8 followed (2014). More than that, lazy list operations (that are semantically very similar to streams) came even earlier in various lisp derivatives.
Great write up, reminds me on LINQ in C#.
Nice one. It reminds me of Reactive Extension in .NET.