How to Use Java Streams Explained
Making streams, intermediate operations, and terminal operations work in Java
Streams are Java’s way of integrating functional programming with its object-oriented style. There are a lot of benefits to using streams in Java, such as the ability to write functions at a more abstract level which can reduce code bugs, compact functions into fewer and more readable lines of code, and the ease they offer for parallelization. Java streams are fairly well known but not everyone knows how to take full advantage of their benefits, including the finer points of making streams, intermediate operations, and terminal operations. In this blog we’re going to use a simple example to explain how Java streams work, which will result in less verbose, more intuitive, and less error prone code.
An Example of How Java Streams Work
The simplest way to think of Java streams, and the way that helps me the most, is imagining a list of objects that are disconnected from each other, entering a pipeline one at a time. You can control how many of these objects enter the pipeline, what you do to these objects inside of the pipeline, and how you catch these objects as they exit the pipeline.
Streams are much easier to learn by looking at the big picture first and then breaking it down. To do so, let’s make a program that takes a certain number of multiples of four, squares each of them, then takes the sum of all of the squares that are not divisible by ten:
Even if you don’t know how streams work in Java, there’s a very good chance you were able to quickly figure out what the code above is doing. Apart from the first line (Stream.iterate(4, multipleOfFour -> multipleOfFour + 4)), the rest of the code almost reads like a set of instructions.
Let’s take this one piece at a time.
Lambda Expression
This is how Lambda expressions are defined in Java. Lambda expressions provide a more compact and simplified way to define a function. Essentially, you can imagine the multipleOfFour on the left is the input, the -> is the Lambda, and the multipleOfFour + 4 on the right is the return value. Imagine this reading as “Give me an input (multipleOfFour), and I will return to you that input + 4.” Note that if a Lambda expression in Java takes more than one line, you need to enclose the right side (after the Lambda ->) with curly brackets “{}” and add a return statement.
Stream Creation
Stream.iterate() is a function that creates an infinite stream by taking a seed (starting point) and a UnaryOperator (for which you can pass in a Lambda expression). Starting at four, this particular iteration will return a stream of: 4, 8, 12, 16, 20, 24, 28, ...
Limit
This is the next part of the stream pipeline and it limits the previous stream to only the number specified in the limit (in this case that is numberOfMultiples). Thus, our stream isn’t an infinite stream anymore, and now only contains the first numberOfMultiples, which in this case are multiples of four.
Higher Order Functions
Higher order functions are functions that take functions as arguments and either act on the given function or return a function. When using Java’s functional interface, higher order functions are extremely useful as they allow the developer to reach a higher level of abstraction.
Map
Map is a higher order function. It takes a Lambda expression and converts each value in the stream it is acting on and writes it to a different stream as specified in the Lambda. In this case, it is taking every multiple of four in our original stream and creating a new stream that contains all of those multiples of four squared. Although in this scenario we are going Stream<Integer> -> Stream<Integer>, you can use map to completely convert your stream to a different type of stream!
For example, if I ran:
It would convert my Stream<Integer> -> Stream<String>. This is extremely powerful, and if you were to implement something like this object-oriented style, the code will not be as direct and elegant. You’ll also need to unnecessarily create variables and objects with limited scope. Here’s the object-oriented version:
Filter
Similar to map, filter is another higher order function. Filter takes a predicate, which is a way of saying a lambda expression that returns true for certain conditions and false for others. It filters the stream, so that the only objects remaining in the stream are those that caused the predicate to return true. The way I would read a filter like this is “Filter the multiples of four squared in my stream such that the only objects are those where multiple of four squared % 10 == 0.”
Reduce
Up to this point, map and filter are what we call intermediate operations - they take a stream and return a new stream. Reduce is an example of a terminal operation, which, as its name suggests, terminates the stream and returns something else, whether it’s a collection or a value. Reduce takes two parameters: an identity (often called an “accumulator”) and a BinaryOperator to reduce to a single value. The way to understand this specific reduce function is “Starting with the number 0, sum all of the integers in the stream, and return that sum.”
Example of Creating a Java Stream
Making a stream directly from a collection:
Example of an Intermediary Operation in Java Streams
.sorted() : Sorts your stream with the given comparator:
Comparator.comparingInt() compares items in the stream given the function specified that compares integer components of those items - this works the same as if you did .sorted((i,j) -> Integer.compare(i[1], j[1])). This function sorts the created stream of arrays by their second values in each array of two numbers. As for collect, that will be covered below in the examples of other terminal operations.
Two Examples of Terminal Operations in Java Streams
Terminal Operation Example 1
.collect() : Collect the items in your stream into a certain collection:
Building off of the previous example, collect() is another very important terminal operation in Java streams. Based on the collector you specify, the items in your streams will be collected into that collection. The example above collects the items in the stream into a list. After this is run, list now contains a list of the arrays we added, sorted in ascending order by the 2nd item in each array. You can also collect the items in the stream into a map, set, concurrent map, or any other collection by specifying a collection factory.
Here’s one more example of using collect on the same list we had before, but this time we’re collecting that list of integer arrays into a map where the first item in each array is the key and the second is the value of each <key,value> pair!
Terminal Operation Example 2
.forEach() : Perform an action on for each item in your stream
Yes, streams have a forEach() implementation as well, and it is a terminal operation! For each, like a typical for-each loop in java, will iterate over the stream and perform an action on each item. It is important to note that there is a collection.stream.forEach() implementation, and a collection.forEach() implementation. They are very similar, with the difference being that the order every item will be processed in the stream is not defined in the streams version, but is defined in the collections version because it is implemented for the specific collection.
So Why Should I Use Java Streams?
Now that we have a good grasp of how streams work, let’s go back to our original example. Except this time, let’s compare it to the code we’d write to achieve the same functionality without using Java streams.
Code Using Java Streams:
Code Without Using Java Streams:
This is only a very small example but it already shows a number of advantages that streams have. This includes less verbose code, more intuitive code, and less error prone code.
Advantages of Java Streams - Less Verbose Code
Using streams, for just this example, chopped off two lines of code. In my experience, it is not uncommon for streams to be able to compact a 15-20 line function into an easily understood 3-4 lines.
You’ll also notice, thanks to all of the lambda expressions we used, we’re not creating unnecessary new variables that only exist in the function’s scope. For the code without streams to be readable, we had to create a new integer for the running sum of the entire for loop, a new integer for the multiple of four in the for loop’s declaration, and an integer for the multiple of four squared.
Advantages of Java Streams - More Intuitive Code
Another advantage of using streams is that the code ends up being more intuitive and requires less thinking to understand what’s going on. You can notice that the code using streams almost reads like a set of steps the calculation goes through. In the example without streams, you need to think about what the variable in the for-loop’s declaration is, why we’re incrementing it by four, and then sometimes in between increments we check if the square of it is divisible by 10, and if it is we append the sum. Instead of the calculations going step by step, we’re starting to mix different functionalities together, which gets more difficult to reason about as the functions you work with become more complex.
Advantages of Java Streams - Less Error Prone Code
Lastly, due to their step by step nature, the stream version of this function will be a lot less error prone. As Streams involve higher order functions, you can work with them at a higher level of abstraction than you would a typical for-loop, which is a lot safer.
An example of this can be seen in the <= in the for-loop in the second example. It is very common to have for-loops that are exclusive in their comparison check, so a developer working on a function like this has a real chance of setting that comparison as <, which will result in only summing the first numberOfMultiples - 1. It’s also very common to start a for-loop at zero instead of four, so that’s another possible area where a developer can introduce a bug. There’s a lot of very fine details that you need to reason about in the non-streams version that you just don’t need to spend mental energy on when you work with streams.
Wrapping It Up
This article is just showing you the tip of the iceberg when it comes to all of the ways to make streams, intermediate operations, and terminal operations work in Java. There are many more that you can use, and you can create your own custom ones as well.
To recap, in this article we went over:
- Streams
- Lambda Expressions
- Stream Creation
- Map, Filter, Reduce, and other examples of intermediate and terminal operations
- Some of the pros of using streams
Another important characteristic of streams is that intermediate operations are lazy, which means that a processing based on an intermediate operation will only be done if it is needed, which can be a significant optimization with streams. This article has a great section on how lazy evaluation works in java streams.
If you are interested in diving deeper, a good starting point is to learn more intermediate and terminal operations, and practice using streams! If you are looking to learn parallelization as well, this presentation by Kenneth Kousen is a great and easy to grasp explanation on how parallel streams and completable futures work.
With streams in Java, you can write more elegant and bug-free Java code, and as you improve your skills in using streams you will see that there are certain operations that become much easier to implement!