DEV Community

Mạnh Vũ
Mạnh Vũ

Posted on • Edited on

Elixir Stream - The way to save resource

Intro

When go to Elixir, almost of us use Enum (or for) a lot. Enum with pipe (|>) is the best couple, so easy to write clean code.

But we have a trouble when go to process a big list or file for example, all data will processed and passed together to next function this will consume a lot of memory. For this kind of trouble Elixir provides Stream this topic we will go through Stream module and see how it process data.

How it works

Example if we need to process a large data by Enum:

1..10_000
|> Enum.map(fn n -> {n, n} end)
|> Enum.map(fn {index, n} -> 
    case rem(index, 2) do
     0 -> {index, n * n}
     _ -> {index, n + 1}
    end
  end)
|> Enum.filter(fn {_, n} -> n > 1_000 and n < 10_000 end)
Enter fullscreen mode Exit fullscreen mode

As this code, Enum always make a full data list and transfer to next Enum function. The way of Enum processing consumes a lot of memory for a large data set. It's good for small data set only!

Flow results are passed as list in pipe:
Enum

With Stream, Elixir provides a different way. It transfer a small data between functions in pipe. That is lazy way for processing data.

Stream is suitable for large data set and some specific generating data way like: event, I/O, loop,...

For same with example above we have a stream like:

1..10_000
|> Stream.map(fn n -> {n, n} end)
|> Stream.map(fn {index, n} -> 
    case rem(index, 2) do
     0 -> {index, n * n}
     _ -> {index, n + 1}
    end
  end)
|> Stream.filter(fn {_, n} -> n > 1_000 and n < 10_000 end)
|> Enum.to_list()
Enter fullscreen mode Exit fullscreen mode

In this example, Stream can help us reduce a lot of memory for transfer data between Stream functions. Just at end of pipe for this example is need to construct a full list data (after filtered).

As flow we have:
Stream flow

For each function in Stream the data go piece by piece directly to param of function. Of course, we can put some piece of data (element) to a chunk for better performance (batch processing).

How to make a Stream

Data source for Stream we have 3 ways:

The first, we use Stream from output of other like: IO.stream/2, URI.query_decoder/1,...

The second, we use enumerable for input of stream like List, Range,...

The third, we self construct a Stream by function like: Stream.cycle/1, Stream.unfold/2, Stream.resource/3,...

Top comments (0)