Are you passionate about Elixir and Phoenix? Subscribe to the Poeticoding newsletter and join happy regular readers and receive new posts by email.
Intro
We already saw how to spawn new processes to run our code concurrently, doing multiple HTTP requests at the same time, getting cryptocurrency prices.
Without messages and coordination between processes, the only way to see the results was to let each process printing each price once finished.
iex> ["BTC-USD", "ETH-USD", "LTC-USD", "BCH-USD"] \
|> Enum.map(fn product_id->
spawn(fn -> Coinbase.print_price(product_id) end)
end)
BTC-USD: 3704.51000000
ETH-USD: 125.15000000
LTC-USD: 45.64000000
BCH-USD: 122.83000000
In the iex
console, for each product we spawn a process in which we run the Coinbase.print_price/1
function. When a process receives the result, it prints the price and then exits.
This example was made to focus just on the spawn
function and processes creation, but it also shows a lack of coordination between the processes.
Shared Memory vs. Message Passing
Most of the languages (like Java, Ruby, Python, etc.) use a shared-memory concurrency model, where threads can write to, and read from, a shared block of memory.
Since many threads can concurrently alter the same block of memory, to avoid race conditions we use locks to coordinate threads and being sure that only one thread at a time is writing to the memory. I briefly talked about how Python and Ruby use the GIL (Global Interpreter Lock) to protect the shared memory. The GIL lets only one thread at a time to write to the shared memory, making really hard to achieve parallelism.
Erlang and Elixir go in another direction implementing a concurrency model (Actor Model) where processes** are isolated and do not share any memory. The memory in a process can't be altered directly by any other process.
Great, but if they don't share memory, how can we control and coordinate them? With messages.
Each process has a mailbox and can receive messages from other processes. Using messages, for example, a process can ask to another one to make a computation and get the result back.
Sending the first message
Let's start with a simple example: we send a message to a fresh new process. Once received, the message is then printed to the terminal.
When a message is sent to the process, it's first stored into the process' mailbox. Then, we use receive
to find the first message in the mailbox that matches one of the given patterns.
iex> pid = spawn fn ->
receive do
msg -> IO.inspect(msg)
end
end
#PID<0.109.0>
-
spawn
creates a process returning itspid
,#PID<0.109.0>
in the case above. - The anonymous function we pass runs inside this new process.
- At the beginning of the function, we call
receive
to wait for a message.
iex> send pid, "Hello World"
"Hello World"
- With
send/2
, we asynchronously (it returns immediately) send the message"Hello World"
to the process identified by#PID<0.109.0>
. -
receive
(which was waiting for a message) matches the received string with the catchall pattern, and prints it.
Process, please compute this for me
Let's now try to ask to our process to compute the sum of a list of integers.
With send/2
with are not restricted to string messages, we can actually send any Elixir's data type.
iex> pid = spawn fn ->
receive do
{:sum, numbers} when is_list(numbers)->
Enum.sum(numbers)
|> IO.inspect(label: "sum result")
end
end
#PID<0.110.0>
This time receive
looks just for a specific type of message: a tuple where the first element is the atom :sum
and the second element is a list.
iex> send pid, {:sum, [1,2,3,4,5]}
sum result: 15
In this way, thanks to message passing and pattern matching, we can easily pass the list of numbers as part of the message and control which action the process is going to make.
Receive and process multiple messages
In the examples above, once the message is received and processed, the function ends making the process exit.
Using the PID, we can check if the process is still alive
iex> Process.alive? pid
false
To be able to receive and serve many messages, we need to keep our process alive. To do so, we need to loop using recursion.
Let's consider another example, where this time we have different operations, :+
(sum) and :-
(difference) of two numbers.
defmodule Example do
def next_message do
receive do
{:+, {a,b}}->
IO.puts("#{a} + #{b} = #{a+b}")
{:-, {a,b}}->
IO.puts("#{a} - #{b} = #{a+b}")
end
next_message()
end
end
For simplicity, we define the function within a module. Recursion in this way is simpler, compared to an anonymous function, since we don't have to bring the function as an argument.
Once we have processed the message and reached the end of the function, we make a recursion calling next_message()
, rerunning the function.
It's important to see how a process can compute one message at a time. If we want to process different messages concurrently, we need to send them to different processes.
We can use spawn/3
, passing the module, function name and arguments, avoiding to use any anonymous functions.
iex> pid = spawn Example, :next_message, []
#PID<0.119.0>
iex> send pid, {:+, {10, 5} }
10 + 5 = 15
iex> send pid, {:-, {10, 5} }
10 - 5 = 15
We see how this time our function loops, making the process to go through multiple messages.
To make the process exit, we can use Process.exit/2
to send it an exit signal.
iex> Process.alive?(pid)
true
iex> Process.exit(pid, :halt)
true
iex> Process.alive?(pid)
false
Sending the result back
Most of the time, printing the result is not enough. We want to get the result back or a confirmation that something has happened in another process.
We saw that we can send a message with any Elixir data type. It turns out that we can also send a PID. So, along with a message, we can send the PID of the process where we want to receive
the result back.
self()
returns the PID of the process where it's called. If we call it on iex, it shows the current console process ID.
iex> self()
#PID<0.103.0>
Let's add the PID element to the patterns in the receive
block of the previous example.
def next_message do
receive do
{:+, {a,b}, from_pid}->
send from_pid, a + b
{:-, {a,b}, from_pid}->
send from_pid, a - b
end
next_message()
end
Instead of printing the result, we send it back to from_pid
.
iex> pid = spawn Example, :next_message, []
#PID<0.120.0>
iex> send pid, {:+, {10, 5}, self() }
{:+, {10, 5}, #PID<0.103.0>}
- We spawned the process which waits for a message with the operation we want to do.
- We send a message embedding this time the iex PID as third element of the tuple
- The
next_message
function receives the message, and uses thefrom_pid
to send the result back.
The :erlang.process_info/2
is a useful (and debug only!) function, which we can use to inspect the mailbox of a process.
iex> :erlang.process_info self(), :messages
{:messages, [15]}
Great, we received the result as a message. We just need now to use a proper way to bind a variable to the result. We use the receive
block.
iex> sum_result = receive do
...> res -> res
...> end
15
iex> sum_result
15
To keep it simple, we just sent back the result without any other information. In general this is not a great practice, since in a process' mailbox we find messages coming from multiple processes.
It's better to change the next_message
function to also embed the sender's PID along with the result.
def next_message do
receive do
{:+, {a,b}, from_pid}->
send from_pid, {self(), a + b}
...
end
end
iex> send pid, {:+, {10, 5}, self() }
{:+, {10, 5}, #PID<0.103.0>}
This time, in the message we've received back, the result is with the sender's PID.
iex> :erlang.process_info self(), :messages
{:messages, [{#PID<0.120.0>, 15}]}
This is useful because we can now use the ^
pin operator to get just the message coming from pid
.
iex> receive do
...> {^pid, result} -> result
...> end
15
Wrap up and useful resources
Using directly spawn
and send
is a great way to understand how concurrency and message passing work.
Most of the time though, it's better to use modules like Task
or GenServer
, which are built on the top of spawn
and send
. They give us an easier way to deal with processes and messages, without having to reinvent the wheel.
To see how powerful and easy Task can be, the documentation is obviously a great start. Elixir has one of the best documentation I ever seen, super clear with a lot of examples.
Percy Grunwald also wrote a great article showing how clean concurrent code can be in Elixir, using the Task module
If you are interested to know more about the Actor Model here's other two great resources:
Hewitt, Meijer and Szyperski: The Actor Model
One of the best videos you could see about the actor model, explained by its creator.
The actor model in 10 minutes
If you don't have the time to watch a 40mins video, in this article Brian Storti explains clearly what the actor model is.
Top comments (0)