Sterling has too many projects Blogging about programming, microcontrollers & electronics, 3D printing, and whatever else...

Introducing Async & Concurrency in Raku

| 1804 words | 9 minutes | raku advent-2019
Train switch yard

Before we get into the guts of this advent calendar for the next 23 days after today, I want to be sure to introduce the basic concepts I’m going to cover. I am calling this the Raku Async & Concurrency Advent Calendar, but what do those terms mean and how do they apply to Raku?

This advent calendar assumes a basic knowledge of Raku. If you don’t know Raku, but you are familiar with another language, I’m sure you will probably be able to follow along. However, this is aimed at intermediate to advanced developers, so I recommend one of these resources for learning Raku first:

Asynchronous Programming

Async is short for asynchronous programming because asynchronous is annoyingly painful to type. Asynchronous programming is a style of programming where the call to a function is disconnected from its return. Normally, we write a function like this:

sub double($x) { $x * 2 }
my $val = double(21);

If we want make this async, we might try something like this:

sub double($x) { sub () { $x * 2 } }
my $promise = double($x);
# more code
# ...
# more code
my $val = $promise.();

That’s a silly contrived example, but the point is that we don’t actually receive the value until later. This style of programming is useful primarily when you know you need some work done, but don’t need the results immediately. This is especially useful when you might need to do other work in the meantime.

When does it make sense?

A couple example situations where async could be helpful:

  • All socket-based client/server applications are necessarily async. For TCP, on a server, you establish a listening socket and then when an client connects, you receive a connected socket that you can exchange data on using read and write operations. A TCP client, meanwhile, requests a connection and then waits until the server accepts the connection. The client could continue to do work while waiting.

  • A program that processes log data in real-time typically has several processing steps to run each line through. It waits for data to arrive. It parses each line of data to get the interesting information out of it. It filters the data to decide if it’s useful for the current task or not. It runs calculations on the data. Each of these operations could be performed in an asynchronous programming chain where the operations are performed as soon as some amount data is ready for each step.

Async is just a style of programing and it fits wherever you, the software developer, decide it fits. Some problems are more inherently asynchronous than others, but you could build them synchronously too. Use async when it suits you.

What are the tools?

In Raku, asynchronous programming is primarily performed through high-level, “composable” interfaces, such as Promise, Supply, and Channel. These are high-level because they provide an interface aimed at someone without needing to rigorously prove the safety of how these tools are used. You still have to be careful, to be sure, but the pitfalls are much reduced and should be more obvious from their interface. They are composable because they are designed to let you take different libraries using these interfaces and use them together in predictable ways.

Low-level tools, on the other hand, like Lock and Semaphore can be used in ways that result in unpredictable behavior when different libraries using these are combined. These too are available in Raku because sometimes you need to build with them, but they aren’t the bread and butter of Raku’s async programming.

What does it look like?

So what does async look like in Raku? Here’s a basic counting program, but instead of using a loop, we’ll output each number every second.

react {
    whenever Supply.interval(1) -> $n {
        say $n;
    }
}

This is functionally equivalent to running:

for 1...* -> $n {
    say $n;
    sleep 1;
}

As you can see the syntax is very simple and hopefully easy to follow. You can even think of it as sort of being synchronous without getting into too much trouble. A goal of async programming in Raku is to make it accessible to those who don’t fully understand it to at least understand what is happening.

Concurrent Programming

I took a class on concurrent programming in college. I’m afraid I passed the class without really grasping the topic very well. However, I think that’s partly because it is a hard concept to really understand by lifting the hood and looking at all the moving parts. However, it’s also partly because the class taught concurrency in mathematical terms and I’ve found the practical terms much easier to follow.

Concurrent programming is merely the act of running parts of a program simultaneously. The parts of a program that can be run simultaneously are called threads. You can think of threads as being lanes in which a program can run its code. For example, each thread might be running the same program code with different starting data. Or a program could have a thread for updating the graphics interface and a thread for talking to other programs on the network and a thread for doing local processing of the data received.

Bare metal threads

Running in lanes can be achieved in various ways, including:

  • On a single CPU/single core computer, the CPU will interleave code execution that needs to happen simultaneously. Code does not actually run at the same time, but every thread of each program is given time to run some instructions on the CPU, then a context switch is performed and another thread runs.

  • On a multi-CPU/multi-core computer, this means both interleaving code on each CPU core and it means running various programs and threads on each core in separate hardware simultaneously.

In any case, you have a problem where the code in any given thread might be only fractionally complete with a task, so you need to make sure the threads interact with one another safely. This is called “thread safe code.”

Thread safety

There are a few basic programming styles for building thread safe code. I present them here from most preferred to least preferred style:

  1. Any code that is stateless or which only has changing state internal to itself is inherently thread safe. It is only when code needs to reach data that other code might use that it becomes unsafe.

  2. Often, code can be written such that it is stateless internally, but performs state changes when it finishes execution. This data transformations are thread safe so long as something synchronizes the state updates in a safe way. The code that performs the transformation does not itself have to make any special provision for safety.

  3. For situations where some changeable state must be shared between executing threads, special care must be taken every time that data is read from and written to in order to make sure the changes don’t corrupt the state.

Raku provides specialized tools for handling each of these situations. Whenever writing concurrent programs, you should always aim for the top two coding styles when you can. These are the easiest to understand and maintain. However, the third option is sometimes the best

The Raku way

In Raku, we don’t tend to work with threads directly. Instead, we schedule blocks of code to run asynchronously. How the code is actually run depends on the scheduler used. The default scheduler on Rakudo will schedule blocks to run on a separate thread from the main thread.1

The primary means of scheduling a block to run is using a start block similar to the following example.

start {
    # Subtask 1
}
start {
    # Subtask 2
}
# Main task

I, personally, refer to work scheduled this way or by other means as a “task” or a “subtask” depending on context. That’s not a Rakuism, though, so beware that the terminology of others may vary.

Anyway, the main point is that concurrency refers to running two different parts of your code simultaneously.

Parallel Programming

A third term we will frequently be using in this calendar is parallel programming. Parallel programming is very similar to concurrency, but primarily oriented around processing data in parallel. For example, if you have a large number of integers you want to run through a function, you can perform that action concurrently for every one of those integers. This is parallel programming.

The reason it is a separate term from async or concurrency is twofold. One, a parallel program is not necessarily concurrent or async and, two, there are some special optimizations related to parallel programming that are not necessarily a part of general concurrency or async.

For example, here is a small parallel program in Raku:

my @doubles = (1...10_000_000).hyper.map(* * 2);

When this program runs, Raku will schedule the tasks to run in multiple threads to iterate through all values in the range and double all 10 million values in batches. The work of dividing the work into batches and then choosing how to schedule them is what makes parallel programming a special topic of its own. This program is probably concurrent, but it is not async. We could do parallel programming asynchronously too, but we aren’t in this example.

Programming Programming

So, when we put the terms together we find that a program might have any of the three properties described. It might be concurrent and async and parallel. It might be none of those things. It might be just one or two of those things, in any combination. These are different styles of programming.

Raku provides tooling that aims to make it easier to read your code when it is being formed asynchronously, concurrently, and in parallel. It also aims to encourage you to write your code in an inherently thread safe way by allowing you to safely transform state while operating in multiple threads simultaneously.

It is my hope that over the following 23 days, I will better equip you to decide when to embrace each of these features of Raku and how to do so in a way that makes your code faster while remaining easy for humans to parse and understand.

Cheers.


  1. I will cover the details of schedulers later. When I do you will see the exact mechanism used by Raku to determine whether a task is scheduled to run immediately in a separate thread, run at some future time on a separate thread, run on the current thread whenever the current thread stops being busy, etc. However, I will generally assume that start will schedule a task on the next available thread as this is generally safe to assume in Raku. ↩︎

The content of this site is licensed under Attribution 4.0 International (CC BY 4.0).