Functional Data Structures

Speaker: Ivan Cukic

Audience level: Intermediate | Advanced

category

Today, most software systems have the need for some sort of concurrency. This poses problems regarding the software design. In the past few years, Meeting C++ had more than a few talks related to software design in the concurrent environment. From higher-level topics like Functional Reactive Programming to lower level stuff like Coroutines TS.

There is another important aspect of writing software in the concurrent environment - handling state. Most programs have a lot of moving parts, and it is not easy to keep track of them. It is not rare to have software bugs coming from the program being in an invalid state while not realizing it is invalid.

To quote John Carmack:

A large fraction of the flaws in software development are due to programmers not fully understanding all the possible states their code may execute in. In a multithreaded environment, the lack of understanding and the resulting > problems are greatly amplified, almost to the point of panic if you are paying attention.

In order to minimize the moving parts, we can just decide never to change any data we create - to make everything immutable. This solves the problem of having multiple concurrent processes try to update the same data ending up in a data-race, and it produces systems for which it is much harder to end up in an invalid state.

The problem with this idea is that if we can not change any existing data, we will need always need to create altered copies of old data. If we used ordinary data structures, this would explode our memory requirements.

Functional data structures are a perfect solution for this. They are immutable - they do not allow destructive updates, and they efficiently keep the history of all the previous values still referenced by different parts of the program.

We will present the general ideas behind the design of functional data structures, with a few examples culminating in the data structure which has most of the benefits of std::vector - from algorithm complexity, to memory locality - while being optimized for creating modified copies.