Sparks

C++20 Coroutines: a short series

3: Implementing Simple Generator Coroutines in C++03

Coroutines: Basics, Py->C++20, Benchmarking

This is a mini-series leading up to implementing coroutines in C++20, (“what I learnt on my holiday”). It also has comparison with other implementations, as part of my ModernCPP explorations. (first post)

In the previous post I showed some sample generators in C++ and noted that you could not know what the following did without knowing about the implementation:

class Fibonnaci: public Generator {
    int a, b, s;
public:
    Fibonnaci() : a(1), b(1), s(0)  { }
    int next() {
    GENERATOR_CODE_START
        while ( true ) {
            YIELD(a);
            s = a + b;
            a = b;
            b = s;
        };
    GENERATOR_CODE_END
    };
};

As a result, this post expands on this to discuss the implementation.

Expanding the example

There’s no formal support for coroutines in C++03. So how does this work? Fundamentally, this uses preprocessor macros to implement the generator behaviour. This is what the code looks like when those macros are expanded:

class Fibonnaci: public Generator {
    int a, b, s;
public:
    Fibonnaci() : a(1), b(1), s(0)  { }
    int next() {
    //GENERATOR_CODE_START
      if (__generator_state == -1) { throw StopIteration(); } switch(__generator_state) { default:

        while ( true ) {
             __generator_state = __LINE__; return (a); case __LINE__:
            s = a + b;
            a = b;
            b = s;
        };
      }; __generator_state = -1; throw StopIteration();
    };
};

It shoulds be relatively clear from this that this is just really just a bit of syntactic sugar around a switch based state machine. The state machine is managed by the automatically expanded __LINE__ macro which represents the line number where used.

Base class support

It’s in order to support this that we derive from a Generator baseclass. That baseclass looks like this:

class Generator {
  protected:
    int __generator_state;
  public:
    Generator()  {     };
    ~Generator() {     };
};

What this shows is that in this generator, generator “local” variables are stored inside the generator object, not in the coroutine body. It also uses a protected __generator_state attribute to capture the line number before returning. A large switch statement controls re-entry into the the coroutine body.

Support Macros

The preprocessor macros that are needed to enable this look like this:

// #define GENERATOR struct
#define GENERATOR_CODE_START  if (__generator_state == -1)      \
                     { throw StopIteration(); }         \
                     switch(__generator_state)         \
                     { default:
#define YIELD(v)         __generator_state = __LINE__; \
                         return (v);                         \
                         case __LINE__:
#define GENERATOR_CODE_END     };                                      \
                     __generator_state = -1; throw StopIteration();

It should be clear that this is primarily code focussed on building a statemachine driven by __LINE__

Complete C++03 Generator code

This means that the whole code for the old-style generators is pretty short:

// NOTE: cpp03generators.h

class StopIteration { };

class Generator {
  protected:
    int __generator_state;
  public:
    Generator()  {     };
    ~Generator() {     };
};

#define GENERATOR_CODE_START  if (__generator_state == -1)      \
                     { throw StopIteration(); }         \
                     switch(__generator_state)         \
                     { default:
#define YIELD(v)         __generator_state = __LINE__; \
                         return (v);                         \
                         case __LINE__:
#define GENERATOR_CODE_END     };                                      \
                     __generator_state = -1; throw StopIteration();

Where’s the code?

Update: 11/4/2023

Want to read and runcode, not the blog? I know the feeling. I recently created a github repo specifically for putting code from blog posts in. Though recognising that it might also have non-code stuff added at some point, I’ve called it blog-extras.

The blog-extras repo can be found here:

The code specific to this series of short posts can be found here:

That repo will get fleshed out with code/examples from other posts in this blog too.

Discussion

This is a fairly brief and clear implementation, which pretty much mirrors what every coroutine implementation has to do:

  • Store local state inside a heap structure
  • Keep track of the local line number to return to when started
  • Guard against restarting a stopped generator
  • Wrap all the body code inside a switch statement to allow jumping back to where we left off.

It’s cheap and cheerful, and loosely based on Duff’s device based coroutines so is guaranteed to work (with a small caveat around one type of debug environment). It’s also an old approach, so is very portable and compiles almost everywhere. I’ve used this approach in my python to C++ compiler pyxie to implement the range() method and it’s relatively clear and lightweight. Due to the lightweight nature, this sort of coroutine is also pretty useful in an constrained environments.

If you wanted to serialise this and send it over the network to run elsewhere, it should be fairly obvious how you would do this - since this doesn’t rely on any internal / compiler specific functionality. It should be fairly easy to also see how to implement Modern C++ functionality on top of this - things like templating, move support, etc.

One downside is the syntax looks a little funky. It also doesn’t provide for everything modern C++20 coroutines do - including things like the iterator/ranges protocol, sending values in, throwing exceptions in, and so on. (These could be added though)

NEXT POST: The next post in this short series looks at implementing simple generator style coroutines in C++20.

Updated: 2023/09/15 21:12:33