Iterative Versions of Recursive Functions in C++

6 min readJun 21, 2023

It is much more easy to come up of recursive algorithms for many problems. The only catch is that recursion uses stack memory which is in short supply while iterative versions use a list accessed in a last-in-first-out manner. Here we will see three algorithms namely, the factorial, the depth of a binary tree and the generation of all n-choose-k combinations of a set.

Factorial

The factorial of number can be defined recursively as:

f(0) = 1
f(n) = n* f(n-1)

This allows a direct translation to the following code:

int factorial(int n) {
  if (n > 1) {
    // Recursive case
    return n * factorial(n - 1);
  } else {
    // Base case
    return 1;
  }
}

Converting it to an iterative function is pretty simple. However even the recursive function when compiled with GNU C++ compiler produces an iterative code as the following assembler output shows.

_Z9factoriali:
.LFB0:
 endbr64
 movl $1, %eax
 cmpl $1, %edi
 jle .L4
.L3:
 movl %edi, %edx
 subl $1, %edi
 imull %edx, %eax
 cmpl $1, %edi
 jne .L3
 ret  ;; redundant code
.L4:
 ret

Thus there may not be a need to actively convert a recursive function to an iterative function especially if the function is tail recursive. Many languages support tail calls where, if the last statement is a call to another function then there is a jump not a call, thereby preventing the stack from growing. C++ cannot do that, partly because destructors and finalisers are meant to called just before the function returns.

Depth of Binary Tree

Consider the function to compute the depth of a binary tree. The recursive version is reproduced below:

int recursive_depth(Node* root){
  if (root ==nullptr) return 0;
  else {
      int dleft = recursive_depth(root->left);
      int dright = recursive_depth(root->right);
      return 1 + std::max(dleft,dright);
  }
}

The assembler language output as shown below indicates that much like the C++ code the function is called twice recursively.

_Z15recursive_depthP4Node:
.LFB6:
 endbr64
 testq %rdi, %rdi
 je .L6
 pushq %rbp
 pushq %rbx
 movq %rdi, %rbx
 subq $8, %rsp
 movq (%rdi), %rdi
 call _Z15recursive_depthP4Node
 movq 8(%rbx), %rdi
 movl %eax, %ebp
 call _Z15recursive_depthP4Node
 leal 1(%rbp), %ecx
 movl %eax, %edx
 leal 1(%rax), %eax
 cmpl %edx, %ebp
 cmovg %ecx, %eax
 addq $8, %rsp
 popq %rbx
 popq %rbp
 ret
.L6:
 xorl %eax, %eax
 ret

One way to convert it to an iterative function is to convert it a Continuation Passing Style (CPS). Translating the Ocaml version by Gasche we get the following C++ code.

typedef std::function<int(int)> call_back;
int cps_depth( Node* root, call_back cps){

  if(root==nullptr) return cps(0);
  else return cps_depth(root->left,[&] (int dleft) {
                        return cps_depth(root->right, [&] (int dright){
                        return cps(1 + std::max(dleft,dright));
                        });});
}
int depth(Node* root){
  return cps_depth(root, [](int k){return k;});
}

However despite being tail recursive the C++ compiler cannot translate it to an iterative function as the assembled output, not shown, indicates. The way to convert a recursive function to CPS, usually, is to use a CPS function that takes the return value of the recursive function as the input parameter and then to add the rest of the original function as the code for the CPS function. It is important that the recursive function uses the CPS function before returning at every instance.

Following Gasche, we can now convert the above code to two mutually recursive functions that are actually tail calls.

enum class item_type { tree, height};
struct item {
  item_type type;
 union { Tree* root; unsigned height; };
 item(Tree* r):type(item_type::tree)
    {root=r;}
 item(unsigned h):type(item_type::height)
    {height=h;}
};
static int height_aux(Tree *root, stack<item>& items){
    if(root)
    {
      item it(root->right);
      items.push(it);
      return height_aux(root->left,items);
    } else {
      return eval(items, 0);
    }
  }
  static int eval(stack<items>, unsigned depth){
    if(items.empty()){
      return depth;
    }else{
 
      item cur = items.top();
      items.pop();
      switch (cur.type)
      {
        case item_type::tree:
          {
            item next(depth);
            items.push(next);
            return height_aux(cur.root, items);
          }
        case item_type::height:
          {
            return eval(items,  1 + std::max(depth, cur.height));
          }
      }
    }
 
  }
static unsigned height(Tree* root){
    stack<item> items;
    return height_aux(root,items);
}

Let’s decode what is happening here. Let’s look at the return call of the CPS version again.

return cps_depth(root->left,[&] (int dleft) {
                        return cps_depth(root->right, [&] (int dright){
                        return cps(1 + std::max(dleft,dright));
                        });});

The second or inner call to cps_depth requires the value of root->right and nothing else. The subsequent call to cps needs only the value of dleft, which becomes available only when the outer function is called. Hence we need a union that indicates which of the two values is required.

Combining eval and height_aux into a single iterative function is not too difficult as shown below

static int height(Tree *root){
  enum class item_type { tree, height};
  struct item {
    item_type type;
    union {
      Tree* root;
      unsigned height;
    };
    item(Tree* r):type(item_type::tree)
    {root=r;}
    item(unsigned h):type(item_type::height)
    {height=h;}
 
  };
    bool go_on=true;
    unsigned depth = 0;
    std::stack<item> items;
    while (go_on) {
 
      if (root) {
        item it(root->right);
        items.push(it);
        root = root->left;
      } else {
        go_on=false;
        depth=0;
        while (!items.empty()&& !go_on) {
          item cur = items.top();
          items.pop();
          switch (cur.type) {
            case item_type::tree: {
                item next(depth);
                items.push(next);
                root = cur.root;
                go_on = true;
              }
            break;
            case item_type::height: {
                  depth = 1 + std::max(depth, cur.height);
                 }
             break;
          }
        }
      }
    }
    return depth;
  }

Generate All Combinations

Consider the problem of generating all k-combinations of set containing ’n’ elements.

void print(const vector<int>& nums, ostream& fout)
{
    for (auto& a : nums) { fout << a << ' '; }
    fout << endl;
}
void combo(vector<int>& nums, int n, int k, vector<int>& result) 
{
    if ( k == 0 ) {
       print(result,cout); 
    }
    else if (n == k){
            result.insert(result.end(),nums.begin(),nums.begin()+k);
           print(result,cout);
            result.erase(result.end()-k,result.end());
    } else {
            result.push_back(nums[0]);
            //include num[0] and reduce n
            swap(nums[0],nums[n-1]);
            combo(nums,n-1,k-1,result);
            result.pop_back();
            //exclude old num[0]           
            combo(nums,n-1,k,result);
            swap(nums[0],nums[n-1]);
    }
}
void doCombo(vector<int>& nums, int n, int k){
  vector<int> result;
  combo(nums,n,k,result);
}

Notice that result is empty initially and in the course of the computation, the ‘result’ will be the same at the start and end of the function in all instances of the call. Needless to say, the compiler does not automatically convert it an iterative function. Using the same technique as before let us generate the CPS version is shown below:

typedef std::function<void()> callback;
void combo_cps(vector<int>& nums, int n, int k,
  vector<int>& result, callback cps){
    if ( k == 0 ) {
        print(result,cout); 
        cps();
    } else if (n == k){
        result.insert(result.end(),nums.begin(),nums.begin()+k);
        print(result,cout);
        result.erase(result.end()-k,result.end());
        cps();
    } else {
        result.push_back(nums[0]);
        //include num[0] 
        swap(nums[0],nums[n-1]);
        combo_cps(nums,n-1,k-1,result, 
                  [&](){
                      result.pop_back();
                      //exclude old num[0]            
                      combo_cps(nums,n-1,k,result, [&]()
                                {
                                    swap(nums[0],nums[n-1]);
                                    cps();
                                });
                  });
    }

}
void combo(vector<int>& nums, int n, int k){
   combo_cps(nums,n,k,[](){}):
}

(On a personal note: I made the rookie mistake of not calling cps before returning and took me a while to fix the issue).

To convert the above CPS version to an iterative version we need to track two values, n and k and the state which is whether num[0] is included or excluded. The result is shown below:

struct iter{
  int n;
  int k;
  enum { include, exclude, none } state;
  
};

void combo_iter(vector<int>& nums, int n, int k){
    vector<int> result;
    vector<iter> items;
    iter it{0,0,iter::none};
    items.push_back(it);
    for(;;){
        while(!(k==0 || k==n)){
            result.push_back(nums[0]);
            swap(nums[0],nums[n-1]);
            iter it{n-1,k-1,iter::include};
            items.push_back(it);
            --n;
            --k;
        }

    if ( k == 0 ) {
        print(result,cout); 
    } else if (n == k){
        result.insert(result.end(),nums.begin(),nums.begin()+k);
        print(result,cout);
        result.erase(result.end()-k,result.end());
    } 
        bool go_on = true;
        while(go_on){
            iter it = items.back();
            items.pop_back();
            switch (it.state){
                case iter::exclude:
                     swap(nums[0],nums[n]);
                    break;
                case iter::include:
                    result.pop_back();
                    it = iter{it.n,it.k+1,iter::exclude};
                    items.emplace_back(it);
                    n=it.n;
                    k= it.k;
                    go_on = false;
                    break;
                case iter::none:
                    return;
                    break;
            }
        }
    }
}

Notice that the structure of the code is almost the same as for the tree-depth problem. To include num[0] we decrease k and add num[0] to the result. To exclude num[0] we revert to the old value of k and pop the top value from the result. Notice that the switch statement is executed only after at least one whole result is computed.

Footnote:

Notice that instead of passing vector nums we could use a const_iterator and thereby get rid of all the swaps. In fact if nums is a vector or array we could use a pointer to the first element of the vector.

Iterative Versions of Recursive Functions in C++

Factorial

Depth of Binary Tree

Generate All Combinations

Written by Joseph Mariadassou