Wednesday, July 21, 2010

Recursion is not always good!

In my university's computer science curriculum, they seem to put a big emphasis in using recursion to solve things, as opposed to alternative methods (such as loops).

It is indeed true that recursion can simplify the high-level code, and many times it can look beautiful. There is of-course a price to pay with recursion, and that is the extra overhead recursive calls have (therefor they're usually slower than alternative iterative/loop versions).

I will stress that learning to think recursively is indeed a good skill to have, and any good programmer should be able to do so. So in that sense, it is good that computer science curriculums seem to be stressing recursion; however, once you have learned to think recursively, it is again beneficial to start thinking non-recursively when algorithms are simple-enough to do so.

Let's consider one of the most common examples of recursion used, the factorial function! (I have used this function many times in my previous articles to demonstrate things, so shouldn't be a surprise :D)


template<typename T>
T factorial(T n) {
return n <= 1 ? 1 : n * factorial(n-1);
}

int main() {
int n = factorial(5);
cout << n << endl; // Prints "120"
return 0;
}


This does end up looking pretty clean, as we can do the whole factorial function in one line.

Now lets take a look at the iterative/loop version:

template<typename T>
T factorial(T n) {
T ret = n ? n :(n=1);
while(--n) ret*=n;
return ret;
}

int main() {
int n = factorial(5);
cout << n << endl; // Prints "120"
return 0;
}


Now this iterative version is not as 'beautiful' as the recursive version, because as you can see its 3 lines instead of 1.

However the generated code for this version ends up being nicer than the recursive version.

When you make recursive calls, the arguments that are passed to the recursive function need to be pushed on the stack, and then these arguments eventually need to be popped back out as well. Furthermore, not only do the arguments need to be pushed on the stack, but also the return address needs to be pushed. Sometimes the recursion can be nested so-deep, that you end up running out of stack, and get a stack-overflow crash. This extra overhead is the downside of recursion.

The iterative/loop version doesn't have such problems. No function calls are made so you don't have the function call overhead. Instead the loop versions usually end up compiling with a simple short-jump.

Another thing is that the loop version is inline-able, whereas the recursive version likely will not be inlined, and even if it does get inlined, it probably will only inline the first call to the function.

Potentially the compiler might be able to turn a recursive call into a iterative/loop-version, but don't count on this. At least, I haven't seen it happen...


Now, there are some cases where an algorithm will end up being naturally-recursive, and it ends up being difficult or unpractical to code an iterative version. In these cases recursion is the better way to go (in these cases, the algorithm can end up being faster using recursion instead of iteration as well). So in the end, it really depends on the situation whether to go with recursion or iteration/loops; the more experience you get, the more you figure out when to use recursion vs iteration.


Notes:
If you noticed, I was using template-functions for the factorial function. The reason I made it a template, is because the factorial function is something you may want to re-use with different data-types (int, unsigned int, long long int, float, double, some custom BigInteger class, etc...).
Using templates, we only have to code one function that can be reused by these different data-types.

4 comments:

  1. It's worth mentioning when recursion is a good idea. Namely, when you're dealing with recursive data structures such as trees. There's other places where it's good too, obviously, but this is an immediately apparent example. Why is your blog called trash can of dreams?

    ReplyDelete
  2. Most algorithms performed on tree data structures are indeed well-fit for recursion.

    I was always taught to use recursion when traversing trees and etc...
    I wonder how the code will look without using recursion. I don't believe I've seen that before.


    Anyways, the blog name is something i thought up years ago. Essentially it symbolizes a place where a bunch of random ideas are piled up, as well as thoughts/ideas/dreams that will never become a reality.

    ReplyDelete
  3. Nice read! :)

    I was hoping to see some benchs :-(

    I did once a quadtree using loops rather than recursion. It was fast but it ended up being _very_ hard to understand and modify.

    In the end we ended up refactoring to use recursion. Much more cleaner and easier to understand. Lots of bugs were also fixed thanks to that.

    By the way, the "loop" way of doing things is called Tail recursion (vs Call stack recursion).

    And inlining callstack recursion in MSVC won't ever happen (as to the date) unless you use some pragmas and pray: http://msdn.microsoft.com/en-us/library/z8y1yy88%28VS.80%29.aspx

    I was just reading a few days ago an article about the C# JIT compiler which converted call stack recursion code into tail recursion, under very strict circumstances. Here's the link: http://blogs.msdn.com/b/davbr/archive/2007/06/20/tail-call-jit-conditions.aspx

    ReplyDelete
  4. Dark Sylinc,

    I ended up making a quick test-case, although it was a bit of a pain since the recursive version kept getting stack overflows whenever i set a high input.
    So I ended up going with this code:
    u64 r = 0;
    for(u64 i = 0; i < 0x3fff; i++) {
        r += factorial(i);
    }
    cout << r << endl;

    The runtime is 2.226 seconds for the recursive version, and 0.500 seconds for the loop version.

    And thanks for the links, I'll check them out :P

    ReplyDelete