In Systematic Error Handling in C++ Andrei Alexandrescu claims that C++ exceptions are very slow. This is understandable: the code behind the scenes that handles an exception is rather complicated. It needs to dispatch the exception to a proper catch block and make stack unwinding happen so that all objects that need to be destroyed, be destroyed.

It is clear that exceptions should be used for exceptional situations, and not just as a way to transfer control way up the stack.

Nevertheless, it is interesting to know how fast / slow exceptions are.

I used Google’s benchmark library to benchmark exception handling.

Scenarios I benchmarked were:

  1. No exceptions: the benchmarked function calls another external function and modifies its result. Why call an external function: because the optimizer can optimize away an empty try…catch block. Why modify the result: because the optimizer can turn CALL instruction into JMP.
  2. Same as above but with try…catch around the external call. This allows for measuring the real cost of the “zero cost” exception model.
  3. Same as above but the external function throws an exception. The catch block swallows the exception and does nothing else. This allows for measuring the cost of exception handling.
extern int do_something();
extern int throw_something();

int func_empty()
{
    return do_something() + 1;
}

int func_trycatch()
{
    try {
        return do_something() + 1;
    }
    catch (...) {
        return -1;
    }
}

int func_throw()
{
    try {
        return throw_something() + 1;
    }
    catch (...) {
        return -1;
    }
}
#ifndef FUNCS_H
#define FUNCS_H

void func_empty();
void func_trycatch();
void func_throw();

#endif
#include <stdexcept>
#include <benchmark/benchmark.h>
#include "funcs.h"

int do_something()
{
    return 22;
}

int throw_something()
{
    throw std::exception();
}

static void BM_func_empty(benchmark::State& state)
{
    for (auto _ : state) {
        func_empty();
    }
}

static void BM_func_trycatch(benchmark::State& state)
{
    for (auto _ : state) {
        func_trycatch();
    }
}

static void BM_func_throw(benchmark::State& state)
{
    for (auto _ : state) {
        func_throw();
    }
}

BENCHMARK(BM_func_empty);
BENCHMARK(BM_func_trycatch);
BENCHMARK(BM_func_throw);

BENCHMARK_MAIN();

System specifications: 4 X 3800 MHz CPUs, CPU Caches: L1 Data 32K (x4), L1 Instruction 32K (x4), L2 Unified 256K (x4), L3 Unified 6144K (x1)

Compilers tested: g++ 7.2.0, g++ 6.4.0, g++ 5.4.1 (all are x86_64-linux-gnu); clang++ 5.0.0-3, clang++ 4.0.1-6, clang++ 3.9.1-17ubuntu1, clang++ 3.8.1-24ubuntu7. (all are x86_64-pc-linux-gnu).

g++ 4.8.0 didn’t work, the application crashed: bm: malloc.c:2427: sysmalloc: Assertion (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize – 1)) == 0)’ failed. According to the <a href="https://gcc.godbolt.org/">Compiler Explorer</a>, all g++ compilers generated the same code; clang++ compilers also generated the same code, albeit a bit different from g++. <pre class="lang:asm" title="g++"> func_empty(): sub rsp, 8 call do_something() add rsp, 8 add eax, 1 ret func_trycatch(): sub rsp, 8 call do_something() add eax, 1 .L4: add rsp, 8 ret mov rdi, rax call __cxa_begin_catch call __cxa_end_catch or eax, -1 jmp .L4 func_throw(): sub rsp, 8 call throw_something() add eax, 1 .L9: add rsp, 8 ret mov rdi, rax call __cxa_begin_catch call __cxa_end_catch or eax, -1 jmp .L9 </pre> <pre class="lang:asm" title="clang++"> func_empty(): # @func_empty() push rax call do_something() inc eax pop rcx ret func_trycatch(): # @func_trycatch() push rax call do_something() mov ecx, eax inc ecx .LBB1_3: mov eax, ecx pop rcx ret mov rdi, rax call __cxa_begin_catch call __cxa_end_catch mov ecx, -1 jmp .LBB1_3 func_throw(): # @func_throw() push rax call throw_something() mov ecx, eax inc ecx .LBB2_3: mov eax, ecx pop rcx ret mov rdi, rax call __cxa_begin_catch call __cxa_end_catch mov ecx, -1 jmp .LBB2_3 </pre> <table class="benchmark"> <tr> <th></th> <th>func_empty<br/>time, ns</th> <th>func_trycatch<br/>time, ns</th> <th>func_throw<br/>time, ns</th> </tr> <tr> <th>g++ 7.2</th> <td>2</td> <td>2</td> <td>1320</td> </tr> <tr> <th>g++ 6.4</th> <td>3</td> <td>2</td> <td>1288</td> </tr> <tr> <th>g++ 5.4.1</th> <td>3</td> <td>2</td> <td>1305</td> </tr> <tr> <th>clang++ 5</th> <td>2</td> <td>2</td> <td>1323</td> </tr> <tr> <th>clang++ 4.0.1</th> <td>2</td> <td>2</td> <td>1329</td> </tr> <tr> <th>clang++ 3.9.1</th> <td>2</td> <td>2</td> <td>1321</td> </tr> <tr> <th>clang++ 3.8.1</th> <td>2</td> <td>2</td> <td>1304</td> </tr> </table> Some g++ tests says that func_empty() is sometimes slower than func_trycatch(). I find it hard to believe, as the code generated for both functions is the same (obviously, only ` part differs). This is probably a measurement error, and the difference in couple of nanoseconds is insignificant.

The results show that:

  • in terms of execution time, zero cost exception model is really zero cost — unless you have to catch an exception;
  • when you have to catch an exception, the overhead is quite significant — microseconds vs nanoseconds. In my opinion, this confirms the statement that exceptions should be used for exceptional situations — at least when you are concerned with performance.

Recommended reading: Top 15 C++ Exception handling mistakes and how to avoid them.

How Slow Are C++ Exceptions?
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *