PHP extensions are usually written in C. C is a great and powerful language but the programmer needs to be very attentive: memory management is manual, if you forget to destroy a resource after you are through with it, you have a memory leak; if you need to make a copy of a zval and you forgot to increment its reference counter, chances are you will get a nice crash.

This is where C++ can help thanks to the RAII technique. But just like any tool, C++ must be properly used.

Memory allocation

Zend Engine uses its own memory allocation functions. They allow Zend Engine to recover from memory leaks (which happen either because of genuine bugs, or when the code did not have a chance to recover from a fatal error). Consider the following situation:

We allocate a 1 MB buffer, then call a userland function we cannot control, and which may suddenly terminate the script, and then we dispose of the allocated buffer. But in case that userland function never returns, we have a memory leak. Zend Engine will be unable to fix it automatically because we did not use Zend memory allocation functions.

If you are familiar with C++, you may wonder why we didn’t use something like std::unique_ptr:

The answer is because it does not really matter whether we use smart pointers here; when I said that the userland function may not return, I really meant this; the stack unwinding will not happen, and we still have a memory leak. I will return to this issue a bit later.

In C, we would use something like this:

All memory allocated with emalloc() or ecalloc() is automatically freed by Zend Engine when the request finishes, and therefore there will be no memory leak (debug PHP builds do report memory leaks but that happens only when the script terminates normally, that is, without calls to die()/exit() or uncaught exceptions or fatal errors).

OK, for memory allocations we can use emalloc() and ecalloc() but what about std::string, std::vector etc?

The good news is that the STL supports custom allocators, and therefore we can easily write a emalloc()-based allocator that plays nice with Zend Engine:

Now, if we need a string that uses our allocator, we can use something like this:

Control Flow

Earlier I mentioned that stack unwinding does not always happen when the script terminates abnormally. Now I try to explain why this happens and what we can do.

Consider the following PHP function:

What will happen when the interpreter executes this function?

die() translates to EXIT opcode which is handled this way:

That is, in the end, the handler calls zend_bailout() function.

zend_bailout() is a macro defined as _zend_bailout(__FILE__, __LINE__), which is

LONGJMP is another macro that expands either to longjmp() or something compatible.

What does this mean for C++?

No destructors for automatic objects are called. If replacing of std::longjmp with throw and setjmp with catch would execute a non-trivial destructor for any automatic object, the behavior of such std::longjmp is undefined.

What is the proper way to handle such situations?

The trick here is to intercept the error, clean up all resources we may have allocated, and rethrow the error.

Resources must be allocated before we do SETJMP and freed before we call _zend_bailout().

Restoration of EG(bailout) can be automated with a simple class, but this is left as an exercise to the reader.

Mix C and C++ headers

PHP/Zend extensively uses inline functions, and not always those functions are enclosed with extern "C". Because of name mangling, the internal function name for the C and C++ may differ. This is usually not an issue — until the compiler refuses to inline the function and insists on calling it by name. I have faced this several times with PHP 7.0 (maybe things got better with 7.1 or 7.2, I don’t know), and therefore I strongly recommend to include Zend/PHP headers this way:

C++ exceptions

C++ exceptions should not leak into PHP, as this will crash the interpreter. This is especially critical for multi-threaded SAPIs.

If an exception has to be passed to the userland, it should be converted into a PHP exception:

Link In the Standard C++ Library

By default, PHP’s build system uses a C compiler and does not link in the standard C++ library.

To tell it you need C++, you need to tweak your extension’s config.m4 file.

For example, it if looks this way for C:

You will need to add PHP_REQUIRE_CXX() and pass true as the very last argument to PHP_NEW_EXTENSION:

PHP Extensions and C++
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *