From time to time you have to deal with someone else’s code. And the code you have to deal with sometimes surprises you.

For example, Pimple, a small Dependency Injection Container for PHP; recent versions of which, according to README, are more focused on performance.

Performance… So much in this word…

Learning something new is always interesting and sometimes funny, and therefore I started to read the code.

The deeper I went, the more I got confused: if that is the code focused on performance, what kind of code do you usually write?

class Container implements \ArrayAccess

ArrayAccess is slow. Yes, it is convenient, offers more readability but is is very slow. It is even faster to call offsetGet/offsetSet/offsetExists directly than to use ArrayAccess.

$this->factories = new \SplObjectStorage();
$this->protected = new \SplObjectStorage();

It is pretty interesting approach to use objects as keys (are you really sure you need this?) but this artificially introduces some limitations: SplObjectStorage accepts only objects, and therefore factories can be either closures or invocable objects (classes with __invoke() method), but not arrays (say, [$object, 'method'], which is a valid callable in PHP).


Oh well, oh well… In PHP, if you really want performance, you need to change your habits as to how you write the code. The reason is that the PHP interpreter in many ways is as dumb as a rock: it will do what you tell it to, and it won’t try to be smarter than you.

So, whenever you see $this->values[$id], you can be sure that PHP will really fetch the property called ‘values’ from $this, and then will look up a value by the key $id. Common subexpression elimination? No way. Honestly, it is probably not that easy task anyway: if a property is an object implementing ArrayAccess interface, you in theory can have different values every time you call offsetGet; to make sure that this does not happen, the optimizer needs to know the name of the class, and have that class available (which is not always the case thanks to autoloading).

Below is what VLD shows for the above offsetGet implementation (you can reproduce the result with php -d extension=vld.so -d vld.active=1 -d opcache.enable_cli=1 Container.php; I intentionally used opcache.enable_cli=1 to mimic the real production environment):

function name:  offsetGet
number of ops:  60
compiled vars:  !0 = $id, !1 = $raw, !2 = $val
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
  98     0  E >   RECV                                             !0      
 100     1        FETCH_OBJ_IS                                     $4      'keys'
         2        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, !0
         3      > JMPNZ                                                    ~3, ->8
 101     4    >   NEW                                              $3      :-4
         5        SEND_VAR_EX                                              !0
         6        DO_FCALL                                      0          
         7      > THROW                                         0          $3
 105     8    >   FETCH_OBJ_IS                                     $4      'raw'
         9        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, !0
        10      > JMPNZ                                                    ~3, ->27
 106    11    >   FETCH_OBJ_R                                      $4      'values'
        12        FETCH_DIM_R                                      $5      $4, !0
        13        TYPE_CHECK                                    8  ~4      $5
        14      > JMPZ                                                     ~4, ->27
 107    15    >   FETCH_OBJ_R                                      $4      'values'
        16        FETCH_DIM_R                                      $5      $4, !0
        17        FETCH_OBJ_IS                                     $4      'protected'
        18        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, $5
        19      > JMPNZ                                                    ~3, ->27
 108    20    >   INIT_FCALL                                               'method_exists'
        21        FETCH_OBJ_R                                      $4      'values'
        22        FETCH_DIM_R                                      $3      $4, !0
        23        SEND_VAR                                                 $3
        24        SEND_VAL                                                 '__invoke'
        25        DO_ICALL                                         $3      
        26      > JMPNZ                                                    $3, ->30
 110    27    >   FETCH_OBJ_R                                      $4      'values'
        28        FETCH_DIM_R                                      $3      $4, !0
        29      > RETURN                                                   $3
 113    30    >   FETCH_OBJ_R                                      $3      'values'
        31        FETCH_DIM_R                                      $5      $3, !0
        32        FETCH_OBJ_IS                                     $4      'factories'
        33        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, $5
        34      > JMPZ                                                     ~3, ->42
 114    35    >   FETCH_OBJ_R                                      $4      'values'
        36        FETCH_DIM_R                                      $3      $4, !0
        37        INIT_DYNAMIC_CALL                                        $3
        38        FETCH_THIS                                       $3      
        39        SEND_VAR_EX                                              $3
        40        DO_FCALL                                      0  $3      
        41      > RETURN                                                   $3
 117    42    >   FETCH_OBJ_R                                      $4      'values'
        43        FETCH_DIM_R                                      $3      $4, !0
        44        QM_ASSIGN                                        !1      $3
 118    45        INIT_DYNAMIC_CALL                                        !1
        46        FETCH_THIS                                       $3      
        47        SEND_VAR_EX                                              $3
        48        DO_FCALL                                      0  $4      
        49        FETCH_OBJ_W                                      $5      'values'
        50        ASSIGN_DIM                                       $3      $5, !0
        51        OP_DATA                                                  $4
        52        QM_ASSIGN                                        !2      $3
 119    53        FETCH_OBJ_W                                      $3      'raw'
        54        ASSIGN_DIM                                               $3, !0
        55        OP_DATA                                                  !1
 121    56        FETCH_OBJ_W                                      $3      'frozen'
        57        ASSIGN_DIM                                               $3, !0
        58        OP_DATA                                                  
 123    59      > RETURN                                                   !2

For the records, “unoptimized” (with OpCache disabled) version had 67 operations.

Oplines 8 to 29 are responsible for the second if operator, I will decode them to explain what happens:

  • FETCH_OBJ_IS $4 'raw' silently (that is, does not complain if the property does not exist) fetches property $this->raw into a variable $4.
  • ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0 checks whether $4[!0] (that is, $this->raw[$id]) is empty and stores the result to a compiled variable ~3
  • JMPNZ ~3, ->27 transfers control to the 27th opline if ~3 is not zero.
  • FETCH_OBJ_R $4 'values' fetches $this->values into $4
  • FETCH_DIM_R $5 $4, !0 reads $4[!0] (that is, $this->raw[$id]) into $5
  • TYPE_CHECK 8 ~4 $5 checks whether $5 is of type 8 (IS_OBJECT) and stores the result into ~4
  • Oplines 15 and 16 are the same as 13 and 14 because Zend OpCache does not eliminate common subexpressions
  • FETCH_OBJ_IS $4 'protected' silently fetches $this->protected into $4
  • ISSET_ISEMPTY_DIM_OBJ ~3 $4, $5 checks if $4[$5] is set (isset($this->protected[$this->values[$id]])) and stores the result to ~3
  • JMPNZ ~3, ->27 transfers the control to the 27th opline if ~3 is not zero.
  • INIT_FCALL 'method_exists' prepares function call info and function call info cache for method_exists() function (INIT_FCALL is roughly equivalent to zend_fcall_info_init() Zend API)
  • Oplines 21 and 22… they look so familiar, I bet we have already seen them somewhere!
  • Oplines 23 and 24 sends parameters to method_exists, and DO_ICALL invokes method_exists() and stores the return value to $3. JMPNZ transfers control to the 30th opline if $3 is not zero. And I cannot get rid of annoying feeling that I have already seen oplines 27 and 28.

I have written a small PhpBench benchmark to see how fast Pimple is:

<?php

/**
 * @Revs(1000000)
 * @Iterations(5)
 * @OutputMode("throughput")
 * @OutputTimeUnit("seconds", precision=3)
 * @Groups({"Container"})
 */
class PimpleContainerBench
{
    private $x;

    public function __construct()
    {
        $this->x = new Pimple\Container();
        $this->x['factory'] = $this->x->factory(function() {
            return 1;
        });

        $this->x['shared'] = function() {
            return 2;
        };

        $x = $this->x['shared']; // Resolve
    }

    /**
     * @Subject
     */
    public function getShared()
    {
        $this->x['shared'];
    }

    /**
     * @Subject
     */
    public function getFactory()
    {
        $this->x['factory'];
    }

}

On average, getShared performed 7,916,916 operations per second, getFactory was slower and performed 2,067,365 operations per second.

Now let us try to optimize the container.

The most obvious optimization is to eliminate common subexpression:

@@ -101,20 +101,21 @@
             throw new UnknownIdentifierException($id);
         }

+        $raw = $this->values[$id];
+
         if (
             isset($this->raw[$id])
-            || !\is_object($this->values[$id])
-            || isset($this->protected[$this->values[$id]])
-            || !\method_exists($this->values[$id], '__invoke')
+            || !\is_object($raw)
+            || isset($this->protected[$raw])
+            || !\method_exists($raw, '__invoke')
         ) {
-            return $this->values[$id];
+            return $raw;
         }

-        if (isset($this->factories[$this->values[$id]])) {
-            return $this->values[$id]($this);
+        if (isset($this->factories[$raw])) {
+            return $raw($this);
         }

-        $raw = $this->values[$id];
         $val = $this->values[$id] = $raw($this);
         $this->raw[$id] = $raw;

However, this is not the best solution, and in fact, it will be slower for shared services than the original code.

line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
…
 104     8    >   FETCH_OBJ_R                                      $4      'values'
         9        FETCH_DIM_R                                      $3      $4, !0
        10        QM_ASSIGN                                        !1      $3
 107    11        FETCH_OBJ_IS                                     $4      'raw'
        12        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, !0
        13      > JMPNZ                                                    ~3, ->24
 108    14    >   TYPE_CHECK                                    8  ~4      !1
        15      > JMPZ                                                     ~4, ->24
 109    16    >   FETCH_OBJ_IS                                     $4      'protected'
        17        ISSET_ISEMPTY_DIM_OBJ                       33554432  ~3      $4, !1
        18      > JMPNZ                                                    ~3, ->24
 110    19    >   INIT_FCALL                                               'method_exists'
        20        SEND_VAR                                                 !1
        21        SEND_VAL                                                 '__invoke'
        22        DO_ICALL                                         $3      
        23      > JMPNZ                                                    $3, ->25
 112    24    > > RETURN                                                   !1

For a shared service, the condition isset($this->raw[$id]) will be true, so the path will be:

FETCH_OBJ_R $4 'values'
FETCH_DIM_R $3 $4, !0
QM_ASSIGN !1 $3
FETCH_OBJ_IS $4 'raw'
ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0
JMPNZ ~3, 24
RETURN !1

In the original code the path was:

FETCH_OBJ_IS $4 'raw'
ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0
JMPNZ ~3, 27
FETCH_OBJ_R $4 'values'
FETCH_DIM_R $3 $4, !0
RETURN $3

That is, 6 vs 7 instructions. Does one instruction make a difference? You bet!

getShared benchmark now shows 7,761,829 operations per second, which is roughly 150,000 ops/sec less 🙂 That’s the power of one instruction.

The proper optimization would be

@@ -101,20 +101,24 @@
             throw new UnknownIdentifierException($id);
         }

+        if (isset($this->raw[$id])) {
+            return $this->values[$id];
+        }
+
+        $raw = $this->values[$id];
+
         if (
-            isset($this->raw[$id])
-            || !\is_object($this->values[$id])
-            || isset($this->protected[$this->values[$id]])
-            || !\method_exists($this->values[$id], '__invoke')
+            !\is_object($raw)
+            || isset($this->protected[$raw])
+            || !\method_exists($raw, '__invoke')
         ) {
-            return $this->values[$id];
+            return $raw;
         }

-        if (isset($this->factories[$this->values[$id]])) {
-            return $this->values[$id]($this);
+        if (isset($this->factories[$raw])) {
+            return $raw($this);
         }

-        $raw = $this->values[$id];
         $val = $this->values[$id] = $raw($this);
         $this->raw[$id] = $raw;

This change gives us 7,993,924 (vs 7,916,916) ops/s for getShared and 2,310,604 (vs 2,067,365) ops/s for getFactory.

getShared path is now

FETCH_OBJ_IS $4 'raw'
ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0
JMPZ ~3, ->14 /* branch NOT taken */
FETCH_OBJ_R $4 'values'
FETCH_DIM_R $3 $4, !0
RETURN $3

We have same 6 opcodes, so the speed should not really differ for getShared (77 kops/sec is probably a measurement error), but the difference between two getFactory benchmarks is ~250 kops/sec.

Frankly speaking, getOffset is not the only place that can be optimized; for example, I would get rid of Container::$keys property: this would save some space and code; I would differently implement __construct() (yes, I do realize it is called once but the code smells); finally, I probably would not use SplObjectStorage. However, offsetGet is the most used method, and therefore it makes sense to optimize it first.

The moral of this story is that if you really care about performance, don’t delegate this to the optimizer, do it yourself.

PS: if this version of Pimple is more focused on performance, I don’t think I want to see slower ones.

Performance in PHP: Pimple Container
Tagged on:                 

Leave a Reply

Your email address will not be published. Required fields are marked *