From time to time, you have to deal with someone else’s code. And the code you have to deal with sometimes surprises you.
For example, Pimple, a small Dependency Injection Container for PHP, recent versions of which, according to the README, are more focused on performance
.
Performance… So much in this word…
Learning something new is always interesting and sometimes funny, and therefore, I started to read the code.
The deeper I went, the more confused I got: if that is the code focused on performance, what kind of code do you usually write?
class Container implements \ArrayAccess
ArrayAccess is slow. Yes, it is convenient and offers better readability, but it is very slow. It is even faster to call offsetGet/offsetSet/offsetExists directly rather than to use ArrayAccess.
$this->factories = new \SplObjectStorage(); $this->protected = new \SplObjectStorage();
It is a pretty interesting approach to use objects as keys (are you really sure you need this?), but it artificially introduces some limitations: SplObjectStorage accepts only objects, and therefore factories can be either closures or invocable objects (classes with __invoke() method), but not arrays (say, [$object, 'method'], which is a valid callable in PHP).
public function offsetGet($id)
{
if (!isset($this->keys[$id])) {
throw new UnknownIdentifierException($id);
}
if (
isset($this->raw[$id])
|| !\is_object($this->values[$id])
|| isset($this->protected[$this->values[$id]])
|| !\method_exists($this->values[$id], '__invoke')
) {
return $this->values[$id];
}
if (isset($this->factories[$this->values[$id]])) {
return $this->values[$id]($this);
}
$raw = $this->values[$id];
$val = $this->values[$id] = $raw($this);
$this->raw[$id] = $raw;
$this->frozen[$id] = true;
return $val;
}
Oh well, oh well… In PHP, if you really want performance, you need to change your coding habits. The reason is that the PHP interpreter in many ways is as dumb as a rock: it will do what you tell it to, and it won’t try to be smarter than you.
So, whenever you see $this->values[$id], you can be sure that PHP will really fetch the property called values from $this, and then will look up a value by the key $id. Common subexpression elimination? No way. And it is probably not that easy a task anyway: if a property is an object implementing the ArrayAccess interface, you can have different values every time you call offsetGet; to make sure this does not happen, the optimizer needs to know the class name and have that class available (which is not always the case thanks to autoloading).
Below is what VLD shows for the above offsetGet implementation (you can reproduce the result with php -d extension=vld.so -d vld.active=1 -d opcache.enable_cli=1 Container.php. I intentionally used opcache.enable_cli=1 to mimic the real production environment).
function name: offsetGet
number of ops: 60
compiled vars: !0 = $id, !1 = $raw, !2 = $val
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
98 0 E > RECV !0
100 1 FETCH_OBJ_IS $4 'keys'
2 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, !0
3 > JMPNZ ~3, ->8
101 4 > NEW $3 :-4
5 SEND_VAR_EX !0
6 DO_FCALL 0
7 > THROW 0 $3
105 8 > FETCH_OBJ_IS $4 'raw'
9 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, !0
10 > JMPNZ ~3, ->27
106 11 > FETCH_OBJ_R $4 'values'
12 FETCH_DIM_R $5 $4, !0
13 TYPE_CHECK 8 ~4 $5
14 > JMPZ ~4, ->27
107 15 > FETCH_OBJ_R $4 'values'
16 FETCH_DIM_R $5 $4, !0
17 FETCH_OBJ_IS $4 'protected'
18 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, $5
19 > JMPNZ ~3, ->27
108 20 > INIT_FCALL 'method_exists'
21 FETCH_OBJ_R $4 'values'
22 FETCH_DIM_R $3 $4, !0
23 SEND_VAR $3
24 SEND_VAL '__invoke'
25 DO_ICALL $3
26 > JMPNZ $3, ->30
110 27 > FETCH_OBJ_R $4 'values'
28 FETCH_DIM_R $3 $4, !0
29 > RETURN $3
113 30 > FETCH_OBJ_R $3 'values'
31 FETCH_DIM_R $5 $3, !0
32 FETCH_OBJ_IS $4 'factories'
33 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, $5
34 > JMPZ ~3, ->42
114 35 > FETCH_OBJ_R $4 'values'
36 FETCH_DIM_R $3 $4, !0
37 INIT_DYNAMIC_CALL $3
38 FETCH_THIS $3
39 SEND_VAR_EX $3
40 DO_FCALL 0 $3
41 > RETURN $3
117 42 > FETCH_OBJ_R $4 'values'
43 FETCH_DIM_R $3 $4, !0
44 QM_ASSIGN !1 $3
118 45 INIT_DYNAMIC_CALL !1
46 FETCH_THIS $3
47 SEND_VAR_EX $3
48 DO_FCALL 0 $4
49 FETCH_OBJ_W $5 'values'
50 ASSIGN_DIM $3 $5, !0
51 OP_DATA $4
52 QM_ASSIGN !2 $3
119 53 FETCH_OBJ_W $3 'raw'
54 ASSIGN_DIM $3, !0
55 OP_DATA !1
121 56 FETCH_OBJ_W $3 'frozen'
57 ASSIGN_DIM $3, !0
58 OP_DATA
123 59 > RETURN !2
For the records, the “unoptimized” (with OpCache disabled) version had 67 operations.
Oplines 8 to 29 are responsible for the second if operator; I will decode them to explain what happens:
FETCH_OBJ_IS $4 'raw'silently (that is, does not complain if the property does not exist) fetches the property$this->rawinto a variable$4.ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0checks whether$4[!0](that is,$this->raw[$id]) is empty and stores the result to a compiled variable~3.JMPNZ ~3, ->27transfers control to the 27th opline if~3is not zero.FETCH_OBJ_R $4 'values'fetches$this->valuesinto$4.FETCH_DIM_R $5 $4, !0reads$4[!0](that is,$this->raw[$id]) into$5.TYPE_CHECK 8 ~4 $5checks whether$5is of type 8 (IS_OBJECT) and stores the result into~4.- Oplines 15 and 16 are the same as 13 and 14 because Zend OpCache does not eliminate common subexpressions.
FETCH_OBJ_IS $4 'protected'silently fetches$this->protectedinto$4.ISSET_ISEMPTY_DIM_OBJ ~3 $4, $5checks if$4[$5]is set (isset($this->protected[$this->values[$id]])) and stores the result to~3.JMPNZ ~3, ->27transfers the control to the 27th opline if~3is not zero.INIT_FCALL 'method_exists'prepares function call info and function call info cache for themethod_exists()function (INIT_FCALLis roughly equivalent to thezend_fcall_info_init()Zend API)- Oplines 21 and 22… they look so familiar, I bet we have already seen them somewhere!
- Oplines 23 and 24 send parameters to
method_exists, andDO_ICALLinvokesmethod_exists()and stores the return value to$3.JMPNZtransfers control to the 30th opline if$3is not zero. And I cannot get rid of the annoying feeling that I have already seen oplines 27 and 28.
I wrote a small PhpBench benchmark to see how fast Pimple is:
<?php
/**
* @Revs(1000000)
* @Iterations(5)
* @OutputMode("throughput")
* @OutputTimeUnit("seconds", precision=3)
* @Groups({"Container"})
*/
class PimpleContainerBench
{
private $x;
public function __construct()
{
$this->x = new Pimple\Container();
$this->x['factory'] = $this->x->factory(function() {
return 1;
});
$this->x['shared'] = function() {
return 2;
};
$x = $this->x['shared']; // Resolve
}
/**
* @Subject
*/
public function getShared()
{
$this->x['shared'];
}
/**
* @Subject
*/
public function getFactory()
{
$this->x['factory'];
}
}
On average, getShared performed 7,916,916 operations per second. getFactory was slower, performing 2,067,365 operations per second.
Now, let us try to optimize the container.
The most obvious optimization is to eliminate common subexpressions:
@@ -101,20 +101,21 @@
throw new UnknownIdentifierException($id);
}
+ $raw = $this->values[$id];
+
if (
isset($this->raw[$id])
- || !\is_object($this->values[$id])
- || isset($this->protected[$this->values[$id]])
- || !\method_exists($this->values[$id], '__invoke')
+ || !\is_object($raw)
+ || isset($this->protected[$raw])
+ || !\method_exists($raw, '__invoke')
) {
- return $this->values[$id];
+ return $raw;
}
- if (isset($this->factories[$this->values[$id]])) {
- return $this->values[$id]($this);
+ if (isset($this->factories[$raw])) {
+ return $raw($this);
}
- $raw = $this->values[$id];
$val = $this->values[$id] = $raw($this);
$this->raw[$id] = $raw;
However, this is not the best solution, and in fact, it will be slower for shared services than the original code.
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
…
104 8 > FETCH_OBJ_R $4 'values'
9 FETCH_DIM_R $3 $4, !0
10 QM_ASSIGN !1 $3
107 11 FETCH_OBJ_IS $4 'raw'
12 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, !0
13 > JMPNZ ~3, ->24
108 14 > TYPE_CHECK 8 ~4 !1
15 > JMPZ ~4, ->24
109 16 > FETCH_OBJ_IS $4 'protected'
17 ISSET_ISEMPTY_DIM_OBJ 33554432 ~3 $4, !1
18 > JMPNZ ~3, ->24
110 19 > INIT_FCALL 'method_exists'
20 SEND_VAR !1
21 SEND_VAL '__invoke'
22 DO_ICALL $3
23 > JMPNZ $3, ->25
112 24 > > RETURN !1
For a shared service, the condition isset($this->raw[$id]) will be true, so the path will be:
FETCH_OBJ_R $4 'values'
FETCH_DIM_R $3 $4, !0
QM_ASSIGN !1 $3
FETCH_OBJ_IS $4 'raw'
ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0
JMPNZ ~3, 24
RETURN !1
In the original code, the path was:
FETCH_OBJ_IS $4 'raw'
ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0
JMPNZ ~3, 27
FETCH_OBJ_R $4 'values'
FETCH_DIM_R $3 $4, !0
RETURN $3
That is, six vs. seven instructions. Does one instruction make a difference? You bet!
The getShared benchmark now shows 7,761,829 operations per second, which is roughly 150,000 ops/sec less 🙂 That’s the power of one instruction.
The proper optimization will be
@@ -101,20 +101,24 @@
throw new UnknownIdentifierException($id);
}
+ if (isset($this->raw[$id])) {
+ return $this->values[$id];
+ }
+
+ $raw = $this->values[$id];
+
if (
- isset($this->raw[$id])
- || !\is_object($this->values[$id])
- || isset($this->protected[$this->values[$id]])
- || !\method_exists($this->values[$id], '__invoke')
+ !\is_object($raw)
+ || isset($this->protected[$raw])
+ || !\method_exists($raw, '__invoke')
) {
- return $this->values[$id];
+ return $raw;
}
- if (isset($this->factories[$this->values[$id]])) {
- return $this->values[$id]($this);
+ if (isset($this->factories[$raw])) {
+ return $raw($this);
}
- $raw = $this->values[$id];
$val = $this->values[$id] = $raw($this);
$this->raw[$id] = $raw;
This change gives us 7,993,924 (vs 7,916,916) ops/s for getShared and 2,310,604 (vs 2,067,365) ops/s for getFactory.
The getShared path is now
FETCH_OBJ_IS $4 'raw' ISSET_ISEMPTY_DIM_OBJ ~3 $4, !0 JMPZ ~3, ->14 /* branch NOT taken */ FETCH_OBJ_R $4 'values' FETCH_DIM_R $3 $4, !0 RETURN $3
We have the same six opcodes, so the speed should not really differ for getShared (77 kops/sec is probably a measurement error); the difference between the two getFactory benchmarks is ~250 kops/sec.
getOffset() is not the only place that can be optimized; for example, I would:
- Get rid of
Container::$keysproperty to save some space and code. - Differently implement
__construct()(yes, I do realize it is called once, but the code smells). - Avoid
SplObjectStorage.
However, offsetGet is the most used method, and therefore it makes sense to optimize it first.
The moral of this story is that if you really care about performance, don’t delegate this to the optimizer; do it yourself.
PS: If this version of Pimple is more performance-focused, I don’t want to see slower ones.