Changes of Zend API in PHP 7.3 you should be aware of

After an unsuccessful attempt to compile my extension with the latest PHP, I discovered that you can no longer directly update GC_REFCOUNT(). As in zend_types.h, macros concerning GC refcount are defined as below:

1
2
3
4
#define GC_REFCOUNT(p) zend_gc_refcount(&(p)->gc)
#define GC_SET_REFCOUNT(p, rc) zend_gc_set_refcount(&(p)->gc, rc)
#define GC_ADDREF(p) zend_gc_addref(&(p)->gc)
#define GC_DELREF(p) zend_gc_delref(&(p)->gc)

Meanwhile, in PHP 7.2 and older versions:

1
#define GC_REFCOUNT(p) (p)->gc.refcount

Here’s a simple workaround for compatibility.

1
2
3
4
5
#if PHP_VERSION_ID < 70300
#define GC_ADDREF(p) ++GC_REFCOUNT(p)
#define GC_DELREF(p) --GC_REFCOUNT(p)
#define GC_SET_REFCOUNT(p, rc) GC_REFCOUNT(p) = rc
#endif

This change in internal API was intended to eliminate race-conditions in multi-thread applications, as mentioned in this pull request.

Other notable API changes can be found here, with which you can make your extension compatible with PHP 7.3.

Fast ZPP's Incompatibility with C++

Since PHP 7.0, a new Zend API was implemented for faster parameter parsing.

For example, if your function accepts an integer parameter foo, then the code may look like this.

1
2
3
ZEND_PARSE_PARAMETERS_START(1, 1)
Z_PARAM_LONG(foo)
ZEND_PARSE_PARAMETERS_END();

However, if your extension is written in C++, the compiler will complain and refuse to compile.

You’ll get error message like:

1
error: invalid conversion from 'int' to 'zend_expected_type {aka _zend_expected_type}' [-fpermissive]

Confused? Let’s take a look at the macro definition, which is located in zend_API.h

1
2
3
4
5
6
7
8
9
10
11
#define ZEND_PARSE_PARAMETERS_START_EX(flags, min_num_args, max_num_args) do { \
const int _flags = (flags); \
int _min_num_args = (min_num_args); \
int _max_num_args = (max_num_args); \
int _num_args = EX_NUM_ARGS(); \
int _i; \
zval *_real_arg, *_arg = NULL; \
zend_expected_type _expected_type = IS_UNDEF; \
char *_error = NULL; \
zend_bool _dummy; \
// Some more code...

We can see on line 8 Zend’s trying to initialize an enum zend_expected_type with value 0, which is forbidden in C++. In C++, you should either explicitly cast it with static_cast or initialize using a corresponding enum value.

Fortunately the value 0 is defined in macro IS_UNDEF (why this??), you can just redefine it instead of sed zend_API.h in your config.m4 script.

Now your code may look like this.

1
2
3
4
5
6
7
#undef IS_UNDEF
#define IS_UNDEF Z_EXPECTED_LONG // Which is zero
ZEND_PARSE_PARAMETERS_START(1, 1)
Z_PARAM_LONG(foo)
ZEND_PARSE_PARAMETERS_END();
#undef IS_UNDEF
#define IS_UNDEF 0

Ugly, but your code will compile. Cheers :)

P.S. The latest PHP 7.2 still have this problem. Perhaps I should report this issue to the PHP internals guys.

PHP-CPP bug which causes memory leak

TL, DR

Recently I was working with PHP-CPP(2.0.0 release) for my projects. When running a test, I accidentally discovered (with debug_zval_dump()) that when an object gets out of scope, its refcount does not decrement, which causes memory leak.

This is not a normal behavior for a regular PHP object, however, this object is declared and instantiated in my C++ code instead of user space, thus its garbage collection cannot be automatically done by Zend Engine. There must be something wrong with PHP-CPP.

Wrapping C++ objects in PHP

As we know, we can wrap an object which extends Php::Base into a Php::Value using constructor Value::Value(const Base *object). And if the object is instantiated in C++, you have to make it accessible to Zend Engine by calling constructor Object::Object(const char *name, Base *base) at least once. Make sure the object is instantiated with new, and once wrapped, you shall never attempt to delete it. (Similarly, never use std::shared_ptr for the object.) Otherwise you’ll get segmentation fault.

PHP-CPP makes sure garbage collection of C++ objects wrapped within Php::Value is handled automatically, for the Php::Base* pointer is stored in a std::unique_ptr in a PHP-CPP’s Php::ObjectImpl object. The latter gets destroyed once refcount becomes zero.

Problem cause discovered

And here’s the problem. Why didn’t refcount decrease when Php::Value gets destroyed? Let’s take a peek at the destructor of Php::Value:

1
2
3
4
5
6
7
8
/**
* Destructor
*/
Value::~Value()
{
// reduce the refcount - if necessary
Z_TRY_DELREF_P(_val);
}

And constructor for wrapping Php::Base*:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/**
* Wrap around an object
* @param object
*/
Value::Value(const Base *object)
{
// there are two options: the object was constructed from user space,
// and is already linked to a handle, or it was constructed from C++
// space, and no handle does yet exist. But if it was constructed from
// C++ space and not yet wrapped, this Value constructor should not be
// called directly, but first via the derived Php::Object class.
auto *impl = object->implementation();
// do we have a handle?
if (!impl) throw FatalError("Assigning an unassigned object to a variable");
// set it to an object
Z_TYPE_INFO_P(_val) = IS_OBJECT;
Z_OBJ_P(_val) = impl->php();
// increase refcount
GC_REFCOUNT(impl->php())++;
}

It seems that we discovered the cause of the problem. GC_REFCOUNT is the reference counter for a zend_object, while function Z_TRY_DELREF_P() decrements the refcount for a zval. Thus, refcount increments every time you wrap the object with Value::Value(const Base *object), but never goes down when wrapped object is destroyed. Hence the problem.

Problem solved

Changing GC_REFCOUNT(impl->php())++ into Z_ADDREF_P(_val) will do the trick. Not long after I started writing this blog I discovered that the master branch of PHP-CPP’s GitHub repository has already fixed this bug. So it’s highly recommended that you use the master branch of PHP-CPP(currently works fine with my projects), instead of the 2.0.0 release. See here for commits since the 2.0.0 release.

NEVER use Chinese as system language on your Linux devices

During my attempt to execute a shell script provided by the manufacturer, I got a couple o’ annoying error outputs(shown below).

Going through the script several times, it seemed that there were nothing wrong with it. But it suddenly dawned on me that I was using Chinese as system language for my CentOS 7.

Then I found this. It’s obvious that grep Disk may not work as expected.

1
2
SIZE=`fdisk -l $DRIVE | grep Disk | awk '{print $5}'`
CYLINDERS=`echo $SIZE/255/63/512 | bc`

Gotcha..

The script worked after I changed system language to English. Perhaps we should never use Chinese(as well as other languages beside English) as system language. :)

P.S. In fact, you can do something like export LC_ALL="en_US.UTF-8" to switch locale insead of language, if you really wanna use other languages.

Plans For 20172

1. List of plans

  • Learn Java and Spring Boot.
  • Maintenance for TJUBBS backend. Including bug fix and adding minor functionalities.
  • Finish implementing all functionalities in php-asio and make a stable release.
  • Continue learning C# and Unity3D. Finish the VR project.
  • Continue learning data mining.

2. Optional

  • Write a client of YAP, and test the sanity of the demo server.
  • Rewrite RARBG crawler using amphp/aeyrs and amphp/artax for better performance.
  • Write an asynchronous MySQL client for Workerman.

Concurrency Programming in PHP

1. When is concurrency needed?

Bad examples

  • Attempting to fetch the content from a remote.
1
2
3
$handle = curl_init('http://www.an-extremely-slow-website.com/');
//This may take several seconds.
curl_exec($handle);
  • Performing a slow MySQL query.
1
2
3
4
$pdo = new PDO($dsn, $username, $password);
//There are million of rows in table `history`, query cannot use index.
$stmt = $pdo->prepare('SELECT * FROM `history` WHERE `file_name` NOT LIKE "%.jpg"');
$stmt->execute();
  • Network transmission is slow.
1
2
3
4
5
6
//File length is about 2MiBs in total.
while ($buffer = fread($handle, 8192)) {
//Unfortunately, Client can only recieve around 100KiBs per second.
fwrite($socket, $buffer, 8192);
}
fclose($handle);
  • Deferring is needed.
1
2
3
4
5
while ($buffer = fread($socket, 8192)) {
fwrite($handle, 8192);
//We want to restrict transmission speed by sleeping 100 milliseconds every 8KiBs.
usleep(100000);
}

What do they have in common?

  1. Something slow needs to be done.
  2. I/O process waits and does nothing.

Probable solutions

  1. Spawn a thread/process for each request.
  2. Handle requests asynchronously within one thread.

2. How to implement concurrency in PHP?

Event-based model

Event-driven libraries in PHP

  1. libevent
  2. libev
  3. libuv

Frameworks based on event-driven libraries

  1. ReactPHP
  2. Amp
  3. Workerman
  4. Swoole

Features and advantages

  1. Asynchronous, non-blocking I/O for web service, filesystem, and database connection, etc.
  2. Implementing high concurrency in a single thread. No cost for forking or spawning threads.

3. The Amp Framework

Basic usage

  • Event watchers
Watcher Type Callback Signature
defer() function (string $watcherId, $callbackData)
delay() function (string $watcherId, $callbackData)
repeat() function (string $watcherId, $callbackData)
onReadable() function (string $watcherId, $stream, $callbackData)
onWritable() function (string $watcherId, $stream, $callbackData)
onSignal() function (string $watcherId, $signal, $callbackData)
  • Controlling event watchers
Method Behaviour
run() Start the event loop with all watcher active.
stop() Terminate the event loop and continue execution to the next line after run().
enable() Resume a disabled watcher back to the event loop.
disable() Temporarily remove a watcher from the event loop.
reference() Mark a watcher as referenced.
unreference() Mark a watcher as unreferenced.
cancel() Destroy a watcher.
  • Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
use Amp\Loop;
//Same as call Loop::defer() with callback right before calling Loop::run() without callback.
Loop::run(function (string $id) {
echo "Event loop started. Watcher id is $id.\n";
//Get called right after this tick.
$id = Loop::defer(function (string $id, $param) {
echo "Watcher id is $id. Watcher id of last tick is $param.\n";
}, $id);
echo "Watcher id of next tick is $id.\n";
$count = 0;
//Loop::repeat() callbacks get called after every specified interval.
Loop::repeat(1000, function (string $id) use (&$count) {
++$count;
echo "Timer callback is called for $count time(s).\n";
if ($count == 7)
//Loop::delay() is similar with Loop::repeat(),
//except for that the former is destroyed right after its tick.
Loop::delay(2500, function () use ($id) {
//Loop::cancel() removes a specified watcher from the event loop.
Loop::cancel($id);
});
});
pcntl_signal(SIGINT, SIG_IGN);
//Get called whenever a specific signal is sent to the process.
$id = Loop::onSignal(SIGINT, function () {
echo "SIGINT received. Exiting event loop.\n";
//When Loop::stop() is called,
//the event loop will stop right after current tick.
Loop::stop();
});
//All watchers are referenced by default.
//Unreferenced watchers won't keep the event loop alive.
Loop::unreference($id);
});
//When there are no available watchers, the event loop exits automatically.
echo "Terminated.\n";

Promises

  • What are Promises?

  1. Asynchronous functions should return an instance of a class which implements Amp\Promise.
  2. Promises are created by an instance of Amp\Deferred, which resolves the promised value, and throws an exception when an error occurs.
  3. Unlike the Promises implemented in JavaScript and ReactPHP, etc, thenables in Amp are implemented with Coroutines.
  • Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
use Amp\Loop;
function asyncDivide($divisor, $dividend, $delay) {
//Promises are created by Amp\Deferred.
$deferred = new Amp\Deferred;
Loop::delay($delay, function () use ($divisor, $dividend, $deferred) {
$divisor = intval($divisor);
$dividend = intval($dividend);
if (!$dividend)
//Reject and emit an error.
$deferred->fail(new DivisionByZeroError('Divided by zero'));
else
//Resolve a result.
$deferred->resolve($divisor / $dividend);
});
//The async function shall return a Promise.
return $deferred->promise();
}
Loop::run(function () {
//Call a function asynchronously.
$promise = asyncDivide(4, 5, 1000);
//The following event occurs when the Promise is resolved or rejected.
$promise->onResolve(function (?Throwable $error, $result) {
if ($error)
echo $error->getMessage(), PHP_EOL;
else
echo 'Result is ',$result, PHP_EOL;
});
});
  • Promise Combinators (in namespace Amp\Promise) combine multiple promises to a single Promise.
Function Behaviour
all() Resolve when all Promises in the group resolve.
some() Resolve when no less than one Promise resolves.
any() Resolve even when all Promises fail.
first() Resolve when the first Promise in the group resolves.
  • Promise Helpers (in namespace Amp\Promise)
Function Behaviour
rethrow() Forward errors generated by the given Promise to the event loop.
timeout() Throw an exception if the given Promise fail to resolve or reject.
wait() Synchronously wait for a Promise to resolve.

Coroutines

  • What are coroutines?

  • In Amp, all yields of coroutines must be one of the following type.
Type Description
Amp\Promise Control will be returned to the Coroutine once resolved.
React\Promise\PromiseInterface Will be adapted to Amp\Promise.
array Array of Promises will be combined implicitly to Amp\Promise\All.
  • Coroutine helpers (in Amp namespace)
Function Behaviour
coroutine(callable $callback) : callable Wrap a function into a coroutine.
asyncCoroutine(callable $callback) : callable Callback function do not return a Promise when called.
call(callable $callback, …$args) : Promise Call the given function, and return a Promise.
asyncCall(callable $callback, …$args) : void Do not return a Promise.
  • Examples:
1
2
3
4
5
6
7
8
9
10
function asyncDivide($divisor, $dividend, $delay) {
return \Amp\coroutine(function () use ($divisor, $dividend, $delay) {
yield new Amp\Delayed($delay);
return $divisor / $dividend;
});
}
Amp\Loop::run(function () {
$value = yield asyncDivide(3, 4, 500)();
var_dump($value);
});
1
2
3
4
5
6
7
8
9
10
function asyncDivide($divisor, $dividend, $delay) {
return Amp\call(function () use ($divisor, $dividend, $delay) {
yield new Amp\Delayed($delay);
return $divisor / $dividend;
});
}
Amp\Loop::run(function () {
$value = yield asyncDivide(3, 4, 500);
var_dump($value);
});

Iterators

  • In Amp, an iterator iterates through a set of Promises, and resolves alongside with the Promises. It can be recognized as a “special” Promise which can be resolved multiple times.
  • Iterators are created by Amp\Emitter.
  • Iterator functions are listed below.
Method Behaviour
Iterator::getCurrent() If Promise resolves to true, consume value of current position.
Iterator::advance() Return a Promise which indicates whether there’s a value to consume.
Emitter::emit() Emits a new value to the Iterator.
Emitter::complete() Mark an iterator as complete and no further emits will be done.
Emitter::iterate() Return instance of Iterator.
Iterator\fromIterable() Converts arrays or Traversable objects into an Iterator.
  • Examples:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function subtractToZero($init, $interval) {
$value = $init;
//Iterators are created by Amp\Emitter.
$emitter = new Amp\Emitter;
Loop::repeat($interval, function ($id) use ($emitter, &$value) {
if ($value > 0)
$emitter->emit(--$value);
else {
$emitter->complete();
//Cancel timer event when complete.
Loop::cancel($id);
}
});
//Return the iterator.
return $emitter->iterate();
}
Loop::run(function () {
$iterator = subtractToZero(10, 100);
while (yield $iterator->advance())
var_dump($iterator->getCurrent());
});
  • Producer is a simplified form of emitter which can be used when all values can be emitted in a single coroutine.
1
2
3
4
5
6
7
8
9
10
11
Amp\Loop::run(function () {
$iterator = new \Amp\Producer(function (callable $emit) {
static $i = 0;
while (++$i < 10) {
yield new \Amp\Delayed(200);
yield $emit($i);
}
});
while (yield $iterator->advance())
var_dump($iterator->getCurrent());
});
  • Iterator combination functions combine an array of Iterators into a single Iterator.
Function Behaviour
Iterator\concat() Iterators are resolved one by one.
Iterator\merge() Iterators are resolved simultaneously.
  • Iterator transformation functions intervene the resolution of Iterators using Producer.
Function Behaviour
Iterator\map() Transform the resolved value into another value.
Iterator\filter() Resolved value is omitted if filter callback returns false.
1
2
3
4
5
6
7
Loop::run(function () {
$iterator = \Amp\Iterator\map(subtractToZero(10, 200), function ($value) {
return "Current value is $value.\n";
});
while (yield $iterator->advance())
echo $iterator->getCurrent();
});
1
2
3
4
5
6
7
Loop::run(function () {
$iterator = \Amp\Iterator\filter(subtractToZero(10, 200), function ($value) {
return $value != 3;
});
while (yield $iterator->advance())
var_dump($iterator->getCurrent());
});

Cancellation

  • Amp provides cancellation of a specific asynchronous operation. but it does not and cannot automatically handle cancellation. Instead, you should handle cancellation manually after its request.
  • Cancellation is implemented using Amp\CancellationTokenSource and Amp\CancellationToken.
Method Behaviour
CancellationTokenSource::getToken() Returns a unique CancellationToken instance.
CancellationTokenSource::cancel() Emits a Cancellation request to its CancellationToken.
CancellationToken::isRequested() Resurns whether there is a Cancellation request.
CancellationToken::throwIfRequested() Throws CancelledException if Cancellation request exists.
CancellationToken::subscribe() Callback will be invoked when the request occurs.
CancellationToken::unsubscribe() Disable a specified callback by id.
  • Examples:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
use Amp\Loop;
function subtractToZero($init, $interval, $token = null) {
$value = $init;
$emitter = new Amp\Emitter;
Loop::repeat($interval, function ($id) use ($emitter, &$value, $token) {
//Cancellation requests are received by isRequested() method.
if ($value > 0 && (!isset($token) || !$token->isRequested()))
$emitter->emit(--$value);
else {
$emitter->complete();
Loop::cancel($id);
}
});
return $emitter->iterate();
}
Loop::run(function () {
$token_source = new \Amp\CancellationTokenSource;
$iterator = subtractToZero(10, 200, $token_source->getToken());
Loop::delay(1500, function () use ($token_source) {
//Cancel this operation 1500 milliseconds after current tick.
$token_source->cancel();
});
while (yield $iterator->advance())
var_dump($iterator->getCurrent());
});
1
2
3
4
5
6
7
8
9
10
11
12
13
Loop::repeat($interval, function ($id) use ($emitter, &$value, $token) {
//Callback which is subscribed to a Cancellation Token
//will be invoked before the callback marked as cancelled.
$token->subscribe(function () use ($id, $emitter) {
Loop::cancel($id);
});
if ($value > 0)
$emitter->emit(--$value);
else {
$emitter->complete();
Loop::cancel($id);
}
});