Writing PHP Extensions1. Setting up Your PHP Build Environment on Linux2. Generating a PHP Extension Skeleton3. Building and Installing a PHP Extension4. Rebuilding Extensions for Production5. Extension Skeleton File Content6. Running PHP Extension Tests7. Adding New Functionality8. Basic PHP Structures9. PHP Arrays10. Catching Memory Leaks11. PHP Memory Management12. PHP References13. Copy on Write14. PHP Classes and Objects15. Using OOP in our Example Extension16. Embedding C Data into PHP Objects17. Overriding Object Handlers18. Answers to Common Extension Questions9. PHP ArraysPHP arrays are complex data structures. They may represent an ordered map with integer and string keys to any PHP values (zval). Internally, a PHP array is implemented as an adoptive data structure that may change its internal representation and behavior at run-time, depending on stored data. For example, if a script stores elements in an array with sorted and close numeric indexes (e.g. [0=>1, 1=>2, 3=>3]), it is going to be represented as a plain array. We will name such arrays – packed. Elements of packed arrays are accessed by offset, with near the same speed as C array. Once a PHP array gets a new element with a string (or “bad” numeric) key (e.g. [0=>1, 1=>3, 3=>3, “ops”=>4]), it’s automatically converted to a real hash table with conflicts resolution. The following examples explain how keys are logically organized in PHP: $a = [1, 2, 3]; // packed array $a = [0=>1, 1=>2, 3=>3]; //packed array with a “hole” $a = [0=>1, 2=>3, 1=>2]; // hash table (because of ordering) without conflicts $a = [0=>1, 1=>2, 256 =>3]; // hash table (because of density) with conflicts $a = [0=>1, 1=>2, “x”=>3]; // hash table (because of string keys) Values are always stored as an ordered plain array. They may be simple iterated top-down or in reverse direction. Actually, this is an array of Buckets with embedded zvals and some additional information. In packed arrays, value index is the same as numeric key. Offset is calculated as key * sizeof(Bucket). HashTables uses additional arrays of indexes (Hash). It remaps value of hash function, calculated for numeric or string key value, to value index. Few array keys may make a collision, when they have the same value of hash function. They are resolved through linked lists of elements with the same hash value. INTERNAL PHP ARRAY REPRESENTATION Now, let’s look into the internal PHP array representation. The value field of “zval” with IS_ARRAY type keeps a pointer to “zend_array” structure. It’s “inherited” from zend_refcounted”, that defines the format of the first 64-bit word with reference-counter. Other fields are specific for zend_array or HashTable. The most important one is “arData”, which is a pointer to a dependent data structure. Actually, they are two data structures allocated as a single memory block. Above the address, pointed by “arData”, is the “Hash” part (described above). Below, the same address is, the “Ordered Values” part. The “Hash” part is a turned-down array of 32-bit Bucket offsets, indexed by hash value. This part may be missed for packed arrays, and in this case, the Buckets are accessed directly by numeric indexes. The “Ordered Values” part is an array of Buckets. Each Bucket contains embedded zval, string key represented by a pointer to zend_ string (it’s NULL for numeric key), and numeric key (or string hash_value for string key). Reserved space in zvlas is used to organize linked-list of colliding elements. It contains an index of the next element with the same hash_value.Historically, PHP 5 made clear distinctions between arrays and HashTable structures. (HashTable didn’t contain reference-counter.) However, in PHP 7, these structures were merged and became aliases. PHP arrays may be immutable. This is very similar to interned strings. Such arrays don’t require reference counting and behave in the same way as scalars values. PHP ARRAY APIS Use the following macros to retrieve zend_array from zval (also available with “_P” suffix for pointers to zval): Z_ARR(zv) – returns zend_array value of the zval (the type must be IS_ARRAY). Z_ARRVAL(zv) – historical alias of Z_ARR(zv). Use the following macros and functions to work with arrays represented by zval: ZVAL_ARR(zv, arr) – initializes PHP array zval using given zend_array. array_init(zv) – creates a new empty PHP array. array_init_size(zv, count) – creates a new empty PHP array and reserves memory for “count” elements. add_next_index_null(zval *arr) – inserts new NULL element with the next index. add_next_index_bool(zval *ar, int b) – inserts new IS_BOOL element with value “b” and the next index. add_next_index_long(zval *arr, zend_long val) – inserts new IS_LONG element with value “val” and the next index. add_next_index_double(zval *arr, double val) – inserts new IS_DOUBLE element with value “val” and the next index. add_next_index_str(zval *arr, zend_string *zstr) – inserts new IS_DOUBLE element with value “zstr” and the next index. add_next_index_string(zval *arr, char *cstr) – creates PHP string from zero-terminated C string “cstr”, and inserts it with the next index. add_next_index_stringl(zval *arr, char *cstr, size_t len)– creates PHP string from C string “cstr” with length “len”, and inserts it with the next index. add_next_index_zval(zval *arr, zval *val) – inserts the given zval into array with the next index. Note that reference-counter of the inserted value is not changed. You should care about reference-counting yourself (e.g. calling Z_TRY_ADDREF_P(val)). All other add_next_index_...() functions are implemented through this function.add_index_...(zval *arr, zend_ulong idx, …) – another family of functions to insert a value with the given numeric “idx”. Variants with similar suffixes and arguments as the add_next_index_...() family above are available. add_assoc_...(zval *arr, char *key, …) – another family of functions to insert a value with a given string “key”, defined by zero-terminated C string.add_assoc_..._ex(zval *arr, char *key, size_t key_len, …) – another family of functions to insert a value with a given string “key”, defined by C string and its length. Here are a few functions that can work directly with zend_array: zend_new_arra(count) – creates and returns new array (it reserves memory for “count” elements). zend_array_destroy(zend_array *arr) – frees memory allocated by array and all its elements. zend_array_count(zend_array *arr) – returns number of elements in array. zend_array_dup(zend_array *arr) – creates another array identical to the given one. zend_array and HashTable are represented identically. And each zend_array is also a HashTable, but not vice-verse. zend_arrays may keep only zvals as elements. Generalized HashTables may keep pointers to any data structures. Technically this is represented by a special zval type IS_PTR. The HashTable API is quite extensive, so here we give just a quick overview: zend_hash_init() – initializes a hash table. The HashTable itself may be embedded into another structure, allocated on stack or through malloc()/emalloc(). One of the arguments of this function is a destructor callback, that is going to be executed for each element removed from HashTable. For zend_arrays this is zval_ptr_dtor(). zend_hash_clean() – removes all elements of HashTable. zend_hash_destroy() – frees memory allocated by HashTable and all its elements. zend_hash_copy() – copies all elements of the given HashTable into another one. zend_hash_num_elements() – returns number of elements in HashTable. zend_hash_[str_|index_]find[_ptr|_deref|_ind]() – finds and returns element of HashTable with a given string or numeric key. Returns NULL, if key doesn’t exist. zend_hash_[str_|index_]exists[_ind]() – checks if an element with the given string or numeric key exists in the HashTable. zend_hash_[str_|index_](add|update)[_ptr|_ind]() – adds new or updates existing elements of HashTable with given string or numeric key. “zend_hash...add” functions return NULL, if the element with the same key already exists. “zend_hash...update” functions insert new element, if it didn’t exist before. zend_hash_[str_|index_]del[_ind]() – removes element with the given string or numeric key from HashTable. zend_symtable_[str_]find[_ptr|_deref|_ind]() – is similar to zend_hash_find...(), but the given key is always represented as string. It may contain a numeric string. In this case, it’s converted to number and zend_hash_index_find...() is called. zend_symtable_[str_|]exists[_ind]() – is similar to zend_hash_exists...(), but the given key is always represented as string. It may contain a numeric string. In this case, it’s converted to number and zend_hash_index_exists...() is called. zend_symtable_[str_](add|update)[_ptr|_ind]() – is similar to zend_hash_add/update...(), but the given key is always represented as string. It may contain a numeric string. In this case, it’s converted to number and zend_hash_index_add/update...() is called.zend_symtable_[str_|]del[_ind]() – is similar to zend_hash_del...(), but the given key is always represented as string. It may contain a numeric string. In this case, it’s converted to number and zend_hash_index_del...() is called. There are also a number of ways to iterate over HashTable: ZEND_HASH_FOREACH_KEY_VAL(ht, num_key, str_key, zv) – a macro that starts an iteration loop over all elements of the HashTable “ht”. The nested C code block is going to be called for each element. C variables “num_key”, “str_key” and “zv” are going to be initialized with numeric key, string key and pointer to element zval. For elements with numeric keys, “str_key” is going to be NULL. There are more similar macros to work only with value, keys, etc. There are also similar macros to iterate in the reverse order. The usage of this macro is going to be demonstrated in the next example. ZEND_HASH_FOREACH_END() – a macro that ends an iteration loop. zend_hash_[_reverse]apply() – calls a given callback function for each element of the HashTable. zend_hash_apply_with_argument[s]() – calls a given callback function for each element of the given HashTable, with additional argument(s). See more information in Zend/zend_hash.h. USING PHP ARRAYS IN OUR EXAMPLE EXTENSION Let’s extend our test_scale() function to support arrays. Let it return another array with preserved keys and scaled values. Because the element of an array may be another array (and recursively deeper), we have to separate the scaling logic into a separate recursive function do_scale(). The logic for IS_LONG, IS_DOUBLE and IS_STRING is kept the same, except, that our function now reports SUCCESS or FAILURE to the caller and therefore we have to replace our RETURN_...() macros with RETVAL_...() and “return SUCCESS”.static int do_scale(zval *return_value, zval *x, zend_long factor) { if (Z_TYPE_P(x) == IS_LONG) { RETVAL_LONG(Z_LVAL_P(x) * factor); } else if (Z_TYPE_P(x) == IS_DOUBLE) { RETVAL_DOUBLE(Z_DVAL_P(x) * factor); } else if (Z_TYPE_P(x) == IS_STRING) { zend_string *ret = zend_string_safe_alloc(Z_STRLEN_P(x), factor, 0, 0); char *p = ZSTR_VAL(ret); while (factor-- > 0) { memcpy(p, Z_STRVAL_P(x), Z_STRLEN_P(x)); p += Z_STRLEN_P(x); } *p = ‘\000’; RETVAL_STR(ret); } else if (Z_TYPE_P(x) == IS_ARRAY) { zend_array *ret = zend_new_array(zend_array_count(Z_ARR_P(x))); zend_ulong idx; zend_string *key; zval *val, tmp; ZEND_HASH_FOREACH_KEY_VAL(Z_ARR_P(x), idx, key, val) { if (do_scale(&tmp, val, factor) != SUCCESS) { return FAILURE; } if (key) { zend_hash_add(ret, key, &tmp); } else { zend_hash_index_add(ret, idx, &tmp); } } ZEND_HASH_FOREACH_END(); RETVAL_ARR(ret); } else { php_error_docref(NULL, E_WARNING, “unexpected argument type”); return FAILURE; } return SUCCESS; } PHP_FUNCTION(test_scale) { zval *x; zend_long factor = TEST_G(scale); // default value ZEND_PARSE_PARAMETERS_START(1, 2) Z_PARAM_ZVAL(x) Z_PARAM_OPTIONAL Z_PARAM_LONG(factor) ZEND_PARSE_PARAMETERS_END(); do_scale(return_value, x, factor); }The new code for IS_ARRAY argument creates an empty resulting array (reserving the same number of elements, as in source array). It then iterates through source array element and calls the same do_scale() function for each element, storing the temporary result in tmp zval. Then it adds this temporary value into the resulting array under the same string key or numeric index. Let’s test new functionality... $ php -r ‘var_dump(test_scale([2, 2.0, “x” => [“2”]], 3));’ array(3) { [0]=> int(6) [1]=> float(6) [“x”]=> array(1) { [0]=> string(3) “222” } }Works fine, but, really, our function has a bug. It may leak memory on some edge conditions. Request PDF VersionBook traversal links for 9. PHP Arrays‹ 8. Basic PHP StructuresWriting PHP Extensions10. Catching Memory Leaks ›