The document discusses various PHP data structures including arrays, structs, queues, stacks, and sets. PHP arrays are actually implemented as a hash table with O(1) access and can be iterated in both directions using a doubly linked list. True arrays have fixed size and elements accessed by index. Structs can be represented using arrays or classes, with classes providing type safety. Common data structures like queues and stacks follow FIFO and LIFO principles respectively, while sets have no order and are useful for membership testing.
What's New in Teams Calling, Meetings and Devices March 2024
Mastering PHP Data Structures 102 Conference
1. Masterizing PHP Data Structure 102
Patrick Allaert
PHPBenelux Conference Antwerp 2012
2. About me
● Patrick Allaert
● Founder of Libereco
● Playing with PHP/Linux for +10 years
● eZ Publish core developer
● Author of the APM PHP extension
● @patrick_allaert
● patrickallaert@php.net
● http://github.com/patrickallaert/
● http://patrickallaert.blogspot.com/
15. Array: PHP's untruthfulness
PHP “Arrays” are not true Arrays!
An array is typically implemented like this:
Data Data Data Data Data Data
16. Array: PHP's untruthfulness
PHP “Arrays” can be iterated both directions (reset(),
next(), prev(), end()), exclusively with O(1) operations.
17. Array: PHP's untruthfulness
PHP “Arrays” can be iterated both directions (reset(),
next(), prev(), end()), exclusively with O(1) operations.
Implementation based on a Doubly Linked List (DLL):
Head Tail
Data Data Data Data Data
Enables List, Deque, Queue and Stack
implementations
19. Array: PHP's untruthfulness
PHP “Arrays” elements are always accessible using a
key (index).
Implementation based on a Hash Table:
Head Bucket pointers array Tail
0 1 2 3 4 5 ... nTableSize -1
Bucket * Bucket * Bucket * Bucket * Bucket * Bucket * Bucket *
Bucket Bucket Bucket Bucket Bucket
Data Data Data Data Data
22. Array: PHP's untruthfulness
● In C: 100 000 integers (using long on 64bits => 8
bytes) can be stored in 0.76 Mb.
● In PHP: it will take ≅ 13.97 Mb!
● A PHP variable (containing an integer) takes 48
bytes.
● The overhead of buckets for every “array” entries is
about 96 bytes.
● More details:
http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html
24. Structs (or records, tuples,...)
● A struct is a value containing other values which
are typically accessed using a name.
● Example:
Person => firstName / lastName
ComplexNumber => realPart / imaginaryPart
26. Structs – Using a class
$person = new PersonStruct(
"Patrick", "Allaert"
);
27. Structs – Using a class
(Implementation)
class PersonStruct
{
public $firstName;
public $lastName;
public function __construct($firstName, $lastName)
{
$this->firstName = $firstName;
$this->lastName = $lastName;
}
}
28. Structs – Using a class
(Implementation)
class PersonStruct
{
public $firstName;
public $lastName;
public function __construct($firstName, $lastName)
{
$this->firstName = $firstName;
$this->lastName = $lastName;
}
public function __set($key, $value)
{
// a. Do nothing
// b. trigger_error()
// c. Throws an exception
}
}
29. Structs – Pros and Cons
Array Class
+ Uses less memory (PHP < 5.4) - Uses more memory (PHP < 5.4)
- Uses more memory (PHP = 5.4) + Uses less memory (PHP = 5.4)
- No type hinting + Type hinting possible
- Flexible structure + Rigid structure
+|- Less OO +|- More OO
+ Slightly faster - Slightly slower
30. “true” Arrays
● An array is a fixed size collection where elements
are each identified by a numeric index.
31. “true” Arrays
● An array is a fixed size collection where elements
are each identified by a numeric index.
0 1 2 3 4 5
Data Data Data Data Data Data
32. “true” Arrays – Using
SplFixedArray
$array = new SplFixedArray(3);
$array[0] = 1; // or $array->offsetSet()
$array[1] = 2; // or $array->offsetSet()
$array[2] = 3; // or $array->offsetSet()
$array[0]; // gives 1
$array[1]; // gives 2
$array[2]; // gives 3
33. “true” Arrays – Pros and Cons
Array SplFixedArray
- Uses more memory + Uses less memory
+|- Less OO +|- More OO
+ Slightly faster - Slightly slower
34. Queues
● A queue is an ordered collection respecting First
In, First Out (FIFO) order.
● Elements are inserted at one end and removed at
the other.
35. Queues
● A queue is an ordered collection respecting First
In, First Out (FIFO) order.
● Elements are inserted at one end and removed at
the other.
Data
Dequeue
Data Data Data Data Data Data
Enqueue
Data
36. Queues – Using array
$queue = array();
$queue[] = 1; // or array_push()
$queue[] = 2; // or array_push()
$queue[] = 3; // or array_push()
array_shift($queue); // gives 1
array_shift($queue); // gives 2
array_shift($queue); // gives 3
37. Queues – Using SplQueue
$queue = new SplQueue();
$queue[] = 1; // or $queue->enqueue()
$queue[] = 2; // or $queue->enqueue()
$queue[] = 3; // or $queue->enqueue()
$queue->dequeue(); // gives 1
$queue->dequeue(); // gives 2
$queue->dequeue(); // gives 3
38. Queues – Pros and Cons
Array SplQueue
- Uses more memory + Uses less memory
(overhead / entry: 96 bytes) (overhead / entry: 48 bytes)
- No type hinting + Type hinting possible
+|- Less OO +|- More OO
39. Stacks
● A stack is an ordered collection respecting Last In,
First Out (LIFO) order.
● Elements are inserted and removed on the same
end.
40. Stacks
● A stack is an ordered collection respecting Last In,
First Out (LIFO) order.
● Elements are inserted and removed on the same
end.
Data
Push
Data Data Data Data Data Data
Pop
Data
41. Stacks – Using array
$stack = array();
$stack[] = 1; // or array_push()
$stack[] = 2; // or array_push()
$stack[] = 3; // or array_push()
array_pop($stack); // gives 3
array_pop($stack); // gives 2
array_pop($stack); // gives 1
42. Stacks – Using SplStack
$stack = new SplStack();
$stack[] = 1; // or $stack->push()
$stack[] = 2; // or $stack->push()
$stack[] = 3; // or $stack->push()
$stack->pop(); // gives 3
$stack->pop(); // gives 2
$stack->pop(); // gives 1
43. Stacks – Pros and Cons
Array Class
- Uses more memory + Uses less memory
(overhead / entry: 96 bytes) (overhead / entry: 48 bytes)
- No type hinting + Type hinting possible
+|- Less OO +|- More OO
44. Sets
● A set is a collection with no particular ordering
especially suited for testing the membership of a
value against a collection or to perform
union/intersection/complement operations
between them.
45. Sets
● A set is a collection with no particular ordering
especially suited for testing the membership of a
value against a collection or to perform
union/intersection/complement operations
between them.
Data
Data
Data
Data
Data
51. Sets – Using SplObjectStorage
(objects)
$set = new SplObjectStorage();
$set->attach($object1); // or $set[$object1] = null;
$set->attach($object2); // or $set[$object2] = null;
$set->attach($object3); // or $set[$object3] = null;
isset($set[$object2]); // true
isset($set[$object2]); // false
$set1->addAll($set2); // union
$set1->removeAllExcept($set2); // intersection
$set1->removeAll($set2); // complement
52. Sets – Using QuickHash (int)
$set = new QuickHashIntSet(64,
QuickHashIntSet::CHECK_FOR_DUPES);
$set->add(1);
$set->add(2);
$set->add(3);
$set->exists(2); // true
$set->exists(5); // false
● No union/intersection/complement operations
(yet?)
● Yummy features like (loadFrom|saveTo)(String|File)
53. Sets – With finite possible values
define("E_ERROR", 1); // or 1<<0
define("E_WARNING", 2); // or 1<<1
define("E_PARSE", 4); // or 1<<2
define("E_NOTICE", 8); // or 1<<3
$set = 0;
$set |= E_ERROR;
$set |= E_WARNING;
$set |= E_PARSE;
$set & E_ERROR; // true
$set & E_NOTICE; // false
$set1 | $set2; // union
$set1 & $set2; // intersection
$set1 ^ $set2; // complement
54. Sets – With finite possible values
(function features)
Instead of:
function remove($path, $files = true, $directories = true, $links = true,
$executable = true)
{
if (!$files && is_file($path))
return false;
if (!$directories && is_dir($path))
return false;
if (!$links && is_link($path))
return false;
if (!$executable && is_executable($path))
return false;
// ...
}
remove("/tmp/removeMe", true, false, true, false); // WTF ?!
55. Sets – With finite possible values
(function features)
Instead of:
define("REMOVE_FILES", 1 << 0);
define("REMOVE_DIRS", 1 << 1);
define("REMOVE_LINKS", 1 << 2);
define("REMOVE_EXEC", 1 << 3);
define("REMOVE_ALL", ~0); // Setting all bits
function remove($path, $options = REMOVE_ALL)
{
if (~$options & REMOVE_FILES && is_file($path))
return false;
if (~$options & REMOVE_DIRS && is_dir($path))
return false;
if (~$options & REMOVE_LINKS && is_link($path))
return false;
if (~$options & REMOVE_EXEC && is_executable($path))
return false;
// ...
}
remove("/tmp/removeMe", REMOVE_FILES | REMOVE_LINKS); // Much better :)
56. Sets: Conclusions
● Use the key and not the value when using PHP
Arrays.
● Use QuickHash for set of integers if possible.
● Use SplObjectStorage as soon as you are playing
with objects.
● Don't use array_unique() when you need a set!
57. Bloom filters
● A bloom filter is a space-efficient probabilistic data
structure used to test whether an element is
member of a set.
● False positives are possible, but false negatives are
not!
58. Bloom filters – Using bloomy
// BloomFilter::__construct(int capacity [, double
error_rate [, int random_seed ] ])
$bloomFilter = new BloomFilter(10000, 0.001);
$bloomFilter->add("An element");
$bloomFilter->has("An element"); // true for sure
$bloomFilter->has("Foo"); // false, most probably
59. Maps
● A map is a collection of key/value pairs where all
keys are unique.
60. Maps – Using array
$map = array();
$map["ONE"] = 1;
$map["TWO"] = 2;
$map["THREE"] = 3;
// Merging maps:
array_merge($map1, $map2); // SLOW!
$map2 + $map1; // Fast :)
● Don't use array_merge() on maps.
65. Heap – Using Spl(Min|Max)Heap
$heap = new SplMinHeap;
$heap->insert(3);
$heap->insert(1);
$heap->insert(2);
66. Heaps: Conclusions
● MUCH faster than having to re-sort() an array at
every insertion.
● If you don't require a collection to be sorted at
every single step and can insert all data at once
and then sort(). Array is a much better/faster
approach.
● SplPriorityQueue is very similar, consider it is the
same as SplHeap but where the sorting is made on
the key rather than the value.
67. Other related projects
● SPL Types: Various types implemented as object:
SplInt, SplFloat, SplEnum, SplBool and SplString
http://pecl.php.net/package/SPL_Types
68. Other related projects
● SPL Types: Various types implemented as object:
SplInt, SplFloat, SplEnum, SplBool and SplString
http://pecl.php.net/package/SPL_Types
● Judy: Sparse dynamic arrays implementation
http://pecl.php.net/package/Judy
69. Other related projects
● SPL Types: Various types implemented as object:
SplInt, SplFloat, SplEnum, SplBool and SplString
http://pecl.php.net/package/SPL_Types
● Judy: Sparse dynamic arrays implementation
http://pecl.php.net/package/Judy
● Weakref: Weak references implementation.
Provides a gateway to an object without
preventing that object from being collected by the
garbage collector.
70. Conclusions
● Use appropriate data structure. It will keep your
code clean and fast.
71. Conclusions
● Use appropriate data structure. It will keep your
code clean and fast.
● Think about the time and space complexity
involved by your algorithms.
72. Conclusions
● Use appropriate data structure. It will keep your
code clean and fast.
● Think about the time and space complexity
involved by your algorithms.
● Name your variables accordingly: use “Map”, “Set”,
“List”, “Queue”,... to describe them instead of using
something like: $ordersArray.