PostgreSQL 9.2 introduced native support for the JSON data type, as well as V8/Javascript and Coffeescript procedural languages.
Learn how you can write web applications in PHP using an intelligent and horizontally sharded cluster of PostgreSQL databases, bringing you infinite scalability and parallel processing.
This talk will guide you through the development lifecycle of the application, focusing on architecture, technologies, testing and deployment.
Vector Search -An Introduction in Oracle Database 23ai.pptx
Writing infinite scalability web applications with PHP and PostgreSQL
1. Writing infinite scalability web
applications with PHP and
PostgreSQL
PHPDay 2013 - Verona
Gabriele Bartolini
2ndQuadrant Italia
gabriele.bartolini@2ndQuadrant.it
@_GBartolini_
sabato 18 maggio 13
2. Gabriele Bartolini
•Co-Founder and Manager of 2ndQuadrant Italia
• Data Architect, Business critical environments (OLTP)
• Data warehousing
•Co-Founder Italian PostgreSQL Users Group
•Co-Founder PostgreSQL Europe
•PostgreSQL Contributor and Advocate
sabato 18 maggio 13
11. Sharding
•Horizontal partitioning
technique
•Logically and physically
split a table by distributing
data across different
databases/servers
• Reduced number of rows in
each shard
• Reduced index size
•Some DBMS have the
"DISTRIBUTED BY" clause
for "CREATE TABLE"
•Query distribution
• Parallel processing
• Shared nothing architecture
sabato 18 maggio 13
12. Table 1
Table 1
Shard 1
Table 1
Shard 2
Table 1
Shard 3
Table 1
Shard ...
Table 1
Shard n
Table 1
SELECT * FROM Table1;
Parallel execution on the ‘n’ shards
sabato 18 maggio 13
14. PostgreSQL
•Started in 1995 (Postgres started in 1986)
•Current major version is 9.2
• Supported major versions are 8.4, 9.0, 9.1 and 9.2
•The PostgreSQL License (BSD style)
• Not subject to monopoly nor acquisitions
•Platform independent (C language)
sabato 18 maggio 13
15. Some major features
•Client/Server
•Concurrency (MVCC)
•Referential integrity
•ACID transactions
•Views
•Functions
•Triggers
•Tablespaces
•Partitioning
•Schemas
•Point In Time Recovery
•Master/Slave Replication
sabato 18 maggio 13
16. Features for developers
•Standard compliant (e.g.
SQL, XML, UTF8, JSON, ...)
•Client interfaces: C/C++,
PHP, JDBC, ODBC, .Net,
Python, Perl, Ruby, ...
•Vast gamma of ORMs
•Several procedural
languages
•Versatile and extensible
• custom data types
• intelligence in the database
• extensions (from 9.1)
• PostGIS
•Database Unit testing
sabato 18 maggio 13
17. PostgreSQL 9.3
•Autumn 2013
•JSON operators and
functions
•Updatable views
•Materialised views
•Writable foreign tables
•Support for LATERAL
•...
•http://wiki.postgresql.org/
wiki/What
%27s_new_in_PostgreSQL_
9.3
sabato 18 maggio 13
18. NO ... DISTRIBUTED BY
•PostgreSQL does not support transparent distribution of
data and queries (yet)
•Sharding can be implemented through PL/Proxy using
functions
sabato 18 maggio 13
23. Disclaimer
For the sake of simplicity, every database server
described in the following architecture diagrams,
deliberately represents a single point of failure.
Due to its high availability/disaster recovery features and
open source tools, PostgreSQL is widely used in mission
critical environments.
sabato 18 maggio 13
27. Web server
Multiple web servers architecture
apache/php
DB Server
postgres
pgbouncer
pgbouncer
Web server
apache/php
pgbouncer
HA Proxy
HTTP Balancing
Pros: scalability and
continuity (Web)
Cons: scalability
and continuity (DB)
sabato 18 maggio 13
28. Web server
Read only database servers
apache/php
DB Master
postgres
pgbouncer
pgbouncer
Web server
apache/php
pgbouncer
HA Proxy
HTTP Balancing
DB Hot Standby
postgres
pgbouncer
Streaming
Replication
RO
RORW RW
sabato 18 maggio 13
29. Web server
Database sharding architecture
DB Access Server
postgres
pgbouncer
Web server
HA Proxy
Web layer
pl/proxy
Database Access Layer
DB Shard ...
postgres
pgbouncer
DB Shard n
postgres
pgbouncer
DB Shard 1
postgres
pgbouncer
Database
Data Layer
“NoSQL”“NoSQL”
“Infinite” horizontal scalability
Shared
nothing
architecture
sabato 18 maggio 13
30. Architecture recap
•Do not get carried away by current trends and buzzwords
•Make your decisions about architecture based on:
• business continuity requirements
• results of both database and application benchmarks
•Always start with a simple architecture (two-server)
•Enhance gradually up to sharding for infinite scalability
sabato 18 maggio 13
32. PgBouncer
•Lightweight connection
pooler for PostgreSQL
•Written in C by Skype
•Recycling connections on:
• sessions
• transactions
• statements (for PL/Proxy)
•“Virtual databases”
• Real databases can be on
different backend servers
•Online reconfiguration,
restart and upgrade
• Online statistics
• Supports PAUSE/RESUME
sabato 18 maggio 13
35. hstore
•Data type in PostgreSQL
•Key/value pairs as a single
value
•Useful for storing:
• unstructured data
• many attributes for a
record
•Keys and values are stored
as text
•Set of operators
• ->, @>, ?, ...
•Indexes
•Integrates with JSON in 9.3
sabato 18 maggio 13
36. JSON Support
•Introduced in PostgreSQL 9.2
•Data type
•Generators (9.3)
•Operators (9.3)
•Integration with hstore (9.3)
sabato 18 maggio 13
37. CREATE TABLE pictures (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL
REFERENCES users(id),
title TEXT,
metadata JSON,
thumbnail BYTEA,
picture BYTEA,
created_at TIMESTAMP DEFAULT now(),
updated_at TIMESTAMP DEFAULT now()
);
sabato 18 maggio 13
38. INSERT INTO pictures
(user_id, title, metadata)
VALUES (
100,
'A picture of my Stratocaster',
'{"latitude":43.883,"longitude":
11.1,"width":2240,"height":1680}'
);
Storing a JSON value
sabato 18 maggio 13
39. WITH m AS (
SELECT 43.883 AS latitude,
11.1 AS longitude,
2240 AS width, 1680 AS height
)
INSERT INTO pictures
(user_id, title, metadata)
SELECT 100,
'Another picture of my Stratocaster',
to_json(m) FROM m;
Storing a JSON value #2
sabato 18 maggio 13
40. WITH p AS (
SELECT title, metadata, created_at
FROM pictures WHERE id = 1
)
SELECT to_json(p) AS json FROM p;
{"title":"A picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":
2240,"height":1680},"created_at":"2013-05-11
13:11:37.788636"}
9.3 Returning a JSON object
sabato 18 maggio 13
41. WITH p AS (
SELECT id, title, metadata, created_at
FROM pictures WHERE user_id = 100
)
SELECT json_agg(p) AS json FROM p;
[{"id":1,"title":"A picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":2240,"height":
1680},"created_at":"2013-05-11 13:15:21.078747"}, +
{"id":2,"title":"Another picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":2240,"height":
1680},"created_at":"2013-05-11 13:15:21.078747"}]
9.3 Returning an array of JSONs
sabato 18 maggio 13
42. 9.3 A taste of JSON operators
SELECT id, (metadata->'longitude',
metadata->'latitude') AS coords
FROM pictures
WHERE user_id = 100;
id | coords
----+---------------
1 | (11.1,43.883)
2 | (11.1,43.883)
...
sabato 18 maggio 13
43. PL/Javascript
•Procedural Language
•Allows to implement functions inside PostgreSQL using the
Javascript Language
•Available as an extension
•Powered by V8 JavaScript Engine
sabato 18 maggio 13
44. CREATE FUNCTION plv8_test(keys TEXT[],
vals TEXT[]) RETURNS TEXT
LANGUAGE plv8 IMMUTABLE STRICT AS
$BODY$
var o = {};
for(var i=0; i<keys.length; i++) {
o[keys[i]] = vals[i];
}
return JSON.stringify(o);
$BODY$;
SELECT plv8_test(ARRAY['name', 'age'],
ARRAY['Angus', '58']);
sabato 18 maggio 13
45. PL/Proxy
•Procedural language for
defining proxy functions
•Written in C by Skype
•The body specifies:
• remote function (target)
• destination (shard)
•Execution can happen on
one, some or all shards
• Execution is parallel if
shards > 1
•Queries are run in auto-
commit mode
• PgBouncer (statement)
sabato 18 maggio 13
46. The need for an API
•PL/Proxy forces to work
with functions
•One database cluster with:
• access/proxy layer
• data layer (shards)
•Database cluster ~
Singleton class
•Business logic in the
database
•Programmers need to
define an API with access/
authorisation rules
• Proxy functions (PL/Proxy)
• Remote functions (any)
•Supports SQL/MED
sabato 18 maggio 13
48. Define the cluster with SQL/MED
CREATE SERVER mycluster
FOREIGN DATA WRAPPER plproxy
OPTIONS (
connection_lifetime '1800',
s0 'dbname=shard00 host=pg-data-1',
s1 'dbname=shard01 host=pg-data-1',
s2 'dbname=shard02 host=pg-data-2',
s3 'dbname=shard03 host=pg-data-2'
);
sabato 18 maggio 13
49. Proxy function for GET
CREATE OR REPLACE FUNCTION
get_pictures(i_uid INTEGER)
RETURNS json
LANGUAGE plproxy
AS
$BODY$
CLUSTER 'mycluster';
RUN ON hashint4(i_uid);
$BODY$;
sabato 18 maggio 13
50. Data layer function for GET
CREATE OR REPLACE FUNCTION
get_pictures(i_uid INTEGER)
RETURNS json
LANGUAGE SQL AS
$BODY$
WITH p AS (
SELECT id, title, metadata, created_at
FROM pictures WHERE user_id = $1
) SELECT json_agg(p) AS json FROM p;
$BODY$;
sabato 18 maggio 13
51. Proxy function for INSERT
CREATE OR REPLACE FUNCTION
insert_picture(i_uid INTEGER, i_title
TEXT, i_metadata JSON, i_thumbnail BYTEA,
i_picture BYTEA)
RETURNS INTEGER LANGUAGE plproxy
AS $BODY$
CLUSTER 'mycluster';
RUN ON hashint4(i_uid);
$BODY$;
sabato 18 maggio 13
52. Data function for INSERT
CREATE OR REPLACE FUNCTION
insert_picture(i_uid INTEGER, i_title
TEXT, i_metadata JSON, i_thumbnail BYTEA,
i_picture BYTEA)
RETURNS INTEGER LANGUAGE SQL
AS $BODY$
INSERT INTO pictures (user_id, title,
metadata, thumbnail, picture) VALUES ($1,
$2, $3, $4, $5) RETURNING id;
$BODY$;
sabato 18 maggio 13
53. Data representation in PHP
// Obviously all of this will come from a form
$user_id = 100;
$title = 'A picture of my Stratocaster';
$metadata = array (
'latitude' => 43.883,
'longitude' => 11.1,
'width' => 2240,
'height' => 1680, // ... EXIF?
);
$thumbnail = ... // Binary stream
$picture = ... // Binary stream
sabato 18 maggio 13
54. INSERT using PDO
// $access_layer_conninfo = 'pgsql:...'
$connection = new PDO($access_layer_conninfo);
$stmt = $connection->prepare('SELECT
insert_picture(?, ?, ?, ?, ?)'); // NOSQL!
$stmt->bindParam(1, $user_id, PDO::PARAM_INT);
$stmt->bindParam(2, $title, PDO::PARAM_STR);
$stmt->bindParam(3, json_encode($metadata),
PDO::PARAM_STR);
$stmt->bindParam(4, $thumbnail, PDO::PARAM_LOB);
$stmt->bindParam(5, $picture, PDO::PARAM_LOB);
$stmt->execute(); // TODO: Check and get the ID
sabato 18 maggio 13
55. Memcached
•Distributed system for caching of objects in memory
• key/value pairs
• volatile data
•pgmemcache extension
•Integration of database and applications (read scaling)
sabato 18 maggio 13
58. Test-Driven DB development
•Database development
technique
•Write tests before actual
features
•pgTAP is an extension for
TDDD in PostgreSQL
•Allows to check for objects,
privileges, functions,
triggers, data, queries, etc.
•Transactional
•Provides with statistics
• via SQL or pg_prove
•Supports “Test Anything
Protocol” (TAP)
• Continuous integration
sabato 18 maggio 13
59. BEGIN;
SELECT plan(3);
-- insert test 1 here
-- insert test 2 here
-- insert test 3 here
SELECT * FROM finish();
ROLLBACK;
pgTAP group test example
sabato 18 maggio 13
60. SELECT results_eq(
'SELECT * FROM active_users()',
'SELECT * FROM users WHERE active',
'active_users() should return active
users'
);
pgTAP example with ‘results_eq’
sabato 18 maggio 13
61. Deployment with EXTENSION
•Extensibility
•Deployment is a crucial part
of any ICT organisation
•PostgreSQL 9.1 introduces
EXTENSION for packaging
in-database applications
•Fully transactional (atomic)
•Allows for upgrades and
downgrades
•Simplifies logical dump/
restore of objects
•DevOps with EXTENSION:
• Devs write extensions
• Ops deploy them
sabato 18 maggio 13
62. Extensions in detail
•Extensions require developer to write:
• Control file (metadata, settings, dependencies, ...)
• Installation scripts (SQL statements)
• Update scripts (SQL statements, optional)
• Source code / binaries (optional)
sabato 18 maggio 13
63. Extension management
-- INSTALLATION
CREATE EXTENSION pictures_app;
-- REMOVAL
DROP EXTENSION pictures_app;
-- UPDATE (fully atomic)
ALTER EXTENSION pictures_app UPDATE;
sabato 18 maggio 13
65. Conclusions
•Postgres is much more than a “traditional” database
•PHP integrates perfectly with it, in multiple flavours (native
interface, PDO, Doctrine, etc.)
•Start exploiting “vertical scalability”
•Always perform benchmark!
•Evaluate costs and complexity of “horizontal scalability”
•Postgres is “business friendly” in terms of licensing
sabato 18 maggio 13
66. Feel free to approach me about
•Open Source Community
•License
•Total Cost of Ownership
•Development life cycle
•Quality of code and
documentation
•Business Continuity
•Security and data protection
•PostgreSQL Eco-system
•Performance and scalability
•Software development
sabato 18 maggio 13
67. Community
•Next events:
• PGDay.IT (Autumn 2013)
• PGConf.EU (Dublin, Oct 29
- Nov 1, 2013)
•Italian PostgreSQL Users
Group (ITPUG)
• www.itpug.org
•PostgreSQL Europe
• www.postgresql.eu
sabato 18 maggio 13