With 9.4 came logical decoding but what is it and how can it be used? Besides being a precursor to bi-directional replication there are plenty of use cases for this and many don't even require you to implement a plugin. We'll look at trigger-less auditing, partial replication and full statement replication.
1. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Fun Things to do with Logical Decoding
Mike Fowler
PGDay UK 2015
2. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Overview
● What is Logical Decoding?
● Replications Slots
● Output Plugins
● Enabling Logical Decoding
● What does Logical Decoding make possible?
3. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
About Me
● Head of Systems Engineering at Rant & Rave
● Been using PostgreSQL for over 10 years (since 7.4)
● Contributed some XML support
– XMLEXISTS/xpath_exists()
– xml_is_well_formed()
● Contributed a number of bugfixes to the JDBC
driver
4. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
What is Logical Decoding?
“The process of extracting all persistent changes to
a database's tables into a coherent, easy to
understand format which can be interpreted
without detailed knowledge of the database's
internal state”
“Implemented by decoding the contents of the
write-ahead log, which describe changes on a
storage level, into an application-specific form such
as a stream of tuples or SQL statements”
5. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
No, really - what is Logical Decoding?
● The results of logical decoding give a perspective on the
logical changes made during a transaction
● The process of logical decoding allows human
understanding of the physical changes made during a
transaction
● Achieved by creating a replication slot with a plugin to
produce data for a receiver
– Basic example plugin available in contrib
– New utility pg_recvlogical to consume plugin output
– Can roll your own to match your requirements
6. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Some PostgreSQL Basics
● PostgreSQL is a transactional database
● All changes in PostgreSQL are recorded in a Write
Ahead Log (WAL) before any data on disk is actually
changed
● The WAL is the key to replication
– Essentially log based binary replication
– Master runs the walsender to transmit new segments
– Slaves run walreceivers to receive the segments
7. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
A Basic Example
We'll use a simple table called sample:
CREATE TABLE sample (
id SERIAL PRIMARY KEY,
data JSON,
captured TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
);
8. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
A Basic Example
INSERT INTO sample (data)
VALUES
('{"beverage":"coffee","temp":96}');
● The backend takes our INSERT and calculates the
values for the omitted columns id and captured
● The backend also determines exactly what needs to
be changed on disk and commits this to the WAL
9. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Replication Slots
● 9.4 introduced replication slots
● Capability to retain WAL segments on the server
instead of relying on continuous archiving
● Allows master to retain segments even when a
slave is offline
● There are two types of replication slot
– Physical
– Logical
10. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Logical Replication Slots
● Entries in a logical slot are WAL segments that have
been decoded by means of an output plugin
● Output plugins are responsible for transforming the
data from the WAL into the format the consumer of
a logical replication slot desires
● contrib/test_decoding is a working example of the
interface and can be used to play
11. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Output Plugin Rules
● Only transactions that have been successfully
flushed to disk are decoded
● Transactions that are rolled back never get decoded
● Changes to UNLOGGED or TEMPORARY tables are
also not decoded
● Concurrent transactions are decoded in commit
order
12. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Making an Output Plugin
● Consists of building a shared library that implements a number of
functions as detailed in section 46.6 of the manual
– An initialization function
– Transaction begin, row change and transaction end callbacks
– Optional plugin start & stop callbacks
– 9.5 also adds an optional origin filter callback
● For a given transaction the begin & end is called once with the row
change called for each changed row even if multiple rows were
changed in a single statement
13. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Output Plugin Considerations
● Keep the plugin simple
– It's running as part of your database
– It's going to process all your changes
● Offload complex filter/analysis to the slot consumer
● Drop the slot if it's no longer in use to stop using
server resourcesSELECT
pg_drop_replication_slot('pgday_slot');
14. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Enabling Logical Decoding
● Enable replication slots (will require restart)
– wal_level must be set to 'logical'
– max_replication_slots must be at least 1
● Create a replication slot
pgday15=# SELECT * FROM
pg_create_logical_replication_slot('pgday_slot',
'test_decoding');
slot_name | xlog_position
------------+---------------
pgday_slot | 0/16C9A78
(1 row)
15. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Revisiting the Basic Example
We used a simple table called sample:
CREATE TABLE sample (
id SERIAL PRIMARY KEY,
data JSON,
captured TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
);
16. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Revisiting the Basic Example
INSERT INTO sample (data)
VALUES ('{"beverage":"coffee","temp":96}');
● The backend took our INSERT and calculated the values for
the omitted columns id and captured
● The backend also determined exactly what needed to be
changed on disk and committed it to the WAL
● This time the test_decoding plugin was invoked when the
WAL segment entered the logical replication slot
17. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Consuming the Plugin Output
● All data from the plugin will remain in the slot until
consumed
● Data can be consumed in two ways
– SQL interface
● Useful for experimentation and demonstration
● Can be used to form a basic programmatic integration with non-C
applications (e.g. Java using JDBC)
– pg_recvlogical
● Similar to pg_receivexlog
● Can be used to output to a file for later consumption
18. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
The Output from our Example
pgday15=# SELECT * FROM
pg_logical_slot_get_changes('pgday_slot', NULL, NULL);
location | xid | data
-----------+-----+----------------------------------
0/16C9ED0 | 723 | BEGIN 723
0/16C9ED0 | 723 | table public.sample: INSERT:
id[integer]:1
data[json]:'{"beverage":"coffee","temp":96}'
captured[timestamp without tz]:'2015-07-03 19:52:39.497161'
0/16CA000 | 723 | COMMIT 723
(3 rows)
19. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
pg_recvlogical
● Use pg_recvlogical to record to a single file
– -f option to specify output file
– -F to specify frequency of fsync in seconds
● Run pg_recvlogical on a separate server
– -h to specify hostname or ip
– -p port
● Can also create/delete slots
20. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Some simple LD Use Cases
● Trigger-less auditing
– Keep a record of all changes in transaction order
– Change only query log
● Trigger auditing
● Basic throughput analysis
21. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Limitations of test_decoding
● Only details the new/changed row
● No detail of a WHERE clause
● No timestamps in output to help cross reference
● Changed indexes don't make for easy reading
table public.sample: UPDATE: old-key: id[integer]:3 new-tuple: id[integer]:4
data[json]:'{"beverage":"coffee","temp":96}' captured[timestamp without time
zone]:'2015-07-07 00:07:56.305964'
22. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Enhancing test_decoding
● Plugins can use most of backend infrastructure
– Can use output functions
– Read only access to relations
– Can not use anything that requires a transaction
● Information logged to WAL can be tuned with
REPLICA IDENTITY
– Can be adjusted per table with ALTER TABLE
– DEFAULT keeps old value of PRIMARY KEY
– USING INDEX will keep values of columns covered
23. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Example: decoder_raw
● Enhanced version of test_decoding by Michael Paquier
● Appends WHERE clauses based on data derived from
REPLICA IDENTITY
– If DEFAULT, PRIMARY KEY column(s) appear in WHERE
● Removes the commit number from BEGIN and COMMIT
● Available on Git Hub
https://github.com/michaelpq/pg_plugins/tree/master/decoder_r
aw
24. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
decoder_raw Output
=# SELECT slot_name
FROM
pg_create_logical_replication_slot('custom_slot',
'decoder_raw');
slot_name
--------------
custom_slot
(1 row)
=# CREATE TABLE aa (a int primary key, b text);
CREATE TABLE
=# INSERT INTO aa VALUES (1, 'aa'), (2, 'bb');
INSERT 0 2
=# UPDATE aa SET b = 'cc' WHERE a = 1;
UPDATE 1
=# DELETE FROM aa WHERE a = 1;
DELETE 1
25. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
decoder_raw Output
=# SELECT data
FROM pg_logical_slot_peek_changes('custom_slot',
NULL, NULL, 'include-transaction', 'on');
data
----------------------------------------------------
BEGIN;
COMMIT;
BEGIN;
INSERT INTO public.aa (a, b) VALUES (1, 'aa');
INSERT INTO public.aa (a, b) VALUES (2, 'bb');
COMMIT;
BEGIN;
UPDATE public.aa SET a = 1, b = 'cc' WHERE a = 1 ;
COMMIT;
BEGIN;
DELETE FROM public.aa WHERE a = 1 ;
COMMIT;
(12 rows)
26. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Partial Replication
● Can be achieved in two ways
– Plugin can filter out unneeded tables
– Slot consumer can discard unneeded tables
● Useful if you have write notes independent of a
cluster
● Streaming partial replication is possible
– If output of plugin is well structured pg_recvlogical
could be piped straight into a psql client
27. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Full Statement Replication
● Changes can be streamed to other DBMS platforms
– Allows replication to non-PostgreSQL servers
– Useful for keeping a backup during a migration to
PostgreSQL
● Differs from most versions of statement replication
– The master computes the statement and determines the
changes
– Values of columns have already been computed so will be
identical (e.g. CURRENT_TIMESTAMP)
28. Fun Things to do with Logical Decoding
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2015
Conclusion
● Logical Decoding exposes meta-data about
committed transactions
● Plugins can be written to enrich and format the
meta-data for placement in replication slots
● Data in replication slots can be consumed by
external programs using SQL or
pg_recvlogical to perform a number of useful
tasks
29. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Questions?
Mike Fowler
PGDay UK 2015
30. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Thank you!
Mike Fowler
PGDay UK 2015