PostgreSQL is a well-known relational database. But in the last few years, it has gained capabilities that previously belonged only to "NoSQL" databases. In this talk, I describe several of PostgreSQL that give it such capabilities.
4. Writing
• Linux Journal
• Blog: http://blog.lerner.co.il/
• Tweeting: @reuvenmlerner
• ebook: "Practice Makes Python"
• E-mail courses
• My programming newsletter
5. Curating
• Full-stack Web development
• http://DailyTechVideo.com • @DailyTechVideo
• Learning Mandarin Chinese?
• http://MandarinWeekly.com • @MandarinWeekly
6. What is a database?
• Store data securely
• Retrieve data flexibly
• Do this as efficiently as possible
7. My first database
• Text files!
• They're really fast to work with
• They're really flexible
• But all of the data handling is in our application!
• So things are slow
• And when there's more than one user, it gets bad
8. Things would be better if:
• The database let us structure our data
• The database did most of the computing work (high
speed and centralized), freeing up our application
• The database handled constraints and errors
• The database took care of simultaneous reads, writes
in the form of transactions
• The database handled errors well, reporting them
rather than dying on us
9. Relational model
• EF Codd, an IBM researcher, proposed it in 1970
• Replaced the previous hierarchical model
• Normalized data = easier, more flexible
• Eight relational operations:
• Union, intersection, difference, product
• Selection (WHERE), projection (select a, b), join,
division
10. Query languages
• Codd spoke in terms of mathematics.
• This was implemented using query languages
• SQL was not the first, or the only, query language!
• Codd wrote Alpha
• Stonebreaker wrote Quel
• IBM (but not Codd!) wrote SEQUEL
• Larry Ellison made his own version of SEQUEL… and thus
was born the new, more generic name, SQL
11. Brief history
• 1977-1985: Ingres (Stonebreaker)
• 1986-1994: Postgres (Stonebreaker)
• 1995: Postgres + SQL = PostgreSQL
• 1996: Open-source project, run by the
“global development group”
• Ever since, one major release per year
• Current is 9.4, with 9.5 due in the autumn
12. It's getting popular…
• Rock solid
• High performance
• Extensible
• Heroku
• (Also: Thanks, Oracle!)
13. So, what is NoSQL?
• It's not really NoSQL.
• Rather, it's non-relational.
15. So, why NoSQL?
• Not everything is easily represented with tables
• Sometimes we want a more flexible schema — the
database equivalent of dynamic typing
• Some data is bigger, or comes faster, than a single
relational database can handle
16. NoSQL isn't a definition!
• "I want to travel using a non-flying vehicle."
• "I want a non-meat dinner."
• "I want to read a non-fiction book."
17. Key-value stores
• Examples: Redis, Riak
• Think of it as a hash table server, with typed data
• Especially useful for caching, but also good for
many name-value data sets
• Very fast, very reliable, very useful
19. What's wrong with this?
• New systems to learn, install, configure, and tune
• New query language(s) to learn, often without the
expressive power of SQL
• Non-normalized data!
• Splitting our data across different systems might
lead to duplication or corruption
• What about transactions? What about ACID?
20. Is NoSQL wrong?
• No, of course not.
• Different needs require different solutions.
• But let's not throw out 40+ years of database
research, just because NoSQL is new and cool.
• Engineering is all about trade-offs. There is no
perfect solution. Optimize for certain things.
22. SQL vs. NoSQL
• As a developer, I can then choose between SQL
and NoSQL
• NoSQL can be faster, more flexible, and easier
• But SQL databases have a lot of advantages, and
it's a shame to throw out so many years of
advancement
23. But wait!
• PostgreSQL has an extension mechanism
• Add new data types
• Add new functions
• Connect to external databases
• PostgreSQL is becoming a platform for data
storage and retrieval, and not just a database
24. HSTORE
• HSTORE is a data type, just like INTEGER,
TIMESTAMP, or TEXT
• If you define a column as HSTORE, it can contain
key-value pairs
• Keys and values are both strings
25. Create a table
CREATE EXTENSION HSTORE;
CREATE TABLE People (
id SERIAL,
info HSTORE,
PRIMARY KEY(id)
);
26. Add a HSTORE value
INSERT INTO people(info)
VALUES ('foo=>1, bar=>abc, baz=>stuff');
27. Look at our values
[local]/reuven=# select * from people;
+----+------------------------------------------+
| id | info |
+----+------------------------------------------+
| 1 | "bar"=>"abc", "baz"=>"stuff", "foo"=>"1" |
+----+------------------------------------------+
(1 row)
28. Add (or replace) a pair
UPDATE People
SET info = info || 'abc=>def';
30. What else?
• Everything you would want in a hash table…
• Check for a key
• Remove a key-value pair
• Get the keys
• Get the values
• Turn the hstore into a PostgreSQL array or JSON
31. Indexes
• PostgreSQL has several types of indexes
• You can index HSTORE columns with GIN and
GIST indexes, which lets you search inside
• You can also index HSTORE columns with HASH
indexes, for finding equal values
32. HSTORE isn't Redis
• But it does give you lots of advantages
• Super reliable
• CHECK constraints
• Combine HSTORE queries with other queries
• Transactions!
• Master-slave replication for scalability
33. JSON and JSONB
• In the last few versions, PostgreSQL has added
JSON support
• First, basic JSON support
• Then, some added operators
• Now, JSONB support — high-speed binary
JSON storage
34. Creating a table with JSONB
CREATE TABLE People (
id SERIAL,
info JSONB
);
35. Adding values
INSERT INTO people (info)
VALUES ('{"first":"Reuven",
"last":"Lerner"}'),
('{"first":"Atara",
"last":"Lerner-Friedman"}');
36. Retrieving values
select info from people;
+-----------------------------------------------+
| info |
+-----------------------------------------------+
| {"last": "Lerner", "first": "Reuven"} |
| {"last": "Lerner-Friedman", "first": "Atara"} |
+-----------------------------------------------+
(2 rows)
37. Extract
SELECT info->'last' as last,
info->'first' as first
FROM People;
┌───────────────────┬──────────┐
│ last │ first │
├───────────────────┼──────────┤
│ "Lerner" │ "Reuven" │
│ "Lerner-Friedman" │ "Atara" │
└───────────────────┴──────────┘
(2 rows)
38. Use the inside data
select * from people order by info->'first' DESC;
+----+-----------------------------------------------+
| id | info |
+----+-----------------------------------------------+
| 4 | {"last": "Lerner", "first": "Reuven"} |
| 5 | {"last": "Lerner-Friedman", "first": "Atara"} |
+----+-----------------------------------------------+
(2 rows)
39. JSONB operators
• Checking for existence
• Reading inside of the JSONB
• Retrieving data as text, or as JSON objects
40. Indexes
• You can even index your JSONB columns!
• You can use functional and partial indexes on
JSONB
41. Performance
• EnterpriseDB (a PostgreSQL support company)
compared JSONB with MongoDB
• High-volume inserts: PostgreSQL was 2.2x faster
than MongoDB
• Inserts: PostgreSQL was 3x faster
• Disk space: MongoDB used 35% more
• JSONB is slower than MongoDB in updates, however
42. Foreign data wrappers
• Let's say that you have a NoSQL database
• However, you want to integrate that data into your
PostgreSQL system
• That's fine — just use a "foreign data wrapper"
• To PostgreSQL, it looks like a table. But in reality,
it's retrieving (and setting) data in the NoSQL
database!
43. Using a FDW
• Download, install the extension
• Create a foreign server
• Create a foreign table
• Now you can read from and write to the foreign
table
• How is NoSQL mapped to a table? Depends on
the FDW
45. Schema changes
• NoSQL loves to talk about "no schemas"
• But schemas make our data predictable, and help
us to exclude bad data
• You can always use ALTER TABLE to change the
schema — adding, removing, and renaming
columns, or even modifying data types or
constraints
46. Summary
• New problems can require new solutions
• But let's not give up all of the great solutions we've
created over the last few decades
• PostgreSQL has proven itself, time and again, as
an SQL solution
• But it's becoming a platform — one which includes
NoSQL data types, and integrates with NoSQL
databases
47. Any questions?
• Ask me now, or:
• reuven@lerner.co.il
• @reuvenmlerner
• http://lerner.co.il/