SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Getting Started with
      PL/Proxy
          Peter Eisentraut
        peter@eisentraut.org

          F-Secure Corporation



  PostgreSQL Conference East 2011



                                    CC-BY
Concept



  • a database partitioning system implemented as a
   procedural language
  • “sharding”/horizontal partitioning
  • PostgreSQL’s No(t-only)SQL solution
Concept


          application   application        application   application



                                frontend



          partition 1   partition 2        partition 3   partition 4
Areas of Application


   • high write load
   • (high read load)
   • allow for some “eventual consistency”
   • have reasonable partitioning keys
   • use/plan to use server-side functions
Example
 Have:1
 CREATE TABLE products (
     prod_id serial PRIMARY KEY ,
     category integer NOT NULL ,
     title varchar (50) NOT NULL ,
     actor varchar (50) NOT NULL ,
     price numeric (12 ,2) NOT NULL ,
     special smallint ,
     common_prod_id integer NOT NULL
 );

 INSERT INTO products VALUES (...) ;
 UPDATE products SET ... WHERE ...;
 DELETE FROM products WHERE ...;
 plus various queries

   1 dellstore2   example database
Installation



   • Download: http://plproxy.projects.postgresql.org,
     Deb, RPM, . . .
   • Create language: psql -d dellstore2 -f
     ...../plproxy.sql
Backend Functions I
  CREATE FUNCTION insert_product ( p_category int ,
       p_title varchar , p_actor varchar , p_price
       numeric , p_special smallint ,
       p_common_prod_id int ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        INSERT INTO products ( category , title ,
           actor , price , special , common_prod_id )
           VALUES ( p_category , p_title , p_actor ,
           p_price , p_special , p_common_prod_id ) ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Backend Functions II
  CREATE FUNCTION update_product_price ( p_prod_id
       int , p_price numeric ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        UPDATE products SET price = p_price WHERE
            prod_id = p_prod_id ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Backend Functions III

  CREATE FUNCTION delete_product_by_title ( p_title
       varchar ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        DELETE FROM products WHERE title = p_title ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Frontend Functions I
  CREATE FUNCTION insert_product ( p_category int ,
       p_title varchar , p_actor varchar , p_price
       numeric , p_special smallint ,
       p_common_prod_id int ) RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON hashtext ( p_title ) ;
  $$ ;

  CREATE FUNCTION update_product_price ( p_prod_id
       int , p_price numeric ) RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  $$ ;
Frontend Functions II


  CREATE FUNCTION delete_product_by_title ( p_title
       varchar ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON hashtext ( p_title ) ;
  $$ ;
Frontend Query Functions I


  CREATE FUNCTION get_product_price ( p_prod_id
       int ) RETURNS SETOF numeric
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  SELECT price FROM products WHERE prod_id =
       p_prod_id ;
  $$ ;
Frontend Query Functions II

  CREATE FUNCTION
       get_products_by_category ( p_category int )
       RETURNS SETOF products
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  SELECT * FROM products WHERE category =
       p_category ;
  $$ ;
Unpartitioned Small Tables


  CREATE FUNCTION insert_category ( p_categoryname )
       RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON 0;
  $$ ;
Which Hash Key?



   • natural keys (names, descriptions, UUIDs)
   • not serials (Consider using fewer “ID” fields.)
   • single columns
   • group sensibly to allow joins on backend
Set Basic Parameters

   • number of partitions (2n ), e. g. 8
   • host names, e. g.
       • frontend: dbfe
       • backends: dbbe1, . . . , dbbe8
   • database names, e. g.
       • frontend: dellstore2
       • backends: store01, . . . , store08
   • user names, e. g. storeapp
   • hardware:
       • frontend: lots of memory, normal disk
       • backends: full-sized database server
Set Basic Parameters

   • number of partitions (2n ), e. g. 8
   • host names, e. g.
       • frontend: dbfe
       • backends: dbbe1, . . . , dbbe8 (or start at 0?)
   • database names, e. g.
       • frontend: dellstore2
       • backends: store01, . . . , store08 (or start at 0?)
   • user names, e. g. storeapp
   • hardware:
       • frontend: lots of memory, normal disk
       • backends: full-sized database server
Configuration
  CREATE FUNCTION
     plproxy . get_cluster_partitions ( cluster_name
     text ) RETURNS SETOF text LANGUAGE plpgsql AS
     $$ ... $$ ;

  CREATE FUNCTION
     plproxy . get_cluster_version ( cluster_name
     text ) RETURNS int LANGUAGE plpgsql AS
     $$ ... $$ ;

  CREATE FUNCTION plproxy . get_cluster_config ( IN
     cluster_name text , OUT key text , OUT val
     text ) RETURNS SETOF record LANGUAGE plpgsql
     AS $$ ... $$ ;
get_cluster_partitions
  Simplistic approach:
  CREATE FUNCTION
       plproxy . get_cluster_partitions ( cluster_name
       text ) RETURNS SETOF text
  LANGUAGE plpgsql
  AS $$
  BEGIN
        IF cluster_name = ' dellstore_cluster ' THEN
             RETURN NEXT ' dbname = store01 host = dbbe1 ';
             RETURN NEXT ' dbname = store02 host = dbbe2 ';
             ...
             RETURN NEXT ' dbname = store08 host = dbbe8 ';
             RETURN ;
        END IF ;
        RAISE EXCEPTION ' Unknown cluster ';
  END ;
  $$ ;
get_cluster_version
  Simplistic approach:
  CREATE FUNCTION
      plproxy . get_cluster_version ( cluster_name
      text ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  BEGIN
        IF cluster_name = ' dellstore_cluster ' THEN
            RETURN 1;
        END IF ;
        RAISE EXCEPTION ' Unknown cluster ';
  END ;
  $$ LANGUAGE plpgsql ;
get_cluster_config
  CREATE OR REPLACE FUNCTION
       plproxy . get_cluster_config ( IN cluster_name
       text , OUT key text , OUT val text ) RETURNS
       SETOF record
  LANGUAGE plpgsql
  AS $$
  BEGIN
        -- same config for all clusters
        key := ' connection_lifetime ';
        val := 30*60; -- 30 m
        RETURN NEXT ;
        RETURN ;
  END ;
  $$ ;
Table-Driven Configuration I
  CREATE TABLE plproxy . partitions (
      cluster_name text NOT NULL ,
      host text NOT NULL ,
      port text NOT NULL ,
      dbname text NOT NULL ,
      PRIMARY KEY ( cluster_name , dbname )
  );

  INSERT INTO plproxy . partitions        VALUES
  ( ' dellstore_cluster ' , ' dbbe1 ' ,   ' 5432 ' ,
       ' store01 ') ,
  ( ' dellstore_cluster ' , ' dbbe2 ' ,   ' 5432 ' ,
       ' store02 ') ,
  ...
  ( ' dellstore_cluster ' , ' dbbe8 ' ,   ' 5432 ' ,
       ' store03 ') ;
Table-Driven Configuration II

  CREATE TABLE plproxy . cluster_users (
      cluster_name text NOT NULL ,
      remote_user text NOT NULL ,
      local_user NOT NULL ,
      PRIMARY KEY ( cluster_name , remote_user ,
         local_user )
  );

  INSERT INTO plproxy . cluster_users VALUES
  ( ' dellstore_cluster ' , ' storeapp ' , ' storeapp ') ;
Table-Driven Configuration III
  CREATE TABLE plproxy . remote_passwords (
      host text NOT NULL ,
      port text NOT NULL ,
      dbname text NOT NULL ,
      remote_user text NOT NULL ,
      password text ,
      PRIMARY KEY ( host , port , dbname ,
         remote_user )
  );

  INSERT INTO plproxy . remote_passwords VALUES
  ( ' dbbe1 ' , ' 5432 ' , ' store01 ' , ' storeapp ' ,
       ' Thu1Ued0 ') ,
  ...

  -- or use . pgpass ?
Table-Driven Configuration IV

  CREATE TABLE plproxy . cluster_version (
      id int PRIMARY KEY
  );

  INSERT INTO plproxy . cluster_version VALUES (1) ;

  GRANT SELECT ON plproxy . cluster_version TO
     PUBLIC ;

  /* extra credit : write trigger that changes the
     version when one of the other tables changes
     */
Table-Driven Configuration V
  CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text )
         RETURNS SETOF text
  LANGUAGE plpgsql
  SECURITY DEFINER
  AS $$
  DECLARE
        r record ;
  BEGIN
        FOR r IN
             SELECT ' host = ' || host || ' port = ' || port || ' dbname = ' || dbname || '
                   user = ' || remote_user || ' password = ' || password AS dsn
             FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN
                   plproxy . remote_passwords
             WHERE cluster_name = p_cluster_name
             AND local_user = session_user
             ORDER BY dbname      -- important
        LOOP
             RETURN NEXT r. dsn ;
        END LOOP ;
        IF NOT found THEN
             RAISE EXCEPTION ' no such cluster : % ', p_cluster_name ;
        END IF ;
        RETURN ;
  END ;
  $$ ;
Table-Driven Configuration VI
  CREATE FUNCTION
       plproxy . get_cluster_version ( p_cluster_name
       text ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        ret int ;
  BEGIN
        SELECT INTO ret id FROM
            plproxy . cluster_version ;
        RETURN ret ;
  END ;
  $$ ;
SQL/MED Configuration
 CREATE SERVER dellstore_cluster FOREIGN DATA
    WRAPPER plproxy
 OPTIONS (
     connection_lifetime ' 1800 ' ,
     p0 ' dbname = store01 host = dbbe1 ' ,
     p1 ' dbname = store02 host = dbbe2 ' ,
     ...
     p7 ' dbname = store08 host = dbbe8 '
 );

 CREATE USER MAPPING FOR storeapp SERVER
    dellstore_cluster
       OPTIONS ( user ' storeapp ' , password
          ' sekret ') ;

 GRANT USAGE ON SERVER dellstore_cluster TO
    storeapp ;
Hash Functions


  RUN ON hashtext ( somecolumn ) ;

    • want a fast, uniform hash function
    • typically use hashtext
    • problem: implementation might change
    • possible solution: https://github.com/petere/pgvihash
Sequences


 shard 1:
 ALTER SEQUENCE products_prod_id_seq MINVALUE 1
    MAXVALUE 100000000 START 1;
 shard 2:
 ALTER SEQUENCE products_prod_id_seq MINVALUE
    100000001 MAXVALUE 200000000 START 100000001;
 etc.
Aggregates
 Example: count all products
 Backend:
 CREATE FUNCTION count_products () RETURNS bigint
    LANGUAGE SQL STABLE AS $$SELECT count (*)
    FROM products$$ ;
 Frontend:
 CREATE FUNCTION count_products () RETURNS SETOF
      bigint LANGUAGE plproxy AS $$
 CLUSTER ' dellstore_cluster ';
 RUN ON ALL ;
 $$ ;

 SELECT sum ( x ) AS count FROM count_products () AS
    t(x);
Dynamic Queries I
 a. k. a. “cheating” ;-)
  CREATE FUNCTION execute_query ( sql text ) RETURNS
       SETOF RECORD LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  $$ ;

  CREATE FUNCTION execute_query ( sql text ) RETURNS
       SETOF RECORD LANGUAGE plpgsql
  AS $$
  BEGIN
        RETURN QUERY EXECUTE sql ;
  END ;
  $$ ;
Dynamic Queries II

  SELECT * FROM execute_query ( ' SELECT title ,
     price FROM products ') AS ( title varchar ,
     price numeric ) ;

  SELECT category , sum ( sum_price ) FROM
     execute_query ( ' SELECT category , sum ( price )
     FROM products GROUP BY category ') AS
     ( category int , sum_price numeric ) GROUP BY
     category ;
Repartitioning

   • changing partitioning key is extremely cumbersome
   • adding partitions is somewhat cumbersome, e. g., to split
    shard 0:
     COPY ( SELECT * FROM products WHERE
        hashtext ( title :: text ) & 15 <> 0) TO
        ' somewhere ';
     DELETE FROM products WHERE
        hashtext ( title :: text ) & 15 <> 0;
    Better start out with enough partitions!
PgBouncer

          application   application        application   application



                                frontend



          PgBouncer     PgBouncer          PgBouncer     PgBouncer



          partition 1   partition 2        partition 3   partition 4




 Use
 pool_mode = statement
Development Issues



   • foreign keys
   • notifications
   • hash key check constraints
   • testing (pgTAP), no validator
Administration


   • centralized logging
   • distributed shell (dsh)
   • query canceling/timeouts
   • access control, firewalling
   • deployment
High Availability

  Frontend:
    • multiple frontends (DNS, load balancer?)
    • replicate partition configuration (Slony, Bucardo, WAL)
    • Heartbeat, UCARP, etc.
  Backend:
    • replicate backends shards individually (Slony, WAL, DRBD)
    • use partition configuration to configure load spreading or
      failover
Advanced Topics

   • generic insert, update, delete functions
   • frontend joins
   • backend joins
   • finding balance between function interface and dynamic
    queries
   • arrays, SPLIT BY
   • use for remote database calls
   • cross-shard calls
   • SQL/MED (foreign table) integration
The End

Contenu connexe

Tendances

Jersey framework
Jersey frameworkJersey framework
Jersey framework
knight1128
 
Tips
TipsTips
Tips
mclee
 
External Language Stored Procedures for MySQL
External Language Stored Procedures for MySQLExternal Language Stored Procedures for MySQL
External Language Stored Procedures for MySQL
Antony T Curtis
 

Tendances (20)

Oracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web ServicesOracle database - Get external data via HTTP, FTP and Web Services
Oracle database - Get external data via HTTP, FTP and Web Services
 
PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0
 
Read, store and create xml and json
Read, store and create xml and jsonRead, store and create xml and json
Read, store and create xml and json
 
Adodb Pdo Presentation
Adodb Pdo PresentationAdodb Pdo Presentation
Adodb Pdo Presentation
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
 
PDO Basics - PHPMelb 2014
PDO Basics - PHPMelb 2014PDO Basics - PHPMelb 2014
PDO Basics - PHPMelb 2014
 
Redis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your applicationRedis & ZeroMQ: How to scale your application
Redis & ZeroMQ: How to scale your application
 
Jersey framework
Jersey frameworkJersey framework
Jersey framework
 
PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012
 
Tips
TipsTips
Tips
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
On UnQLite
On UnQLiteOn UnQLite
On UnQLite
 
Hanganalyze presentation
Hanganalyze presentationHanganalyze presentation
Hanganalyze presentation
 
What you need to remember when you upload to CPAN
What you need to remember when you upload to CPANWhat you need to remember when you upload to CPAN
What you need to remember when you upload to CPAN
 
Melhorando sua API com DSLs
Melhorando sua API com DSLsMelhorando sua API com DSLs
Melhorando sua API com DSLs
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
 
Plproxy
PlproxyPlproxy
Plproxy
 
Why is the application running so slowly?
Why is the application running so slowly?Why is the application running so slowly?
Why is the application running so slowly?
 
Powershell alias
Powershell aliasPowershell alias
Powershell alias
 
External Language Stored Procedures for MySQL
External Language Stored Procedures for MySQLExternal Language Stored Procedures for MySQL
External Language Stored Procedures for MySQL
 

En vedette

C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
Insight Technology, Inc.
 
Chetan postgresql partitioning
Chetan postgresql partitioningChetan postgresql partitioning
Chetan postgresql partitioning
OpenSourceIndia
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
Sergey Bushik
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQL
Laine Campbell
 

En vedette (10)

C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
 
Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014
 
Chetan postgresql partitioning
Chetan postgresql partitioningChetan postgresql partitioning
Chetan postgresql partitioning
 
Useful PostgreSQL Extensions
Useful PostgreSQL ExtensionsUseful PostgreSQL Extensions
Useful PostgreSQL Extensions
 
BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告
BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告
BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告
 
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
 
PostgreSQL в высоконагруженных проектах
PostgreSQL в высоконагруженных проектахPostgreSQL в высоконагруженных проектах
PostgreSQL в высоконагруженных проектах
 
Couchbase Performance Benchmarking
Couchbase Performance BenchmarkingCouchbase Performance Benchmarking
Couchbase Performance Benchmarking
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQL
 

Similaire à Getting Started with PL/Proxy

Nko workshop - node js crud & deploy
Nko workshop - node js crud & deployNko workshop - node js crud & deploy
Nko workshop - node js crud & deploy
Simon Su
 
Zendcon 2007 Api Design
Zendcon 2007 Api DesignZendcon 2007 Api Design
Zendcon 2007 Api Design
unodelostrece
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
ConFoo
 
Implementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdfImplementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdf
funkybabyindia
 

Similaire à Getting Started with PL/Proxy (20)

Can't Miss Features of PHP 5.3 and 5.4
Can't Miss Features of PHP 5.3 and 5.4Can't Miss Features of PHP 5.3 and 5.4
Can't Miss Features of PHP 5.3 and 5.4
 
9.1 Grand Tour
9.1 Grand Tour9.1 Grand Tour
9.1 Grand Tour
 
Redis for your boss
Redis for your bossRedis for your boss
Redis for your boss
 
Internationalizing CakePHP Applications
Internationalizing CakePHP ApplicationsInternationalizing CakePHP Applications
Internationalizing CakePHP Applications
 
Nko workshop - node js crud & deploy
Nko workshop - node js crud & deployNko workshop - node js crud & deploy
Nko workshop - node js crud & deploy
 
Supercharging WordPress Development in 2018
Supercharging WordPress Development in 2018Supercharging WordPress Development in 2018
Supercharging WordPress Development in 2018
 
Plpgsql internals
Plpgsql internalsPlpgsql internals
Plpgsql internals
 
Perforce Object and Record Model
Perforce Object and Record Model  Perforce Object and Record Model
Perforce Object and Record Model
 
Sah
SahSah
Sah
 
Zendcon 2007 Api Design
Zendcon 2007 Api DesignZendcon 2007 Api Design
Zendcon 2007 Api Design
 
Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...
 
Replacing "exec" with a type and provider
Replacing "exec" with a type and providerReplacing "exec" with a type and provider
Replacing "exec" with a type and provider
 
ETL Patterns with Postgres
ETL Patterns with PostgresETL Patterns with Postgres
ETL Patterns with Postgres
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
Building and Distributing PostgreSQL Extensions Without Learning C
Building and Distributing PostgreSQL Extensions Without Learning CBuilding and Distributing PostgreSQL Extensions Without Learning C
Building and Distributing PostgreSQL Extensions Without Learning C
 
Fatc
FatcFatc
Fatc
 
Php on the desktop and php gtk2
Php on the desktop and php gtk2Php on the desktop and php gtk2
Php on the desktop and php gtk2
 
Implementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdfImplementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdf
 
Pl python python w postgre-sql
Pl python   python w postgre-sqlPl python   python w postgre-sql
Pl python python w postgre-sql
 

Plus de Peter Eisentraut

Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
Peter Eisentraut
 
The Common Debian Build System (CDBS)
The Common Debian Build System (CDBS)The Common Debian Build System (CDBS)
The Common Debian Build System (CDBS)
Peter Eisentraut
 

Plus de Peter Eisentraut (20)

Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQL
 
Linux distribution for the cloud
Linux distribution for the cloudLinux distribution for the cloud
Linux distribution for the cloud
 
Most Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL FeaturesMost Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL Features
 
Porting Applications From Oracle To PostgreSQL
Porting Applications From Oracle To PostgreSQLPorting Applications From Oracle To PostgreSQL
Porting Applications From Oracle To PostgreSQL
 
Porting Oracle Applications to PostgreSQL
Porting Oracle Applications to PostgreSQLPorting Oracle Applications to PostgreSQL
Porting Oracle Applications to PostgreSQL
 
PostgreSQL and XML
PostgreSQL and XMLPostgreSQL and XML
PostgreSQL and XML
 
XML Support: Specifications and Development
XML Support: Specifications and DevelopmentXML Support: Specifications and Development
XML Support: Specifications and Development
 
PostgreSQL: Die Freie Datenbankalternative
PostgreSQL: Die Freie DatenbankalternativePostgreSQL: Die Freie Datenbankalternative
PostgreSQL: Die Freie Datenbankalternative
 
The Road to the XML Type: Current and Future Developments
The Road to the XML Type: Current and Future DevelopmentsThe Road to the XML Type: Current and Future Developments
The Road to the XML Type: Current and Future Developments
 
Access ohne Access: Freie Datenbank-Frontends
Access ohne Access: Freie Datenbank-FrontendsAccess ohne Access: Freie Datenbank-Frontends
Access ohne Access: Freie Datenbank-Frontends
 
Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
 
PostgreSQL News
PostgreSQL NewsPostgreSQL News
PostgreSQL News
 
PostgreSQL News
PostgreSQL NewsPostgreSQL News
PostgreSQL News
 
Access ohne Access: Freie Datenbank-Frontends
Access ohne Access: Freie Datenbank-FrontendsAccess ohne Access: Freie Datenbank-Frontends
Access ohne Access: Freie Datenbank-Frontends
 
Docbook: Textverarbeitung mit XML
Docbook: Textverarbeitung mit XMLDocbook: Textverarbeitung mit XML
Docbook: Textverarbeitung mit XML
 
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...
 
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
Collateral Damage:
Consequences of Spam and Virus Filtering for the E-Mail S...Collateral Damage:
Consequences of Spam and Virus Filtering for the E-Mail S...
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
 
Spaß mit PostgreSQL
Spaß mit PostgreSQLSpaß mit PostgreSQL
Spaß mit PostgreSQL
 
The Common Debian Build System (CDBS)
The Common Debian Build System (CDBS)The Common Debian Build System (CDBS)
The Common Debian Build System (CDBS)
 
SQL/MED and PostgreSQL
SQL/MED and PostgreSQLSQL/MED and PostgreSQL
SQL/MED and PostgreSQL
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Getting Started with PL/Proxy

  • 1. Getting Started with PL/Proxy Peter Eisentraut peter@eisentraut.org F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
  • 2. Concept • a database partitioning system implemented as a procedural language • “sharding”/horizontal partitioning • PostgreSQL’s No(t-only)SQL solution
  • 3. Concept application application application application frontend partition 1 partition 2 partition 3 partition 4
  • 4. Areas of Application • high write load • (high read load) • allow for some “eventual consistency” • have reasonable partitioning keys • use/plan to use server-side functions
  • 5. Example Have:1 CREATE TABLE products ( prod_id serial PRIMARY KEY , category integer NOT NULL , title varchar (50) NOT NULL , actor varchar (50) NOT NULL , price numeric (12 ,2) NOT NULL , special smallint , common_prod_id integer NOT NULL ); INSERT INTO products VALUES (...) ; UPDATE products SET ... WHERE ...; DELETE FROM products WHERE ...; plus various queries 1 dellstore2 example database
  • 6. Installation • Download: http://plproxy.projects.postgresql.org, Deb, RPM, . . . • Create language: psql -d dellstore2 -f ...../plproxy.sql
  • 7. Backend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN INSERT INTO products ( category , title , actor , price , special , common_prod_id ) VALUES ( p_category , p_title , p_actor , p_price , p_special , p_common_prod_id ) ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 8. Backend Functions II CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN UPDATE products SET price = p_price WHERE prod_id = p_prod_id ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 9. Backend Functions III CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN DELETE FROM products WHERE title = p_title ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 10. Frontend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON hashtext ( p_title ) ; $$ ; CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ;
  • 11. Frontend Functions II CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ CLUSTER ' dellstore_cluster '; RUN ON hashtext ( p_title ) ; $$ ;
  • 12. Frontend Query Functions I CREATE FUNCTION get_product_price ( p_prod_id int ) RETURNS SETOF numeric LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; SELECT price FROM products WHERE prod_id = p_prod_id ; $$ ;
  • 13. Frontend Query Functions II CREATE FUNCTION get_products_by_category ( p_category int ) RETURNS SETOF products LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; SELECT * FROM products WHERE category = p_category ; $$ ;
  • 14. Unpartitioned Small Tables CREATE FUNCTION insert_category ( p_categoryname ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON 0; $$ ;
  • 15. Which Hash Key? • natural keys (names, descriptions, UUIDs) • not serials (Consider using fewer “ID” fields.) • single columns • group sensibly to allow joins on backend
  • 16. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 17. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 (or start at 0?) • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 (or start at 0?) • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 18. Configuration CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ ... $$ ;
  • 19. get_cluster_partitions Simplistic approach: CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = ' dellstore_cluster ' THEN RETURN NEXT ' dbname = store01 host = dbbe1 '; RETURN NEXT ' dbname = store02 host = dbbe2 '; ... RETURN NEXT ' dbname = store08 host = dbbe8 '; RETURN ; END IF ; RAISE EXCEPTION ' Unknown cluster '; END ; $$ ;
  • 20. get_cluster_version Simplistic approach: CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = ' dellstore_cluster ' THEN RETURN 1; END IF ; RAISE EXCEPTION ' Unknown cluster '; END ; $$ LANGUAGE plpgsql ;
  • 21. get_cluster_config CREATE OR REPLACE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ BEGIN -- same config for all clusters key := ' connection_lifetime '; val := 30*60; -- 30 m RETURN NEXT ; RETURN ; END ; $$ ;
  • 22. Table-Driven Configuration I CREATE TABLE plproxy . partitions ( cluster_name text NOT NULL , host text NOT NULL , port text NOT NULL , dbname text NOT NULL , PRIMARY KEY ( cluster_name , dbname ) ); INSERT INTO plproxy . partitions VALUES ( ' dellstore_cluster ' , ' dbbe1 ' , ' 5432 ' , ' store01 ') , ( ' dellstore_cluster ' , ' dbbe2 ' , ' 5432 ' , ' store02 ') , ... ( ' dellstore_cluster ' , ' dbbe8 ' , ' 5432 ' , ' store03 ') ;
  • 23. Table-Driven Configuration II CREATE TABLE plproxy . cluster_users ( cluster_name text NOT NULL , remote_user text NOT NULL , local_user NOT NULL , PRIMARY KEY ( cluster_name , remote_user , local_user ) ); INSERT INTO plproxy . cluster_users VALUES ( ' dellstore_cluster ' , ' storeapp ' , ' storeapp ') ;
  • 24. Table-Driven Configuration III CREATE TABLE plproxy . remote_passwords ( host text NOT NULL , port text NOT NULL , dbname text NOT NULL , remote_user text NOT NULL , password text , PRIMARY KEY ( host , port , dbname , remote_user ) ); INSERT INTO plproxy . remote_passwords VALUES ( ' dbbe1 ' , ' 5432 ' , ' store01 ' , ' storeapp ' , ' Thu1Ued0 ') , ... -- or use . pgpass ?
  • 25. Table-Driven Configuration IV CREATE TABLE plproxy . cluster_version ( id int PRIMARY KEY ); INSERT INTO plproxy . cluster_version VALUES (1) ; GRANT SELECT ON plproxy . cluster_version TO PUBLIC ; /* extra credit : write trigger that changes the version when one of the other tables changes */
  • 26. Table-Driven Configuration V CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql SECURITY DEFINER AS $$ DECLARE r record ; BEGIN FOR r IN SELECT ' host = ' || host || ' port = ' || port || ' dbname = ' || dbname || ' user = ' || remote_user || ' password = ' || password AS dsn FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN plproxy . remote_passwords WHERE cluster_name = p_cluster_name AND local_user = session_user ORDER BY dbname -- important LOOP RETURN NEXT r. dsn ; END LOOP ; IF NOT found THEN RAISE EXCEPTION ' no such cluster : % ', p_cluster_name ; END IF ; RETURN ; END ; $$ ;
  • 27. Table-Driven Configuration VI CREATE FUNCTION plproxy . get_cluster_version ( p_cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE ret int ; BEGIN SELECT INTO ret id FROM plproxy . cluster_version ; RETURN ret ; END ; $$ ;
  • 28. SQL/MED Configuration CREATE SERVER dellstore_cluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime ' 1800 ' , p0 ' dbname = store01 host = dbbe1 ' , p1 ' dbname = store02 host = dbbe2 ' , ... p7 ' dbname = store08 host = dbbe8 ' ); CREATE USER MAPPING FOR storeapp SERVER dellstore_cluster OPTIONS ( user ' storeapp ' , password ' sekret ') ; GRANT USAGE ON SERVER dellstore_cluster TO storeapp ;
  • 29. Hash Functions RUN ON hashtext ( somecolumn ) ; • want a fast, uniform hash function • typically use hashtext • problem: implementation might change • possible solution: https://github.com/petere/pgvihash
  • 30. Sequences shard 1: ALTER SEQUENCE products_prod_id_seq MINVALUE 1 MAXVALUE 100000000 START 1; shard 2: ALTER SEQUENCE products_prod_id_seq MINVALUE 100000001 MAXVALUE 200000000 START 100000001; etc.
  • 31. Aggregates Example: count all products Backend: CREATE FUNCTION count_products () RETURNS bigint LANGUAGE SQL STABLE AS $$SELECT count (*) FROM products$$ ; Frontend: CREATE FUNCTION count_products () RETURNS SETOF bigint LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ; SELECT sum ( x ) AS count FROM count_products () AS t(x);
  • 32. Dynamic Queries I a. k. a. “cheating” ;-) CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ; CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY EXECUTE sql ; END ; $$ ;
  • 33. Dynamic Queries II SELECT * FROM execute_query ( ' SELECT title , price FROM products ') AS ( title varchar , price numeric ) ; SELECT category , sum ( sum_price ) FROM execute_query ( ' SELECT category , sum ( price ) FROM products GROUP BY category ') AS ( category int , sum_price numeric ) GROUP BY category ;
  • 34. Repartitioning • changing partitioning key is extremely cumbersome • adding partitions is somewhat cumbersome, e. g., to split shard 0: COPY ( SELECT * FROM products WHERE hashtext ( title :: text ) & 15 <> 0) TO ' somewhere '; DELETE FROM products WHERE hashtext ( title :: text ) & 15 <> 0; Better start out with enough partitions!
  • 35. PgBouncer application application application application frontend PgBouncer PgBouncer PgBouncer PgBouncer partition 1 partition 2 partition 3 partition 4 Use pool_mode = statement
  • 36. Development Issues • foreign keys • notifications • hash key check constraints • testing (pgTAP), no validator
  • 37. Administration • centralized logging • distributed shell (dsh) • query canceling/timeouts • access control, firewalling • deployment
  • 38. High Availability Frontend: • multiple frontends (DNS, load balancer?) • replicate partition configuration (Slony, Bucardo, WAL) • Heartbeat, UCARP, etc. Backend: • replicate backends shards individually (Slony, WAL, DRBD) • use partition configuration to configure load spreading or failover
  • 39. Advanced Topics • generic insert, update, delete functions • frontend joins • backend joins • finding balance between function interface and dynamic queries • arrays, SPLIT BY • use for remote database calls • cross-shard calls • SQL/MED (foreign table) integration