SlideShare une entreprise Scribd logo
1  sur  94
Télécharger pour lire hors ligne
1
    3                1
                     5
            1
            4
1
2
    Five Steps
    to PostgreSQL
1
1      Performance
                       Josh Berkus
                PostgreSQL Project
                     MelPUG 2013
1
                  3                   1
                                      5
postgresql.conf
                          1
                          4          Hardware
      1
      2                    OS &
                        Filesystem
           Query
           Tuning


    1
    1     Application
           Design
0. Getting Outfitted
5 Layer Cake


   Queries     Transactions             Application


   Drivers     Connections    Caching   Middleware

   Schema        Config                 PostgreSQL


  Filesystem     Kernel            Operating System


   Storage      RAM/CPU       Network     Hardware
5 Layer Cake


   Queries     Transactions             Application


   Drivers     Connections    Caching   Middleware

   Schema        Config                 PostgreSQL


  Filesystem     Kernel            Operating System


   Storage      RAM/CPU       Network     Hardware
Scalability Funnel


             Application

             Middleware


             PostgreSQL

                 OS


                 HW
What Flavor is Your DB?                 O
                                        1
W   ►Web Application (Web)
     ●DB smaller than RAM
     ●90% or more “one-liner” queries
What Flavor is Your DB?               O
                                      1
O   ►Online Transaction Processing
     (OLTP)
     ●DB slightly larger than RAM to 1TB
     ●20-70% small data write queries,
      some large transactions
What Flavor is Your DB?             O
                                    1
D   ►Data Warehousing (DW)
     ●Large to huge databases (100GB to
      100TB)
     ●Large complex reporting queries
     ●Large bulk loads of data
     ●Also called "Decision Support" or
      "Business Intelligence"
Tips for Good Form               O
                                 1
►Engineer for the problems you have
 ●not for the ones you don't
Tips for Good Form                     O
                                       1
►A little overallocation is cheaper than
 downtime
  ●unless you're an OEM, don't stint a few
   GB
  ●resource use will grow over time
Tips for Good Form                   O
                                     1
►Test, Tune, and Test Again
 ●you can't measure performance by “it
  seems fast”
Tips for Good Form                        O
                                          1
►Most server performance is
 thresholded
 ●“slow” usually means “25x slower”
 ●it's not how fast it is, it's how close you
  are to capacity
1   Application
     Design
Schema Design                             1
                                          1
►Table design
 ●do not optimize prematurely
   ▬normalize your tables and wait for a proven
    issue to denormalize
   ▬Postgres is designed to perform well with

    normalized tables
 ●Entity-Attribute-Value tables and other
  innovative designs tend to perform poorly
Schema Design                             1
                                          1
►Table design
 ●consider using natural keys
   ▬can cut down on the number of joins you
    need
 ●BLOBs can be slow
   ▬have to be completely rewritten,
    compressed
   ▬can also be fast, thanks to compression
Schema Design                              1
                                           1
►Table design
 ●think of when data needs to be updated,
  as well as read
   ▬sometimes you need to split tables which
    will be updated at different times
   ▬don't trap yourself into updating the same

    rows multiple times
Schema Design                         1
                                      1
►Indexing
 ●index most foreign keys
 ●index common WHERE criteria
 ●index common aggregated columns
 ●learn to use special index types:
  expressions, full text, partial
Schema Design                            1
                                         1
►Not Indexing
 ●indexes cost you on updates, deletes
   ▬especially with HOT
 ●too many indexes can confuse the
  planner
 ●don't index: tiny tables, low-cardinality
  columns
Right indexes?                         1
                                       1
►pg_stat_user_indexes
 ●shows indexes not being used
 ●note that it doesn't record unique index
  usage
►pg_stat_user_tables
 ●shows seq scans: index candidates?
 ●shows heavy update/delete tables: index
  less
Partitioning                              1
                                          1
►Partition large or growing tables
  ●historical data
    ▬data will be purged
    ▬massive deletes are server-killers

  ●very large tables
    ▬anything over 10GB / 10m rows
    ▬partition by active/passive
Partitioning                             1
                                         1
►Application must be partition-compliant
  ●every query should call the partition key
  ●pre-create your partitions
    ▬do not create them on demand … they will
     lock
Query design                         1
                                     1
►Do more with each query
 ●PostgreSQL does well with fewer larger
  queries
 ●not as well with many small queries
 ●avoid doing joins, tree-walking in
  middleware
Query design                        1
                                    1
►Do more with each transaction
 ●batch related writes into large
  transactions
Query design                           1
                                       1
►Know the query gotchas (per version)
 ●Always try rewriting subqueries as joins
 ●try swapping NOT IN and NOT EXISTS
  for bad queries
 ●try to make sure that index/key types
  match
 ●avoid unanchored text searches "ILIKE
  '%josh%'"
But I use ORM!                         1
                                       1
►ORM != high performance
 ●ORM is for fast development, not fast
  databases
 ●make sure your ORM allows "tweaking"
  queries
 ●applications which are pushing the limits
  of performance probably can't use ORM
   ▬but most don't have a problem
It's All About Caching                 1
                                       1
►Use prepared queries    W O

 ●whenever you have repetitive loops
It's All About Caching                       1
                                             1
►Cache, cache everywhere               W O

 ●plan caching: on the PostgreSQL server
 ●parse caching: in some drivers
 ●data caching:
   ▬in the appserver
   ▬in memcached/varnish/nginx

   ▬in the client (javascript, etc.)

 ●use as many kinds of caching as you can
It's All About Caching              1
                                    1
But …
►think carefully about cache invalidation
  ●and avoid “cache storms”
Connection Management               1
                                    1
►Connections take resources   W O

 ●RAM, CPU
 ●transaction checking
Connection Management                    1
                                         1
►Make sure you're only using       W O
 connections you need
 ●look for “<IDLE>” and “<IDLE> in
  Transaction”
 ●log and check for a pattern of connection
  growth
 ●make sure that database and appserver
  timeouts are synchronized
Pooling                              1
                                     1
►Over 100 connections? You need
 pooling!

   Webserver

   Webserver   Pool     PostgreSQL

   Webserver
Pooling                              1
                                     1
►New connections are expensive
 ●use persistent connections or connection
  pooling sofware
   ▬appservers
   ▬pgBouncer

   ▬pgPool (sort of)

 ●set pool side to maximum connections
  needed
2
1
    Query
    Tuning
Bad Queries                                                                                                                     1
                                                                                                                                2
                                   Ranked Query Execution Times

                   5000




                   4000




                   3000
  execution time




                   2000




                   1000




                      0
                          5   10   15   20   25   30   35   40   45   50    55    60   65   70   75   80   85   90   95   100
                                                                           % ranking
Optimize Your Queries
                                          1
                                          2
in Test
►Before you go production
 ●simulate user load on the application
 ●monitor and fix slow queries
 ●look for worst procedures
Optimize Your Queries
                                        1
                                        2
in Test
►Look for “bad queries”
 ●queries which take too long
 ●data updates which never complete
 ●long-running stored procedures
 ●interfaces issuing too many queries
 ●queries which block
Finding bad queries            1
                               2

               ►Log Analysis
                 ●dozens of logging
                  options
                 ●log_min_duration_
                  statement
                 ●pgfouine
                 ●pgBadger
Fixing bad queries         1
                           2
►EXPLAIN ANALYZE
►things to look for:
 ●bad rowcount estimates
 ●sequential scans
 ●high-count loops
 ●large on-disk sorts
Fixing bad queries                      1
                                        2
►reading explain analyze is an art
  ●it's an inverted tree
  ●look for the deepest level at which the
   problem occurs
►try re-writing complex queries several
 ways
Query Optimization Cycle                    1
                                             2
    log queries               run pgbadger



                                   explain analyze
apply fixes                        worst queries



                  troubleshoot
                  worst queries
Query Optimization Cycle
                                        1
                                        2
 (new)
          check pg_stat_statements




                               explain analyze
apply fixes                    worst queries



              troubleshoot
              worst queries
Procedure Optimization
                                           1
                                           2
 Cycle
    log queries            run pg_fouine




                                  instrument
apply fixes                       worst
                                  functions


                  find slow
                  operations
Procedure Optimization
                                         1
                                         2
 Cycle (new)
          check pg_stat_function




                                   instrument
apply fixes                        worst
                                   functions

                find slow
                operations
3
                  1
postgresql.conf
max_connections                    3
                                   1
►As many as you need to use
  ●web apps: 100 to 300   W O
                           D
  ●analytics: 20 to 40
►If you need more than 100 regularly,
 use a connection pooler
  ●like pgbouncer
shared_buffers                          3
                                        1
►1/4 of RAM on a dedicated server
                                  W O
 ●not more than 8GB (test)
 ●cache_miss statistics can tell you if you
  need more
►less buffers to preserve cache space
                                        D
Other memory parameters                   3
                                          1
►work_mem
 ●non-shared
   ▬lower it for many connections   W O

   ▬raise it for large queries D

 ●watch for signs of misallocation
   ▬swapping RAM: too much work_mem
   ▬log temp files: not enough work_mem

 ●probably better to allocate by task/ROLE
Other memory parameters                     3
                                            1
►maintenance_work_mem
 ●the faster vacuum completes, the better
   ▬but watch out for multiple autovacuum
    workers!
 ●raise to 256MB to 1GB for large
  databases
 ●also used for index creation
   ▬raise it for bulk loads
Other memory parameters             3
                                    1
►temp_buffers
 ●max size of temp tables before swapping
  to disk
 ●raise if you use lots of temp tables D
►wal_buffers
 ●raise it to 32MB
Commits                               3
                                      1
►checkpoint_segments
 ●more if you have the disk: 16, 64, 128
►synchronous_commit
 ●response time more important than data
  integrity?
 ●turn synchronous_commit = off W
 ●lose a finite amount of data in a
  shutdown
Query tuning                             3
                                         1
►effective_cache_size
  ●RAM available for queries
  ●set it to 3/4 of your available RAM
►default_statistics_target                   D
  ●raise to 200 to 1000 for large databases
  ●now defaults to 100
  ●setting statistics per column is better
Query tuning                           3
                                       1
►effective_io_concurrency
 ●set to number of disks or channels
 ●advisory only
 ●Linux only
A word about
Random Page Cost
                                      3
                                      1
►Abused as a “force index use”
 parameter
►Lower it if the seek/scan ratio of your
 storage is actually different
  ●SSD/NAND: 1.0 to 2.0
  ●EC2: 1.1 to 2.0
  ●High-end SAN: 2.5 to 3.5
►Never below 1.0
Maintenance                            3
                                       1
►Autovacuum
 ●leave it on for any application which gets
  constant writes W O
 ●not so good for batch writes -- do manual
  vacuum for bulk loads D
Maintenance                           3
                                      1
►Autovacuum
 ●have 100's or 1000's of tables?
  multiple_autovacuum_workers
   ▬but not more than ½ cores
 ●large tables? raise
  autovacuum_vacuum_cost_limit
 ●you can change settings per table
1
  4
   OS &
Filesystem
Spread Your Files Around             1
                                     4
►Separate the transaction log if   O D

 possible
  ●pg_xlog directory
  ●on a dedicated disk/array, performs
   10-50% faster
  ●many WAL options only work if you have
   a separate drive
Spread Your Files Around              1
                                      4

number of drives/arrays      1    2     3
                          which partition
OS/applications              1    1     1
transaction log              1    1     2
database                     1    2     3
Spread Your Files Around               1
                                       4
►Tablespaces for temp files     D

 ●more frequently useful if you do a lot of
  disk sorts
 ●Postgres can round-robin multiple temp
  tablespaces
Linux Tuning                             1
                                         4
►Filesystems
 ●Use XFS or Ext4
   ▬butrfs not ready yet, may never work for DB
   ▬Ext3 has horrible flushing behavior

 ●Reduce logging
   ▬data=ordered, noatime,
    nodiratime
Linux Tuning                        1
                                    4
►OS tuning
 ●must increase shmmax, shmall in kernel
 ●use deadline or noop scheduler to speed
  writes
 ●disable NUMA memory localization
  (recent)
 ●check your kernel version carefully for
  performance issues!
Linux Tuning                   1
                               4
►Turn off the OOM Killer!
  ● vm.oom-kill = 0
  ● vm.overcommit_memory = 2
  ● vm.overcommit_ratio = 80
OpenSolaris/IIlumos                 1
                                    4
►Filesystems
 ●Use ZFS
   ▬reduce block size to 8K   W O

 ●turn off full_page_writes
►OS configuration
 ●no need to configure shared memory
 ●use packages compiled with Sun
  compiler
Windows, OSX Tuning      1
                         4
►You're joking, right?
What about The Cloud?               1
                                    4
►Configuring for cloud servers is
 different
  ●shared resources
  ●unreliable I/O
  ●small resource limits
►Also depends on which cloud
  ●AWS, Rackspace, Joyent, GoGrid

… so I can't address it all here.
What about The Cloud?                    1
                                         4
►Some general advice:
 ●make sure your database fits in RAM
   ▬except on Joyent
 ●Don't bother with most OS/FS tuning
   ▬just some basic FS configuration options
 ●use synchronous_commit = off if
  possible
Set up Monitoring!                        1
                                          4
►Get warning ahead of time
 ●know about performance problems
  before they go critical
 ●set up alerts
   ▬80% of capacity is an emergency!
 ●set up trending reports
   ▬is there a pattern of steady growth
1
 5
Hardware
Hardware Basics                   1
                                  5
►Four basic components:
 ●CPU
 ●RAM
 ●I/O: Disks and disk bandwidth
 ●Network
Hardware Basics                           1
                                          5
►Different priorities for different
 applications
  ●Web: CPU, Network, RAM, ... I/O    W

  ●OLTP: balance all O
  ●DW: I/O, CPU, RAM D
Getting Enough CPU                   1
                                     5
►One Core, One Query
  ●How many concurrent queries do you
   need?
  ●Best performance at 1 core per no more
   than two concurrent queries
►So if you can up your core count, do
►Also: L1, L2 cache size matters
Getting Enough RAM                 1
                                   5
►RAM use is "thresholded"
 ●as long as you are above the amount of
  RAM you need, even 5%, server will be
  fast
 ●go even 1% over and things slow down a
  lot
Getting Enough RAM                     1
                                       5
►Critical RAM thresholds           W

 ●Do you have enough RAM to keep the
  database in shared_buffers?
    ▬Ram 3x to 6x the size of DB
Getting Enough RAM                          1
                                            5
►Critical RAM thresholds           O

 ●Do you have enough RAM to cache the
  whole database?
    ▬RAM 2x to 3x the on-disk size of the
     database
 ●Do you have enough RAM to cache the
  “working set”?
    ▬the data which is needed 95% of the time
Getting Enough RAM                          1
                                            5
►Critical RAM thresholds           D

 ●Do you have enough RAM for sorts &
  aggregates?
    ▬What's the largest data set you'll need to
     work with?
    ▬For how many users
Other RAM Issues                    1
                                    5
►Get ECC RAM
 ●Better to know about bad RAM before it
  corrupts your data.
►What else will you want RAM for?
 ●RAMdisk?
 ●SWRaid?
 ●Applications?
Getting Enough I/O                   1
                                     5
►Will your database be I/O Bound?
 ●many writes: bound by transaction log O
 ●database much larger than RAM: bound
  by I/O for many/most queries D
Getting Enough I/O                       1
                                         5
►Optimize for the I/O you'll need
  ●if you DB is terabytes, spend most of
   your money on disks
  ●calculate how long it will take to read
   your entire database from disk
    ▬backups
    ▬snapshots

  ●don't forget the transaction log!
I/O Decision Tree                                                 1
                                                                  5
lots of              fits in
           No                  Yes     mirrored
writes?              RAM?

     Yes        No

      afford
                               terabytes          HW RAID
     good HW          Yes                   No
                                of data?
      RAID?
                                 Yes
     No
                                                    mostly
SW RAID                        Storage              read?
                               Device
                                                  Yes        No

                                           RAID 5             RAID 1+0
I/O Tips                            1
                                    5
►RAID
 ●get battery backup and turn your write
  cache on
 ●SAS has 2x the real throughput of SATA
 ●more spindles = faster database
   ▬big disks are generally slow
I/O Tips                               1
                                       5
►DAS/SAN/NAS
 ●measure lag time: it can kill response
  time
 ●how many channels?
   ▬“gigabit” is only 100mb/s
   ▬make sure multipath works

 ●use fiber if you can afford it
I/O Tips           1
                   5

           iSCSI
             =
           death
SSD                                   1
                                      5
►Very fast seeks              D

 ●great for index access on large tables
 ●up to 20X faster
►Not very fast random writes
 ●low-end models can be slower than HDD
 ●most are about 2X speed
►And use server models, not desktop!
NAND (FusionIO)                      1
                                     5
All the advantages of SSD, Plus:
►Very fast writes ( 5X to 20X )    W O

  ●more concurrency on writes
  ●MUCH lower latency
►But … very expensive (50X)
Tablespaces for NVRAM                    1
                                         5
►Have a "hot" and a "cold" tablespace
  ●current data on "hot"                 O D

  ●older/less important data on "cold"
  ●combine with partitioning
►compromise between speed and size
Network                           1
                                  5
►Network can be your bottleneck
 ●lag time
 ●bandwith
 ●oversubscribed switches
 ●NAS
Network                            1
                                   5
►Have dedicated connections
 ●between appserver and database server
 ●between database server and failover
  server
 ●between database and storage
Network                                 1
                                        5
►Data Transfers
 ●Gigabit is only 100MB/s
 ●Calculate capacity for data copies,
  standby, dumps
The Most Important
Hardware Advice:
                                      1
                                      5

►Quality matters
 ●not all CPUs are the same
 ●not all RAID cards are the same
 ●not all server systems are the same
 ●one bad piece of hardware, or bad driver,
  can destroy your application
  performance
The Most Important
Hardware Advice:
                                       1
                                       5
►High-performance databases means
 hardware expertise
 ●the statistics don't tell you everything
 ●vendors lie
 ●you will need to research different
  models and combinations
 ●read the pgsql-performance mailing list
The Most Important
Hardware Advice:
                                    1
                                    5
►Make sure you test your hardware
 before you put your database on it
  ●“Try before you buy”
  ●Never trust the vendor or your sysadmins
The Most Important
Hardware Advice:
                                 1
                                 5
►So Test, Test, Test!
 ●CPU: PassMark, sysbench, Spec CPU
 ●RAM: memtest, cachebench, Stream
 ●I/O: bonnie++, dd, iozone
 ●Network: bwping, netperf
 ●DB: pgBench, sysbench
Questions?                                                                                           1
                                                                                                     6
►Josh Berkus                                      ►More Advice
 ● josh@pgexperts.com                                    ● www.postgresql.org/docs
 ● www.pgexperts.com                                     ● pgsql-performance
     ▬ /presentations.html                                 mailing list
 ● www.databasesoup.com                                  ● planet.postgresql.org
                                                         ● irc.freenode.net
                                                             ▬ #postgresql




            This talk is copyright 2013 Josh Berkus, and is licensed under the creative commons attribution license

Contenu connexe

Tendances

PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlMydbops
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
 
第一次Elasticsearch就上手
第一次Elasticsearch就上手第一次Elasticsearch就上手
第一次Elasticsearch就上手Aaron King
 
Fundamental of ELK Stack
Fundamental of ELK StackFundamental of ELK Stack
Fundamental of ELK Stack주표 홍
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA EDB
 
Using ZFS file system with MySQL
Using ZFS file system with MySQLUsing ZFS file system with MySQL
Using ZFS file system with MySQLMydbops
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraScyllaDB
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for HumansCitus Data
 
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018Amazon Web Services Korea
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나종현 김
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)Ontico
 
MySql Practical Partitioning
MySql Practical PartitioningMySql Practical Partitioning
MySql Practical PartitioningAndrei Tsibets
 
MySQL Performance schema missing_manual_flossuk
MySQL Performance schema missing_manual_flossukMySQL Performance schema missing_manual_flossuk
MySQL Performance schema missing_manual_flossukValeriy Kravchuk
 
¿Cómo elegir tu workflow de Git?
¿Cómo elegir tu workflow de Git?¿Cómo elegir tu workflow de Git?
¿Cómo elegir tu workflow de Git?Gerónimo Di Pierro
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 

Tendances (20)

PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in Postgresql
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
第一次Elasticsearch就上手
第一次Elasticsearch就上手第一次Elasticsearch就上手
第一次Elasticsearch就上手
 
Fundamental of ELK Stack
Fundamental of ELK StackFundamental of ELK Stack
Fundamental of ELK Stack
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
Using ZFS file system with MySQL
Using ZFS file system with MySQLUsing ZFS file system with MySQL
Using ZFS file system with MySQL
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache Cassandra
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for Humans
 
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018
Amazon Redshift 아키텍처 및 모범사례::김민성::AWS Summit Seoul 2018
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
 
MySql Practical Partitioning
MySql Practical PartitioningMySql Practical Partitioning
MySql Practical Partitioning
 
MySQL Performance schema missing_manual_flossuk
MySQL Performance schema missing_manual_flossukMySQL Performance schema missing_manual_flossuk
MySQL Performance schema missing_manual_flossuk
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
 
Indexes in postgres
Indexes in postgresIndexes in postgres
Indexes in postgres
 
¿Cómo elegir tu workflow de Git?
¿Cómo elegir tu workflow de Git?¿Cómo elegir tu workflow de Git?
¿Cómo elegir tu workflow de Git?
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
 

Similaire à Five steps perform_2013

Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) PostgreSQL Experts, Inc.
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerkuchinskaya
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big timeproitconsult
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 
How Much Kafka?
How Much Kafka?How Much Kafka?
How Much Kafka?confluent
 
The 5 Minute MySQL DBA
The 5 Minute MySQL DBAThe 5 Minute MySQL DBA
The 5 Minute MySQL DBAIrawan Soetomo
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
Pre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyPre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyAntonios Chatzipavlis
 
Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+Python Ireland
 
Utopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K usersUtopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K usersJaime Buelta
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the CloudTony Tam
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremKris Buytaert
 
Path dependent-development (PyCon India)
Path dependent-development (PyCon India)Path dependent-development (PyCon India)
Path dependent-development (PyCon India)ncoghlan_dev
 
Scaling Magento
Scaling MagentoScaling Magento
Scaling MagentoCopious
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 

Similaire à Five steps perform_2013 (20)

Five steps perform_2009 (1)
Five steps perform_2009 (1)Five steps perform_2009 (1)
Five steps perform_2009 (1)
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big time
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
How Much Kafka?
How Much Kafka?How Much Kafka?
How Much Kafka?
 
Performance Whackamole (short version)
Performance Whackamole (short version)Performance Whackamole (short version)
Performance Whackamole (short version)
 
The 5 Minute MySQL DBA
The 5 Minute MySQL DBAThe 5 Minute MySQL DBA
The 5 Minute MySQL DBA
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Pre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyPre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctly
 
Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+
 
Utopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K usersUtopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K users
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the Cloud
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPrem
 
Path dependent-development (PyCon India)
Path dependent-development (PyCon India)Path dependent-development (PyCon India)
Path dependent-development (PyCon India)
 
Scaling Magento
Scaling MagentoScaling Magento
Scaling Magento
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 

Plus de PostgreSQL Experts, Inc.

PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsPostgreSQL Experts, Inc.
 

Plus de PostgreSQL Experts, Inc. (20)

Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
HowTo DR
HowTo DRHowTo DR
HowTo DR
 
Give A Great Tech Talk 2013
Give A Great Tech Talk 2013Give A Great Tech Talk 2013
Give A Great Tech Talk 2013
 
Pg py-and-squid-pypgday
Pg py-and-squid-pypgdayPg py-and-squid-pypgday
Pg py-and-squid-pypgday
 
92 grand prix_2013
92 grand prix_201392 grand prix_2013
92 grand prix_2013
 
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
 
PWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with PerlPWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with Perl
 
10 Ways to Destroy Your Community
10 Ways to Destroy Your Community10 Ways to Destroy Your Community
10 Ways to Destroy Your Community
 
Open Source Press Relations
Open Source Press RelationsOpen Source Press Relations
Open Source Press Relations
 
5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community
 
Preventing Community (from Linux Collab)
Preventing Community (from Linux Collab)Preventing Community (from Linux Collab)
Preventing Community (from Linux Collab)
 
Development of 8.3 In India
Development of 8.3 In IndiaDevelopment of 8.3 In India
Development of 8.3 In India
 
PostgreSQL and MySQL
PostgreSQL and MySQLPostgreSQL and MySQL
PostgreSQL and MySQL
 
50 Ways To Love Your Project
50 Ways To Love Your Project50 Ways To Love Your Project
50 Ways To Love Your Project
 
8.4 Upcoming Features
8.4 Upcoming Features 8.4 Upcoming Features
8.4 Upcoming Features
 
Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and Variants
 
Writeable CTEs: The Next Big Thing
Writeable CTEs: The Next Big ThingWriteable CTEs: The Next Big Thing
Writeable CTEs: The Next Big Thing
 

Five steps perform_2013

  • 1. 1 3 1 5 1 4 1 2 Five Steps to PostgreSQL 1 1 Performance Josh Berkus PostgreSQL Project MelPUG 2013
  • 2. 1 3 1 5 postgresql.conf 1 4 Hardware 1 2 OS & Filesystem Query Tuning 1 1 Application Design
  • 4. 5 Layer Cake Queries Transactions Application Drivers Connections Caching Middleware Schema Config PostgreSQL Filesystem Kernel Operating System Storage RAM/CPU Network Hardware
  • 5. 5 Layer Cake Queries Transactions Application Drivers Connections Caching Middleware Schema Config PostgreSQL Filesystem Kernel Operating System Storage RAM/CPU Network Hardware
  • 6. Scalability Funnel Application Middleware PostgreSQL OS HW
  • 7. What Flavor is Your DB? O 1 W ►Web Application (Web) ●DB smaller than RAM ●90% or more “one-liner” queries
  • 8. What Flavor is Your DB? O 1 O ►Online Transaction Processing (OLTP) ●DB slightly larger than RAM to 1TB ●20-70% small data write queries, some large transactions
  • 9. What Flavor is Your DB? O 1 D ►Data Warehousing (DW) ●Large to huge databases (100GB to 100TB) ●Large complex reporting queries ●Large bulk loads of data ●Also called "Decision Support" or "Business Intelligence"
  • 10. Tips for Good Form O 1 ►Engineer for the problems you have ●not for the ones you don't
  • 11. Tips for Good Form O 1 ►A little overallocation is cheaper than downtime ●unless you're an OEM, don't stint a few GB ●resource use will grow over time
  • 12. Tips for Good Form O 1 ►Test, Tune, and Test Again ●you can't measure performance by “it seems fast”
  • 13. Tips for Good Form O 1 ►Most server performance is thresholded ●“slow” usually means “25x slower” ●it's not how fast it is, it's how close you are to capacity
  • 14. 1 Application Design
  • 15. Schema Design 1 1 ►Table design ●do not optimize prematurely ▬normalize your tables and wait for a proven issue to denormalize ▬Postgres is designed to perform well with normalized tables ●Entity-Attribute-Value tables and other innovative designs tend to perform poorly
  • 16. Schema Design 1 1 ►Table design ●consider using natural keys ▬can cut down on the number of joins you need ●BLOBs can be slow ▬have to be completely rewritten, compressed ▬can also be fast, thanks to compression
  • 17. Schema Design 1 1 ►Table design ●think of when data needs to be updated, as well as read ▬sometimes you need to split tables which will be updated at different times ▬don't trap yourself into updating the same rows multiple times
  • 18. Schema Design 1 1 ►Indexing ●index most foreign keys ●index common WHERE criteria ●index common aggregated columns ●learn to use special index types: expressions, full text, partial
  • 19. Schema Design 1 1 ►Not Indexing ●indexes cost you on updates, deletes ▬especially with HOT ●too many indexes can confuse the planner ●don't index: tiny tables, low-cardinality columns
  • 20. Right indexes? 1 1 ►pg_stat_user_indexes ●shows indexes not being used ●note that it doesn't record unique index usage ►pg_stat_user_tables ●shows seq scans: index candidates? ●shows heavy update/delete tables: index less
  • 21. Partitioning 1 1 ►Partition large or growing tables ●historical data ▬data will be purged ▬massive deletes are server-killers ●very large tables ▬anything over 10GB / 10m rows ▬partition by active/passive
  • 22. Partitioning 1 1 ►Application must be partition-compliant ●every query should call the partition key ●pre-create your partitions ▬do not create them on demand … they will lock
  • 23. Query design 1 1 ►Do more with each query ●PostgreSQL does well with fewer larger queries ●not as well with many small queries ●avoid doing joins, tree-walking in middleware
  • 24. Query design 1 1 ►Do more with each transaction ●batch related writes into large transactions
  • 25. Query design 1 1 ►Know the query gotchas (per version) ●Always try rewriting subqueries as joins ●try swapping NOT IN and NOT EXISTS for bad queries ●try to make sure that index/key types match ●avoid unanchored text searches "ILIKE '%josh%'"
  • 26. But I use ORM! 1 1 ►ORM != high performance ●ORM is for fast development, not fast databases ●make sure your ORM allows "tweaking" queries ●applications which are pushing the limits of performance probably can't use ORM ▬but most don't have a problem
  • 27. It's All About Caching 1 1 ►Use prepared queries W O ●whenever you have repetitive loops
  • 28. It's All About Caching 1 1 ►Cache, cache everywhere W O ●plan caching: on the PostgreSQL server ●parse caching: in some drivers ●data caching: ▬in the appserver ▬in memcached/varnish/nginx ▬in the client (javascript, etc.) ●use as many kinds of caching as you can
  • 29. It's All About Caching 1 1 But … ►think carefully about cache invalidation ●and avoid “cache storms”
  • 30. Connection Management 1 1 ►Connections take resources W O ●RAM, CPU ●transaction checking
  • 31. Connection Management 1 1 ►Make sure you're only using W O connections you need ●look for “<IDLE>” and “<IDLE> in Transaction” ●log and check for a pattern of connection growth ●make sure that database and appserver timeouts are synchronized
  • 32. Pooling 1 1 ►Over 100 connections? You need pooling! Webserver Webserver Pool PostgreSQL Webserver
  • 33. Pooling 1 1 ►New connections are expensive ●use persistent connections or connection pooling sofware ▬appservers ▬pgBouncer ▬pgPool (sort of) ●set pool side to maximum connections needed
  • 34. 2 1 Query Tuning
  • 35. Bad Queries 1 2 Ranked Query Execution Times 5000 4000 3000 execution time 2000 1000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 % ranking
  • 36. Optimize Your Queries 1 2 in Test ►Before you go production ●simulate user load on the application ●monitor and fix slow queries ●look for worst procedures
  • 37. Optimize Your Queries 1 2 in Test ►Look for “bad queries” ●queries which take too long ●data updates which never complete ●long-running stored procedures ●interfaces issuing too many queries ●queries which block
  • 38. Finding bad queries 1 2 ►Log Analysis ●dozens of logging options ●log_min_duration_ statement ●pgfouine ●pgBadger
  • 39. Fixing bad queries 1 2 ►EXPLAIN ANALYZE ►things to look for: ●bad rowcount estimates ●sequential scans ●high-count loops ●large on-disk sorts
  • 40. Fixing bad queries 1 2 ►reading explain analyze is an art ●it's an inverted tree ●look for the deepest level at which the problem occurs ►try re-writing complex queries several ways
  • 41. Query Optimization Cycle 1 2 log queries run pgbadger explain analyze apply fixes worst queries troubleshoot worst queries
  • 42. Query Optimization Cycle 1 2 (new) check pg_stat_statements explain analyze apply fixes worst queries troubleshoot worst queries
  • 43. Procedure Optimization 1 2 Cycle log queries run pg_fouine instrument apply fixes worst functions find slow operations
  • 44. Procedure Optimization 1 2 Cycle (new) check pg_stat_function instrument apply fixes worst functions find slow operations
  • 45. 3 1 postgresql.conf
  • 46. max_connections 3 1 ►As many as you need to use ●web apps: 100 to 300 W O D ●analytics: 20 to 40 ►If you need more than 100 regularly, use a connection pooler ●like pgbouncer
  • 47. shared_buffers 3 1 ►1/4 of RAM on a dedicated server W O ●not more than 8GB (test) ●cache_miss statistics can tell you if you need more ►less buffers to preserve cache space D
  • 48. Other memory parameters 3 1 ►work_mem ●non-shared ▬lower it for many connections W O ▬raise it for large queries D ●watch for signs of misallocation ▬swapping RAM: too much work_mem ▬log temp files: not enough work_mem ●probably better to allocate by task/ROLE
  • 49. Other memory parameters 3 1 ►maintenance_work_mem ●the faster vacuum completes, the better ▬but watch out for multiple autovacuum workers! ●raise to 256MB to 1GB for large databases ●also used for index creation ▬raise it for bulk loads
  • 50. Other memory parameters 3 1 ►temp_buffers ●max size of temp tables before swapping to disk ●raise if you use lots of temp tables D ►wal_buffers ●raise it to 32MB
  • 51. Commits 3 1 ►checkpoint_segments ●more if you have the disk: 16, 64, 128 ►synchronous_commit ●response time more important than data integrity? ●turn synchronous_commit = off W ●lose a finite amount of data in a shutdown
  • 52. Query tuning 3 1 ►effective_cache_size ●RAM available for queries ●set it to 3/4 of your available RAM ►default_statistics_target D ●raise to 200 to 1000 for large databases ●now defaults to 100 ●setting statistics per column is better
  • 53. Query tuning 3 1 ►effective_io_concurrency ●set to number of disks or channels ●advisory only ●Linux only
  • 54. A word about Random Page Cost 3 1 ►Abused as a “force index use” parameter ►Lower it if the seek/scan ratio of your storage is actually different ●SSD/NAND: 1.0 to 2.0 ●EC2: 1.1 to 2.0 ●High-end SAN: 2.5 to 3.5 ►Never below 1.0
  • 55. Maintenance 3 1 ►Autovacuum ●leave it on for any application which gets constant writes W O ●not so good for batch writes -- do manual vacuum for bulk loads D
  • 56. Maintenance 3 1 ►Autovacuum ●have 100's or 1000's of tables? multiple_autovacuum_workers ▬but not more than ½ cores ●large tables? raise autovacuum_vacuum_cost_limit ●you can change settings per table
  • 57. 1 4 OS & Filesystem
  • 58. Spread Your Files Around 1 4 ►Separate the transaction log if O D possible ●pg_xlog directory ●on a dedicated disk/array, performs 10-50% faster ●many WAL options only work if you have a separate drive
  • 59. Spread Your Files Around 1 4 number of drives/arrays 1 2 3 which partition OS/applications 1 1 1 transaction log 1 1 2 database 1 2 3
  • 60. Spread Your Files Around 1 4 ►Tablespaces for temp files D ●more frequently useful if you do a lot of disk sorts ●Postgres can round-robin multiple temp tablespaces
  • 61. Linux Tuning 1 4 ►Filesystems ●Use XFS or Ext4 ▬butrfs not ready yet, may never work for DB ▬Ext3 has horrible flushing behavior ●Reduce logging ▬data=ordered, noatime, nodiratime
  • 62. Linux Tuning 1 4 ►OS tuning ●must increase shmmax, shmall in kernel ●use deadline or noop scheduler to speed writes ●disable NUMA memory localization (recent) ●check your kernel version carefully for performance issues!
  • 63. Linux Tuning 1 4 ►Turn off the OOM Killer! ● vm.oom-kill = 0 ● vm.overcommit_memory = 2 ● vm.overcommit_ratio = 80
  • 64. OpenSolaris/IIlumos 1 4 ►Filesystems ●Use ZFS ▬reduce block size to 8K W O ●turn off full_page_writes ►OS configuration ●no need to configure shared memory ●use packages compiled with Sun compiler
  • 65. Windows, OSX Tuning 1 4 ►You're joking, right?
  • 66. What about The Cloud? 1 4 ►Configuring for cloud servers is different ●shared resources ●unreliable I/O ●small resource limits ►Also depends on which cloud ●AWS, Rackspace, Joyent, GoGrid … so I can't address it all here.
  • 67. What about The Cloud? 1 4 ►Some general advice: ●make sure your database fits in RAM ▬except on Joyent ●Don't bother with most OS/FS tuning ▬just some basic FS configuration options ●use synchronous_commit = off if possible
  • 68. Set up Monitoring! 1 4 ►Get warning ahead of time ●know about performance problems before they go critical ●set up alerts ▬80% of capacity is an emergency! ●set up trending reports ▬is there a pattern of steady growth
  • 70. Hardware Basics 1 5 ►Four basic components: ●CPU ●RAM ●I/O: Disks and disk bandwidth ●Network
  • 71. Hardware Basics 1 5 ►Different priorities for different applications ●Web: CPU, Network, RAM, ... I/O W ●OLTP: balance all O ●DW: I/O, CPU, RAM D
  • 72. Getting Enough CPU 1 5 ►One Core, One Query ●How many concurrent queries do you need? ●Best performance at 1 core per no more than two concurrent queries ►So if you can up your core count, do ►Also: L1, L2 cache size matters
  • 73. Getting Enough RAM 1 5 ►RAM use is "thresholded" ●as long as you are above the amount of RAM you need, even 5%, server will be fast ●go even 1% over and things slow down a lot
  • 74. Getting Enough RAM 1 5 ►Critical RAM thresholds W ●Do you have enough RAM to keep the database in shared_buffers? ▬Ram 3x to 6x the size of DB
  • 75. Getting Enough RAM 1 5 ►Critical RAM thresholds O ●Do you have enough RAM to cache the whole database? ▬RAM 2x to 3x the on-disk size of the database ●Do you have enough RAM to cache the “working set”? ▬the data which is needed 95% of the time
  • 76. Getting Enough RAM 1 5 ►Critical RAM thresholds D ●Do you have enough RAM for sorts & aggregates? ▬What's the largest data set you'll need to work with? ▬For how many users
  • 77. Other RAM Issues 1 5 ►Get ECC RAM ●Better to know about bad RAM before it corrupts your data. ►What else will you want RAM for? ●RAMdisk? ●SWRaid? ●Applications?
  • 78. Getting Enough I/O 1 5 ►Will your database be I/O Bound? ●many writes: bound by transaction log O ●database much larger than RAM: bound by I/O for many/most queries D
  • 79. Getting Enough I/O 1 5 ►Optimize for the I/O you'll need ●if you DB is terabytes, spend most of your money on disks ●calculate how long it will take to read your entire database from disk ▬backups ▬snapshots ●don't forget the transaction log!
  • 80. I/O Decision Tree 1 5 lots of fits in No Yes mirrored writes? RAM? Yes No afford terabytes HW RAID good HW Yes No of data? RAID? Yes No mostly SW RAID Storage read? Device Yes No RAID 5 RAID 1+0
  • 81. I/O Tips 1 5 ►RAID ●get battery backup and turn your write cache on ●SAS has 2x the real throughput of SATA ●more spindles = faster database ▬big disks are generally slow
  • 82. I/O Tips 1 5 ►DAS/SAN/NAS ●measure lag time: it can kill response time ●how many channels? ▬“gigabit” is only 100mb/s ▬make sure multipath works ●use fiber if you can afford it
  • 83. I/O Tips 1 5 iSCSI = death
  • 84. SSD 1 5 ►Very fast seeks D ●great for index access on large tables ●up to 20X faster ►Not very fast random writes ●low-end models can be slower than HDD ●most are about 2X speed ►And use server models, not desktop!
  • 85. NAND (FusionIO) 1 5 All the advantages of SSD, Plus: ►Very fast writes ( 5X to 20X ) W O ●more concurrency on writes ●MUCH lower latency ►But … very expensive (50X)
  • 86. Tablespaces for NVRAM 1 5 ►Have a "hot" and a "cold" tablespace ●current data on "hot" O D ●older/less important data on "cold" ●combine with partitioning ►compromise between speed and size
  • 87. Network 1 5 ►Network can be your bottleneck ●lag time ●bandwith ●oversubscribed switches ●NAS
  • 88. Network 1 5 ►Have dedicated connections ●between appserver and database server ●between database server and failover server ●between database and storage
  • 89. Network 1 5 ►Data Transfers ●Gigabit is only 100MB/s ●Calculate capacity for data copies, standby, dumps
  • 90. The Most Important Hardware Advice: 1 5 ►Quality matters ●not all CPUs are the same ●not all RAID cards are the same ●not all server systems are the same ●one bad piece of hardware, or bad driver, can destroy your application performance
  • 91. The Most Important Hardware Advice: 1 5 ►High-performance databases means hardware expertise ●the statistics don't tell you everything ●vendors lie ●you will need to research different models and combinations ●read the pgsql-performance mailing list
  • 92. The Most Important Hardware Advice: 1 5 ►Make sure you test your hardware before you put your database on it ●“Try before you buy” ●Never trust the vendor or your sysadmins
  • 93. The Most Important Hardware Advice: 1 5 ►So Test, Test, Test! ●CPU: PassMark, sysbench, Spec CPU ●RAM: memtest, cachebench, Stream ●I/O: bonnie++, dd, iozone ●Network: bwping, netperf ●DB: pgBench, sysbench
  • 94. Questions? 1 6 ►Josh Berkus ►More Advice ● josh@pgexperts.com ● www.postgresql.org/docs ● www.pgexperts.com ● pgsql-performance ▬ /presentations.html mailing list ● www.databasesoup.com ● planet.postgresql.org ● irc.freenode.net ▬ #postgresql This talk is copyright 2013 Josh Berkus, and is licensed under the creative commons attribution license