SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
PostgreSQL
Looking under the hood with Solaris



            / Presentation / Theo Schlossnagle
PostgreSQL is Awesome



    •   Fast.

    •   Extensible.

    •   Tablespaces.

    •   Robust data types.

    •   Partitioning (albeit fake).

    •   Partial and functional indexes.

    •   Extremely supportive community.

    •   Extremely compliant with database standards.
PostgreSQL is Lacking




    •   No upgrades (AYFKM).

    •   pg_dump is too intrusive.

    •   Poor system-level instrumentation.

    •   Poor methods to determine specific contention.

    •   It relies on the operating system’s filesystem cache.
        (which make PostgreSQL inconsistent across it’s supported OS base)
Enter Solaris


     •   Solaris is a UNIX from Sun Microsystems.

     •   How is it different than other UNIX and UNIX-like systems?

         •   Mostly it isn’t different (hence the term UNIX)

         •   It does have extremely strong ABI backward compatibility.

         •   It’s stable and works well on large machines.

     •   Solaris 10 shakes things up a bit:

         •   DTrace

         •   ZFS

         •   Zones
Solaris / ZFS


     •   ZFS: Zettaback Filesystem.

         •   264 snapshots, 248 files/directory, 264 bytes/filesystem,
             278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system

     •   Extremely cheap differential backups.

         •   I have a 5 TB database, I need a backup!

     •   No rollback in your database? What is this? MySQL?

     •   No rollback in your filesystem?

         •   ZFS has snapshots, rollback, clone and promote.

         •   OMG! Life altering features.

     •   Caveat: ZFS is slower than alternatives, by about 10% with tuning.
Solaris / Zones




     •   Zones: Virtual Environments.

     •   Shared kernel.

     •   Can share filesystems.

     •   Segregated processes and privileges.

     •   No big deal for databases, right?




                                  But Wait!
Solaris / ZFS + Zones = Magic Juju
  https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh


      •   ZFS snapshot, clone, delegate to zone, boot and run.

      •   When done, halt zone, destroy clone.

      •   We get a point-in-time copy of our entire PostgreSQL database:

          •   read-write,

          •   low disk-space requirements,

          •   NO LOCKS! Welcome back pg_dump, you don’t suck anymore.

          •   Fast snapshot to usable copy time:

              •   On our 20 GB database: 1 minute.

              •   On our 1.2 TB database: 2 minutes.
ZFS: how I saved my soul.



     •   Database crash. Bad. 1.2 TB of data... busted.
         The reason Robert Treat looks a bit older than he should.

     •   xlogs corrupted. catalog indexes corrupted.

     •   Fault? PostgreSQL bug? Bad memory? Who knows?

     •   Trial & error on a 1.2 TB data set can be a cruel experience.

         •   In real-life, most recovery actions are destructive actions.

         •   PostgreSQL is no different.

     •   Rollback to last checkpoint (ZFS), hack postgres code, try, fail, repeat.
Let DTrace open your eyes


    •   DTrace: Dynamic Tracing

    •   Allow you to dynamically instrument “stuff” in the system:

        •   system calls (like strace/truss/ktrace).

        •   process/scheduler activity (on/off cpu, semaphores, conditions).

        •   see signals sent and received.

        •   trace kernel functions, networking.

        •   watch I/O down to the disk.

        •   user-space processes, each function... each machine instruction!

        •   Add probes into apps where it makes sense to you.
Can you see what I see?

     •   There is EXPLAIN... when that isn’t enough...

     •   There is EXPLAIN ANALYZE... when that isn’t enough.

     •   There is DTrace.

         ; dtrace -q -n ‘
         postgresql*:::statement-start
         {
            self->query = copyinstr(arg0);
            self->ok=1;
         }
         io:::start
         /self->ok/
         {
            @[self->query,
              args[0]->b_flags & B_READ ? "read" : "write",
              args[1]->dev_statname] = sum(args[0]->b_bcount);
         }’
         dtrace: description 'postgres*:::statement-start' matched 14 probes
         ^C

         select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000;
             read sd1 16384
         select division, sum(amount), avg(amount) from ods.billings where txn_timestamp
         between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division;
             read sd2 71647232
OmniTI Labs / pgsoltools

       •    https://labs.omniti.com/trac/pgsoltools

           •    Where we stick out PostgreSQL on Solaris goodies...

           •    like pg_file_stress


             FILENAME/DBOBJECT                              READS                    WRITES
                                                  #   min    avg    max      #   min    avg   max
  alldata1__idx_remove_domain_external            1    12     12     12    398     0      0     0
  slowdata1__pg_rewrite                           1    12     12     12      0     0      0     0
  slowdata1__pg_class_oid_index                   1     0      0      0      0     0      0     0
  slowdata1__pg_attribute                         2     0      0      0      0     0      0     0
  alldata1__mv_users                              0     0      0      0      4     0      0     0
  slowdata1__pg_statistic                         1     0      0      0      0     0      0     0
  slowdata1__pg_index                             1     0      0      0      0     0      0     0
  slowdata1__pg_index_indexrelid_index            1     0      0      0      0     0      0     0
  alldata1__remove_domain_external                0     0      0      0    502     0      0     0
  alldata1__promo_15_tb_full_2                   19     0      0      0     11     0      0     0
  slowdata1__pg_class_relname_nsp_index           2     0      0      0      0     0      0     0
  alldata1__promo_177intaoltest_tb                0     0      0      0   1053     0      0     0
  slowdata1__pg_attribute_relid_attnum_index      2     0      0      0      0     0      0     0
  alldata1__promo_15_tb_full_2_pk                 2     0      0      0      0     0      0     0
  alldata1__all_mailable_2                     1403     0      0    423      0     0      0     0
  alldata1__mv_users_pkey                         0     0      0      0      4     0      0     0
Thank you for listening.
Looking under PostgreSQL’s hood with Solaris.



            / Presentation

Contenu connexe

Tendances

Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
DataWorks Summit
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
distributed matters
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
Cuong Tran
 
Ambari Meetup: NameNode HA
Ambari Meetup: NameNode HAAmbari Meetup: NameNode HA
Ambari Meetup: NameNode HA
Hortonworks
 
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
Tim Vaillancourt
 

Tendances (20)

Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Architecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres ClusterArchitecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres Cluster
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
Provisioning and automating high availability postgres on aws ec2 (1)
Provisioning and automating high availability postgres on aws ec2 (1)Provisioning and automating high availability postgres on aws ec2 (1)
Provisioning and automating high availability postgres on aws ec2 (1)
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStack
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
PostgreSQL Hangout Parameter Tuning
PostgreSQL Hangout Parameter TuningPostgreSQL Hangout Parameter Tuning
PostgreSQL Hangout Parameter Tuning
 
SQL Server vs Postgres
SQL Server vs PostgresSQL Server vs Postgres
SQL Server vs Postgres
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
 
Ambari Meetup: NameNode HA
Ambari Meetup: NameNode HAAmbari Meetup: NameNode HA
Ambari Meetup: NameNode HA
 
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 

En vedette

Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everything
Theo Schlossnagle
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
Theo Schlossnagle
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
Theo Schlossnagle
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
Theo Schlossnagle
 

En vedette (12)

Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everything
 
Web Operations Career
Web Operations CareerWeb Operations Career
Web Operations Career
 
Velocity 2010: Scalable Internet Architectures
Velocity 2010: Scalable Internet ArchitecturesVelocity 2010: Scalable Internet Architectures
Velocity 2010: Scalable Internet Architectures
 
Big Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ PerconaBig Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ Percona
 
Omnios and unix
Omnios and unixOmnios and unix
Omnios and unix
 
Craftsmanship
CraftsmanshipCraftsmanship
Craftsmanship
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
 
Project reality
Project realityProject reality
Project reality
 
Scalable Internet Architecture
Scalable Internet ArchitectureScalable Internet Architecture
Scalable Internet Architecture
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Esperwhispering
EsperwhisperingEsperwhispering
Esperwhispering
 

Similaire à PostgreSQL on Solaris

Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 
Deployment Preparedness
Deployment Preparedness Deployment Preparedness
Deployment Preparedness
MongoDB
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
Jonathan Levin
 
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
Insight Technology, Inc.
 
Оптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчикОптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчик
Agnislav Onufrijchuk
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 

Similaire à PostgreSQL on Solaris (20)

Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Deployment Preparedness
Deployment Preparedness Deployment Preparedness
Deployment Preparedness
 
Deployment
DeploymentDeployment
Deployment
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Quick Wins
Quick WinsQuick Wins
Quick Wins
 
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache CassandraCassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
 
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Оптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчикОптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчик
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 

Plus de Theo Schlossnagle

A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
Theo Schlossnagle
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoring
Theo Schlossnagle
 

Plus de Theo Schlossnagle (20)

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to Complexity
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped Software
 
Monitoring 101
Monitoring 101Monitoring 101
Monitoring 101
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or Not
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
SRECon Coherent Performance
SRECon Coherent PerformanceSRECon Coherent Performance
SRECon Coherent Performance
 
Commandments of scale
Commandments of scaleCommandments of scale
Commandments of scale
 
Adaptive availability
Adaptive availabilityAdaptive availability
Adaptive availability
 
Monitoring the #DevOps way
Monitoring the #DevOps wayMonitoring the #DevOps way
Monitoring the #DevOps way
 
Operational Software Design
Operational Software DesignOperational Software Design
Operational Software Design
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Understanding Slowness
Understanding SlownessUnderstanding Slowness
Understanding Slowness
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Xtreme Deployment
Xtreme DeploymentXtreme Deployment
Xtreme Deployment
 
Atldevops
AtldevopsAtldevops
Atldevops
 
It's all about telemetry
It's all about telemetryIt's all about telemetry
It's all about telemetry
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoring
 
What's in a number?
What's in a number?What's in a number?
What's in a number?
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

PostgreSQL on Solaris

  • 1. PostgreSQL Looking under the hood with Solaris / Presentation / Theo Schlossnagle
  • 2. PostgreSQL is Awesome • Fast. • Extensible. • Tablespaces. • Robust data types. • Partitioning (albeit fake). • Partial and functional indexes. • Extremely supportive community. • Extremely compliant with database standards.
  • 3. PostgreSQL is Lacking • No upgrades (AYFKM). • pg_dump is too intrusive. • Poor system-level instrumentation. • Poor methods to determine specific contention. • It relies on the operating system’s filesystem cache. (which make PostgreSQL inconsistent across it’s supported OS base)
  • 4. Enter Solaris • Solaris is a UNIX from Sun Microsystems. • How is it different than other UNIX and UNIX-like systems? • Mostly it isn’t different (hence the term UNIX) • It does have extremely strong ABI backward compatibility. • It’s stable and works well on large machines. • Solaris 10 shakes things up a bit: • DTrace • ZFS • Zones
  • 5. Solaris / ZFS • ZFS: Zettaback Filesystem. • 264 snapshots, 248 files/directory, 264 bytes/filesystem, 278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system • Extremely cheap differential backups. • I have a 5 TB database, I need a backup! • No rollback in your database? What is this? MySQL? • No rollback in your filesystem? • ZFS has snapshots, rollback, clone and promote. • OMG! Life altering features. • Caveat: ZFS is slower than alternatives, by about 10% with tuning.
  • 6. Solaris / Zones • Zones: Virtual Environments. • Shared kernel. • Can share filesystems. • Segregated processes and privileges. • No big deal for databases, right? But Wait!
  • 7. Solaris / ZFS + Zones = Magic Juju https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh • ZFS snapshot, clone, delegate to zone, boot and run. • When done, halt zone, destroy clone. • We get a point-in-time copy of our entire PostgreSQL database: • read-write, • low disk-space requirements, • NO LOCKS! Welcome back pg_dump, you don’t suck anymore. • Fast snapshot to usable copy time: • On our 20 GB database: 1 minute. • On our 1.2 TB database: 2 minutes.
  • 8. ZFS: how I saved my soul. • Database crash. Bad. 1.2 TB of data... busted. The reason Robert Treat looks a bit older than he should. • xlogs corrupted. catalog indexes corrupted. • Fault? PostgreSQL bug? Bad memory? Who knows? • Trial & error on a 1.2 TB data set can be a cruel experience. • In real-life, most recovery actions are destructive actions. • PostgreSQL is no different. • Rollback to last checkpoint (ZFS), hack postgres code, try, fail, repeat.
  • 9. Let DTrace open your eyes • DTrace: Dynamic Tracing • Allow you to dynamically instrument “stuff” in the system: • system calls (like strace/truss/ktrace). • process/scheduler activity (on/off cpu, semaphores, conditions). • see signals sent and received. • trace kernel functions, networking. • watch I/O down to the disk. • user-space processes, each function... each machine instruction! • Add probes into apps where it makes sense to you.
  • 10. Can you see what I see? • There is EXPLAIN... when that isn’t enough... • There is EXPLAIN ANALYZE... when that isn’t enough. • There is DTrace. ; dtrace -q -n ‘ postgresql*:::statement-start { self->query = copyinstr(arg0); self->ok=1; } io:::start /self->ok/ { @[self->query, args[0]->b_flags & B_READ ? "read" : "write", args[1]->dev_statname] = sum(args[0]->b_bcount); }’ dtrace: description 'postgres*:::statement-start' matched 14 probes ^C select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000; read sd1 16384 select division, sum(amount), avg(amount) from ods.billings where txn_timestamp between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division; read sd2 71647232
  • 11. OmniTI Labs / pgsoltools • https://labs.omniti.com/trac/pgsoltools • Where we stick out PostgreSQL on Solaris goodies... • like pg_file_stress FILENAME/DBOBJECT READS WRITES # min avg max # min avg max alldata1__idx_remove_domain_external 1 12 12 12 398 0 0 0 slowdata1__pg_rewrite 1 12 12 12 0 0 0 0 slowdata1__pg_class_oid_index 1 0 0 0 0 0 0 0 slowdata1__pg_attribute 2 0 0 0 0 0 0 0 alldata1__mv_users 0 0 0 0 4 0 0 0 slowdata1__pg_statistic 1 0 0 0 0 0 0 0 slowdata1__pg_index 1 0 0 0 0 0 0 0 slowdata1__pg_index_indexrelid_index 1 0 0 0 0 0 0 0 alldata1__remove_domain_external 0 0 0 0 502 0 0 0 alldata1__promo_15_tb_full_2 19 0 0 0 11 0 0 0 slowdata1__pg_class_relname_nsp_index 2 0 0 0 0 0 0 0 alldata1__promo_177intaoltest_tb 0 0 0 0 1053 0 0 0 slowdata1__pg_attribute_relid_attnum_index 2 0 0 0 0 0 0 0 alldata1__promo_15_tb_full_2_pk 2 0 0 0 0 0 0 0 alldata1__all_mailable_2 1403 0 0 423 0 0 0 0 alldata1__mv_users_pkey 0 0 0 0 4 0 0 0
  • 12. Thank you for listening. Looking under PostgreSQL’s hood with Solaris. / Presentation