SlideShare une entreprise Scribd logo
1  sur  20
H I M A N C H A L I
T E C H L E A D D B E @ I N M O B I
PostgreSQL as NoSQL
What’s on the plate today ?
 What is the requirement for Schemaless database ?
 Data type and volume that has to be handled.
 How can we use PostgreSQL to store Schemaless
data ?
 Performance of PostgreSQL Schemaless option.
What is Schemaless data ?
 It doesn’t mean unstructured, but it’s structured over
each document
 Each Document is a pair of key-values in a
hierarchical structure of arrays
 The application needs to be aware about this
structure and handle it if it’s not getting as expected
 Modern Schemaless DBs don’t use SQL and hence
are termed NoSQL DB. (confusing term)
Requirement for Schemaless data
 Introduction of new reporting data for merchant
where dimension is not fixed
 Currently there are 15 known fields, where data
varies from 5-15 for different merchants
 In future , chances of new dimension addition
 Data size to start with is 1 million per hour
Issue with current schema DB
 Fixed schema definition
 Too many null values
 Adding new column will need schema changes
 May be updates in old rows will be required
So, should we move to NoSQL ??
If PostgreSQL can solve it !!
 3 different ways provided by PostgreSQL for
Document types
 XML
 hstore
 JSON
hstore : Smart contrib module
 A hierarchical storage type for PostgreSQL
 Key Value store with ACID compliance
 Maps String Keys to String Values or other hstore values
 Rich in its own functions like..
 h -> “a”
 h?->”a”
 h@>”a->2”
 Indexing available : GiST and GIN
 Indexes whole hierarchy , not just a key
 Expression indexes are also supported
hstore continues….
CREATE TABLE my_store
(
id character varying(1024) NOT NULL,
doc hstore,
CONSTRAINT my_store_pkey PRIMARY KEY (id)
);
CREATE INDEX my_store_doc_idx_gist
ON my_store
USING gist(doc) ;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘created_at=>23/12/2013’;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘is_active=>:t’ AND doc ? ‘has_address’
ORDER BY doc -> ‘created_at’ DESC;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘is_active=>:t’ AND doc ?| ARRAY[‘has_address’, ‘has_payoption’] ;
JSON
 Storage type of famous NoSQL DBs like MongoDB ,
CouchDB
 PostgreSQL introduced JSON type in 9.2; Stored as
text but with validation for JSON (--To be fixed in
9.4)
 Function like row_to_json , array_to_json
 But JSON Production and Processing absent
 9.3 Came with many new features
 to_json(any)
 Json_agg(record)
 Many Extraction Functions and operators
JSON continues…
 JSONB being introduced in 9.4
 JSONB Indexing
 Tables can be unlogged for further performance
increase at the cost of reliability
hstore and json
 hstore_to_json(hstore)
 hstore_to_json_loose(hstore)
Some numbers..
 Data used : id number, other fields text
 Record 2 Million , average 64bytes each
 Read CSV File, parse into appropriate format , Insert into DB
 Write Speed in Records/Sec
 hstore : 4600 r/s
 hstore(GiST) : 3000 r/s
 hstore (GIN) : 1700 r/s
 JSON : 4600 r/s
 MongoDB : 4000 r/s
• DB Size (in MB table/indexes)
 hstore : 300/0
 hstore(GiST) : 300/71
 hstore(GIN) : 300/700
 JSON : 300/60
 MongoDB : 1800/200
Write Speed
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Records/Second
Records/Second
Data Size
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Index(MB)
Table(MB)
 Select query on primary key
 Fetch Time in Milliseconds
 hstore : 320
 hstore(GiST) : 190
 hstore(GIN) : 180
 JSON : 20
 MongoDB : 40
 Select query on Name (filter for 100 names)
 Fetch Time in Milliseconds
 hstore : 350
 hstore(GiST) : 150
 Hstore(GIN) : 140
 JSON : 10000
 MongoDB : 450
Select Query Fetch on PK
0
50
100
150
200
250
300
350
MilliSeconds
Select Query Fetch on Text
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
MilliSeconds
MilliSeconds
Some conclusions..
 PostgreSQL can be used as schemaless DB
 PostgreSQL’s relational data storage is very efficient.
 Build indexes using expressions on common used fields
 In hstore GiST index is much more efficient than GIN in our
case
 GiST/GIN accelerates every field not just the Primary Key
 GIN indexes are best for static data because lookups are
faster.
 For dynamic data, GiST indexes are faster to update.
 Specifically, GiST indexes are very good for dynamic data and
fast if the number of unique words (lexemes) is under
100,000.
So what you got in PostgreSQL
 Use the features of relational DB in your Schemaless
world !
 Use Constraints
 Use Transactions
 Use Indexing
 Do Joins on keys
 And with all these get NoSQL(Schemaless data) requirement
fulfilled as well.
 Save the new DB migration cost and time
 ---
 As these are PostgreSQL specific features, migration to
other RDBMS not possible.
Questions???
No offence to NoSQL DBs 
Thank You !!
himamahi09@gmail.com

Contenu connexe

Tendances

Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyOliver Seemann
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action Mydbops
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...Ashnikbiz
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSMax Neunhöffer
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudMongoDB
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performanceMydbops
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)Chinmay Kulkarni
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PGPgDay.Seoul
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationMongoDB
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeMichael Stack
 
Asko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadAsko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadOntico
 

Tendances (20)

Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case study
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
 
re:dash is awesome
re:dash is awesomere:dash is awesome
re:dash is awesome
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performance
 
No sql
No sqlNo sql
No sql
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
 
Deep Dive on ArangoDB
Deep Dive on ArangoDBDeep Dive on ArangoDB
Deep Dive on ArangoDB
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail Application
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
 
Asko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadAsko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture Highload
 

Similaire à PostgreSQL as NoSQL

MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsSrinivas Mutyala
 
Json to hive_schema_generator
Json to hive_schema_generatorJson to hive_schema_generator
Json to hive_schema_generatorPayal Jain
 
Elephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopElephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopStuart Ainsworth
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introductionantoinegirbal
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTORiccardo Zamana
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyGuillaume Lefranc
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 

Similaire à PostgreSQL as NoSQL (20)

MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
Mysql using php
Mysql using phpMysql using php
Mysql using php
 
Json to hive_schema_generator
Json to hive_schema_generatorJson to hive_schema_generator
Json to hive_schema_generator
 
Beyond Relational Databases
Beyond Relational DatabasesBeyond Relational Databases
Beyond Relational Databases
 
Sql Basics And Advanced
Sql Basics And AdvancedSql Basics And Advanced
Sql Basics And Advanced
 
unit 1 ppt.pptx
unit 1 ppt.pptxunit 1 ppt.pptx
unit 1 ppt.pptx
 
Elephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopElephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to Hadoop
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
Sql
SqlSql
Sql
 
MongoDB
MongoDBMongoDB
MongoDB
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
MYSQL.ppt
MYSQL.pptMYSQL.ppt
MYSQL.ppt
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

PostgreSQL as NoSQL

  • 1. H I M A N C H A L I T E C H L E A D D B E @ I N M O B I PostgreSQL as NoSQL
  • 2. What’s on the plate today ?  What is the requirement for Schemaless database ?  Data type and volume that has to be handled.  How can we use PostgreSQL to store Schemaless data ?  Performance of PostgreSQL Schemaless option.
  • 3. What is Schemaless data ?  It doesn’t mean unstructured, but it’s structured over each document  Each Document is a pair of key-values in a hierarchical structure of arrays  The application needs to be aware about this structure and handle it if it’s not getting as expected  Modern Schemaless DBs don’t use SQL and hence are termed NoSQL DB. (confusing term)
  • 4. Requirement for Schemaless data  Introduction of new reporting data for merchant where dimension is not fixed  Currently there are 15 known fields, where data varies from 5-15 for different merchants  In future , chances of new dimension addition  Data size to start with is 1 million per hour
  • 5. Issue with current schema DB  Fixed schema definition  Too many null values  Adding new column will need schema changes  May be updates in old rows will be required So, should we move to NoSQL ??
  • 6. If PostgreSQL can solve it !!  3 different ways provided by PostgreSQL for Document types  XML  hstore  JSON
  • 7. hstore : Smart contrib module  A hierarchical storage type for PostgreSQL  Key Value store with ACID compliance  Maps String Keys to String Values or other hstore values  Rich in its own functions like..  h -> “a”  h?->”a”  h@>”a->2”  Indexing available : GiST and GIN  Indexes whole hierarchy , not just a key  Expression indexes are also supported
  • 8. hstore continues…. CREATE TABLE my_store ( id character varying(1024) NOT NULL, doc hstore, CONSTRAINT my_store_pkey PRIMARY KEY (id) ); CREATE INDEX my_store_doc_idx_gist ON my_store USING gist(doc) ; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘created_at=>23/12/2013’; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘is_active=>:t’ AND doc ? ‘has_address’ ORDER BY doc -> ‘created_at’ DESC; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘is_active=>:t’ AND doc ?| ARRAY[‘has_address’, ‘has_payoption’] ;
  • 9. JSON  Storage type of famous NoSQL DBs like MongoDB , CouchDB  PostgreSQL introduced JSON type in 9.2; Stored as text but with validation for JSON (--To be fixed in 9.4)  Function like row_to_json , array_to_json  But JSON Production and Processing absent  9.3 Came with many new features  to_json(any)  Json_agg(record)  Many Extraction Functions and operators
  • 10. JSON continues…  JSONB being introduced in 9.4  JSONB Indexing  Tables can be unlogged for further performance increase at the cost of reliability
  • 11. hstore and json  hstore_to_json(hstore)  hstore_to_json_loose(hstore)
  • 12. Some numbers..  Data used : id number, other fields text  Record 2 Million , average 64bytes each  Read CSV File, parse into appropriate format , Insert into DB  Write Speed in Records/Sec  hstore : 4600 r/s  hstore(GiST) : 3000 r/s  hstore (GIN) : 1700 r/s  JSON : 4600 r/s  MongoDB : 4000 r/s • DB Size (in MB table/indexes)  hstore : 300/0  hstore(GiST) : 300/71  hstore(GIN) : 300/700  JSON : 300/60  MongoDB : 1800/200
  • 15.  Select query on primary key  Fetch Time in Milliseconds  hstore : 320  hstore(GiST) : 190  hstore(GIN) : 180  JSON : 20  MongoDB : 40  Select query on Name (filter for 100 names)  Fetch Time in Milliseconds  hstore : 350  hstore(GiST) : 150  Hstore(GIN) : 140  JSON : 10000  MongoDB : 450
  • 16. Select Query Fetch on PK 0 50 100 150 200 250 300 350 MilliSeconds
  • 17. Select Query Fetch on Text 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 MilliSeconds MilliSeconds
  • 18. Some conclusions..  PostgreSQL can be used as schemaless DB  PostgreSQL’s relational data storage is very efficient.  Build indexes using expressions on common used fields  In hstore GiST index is much more efficient than GIN in our case  GiST/GIN accelerates every field not just the Primary Key  GIN indexes are best for static data because lookups are faster.  For dynamic data, GiST indexes are faster to update.  Specifically, GiST indexes are very good for dynamic data and fast if the number of unique words (lexemes) is under 100,000.
  • 19. So what you got in PostgreSQL  Use the features of relational DB in your Schemaless world !  Use Constraints  Use Transactions  Use Indexing  Do Joins on keys  And with all these get NoSQL(Schemaless data) requirement fulfilled as well.  Save the new DB migration cost and time  ---  As these are PostgreSQL specific features, migration to other RDBMS not possible.
  • 20. Questions??? No offence to NoSQL DBs  Thank You !! himamahi09@gmail.com