SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
Two Elephants
in The Room!!


        Denish Patel
         Database Architect
Challenge
Big Data !
Challenge Simplified
1. Volume
   a. Size
2. Velocity
   a. streaming
   b. response time
3. Variety
   a. structured
   b. semi structured
   c. unstructured
4. Veracity
   a. uncertainty
   b. truthfulness
Solution




           Sqoop
PostgreSQL
Multi Model Database Server
● Relational
● Object Relational
● Nested Relational
● Array Stores
● Key-Value Store (hstore)
● Document Store (XML,JSON)
● Range Types ( PostgreSQL 9.2)
Hadoop
A Distributed File Structure
● ETL
   ○ Helps to convert unstructured data into Structured
     data for Analytics
● Ease of working on unstructured data
   ○ Log Processing
   ○ Data streaming in real time
● Parallel Processing
   ○ MapReduce
● Scale at Petabytes
Take away?
Two Elephants (PostgreSQL and Hadoop)
works together very well & can solve "most" of
the Big data challenges.

Contenu connexe

Tendances

Migration strategies for a mission critical cluster
Migration strategies for a mission critical clusterMigration strategies for a mission critical cluster
Migration strategies for a mission critical clusterFrancismara Souza
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Icebergkbajda
 
Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad openstackindia
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDatatdc-globalcode
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryChris Schalk
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets robertlz
 
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopFirst NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopEvert Lammerts
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And GisKudos S.A.S
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsCarl Lu
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DCCCRinc
 
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012Big Data Spain
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyftkbajda
 

Tendances (19)

Migration strategies for a mission critical cluster
Migration strategies for a mission critical clusterMigration strategies for a mission critical cluster
Migration strategies for a mission critical cluster
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigData
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
Data Science as Scale
Data Science as ScaleData Science as Scale
Data Science as Scale
 
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopFirst NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And Gis
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasets
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
PyTables
PyTablesPyTables
PyTables
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DC
 
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 

Similaire à Two Elephants Inthe Room

NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Four NoSQL Databases You Should Know
Four NoSQL Databases You Should KnowFour NoSQL Databases You Should Know
Four NoSQL Databases You Should KnowMahmoud Khaled
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
Next Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasNext Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasThoughtworks
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Krishnan Parasuraman
 
Introduction to Google BigQuery
Introduction to Google BigQueryIntroduction to Google BigQuery
Introduction to Google BigQueryCsaba Toth
 
Processing Big Data: An Introduction to Data Intensive Computing
Processing Big Data: An Introduction to Data Intensive ComputingProcessing Big Data: An Introduction to Data Intensive Computing
Processing Big Data: An Introduction to Data Intensive ComputingCollin Bennett
 
Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataMelissa Hornbostel
 
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandasPurna Chander K
 
Hive and data analysis using pandas
Hive  and  data analysis  using pandasHive  and  data analysis  using pandas
Hive and data analysis using pandasPurna Chander
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Ben Stopford
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...Alexey Zinoviev
 

Similaire à Two Elephants Inthe Room (20)

NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Four NoSQL Databases You Should Know
Four NoSQL Databases You Should KnowFour NoSQL Databases You Should Know
Four NoSQL Databases You Should Know
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Mathias test
Mathias testMathias test
Mathias test
 
Next Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasNext Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon Thomas
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Hadoop-2.6.0 Slides
Hadoop-2.6.0 SlidesHadoop-2.6.0 Slides
Hadoop-2.6.0 Slides
 
Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Introduction to Google BigQuery
Introduction to Google BigQueryIntroduction to Google BigQuery
Introduction to Google BigQuery
 
Processing Big Data: An Introduction to Data Intensive Computing
Processing Big Data: An Introduction to Data Intensive ComputingProcessing Big Data: An Introduction to Data Intensive Computing
Processing Big Data: An Introduction to Data Intensive Computing
 
Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big Data
 
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandas
 
Hive and data analysis using pandas
Hive  and  data analysis  using pandasHive  and  data analysis  using pandas
Hive and data analysis using pandas
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
 
Oslo baksia2014
Oslo baksia2014Oslo baksia2014
Oslo baksia2014
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 

Plus de Denish Patel

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
Out of the Box Replication in Postgres 9.4(PgConfUS)
Out of the Box Replication in Postgres 9.4(PgConfUS)Out of the Box Replication in Postgres 9.4(PgConfUS)
Out of the Box Replication in Postgres 9.4(PgConfUS)Denish Patel
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Denish Patel
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
 
Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Denish Patel
 
Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Denish Patel
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Denish Patel
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDSDenish Patel
 
Choosing the "D" , Lightning talk
Choosing the "D" , Lightning talkChoosing the "D" , Lightning talk
Choosing the "D" , Lightning talkDenish Patel
 
Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Denish Patel
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDenish Patel
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDenish Patel
 
P90 X Your Database!!
P90 X Your Database!!P90 X Your Database!!
P90 X Your Database!!Denish Patel
 
Achieving Pci Compliace
Achieving Pci CompliaceAchieving Pci Compliace
Achieving Pci CompliaceDenish Patel
 
Using SQL Standards? Database SQL comparition
Using SQL Standards? Database SQL comparitionUsing SQL Standards? Database SQL comparition
Using SQL Standards? Database SQL comparitionDenish Patel
 
Oracle10g New Features I
Oracle10g New Features IOracle10g New Features I
Oracle10g New Features IDenish Patel
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepDenish Patel
 

Plus de Denish Patel (18)

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
Out of the Box Replication in Postgres 9.4(PgConfUS)
Out of the Box Replication in Postgres 9.4(PgConfUS)Out of the Box Replication in Postgres 9.4(PgConfUS)
Out of the Box Replication in Postgres 9.4(PgConfUS)
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)
 
Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)
 
Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)Out of the Box Replication in Postgres 9.4(PgCon)
Out of the Box Replication in Postgres 9.4(PgCon)
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
Choosing the "D" , Lightning talk
Choosing the "D" , Lightning talkChoosing the "D" , Lightning talk
Choosing the "D" , Lightning talk
 
Scaling postgres
Scaling postgresScaling postgres
Scaling postgres
 
Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQL
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQL
 
P90 X Your Database!!
P90 X Your Database!!P90 X Your Database!!
P90 X Your Database!!
 
Achieving Pci Compliace
Achieving Pci CompliaceAchieving Pci Compliace
Achieving Pci Compliace
 
Using SQL Standards? Database SQL comparition
Using SQL Standards? Database SQL comparitionUsing SQL Standards? Database SQL comparition
Using SQL Standards? Database SQL comparition
 
Oracle10g New Features I
Oracle10g New Features IOracle10g New Features I
Oracle10g New Features I
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRep
 

Two Elephants Inthe Room

  • 1. Two Elephants in The Room!! Denish Patel Database Architect
  • 3. Challenge Simplified 1. Volume a. Size 2. Velocity a. streaming b. response time 3. Variety a. structured b. semi structured c. unstructured 4. Veracity a. uncertainty b. truthfulness
  • 4. Solution Sqoop
  • 5. PostgreSQL Multi Model Database Server ● Relational ● Object Relational ● Nested Relational ● Array Stores ● Key-Value Store (hstore) ● Document Store (XML,JSON) ● Range Types ( PostgreSQL 9.2)
  • 6. Hadoop A Distributed File Structure ● ETL ○ Helps to convert unstructured data into Structured data for Analytics ● Ease of working on unstructured data ○ Log Processing ○ Data streaming in real time ● Parallel Processing ○ MapReduce ● Scale at Petabytes
  • 7. Take away? Two Elephants (PostgreSQL and Hadoop) works together very well & can solve "most" of the Big data challenges.