SlideShare une entreprise Scribd logo
1  sur  30
Good Morning
SQL
Mayank Singh
1316110115
CSE - sec ’B’
Agenda
• What is NoSQL?
• NoSQL Technology Breakdown
• Where is NoSQL a Killer App?
• What Good is Relational?
• NoSQL and CouchDB
• NoSQL, Relational, or Both?
What is NoSQL ?
•Common traits
• Non-relational, Scalable
• Non-schematized/schema-free
• Eventual consistency
• Open source
• Distributed
• “Web scale”
• Developed at big Internet companies
Consistency
• CAP Theorem Databases may only excel at two of the following three
attributes: consistency, availability and partition tolerance
• NoSQL does not offer “ACID” guarantees Atomicity, consistency,
isolation and durability
• Instead offers “eventual consistency” Similar to DNS propagation
• Indexing
• Most NoSQL databases are indexed by key
• Some allow so-called “secondary” indexes
• Often the primary key indexes are clustered
• Hbase uses Hadoop Distributed File System, which is append-only
• Writes are logged
• Logged writes are batched
• File is re-created and sorted
• You get control back quikly
• Queries
• Typically no query language
• Instead, create procedural program
• Sometimes SQL is supported
• Sometimes MapReduce code is used…
MapReduce
• Map step: split the query up
• Reduce Most typical of Hadoop andstep: merge the results
• used with Wide Column Stores, esp. Hbase
• Amazon Web Services’ Elastic MapReduce (EMR) can read/write
DynamoDB, S3, Relational Database Service (RDS)
• “Hive”add-on to Hadoop offers a HiveQL (SQL-like) abstraction over MR
• Use with Hive tables
• Use with Hbase
NoSQL Technology Breakdown
• Key-Value Stores
• The most common; not necessarily the most popular
• Has rows, each with something like a big dictionary/associative array
• Schema may differ from row to row
• Common on Cloud platforms
• e.g. Amazon SimpleDB, Azure Table Storage
• MemcacheDB, Voldemort
• DynamoDB (AWS), Dynomite, Redis and Riak
Key-Value Stores
Wide Column Stores
• Has tables with declared column families
• Each column family has “columns” which are KV pair that can vary from row
to row
• These are the most foundational for large sites
• Big Table (Google)
• Hbase (Originally part of Yahoo-dominated Hadoop project)
• Cassandra (Facebook)
• Calls column families “super columns” and tables “super column families”
• They are the most “Big Data”-ready
• Especially Hbase + Hadoop
Wide Column Stores
Document Stores
• Have “databases,” which are akin to tables
• Have “documents,” akin to rows
• Documents are typically JSON objects
• Each document has properties and values
• Values can be scalars, arrays, links to documents in other databases or sub-documents (i.e.
contained JSON objects - Allows for hierarchical storage)
• Can have attachments as well
• Old versions are retained
• So Doc Stores work well for content management
• Some view doc stores as specialized KV stores
• Most popular with developers, startups, VCs
• The biggies:
• CouchDB
• MongoDB
Document Stores
Document Store Application Orientation
• Documents can each be addressed by URIs
• CouchDB supports full REST interface
• Very geared towards JavaScript and JSON
• Documents are JSON objects
• CouchDB/MongoDB use JavaScript as native language
• In CouchDB, “view functions” also have unique URIs and they return HTML
• So you can build entire applications in the database
Graph Databases
• Great for social network applications and others where relationships are
important
• Nodes and edges
• Edge like a join
• Nodes like rows in a table
• Nodes can also have properties and values
• Neo4j is a popular graph db
Graph Databases
• Source: SlideShare
Where is NoSQL a Killer App?
• Content Management
• Document databases work really well here
• Regular KV pairs can store meta data
• Can also store text-based content
• Attachments can store file-based or binary content
• Versioning and URI addressability help as well
• CouchDB gets called a “Web database”
• Product Catalogs
• Products in a catalog tend to have many attributes in common and then
various others that are class-specific
• Common
• Product ID
• Name
• Description
• Price
• Key Value Stores and Wide Column Stores work well here
• Social
• Graph databases work best here
• Great for tracking:
• Networks
• Followers
• Group membership
• Threaded interactions (comments, likes/favorites)
• Great for Membership, Ownership
• Avoids the self-joins and many-to-many tables necessary in relational DBs
• Big Data
• Wide Column and Key-Value Stores work best here
• MapReduce is designed for this scenario
• Hadoop and Hbase come up a lot
• Sharding and append-only help here
• Premise of analytics is reading data, not maintaining it.
• Miscellaneous
• Event-driven data (i.e. logs)
• User profiles, preferences
• Mail, status message streams
• Other Web data
• Automobile directions
• Info for sites on maps (category, name, description, photo)
• User reviews
• Etc.
What Good is Relational ?
Transactional
• support transactions
• Business systems require atomic transactions
• You can’t process an order without decrementing inventory
• You can’t register a credit without its corresponding debit
• No exceptions, no excuses
Formal Schema
• Regular processes have regular data
• Stocks, trades
• PO line items
• Personnel records
• Insurance policies
• These need relational databases with declared schema
• These don’t need MapReduce, document or graph representation
• Banded Reporting
• Operational reporting is based on detail and group sections with predictable,
consistent layout, based on known schema
• Very hard to design pixel-perfect reports against indeterminate schema
• This highlights how operational business processes almost always require
relational databases
• Data Size
• A well-defined query language
• Mature development and administration tools
• Denormalize the database
NoSQL and CouchDB
• Source: Microsoft Azure
NoSQL, Relational, or Both?
• Type of App
• Really a question of consistency versus massive scale
• Is this an internal system or a public one?
• Is it an application for the data or data for a system?
• Below a certain threshold of concurrent usage, NoSQL may be slower than relational
• Skill Sets and Investment
• Does your staff have RDBMS skills already?
• Do you have significant investment in relational database hw/sw?
• Lots of apps that use an RDBMS?
• Do you want to support both?
• Are you a startup?
• Employ developers who possess NoSQL skills and prefer NoSQL?
• Relational databases use Structured Querying Language (SQL), making them a
good choice for applications that involve the management of several
transactions.
• NoSQL Limitations
• In non-relational databases like Mongo, there are no joins like there would be in relational
databases.
• It also doesn’t automatically treat operations as transactions the way a relational database
does, you must manually choose to create a transaction and then manually verify it, manually
commit it or roll it back.
Companies Using NoSQL DB
References
NoSql - mayank singh

Contenu connexe

Tendances

SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Mongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirniMongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirniDr. Awase Khirni Syed
 
Conhecendo o Apache HBase
Conhecendo o Apache HBaseConhecendo o Apache HBase
Conhecendo o Apache HBaseFelipe Ferreira
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud RightScale
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documentsDr. Awase Khirni Syed
 
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreFilipe Silva
 
Unlocking big data with Hadoop + MySQL
Unlocking big data with Hadoop + MySQLUnlocking big data with Hadoop + MySQL
Unlocking big data with Hadoop + MySQLRicky Setyawan
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture PatternsMaynooth University
 
Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5Mike King
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniAvinash Ramineni
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBIke Ellis
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 

Tendances (20)

SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Mongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirniMongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirni
 
Conhecendo o Apache HBase
Conhecendo o Apache HBaseConhecendo o Apache HBase
Conhecendo o Apache HBase
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documents
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
 
Unlocking big data with Hadoop + MySQL
Unlocking big data with Hadoop + MySQLUnlocking big data with Hadoop + MySQL
Unlocking big data with Hadoop + MySQL
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture Patterns
 
Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 

Similaire à NoSql - mayank singh

UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databasesAnirban Konar
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Gavin Heavyside
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
Schema Design
Schema DesignSchema Design
Schema DesignQBurst
 

Similaire à NoSql - mayank singh (20)

UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databases
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011
 
Say Yes To No SQL
Say Yes To No SQLSay Yes To No SQL
Say Yes To No SQL
 
Apache drill
Apache drillApache drill
Apache drill
 
No SQL
No SQLNo SQL
No SQL
 
Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 
Schema Design
Schema DesignSchema Design
Schema Design
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Dernier (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

NoSql - mayank singh

  • 3. Agenda • What is NoSQL? • NoSQL Technology Breakdown • Where is NoSQL a Killer App? • What Good is Relational? • NoSQL and CouchDB • NoSQL, Relational, or Both?
  • 4.
  • 5. What is NoSQL ? •Common traits • Non-relational, Scalable • Non-schematized/schema-free • Eventual consistency • Open source • Distributed • “Web scale” • Developed at big Internet companies
  • 6. Consistency • CAP Theorem Databases may only excel at two of the following three attributes: consistency, availability and partition tolerance • NoSQL does not offer “ACID” guarantees Atomicity, consistency, isolation and durability • Instead offers “eventual consistency” Similar to DNS propagation
  • 7. • Indexing • Most NoSQL databases are indexed by key • Some allow so-called “secondary” indexes • Often the primary key indexes are clustered • Hbase uses Hadoop Distributed File System, which is append-only • Writes are logged • Logged writes are batched • File is re-created and sorted • You get control back quikly • Queries • Typically no query language • Instead, create procedural program • Sometimes SQL is supported • Sometimes MapReduce code is used…
  • 8. MapReduce • Map step: split the query up • Reduce Most typical of Hadoop andstep: merge the results • used with Wide Column Stores, esp. Hbase • Amazon Web Services’ Elastic MapReduce (EMR) can read/write DynamoDB, S3, Relational Database Service (RDS) • “Hive”add-on to Hadoop offers a HiveQL (SQL-like) abstraction over MR • Use with Hive tables • Use with Hbase
  • 9. NoSQL Technology Breakdown • Key-Value Stores • The most common; not necessarily the most popular • Has rows, each with something like a big dictionary/associative array • Schema may differ from row to row • Common on Cloud platforms • e.g. Amazon SimpleDB, Azure Table Storage • MemcacheDB, Voldemort • DynamoDB (AWS), Dynomite, Redis and Riak
  • 11. Wide Column Stores • Has tables with declared column families • Each column family has “columns” which are KV pair that can vary from row to row • These are the most foundational for large sites • Big Table (Google) • Hbase (Originally part of Yahoo-dominated Hadoop project) • Cassandra (Facebook) • Calls column families “super columns” and tables “super column families” • They are the most “Big Data”-ready • Especially Hbase + Hadoop
  • 13. Document Stores • Have “databases,” which are akin to tables • Have “documents,” akin to rows • Documents are typically JSON objects • Each document has properties and values • Values can be scalars, arrays, links to documents in other databases or sub-documents (i.e. contained JSON objects - Allows for hierarchical storage) • Can have attachments as well • Old versions are retained • So Doc Stores work well for content management • Some view doc stores as specialized KV stores • Most popular with developers, startups, VCs • The biggies: • CouchDB • MongoDB
  • 15. Document Store Application Orientation • Documents can each be addressed by URIs • CouchDB supports full REST interface • Very geared towards JavaScript and JSON • Documents are JSON objects • CouchDB/MongoDB use JavaScript as native language • In CouchDB, “view functions” also have unique URIs and they return HTML • So you can build entire applications in the database
  • 16. Graph Databases • Great for social network applications and others where relationships are important • Nodes and edges • Edge like a join • Nodes like rows in a table • Nodes can also have properties and values • Neo4j is a popular graph db
  • 19. Where is NoSQL a Killer App? • Content Management • Document databases work really well here • Regular KV pairs can store meta data • Can also store text-based content • Attachments can store file-based or binary content • Versioning and URI addressability help as well • CouchDB gets called a “Web database” • Product Catalogs • Products in a catalog tend to have many attributes in common and then various others that are class-specific • Common • Product ID • Name • Description • Price • Key Value Stores and Wide Column Stores work well here
  • 20. • Social • Graph databases work best here • Great for tracking: • Networks • Followers • Group membership • Threaded interactions (comments, likes/favorites) • Great for Membership, Ownership • Avoids the self-joins and many-to-many tables necessary in relational DBs • Big Data • Wide Column and Key-Value Stores work best here • MapReduce is designed for this scenario • Hadoop and Hbase come up a lot • Sharding and append-only help here • Premise of analytics is reading data, not maintaining it.
  • 21. • Miscellaneous • Event-driven data (i.e. logs) • User profiles, preferences • Mail, status message streams • Other Web data • Automobile directions • Info for sites on maps (category, name, description, photo) • User reviews • Etc.
  • 22. What Good is Relational ? Transactional • support transactions • Business systems require atomic transactions • You can’t process an order without decrementing inventory • You can’t register a credit without its corresponding debit • No exceptions, no excuses Formal Schema • Regular processes have regular data • Stocks, trades • PO line items • Personnel records • Insurance policies • These need relational databases with declared schema • These don’t need MapReduce, document or graph representation
  • 23. • Banded Reporting • Operational reporting is based on detail and group sections with predictable, consistent layout, based on known schema • Very hard to design pixel-perfect reports against indeterminate schema • This highlights how operational business processes almost always require relational databases • Data Size • A well-defined query language • Mature development and administration tools • Denormalize the database
  • 26. NoSQL, Relational, or Both? • Type of App • Really a question of consistency versus massive scale • Is this an internal system or a public one? • Is it an application for the data or data for a system? • Below a certain threshold of concurrent usage, NoSQL may be slower than relational • Skill Sets and Investment • Does your staff have RDBMS skills already? • Do you have significant investment in relational database hw/sw? • Lots of apps that use an RDBMS? • Do you want to support both? • Are you a startup? • Employ developers who possess NoSQL skills and prefer NoSQL?
  • 27. • Relational databases use Structured Querying Language (SQL), making them a good choice for applications that involve the management of several transactions. • NoSQL Limitations • In non-relational databases like Mongo, there are no joins like there would be in relational databases. • It also doesn’t automatically treat operations as transactions the way a relational database does, you must manually choose to create a transaction and then manually verify it, manually commit it or roll it back.