SlideShare une entreprise Scribd logo
1  sur  40
NOSQL
Agenda
 Introduction to NOSQL
 Objective
 Examples of NOSQL databases
 NOSQL vs SQL
 Conclusion
Basic Concepts

 Database – is a organized collection of data.
 Data base Management System (DBMS)- is a software
  package with computer program that controls the
  creation , maintainance & use of a database.
     for DBMS , we use structured language to interact with it
     Ex. Oracle , IBM DB2 , Ms Access , MySQL , FoxPro etc.
 Relational DBMS - A relational database is a
  collection of data items organized as a set of formally
  described tables from which data can be accessed easily.
  A relational database is created using the relational
  model. The software used in a relational database is
  called a relational database management
  system (RDBMS).
SQL

 Stuctured Query Language
 Special purpose programming language designed for
    managing data in RDBMS.
   Origininally based upon relational algebra & tuple relation
    calculas.
   SQl’s scope include data insert,upadte & delete, schema
    creation and modification , data access control.
   It is static and strong used in database.
   Most used widely used database language.
   Query is the most important operation in SQL.
   Ex. SELECT *
         FROM Book
         WHERE price > 100.00
         ORDER BY title;
NOSQL

 Stands for Not Only SQL
 Class of non-relational data storage systems
 Usually do not require a fixed table schema nor do
  they use the concept of joins
 All NOSQL offerings relax one or more of the ACID
  properties .
    Atomicity , Consistancy , Isolation , Durability ( ACID )
 “NOSQL” = “Not Only SQL” =
       Not Only using traditional relational DBMS
NOSQL

•   Alternative to traditional relational DBMS
    •   Flexible schema
    •   Quicker/cheaper to set up
    •   Massive scalability
    •   Relaxed consistency higher performance &
        availability

    * No declarative query language more programming
    * Relaxed consistency fewer guarantees
Why NOSQL?


 Every problem cannot be solved by traditional
    relational database system exclusively.
   Handles huge databases.
   Redundancy, data is pretty safe on commodity
    hardware
   Super flexible queries using map/reduce
   Rapid development (no fixed schema, yeah!)
   Very fast for common use cases
Contd..


 Inspired by Distributed Data Storage problems
 Scale easily by adding servers
 Not suited to all problem types, but super-suited to
  certain large problem types
 High-write situations (eg activity tracking or timeline
  rendering for millions of users)
 A lot of relational uses are really dumbed down (eg
  fetch by PK with update)
Architecture
How does it work?

 Clients know how to:
  Send items to servers (consistent hashing)
  What to do when a server fails
  How to fetch keys from servers
  Can “weigh” to server capacities

 Servers know how to:
  Store items they receive
  Expire them from the cache
  No inter-server comms – everything is unaware
Performance

 RDBMS uses buffer to ensure ACID properties
 NoSQL does not guarantee ACID and is therefore
  much faster
 We don’t need ACID everywhere!
 Ex. Data processing (every minute) is 4x faster with
  MongoDB, despite being a lot more detailed (due to
  much simple development)
Why NOSQL is faster than SQL ? - Scalling

 Simple web application with not much traffic
   Application server, database server all on one machine
Scalling contd..

 More traffic comes in
   Application server

   Database server




 Even more traffic comes in
   Load balancer

   Application server x2

   Database server
Scalling contd..


 Even more traffic comes in
     Load balancer x N
       easy
     Application server x N
       easy
     Database server xN
       hard for SQL databases
SQL Slowdown




 Not linear!
Scalling contd..


 NoSQL Scalling -
 Need more storage?
   Add more servers!

 Need higher performance?
   Add more servers!

 Need better reliability?
   Add more servers!
Scalling Summary

 You can scale SQL databases (Oracle, MySQL, SQL
  Server…)
     This will cost you dearly
     If you don’t have a lot of money, you will reach limits quickly
 You can scale NoSQL databases
   Very easy horizontal scaling

   Lots of open-source solutions

   Scaling is one of the basic incentives for design, so it is well
    handled
   Scaling is the cause of trade-offs causing you to have to use
    map/reduce
Characterstics

 Almost infinite horizontal scaling
 Very fast
 Performance doesn’t deteriorate with growth (much)
 No fixed table schemas
 No join operations
 Ad-hoc queries difficult or impossible
 Structured storage
 Almost everything happens in RAM
NOSQL Types


 Wide Column Store / Column Families
 Document Store
 Key Value / Tuple Store
 Graph Databases
 Object Databases
 XML Databases
 Multivalue Databases
Main types -

 Key-Value Stores
 Map Reduce Framework
 Document Databases
 Graph Databases
Key Value Stores

 Lineage: Amazon's Dynamo paper and Distributed
  HashTables.
 Data model: A global collection of key-value pairs
 Example systems
   Google BigTable , Amazon Dynamo, Cassandra,
     Voldemort , Hbase , …
 Implementation: efficiency, scalability, fault-tolerance
   Records distributed to nodes based on key
   Replication

   Single-record transactions, “eventual consistency”
Documented Databases

 Lineage: Inspired by Lotus Notes.
 Data model: Collections of documents, which
  contain key-value collections (called "documents").
 Example: CouchDB, MongoDB, Riak
Graph Database

 Lineage: Draws from Euler and graph theory.
 Data model: Nodes & relationships, both which can
  hold key-value pairs
 Example: AllegroGraph, InfoGrid, Neo4j
Map Reduce Framework

 Google’s framework for processing highly
  distributable problems across huge datasets
  using a large number of computers
 Let’s define large number of computers
    Cluster if all of them have same hardware
    Grid unless Cluster (if !Cluster for old-style programmers)
 Process split into two phases
   Map
      Take the input, partition it delegate to other machines
      Other machines can repeat the process, leading to tree structure
      Each machine returns results to the machine who gave it the task
Map Reduce Framework contd..

   Reduce
     collect results from machines you gave the tasks
     combine results and return it to requester

   Slower than sequential data processing, but massively parallel
   Sort petabyte of data in a few hours
   Input, Map, Shuffle, Reduce, Output
Popular NoSQL


 Hadoop / Hbase       MemcacheDB
 Cassandra            Voldemort
 Amazon               Hypertable
  SimpleDB             Cloudata
 MongoDB              IBM
 CouchDB              Lotus/Domino
 Redis
Real World Use

 Cassandra
   Facebook (original developer, used it till late 2010)
   Twitter
   Digg
   Reddit
   Rackspace
   Cisco

 BigTable
   Google (open-source version is HBase)

 MongoDB
   Foursquare
   Craigslist
   Bit.ly
   SourceForge
   GitHub
MONGODB

  Document store
  Basic support for dynamic (ad hoc) queries
  Query by example (nice!)




 Conditional Operators
    <, <=, >, >=
    $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and, $si
     ze, $type
MONGODB

 Data is stored as BSON (binary JSON)
     Makes it very well suited for languages with native JSON support
 Map/Reduce written in Javascript
     Slow! There is one single thread of execution in Javascript
 Master/slave replication (auto failover with replica sets)
 Sharding built-in
 Uses memory mapped files for data storage
 Performance over features
 On 32bit systems, limited to ~2.5Gb
 An empty database takes up 192Mb
 GridFS to store big data + metadata (not actually an FS)
CASANDRA

 Written in: Java
 Protocol: Custom, binary (Thrift)
 Tunable trade-offs for distribution and replication
  (N, R, W)
 Querying by column, range of keys
 BigTable-like features: columns, column families
 Writes are much faster than reads (!)
    Constant write time regardless of database size
 Map/reduce possible with Apache Hadoop
Some more info about Cassndra in Facebook

 Cassandra is open source DBMS from Appache
  software foundation.
 Cassandra provides a structured key-value
  store with tunable consistency
 Cassandra is a distributed storage system for
  managing structured data that is designed to scale to
  a very large size across many commodity
  servers, with no single point of failure
 It is a NoSQL solution that was initially developed
  by Facebook and powered their Inbox Search feature
  until late 2010
HBASE

 Written in: Java
 Main point: Billions of rows X millions of columns
 Modeled after BigTable
 Map/reduce with Hadoop
 Query predicate push down via server side scan and get filters
 Optimizations for real time queries
 A high performance Thrift gateway
 HTTP supports XML, Protobuf, and binary
 Cascading, hive, and pig source and sink modules
 No single point of failure
 While Hadoop streams data efficiently, it has overhead for
  starting map/reduce jobs. HBase is column oriented
  key/value store and allows for low latency read and writes.
 Random access performance is like MySQL
COUCHDB

 Written in: Erlang
 Main point: DB consistency, ease of use
 Bi-directional (!) replication, continuous or ad-hoc, with conflict
    detection, thus, master-master replication. (!)
   MVCC - write operations do not block reads
   Previous versions of documents are available
   Crash-only (reliable) design
   Needs compacting from time to time
   Views: embedded map/reduce
   Formatting views: lists & shows
   Server-side document validation possible
   Authentication possible
   Real-time updates via _changes (!)
   Attachment handling
   CouchApps (standalone JS apps)
HADOOP

 Apache project
 A framework that allows for the distributed processing of
    large data sets across clusters of computers
   Designed to scale up from single servers to thousands of
    machines
   Designed to detect and handle failures at the application
    layer, instead of relying on hardware for it
   Created by Doug Cutting, who named it after his son's toy
    elephant
   Hadoop subprojects
       Cassandra
       HBase
       Pig
   Hive was a Hadoop subproject, but is now a top-level Apache project
HADOOP contd..

 Scales to hundreds or thousands of computers, each with several
    processor cores
   Designed to efficiently distribute large amounts of work across a
    set of machines
   Hundreds of gigabytes of data constitute the low end of Hadoop-
    scale
   Built to process "web-scale" data on the order of hundreds of
    gigabytes to terabytes or petabytes
   Uses Java, but allows streaming so other languages can easily
    send and accept data items to/from Hadoop
HADOOP contd..

 Uses distributed file system (HDFS)
   Designed to hold very large amounts of data (terabytes or even
    petabytes)
   Files are stored in a redundant fashion across multiple
    machines to ensure their durability to failure and high
    availability to very parallel applications
   Data organized into directories and files

   Files are divided into block (64MB by default) and distributed
    across nodes
 Design of HDFS is based on the design of the Google
  File System
HIVE

 A petabyte-scale data warehouse system for Hadoop
 Easy data summarization, ad-hoc queries
 Query the data using a SQL-like language called
  HiveQL
 Hive compiler generates map-reduce jobs for most
  queries
Conclusion

 NoSQL is a great problem solver if you need it
 Choose your NoSQL platform carefully as each is
  designed for specific purpose
 Get used to Map/Reduce
 It’s not a sin to use NoSQL alongside (yes)SQL
  database
Referance

 http://www.facebook.com/note.php?note_id=24413
    138919
   http://en.wikipedia.org/wiki/Apache_Cassandra
   http://en.wikipedia.org/wiki/SQL
   http://en.wikipedia.org/wiki/NoSQL
   www.slideshare.com
THANK
YOU..!!

Contenu connexe

Tendances

Tendances (20)

Nosql databases
Nosql databasesNosql databases
Nosql databases
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
NoSql
NoSqlNoSql
NoSql
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Temporal databases
Temporal databasesTemporal databases
Temporal databases
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

En vedette

NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarJainul Musani
 
Smart quill seminar report final
Smart quill seminar report finalSmart quill seminar report final
Smart quill seminar report finalPramod Kumar
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Alpha compositing computer technology
Alpha compositing computer technologyAlpha compositing computer technology
Alpha compositing computer technologyRushikesh Welkar
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation Ericsson Labs
 
Jini network technology
Jini  network   technologyJini  network   technology
Jini network technologyKeerthi Thomas
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
smart quill pen
smart quill pensmart quill pen
smart quill penranjith12
 
The Most effective models for Customer Support Operations
The Most effective models for Customer Support OperationsThe Most effective models for Customer Support Operations
The Most effective models for Customer Support OperationsDavid Loia
 
Coneixer barcelona(15 16). ppt
Coneixer barcelona(15 16). pptConeixer barcelona(15 16). ppt
Coneixer barcelona(15 16). pptmvilage
 

En vedette (20)

NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 Seminar
 
Smart quill seminar report final
Smart quill seminar report finalSmart quill seminar report final
Smart quill seminar report final
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to Mongodb
Introduction to MongodbIntroduction to Mongodb
Introduction to Mongodb
 
Final ppt
Final pptFinal ppt
Final ppt
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Alpha compositing computer technology
Alpha compositing computer technologyAlpha compositing computer technology
Alpha compositing computer technology
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
 
Jini network technology
Jini  network   technologyJini  network   technology
Jini network technology
 
PRESENTATION ON MIRROR LINK
PRESENTATION ON MIRROR LINKPRESENTATION ON MIRROR LINK
PRESENTATION ON MIRROR LINK
 
Dna ppt
Dna pptDna ppt
Dna ppt
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
E paper
E paperE paper
E paper
 
smart quill pen
smart quill pensmart quill pen
smart quill pen
 
Proyecto cine
Proyecto cineProyecto cine
Proyecto cine
 
Presentation_NEW.PPTX
Presentation_NEW.PPTXPresentation_NEW.PPTX
Presentation_NEW.PPTX
 
The Most effective models for Customer Support Operations
The Most effective models for Customer Support OperationsThe Most effective models for Customer Support Operations
The Most effective models for Customer Support Operations
 
Retail Idea
Retail IdeaRetail Idea
Retail Idea
 
Coneixer barcelona(15 16). ppt
Coneixer barcelona(15 16). pptConeixer barcelona(15 16). ppt
Coneixer barcelona(15 16). ppt
 

Similaire à Nosql seminar

DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012Appirio
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBJeff Douglas
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irdatastack
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3RojaT4
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
Vskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerMichael Rys
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMohan Rathour
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLAhmed Helmy
 

Similaire à Nosql seminar (20)

NoSQL
NoSQLNoSQL
NoSQL
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDB
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Vskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills Apache Cassandra sample material
Vskills Apache Cassandra sample material
 
Big data concepts
Big data conceptsBig data concepts
Big data concepts
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
Nosql
NosqlNosql
Nosql
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
No sql
No sqlNo sql
No sql
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
The ABC of Big Data
The ABC of Big DataThe ABC of Big Data
The ABC of Big Data
 
NoSQL
NoSQLNoSQL
NoSQL
 

Dernier

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 

Dernier (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 

Nosql seminar

  • 2. Agenda  Introduction to NOSQL  Objective  Examples of NOSQL databases  NOSQL vs SQL  Conclusion
  • 3. Basic Concepts  Database – is a organized collection of data.  Data base Management System (DBMS)- is a software package with computer program that controls the creation , maintainance & use of a database.  for DBMS , we use structured language to interact with it  Ex. Oracle , IBM DB2 , Ms Access , MySQL , FoxPro etc.  Relational DBMS - A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily. A relational database is created using the relational model. The software used in a relational database is called a relational database management system (RDBMS).
  • 4. SQL  Stuctured Query Language  Special purpose programming language designed for managing data in RDBMS.  Origininally based upon relational algebra & tuple relation calculas.  SQl’s scope include data insert,upadte & delete, schema creation and modification , data access control.  It is static and strong used in database.  Most used widely used database language.  Query is the most important operation in SQL.  Ex. SELECT * FROM Book WHERE price > 100.00 ORDER BY title;
  • 5. NOSQL  Stands for Not Only SQL  Class of non-relational data storage systems  Usually do not require a fixed table schema nor do they use the concept of joins  All NOSQL offerings relax one or more of the ACID properties .  Atomicity , Consistancy , Isolation , Durability ( ACID )  “NOSQL” = “Not Only SQL” = Not Only using traditional relational DBMS
  • 6. NOSQL • Alternative to traditional relational DBMS • Flexible schema • Quicker/cheaper to set up • Massive scalability • Relaxed consistency higher performance & availability * No declarative query language more programming * Relaxed consistency fewer guarantees
  • 7. Why NOSQL?  Every problem cannot be solved by traditional relational database system exclusively.  Handles huge databases.  Redundancy, data is pretty safe on commodity hardware  Super flexible queries using map/reduce  Rapid development (no fixed schema, yeah!)  Very fast for common use cases
  • 8. Contd..  Inspired by Distributed Data Storage problems  Scale easily by adding servers  Not suited to all problem types, but super-suited to certain large problem types  High-write situations (eg activity tracking or timeline rendering for millions of users)  A lot of relational uses are really dumbed down (eg fetch by PK with update)
  • 10. How does it work?  Clients know how to: Send items to servers (consistent hashing) What to do when a server fails How to fetch keys from servers Can “weigh” to server capacities  Servers know how to: Store items they receive Expire them from the cache No inter-server comms – everything is unaware
  • 11. Performance  RDBMS uses buffer to ensure ACID properties  NoSQL does not guarantee ACID and is therefore much faster  We don’t need ACID everywhere!  Ex. Data processing (every minute) is 4x faster with MongoDB, despite being a lot more detailed (due to much simple development)
  • 12. Why NOSQL is faster than SQL ? - Scalling  Simple web application with not much traffic  Application server, database server all on one machine
  • 13. Scalling contd..  More traffic comes in  Application server  Database server  Even more traffic comes in  Load balancer  Application server x2  Database server
  • 14. Scalling contd..  Even more traffic comes in  Load balancer x N  easy  Application server x N  easy  Database server xN  hard for SQL databases
  • 16. Scalling contd..  NoSQL Scalling -  Need more storage?  Add more servers!  Need higher performance?  Add more servers!  Need better reliability?  Add more servers!
  • 17. Scalling Summary  You can scale SQL databases (Oracle, MySQL, SQL Server…)  This will cost you dearly  If you don’t have a lot of money, you will reach limits quickly  You can scale NoSQL databases  Very easy horizontal scaling  Lots of open-source solutions  Scaling is one of the basic incentives for design, so it is well handled  Scaling is the cause of trade-offs causing you to have to use map/reduce
  • 18. Characterstics  Almost infinite horizontal scaling  Very fast  Performance doesn’t deteriorate with growth (much)  No fixed table schemas  No join operations  Ad-hoc queries difficult or impossible  Structured storage  Almost everything happens in RAM
  • 19. NOSQL Types  Wide Column Store / Column Families  Document Store  Key Value / Tuple Store  Graph Databases  Object Databases  XML Databases  Multivalue Databases
  • 20. Main types -  Key-Value Stores  Map Reduce Framework  Document Databases  Graph Databases
  • 21. Key Value Stores  Lineage: Amazon's Dynamo paper and Distributed HashTables.  Data model: A global collection of key-value pairs  Example systems  Google BigTable , Amazon Dynamo, Cassandra, Voldemort , Hbase , …  Implementation: efficiency, scalability, fault-tolerance  Records distributed to nodes based on key  Replication  Single-record transactions, “eventual consistency”
  • 22. Documented Databases  Lineage: Inspired by Lotus Notes.  Data model: Collections of documents, which contain key-value collections (called "documents").  Example: CouchDB, MongoDB, Riak
  • 23. Graph Database  Lineage: Draws from Euler and graph theory.  Data model: Nodes & relationships, both which can hold key-value pairs  Example: AllegroGraph, InfoGrid, Neo4j
  • 24. Map Reduce Framework  Google’s framework for processing highly distributable problems across huge datasets using a large number of computers  Let’s define large number of computers  Cluster if all of them have same hardware  Grid unless Cluster (if !Cluster for old-style programmers)  Process split into two phases  Map  Take the input, partition it delegate to other machines  Other machines can repeat the process, leading to tree structure  Each machine returns results to the machine who gave it the task
  • 25. Map Reduce Framework contd..  Reduce  collect results from machines you gave the tasks  combine results and return it to requester  Slower than sequential data processing, but massively parallel  Sort petabyte of data in a few hours  Input, Map, Shuffle, Reduce, Output
  • 26. Popular NoSQL  Hadoop / Hbase  MemcacheDB  Cassandra  Voldemort  Amazon  Hypertable SimpleDB  Cloudata  MongoDB  IBM  CouchDB Lotus/Domino  Redis
  • 27. Real World Use  Cassandra  Facebook (original developer, used it till late 2010)  Twitter  Digg  Reddit  Rackspace  Cisco  BigTable  Google (open-source version is HBase)  MongoDB  Foursquare  Craigslist  Bit.ly  SourceForge  GitHub
  • 28. MONGODB  Document store  Basic support for dynamic (ad hoc) queries  Query by example (nice!)  Conditional Operators  <, <=, >, >=  $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and, $si ze, $type
  • 29. MONGODB  Data is stored as BSON (binary JSON)  Makes it very well suited for languages with native JSON support  Map/Reduce written in Javascript  Slow! There is one single thread of execution in Javascript  Master/slave replication (auto failover with replica sets)  Sharding built-in  Uses memory mapped files for data storage  Performance over features  On 32bit systems, limited to ~2.5Gb  An empty database takes up 192Mb  GridFS to store big data + metadata (not actually an FS)
  • 30. CASANDRA  Written in: Java  Protocol: Custom, binary (Thrift)  Tunable trade-offs for distribution and replication (N, R, W)  Querying by column, range of keys  BigTable-like features: columns, column families  Writes are much faster than reads (!)  Constant write time regardless of database size  Map/reduce possible with Apache Hadoop
  • 31. Some more info about Cassndra in Facebook  Cassandra is open source DBMS from Appache software foundation.  Cassandra provides a structured key-value store with tunable consistency  Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure  It is a NoSQL solution that was initially developed by Facebook and powered their Inbox Search feature until late 2010
  • 32. HBASE  Written in: Java  Main point: Billions of rows X millions of columns  Modeled after BigTable  Map/reduce with Hadoop  Query predicate push down via server side scan and get filters  Optimizations for real time queries  A high performance Thrift gateway  HTTP supports XML, Protobuf, and binary  Cascading, hive, and pig source and sink modules  No single point of failure  While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store and allows for low latency read and writes.  Random access performance is like MySQL
  • 33. COUCHDB  Written in: Erlang  Main point: DB consistency, ease of use  Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!)  MVCC - write operations do not block reads  Previous versions of documents are available  Crash-only (reliable) design  Needs compacting from time to time  Views: embedded map/reduce  Formatting views: lists & shows  Server-side document validation possible  Authentication possible  Real-time updates via _changes (!)  Attachment handling  CouchApps (standalone JS apps)
  • 34. HADOOP  Apache project  A framework that allows for the distributed processing of large data sets across clusters of computers  Designed to scale up from single servers to thousands of machines  Designed to detect and handle failures at the application layer, instead of relying on hardware for it  Created by Doug Cutting, who named it after his son's toy elephant  Hadoop subprojects  Cassandra  HBase  Pig  Hive was a Hadoop subproject, but is now a top-level Apache project
  • 35. HADOOP contd..  Scales to hundreds or thousands of computers, each with several processor cores  Designed to efficiently distribute large amounts of work across a set of machines  Hundreds of gigabytes of data constitute the low end of Hadoop- scale  Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes  Uses Java, but allows streaming so other languages can easily send and accept data items to/from Hadoop
  • 36. HADOOP contd..  Uses distributed file system (HDFS)  Designed to hold very large amounts of data (terabytes or even petabytes)  Files are stored in a redundant fashion across multiple machines to ensure their durability to failure and high availability to very parallel applications  Data organized into directories and files  Files are divided into block (64MB by default) and distributed across nodes  Design of HDFS is based on the design of the Google File System
  • 37. HIVE  A petabyte-scale data warehouse system for Hadoop  Easy data summarization, ad-hoc queries  Query the data using a SQL-like language called HiveQL  Hive compiler generates map-reduce jobs for most queries
  • 38. Conclusion  NoSQL is a great problem solver if you need it  Choose your NoSQL platform carefully as each is designed for specific purpose  Get used to Map/Reduce  It’s not a sin to use NoSQL alongside (yes)SQL database
  • 39. Referance  http://www.facebook.com/note.php?note_id=24413 138919  http://en.wikipedia.org/wiki/Apache_Cassandra  http://en.wikipedia.org/wiki/SQL  http://en.wikipedia.org/wiki/NoSQL  www.slideshare.com