SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
BIG DATA, NOSQL & 
ELASTICSEARCH 
BY SANURA HETTIARACHCHI 
INTERN AT VOCANIC
GRAPH DATABASE 
• Database that uses graph structures with nodes, edges, and properties to 
represent and store data. 
• Nodes represent entities. 
• Properties are pertinent information that relate to nodes. 
• Edges represent the relationship between the two. Most of the important 
information is really stored in the edges.
ADVANTAGES & DISADVANTAGES 
• Faster for associative data sets 
• Map more directly to the structure of object-oriented applications. 
• Do not typically require expensive join operations. 
• Depend less on a rigid schema, they are more suitable to manage ad hoc and 
changing data with evolving schemas. 
o Relational databases are typically faster at performing the same operation on 
large numbers of data elements. 
o Relational databases are well known.
BIG DATA 
• Any collection of data sets so large and complex that it becomes difficult to 
process using traditional data processing applications. 
• Require "massively parallel software running on tens, hundreds, or even 
thousands of servers"
FACTORS OF GROWTH, CHALLENGES AND 
OPPORTUNITIES OF BIG DATA 
• Volume – the quantity of data that is generated. 
• Variety – category to which Big Data belongs to. 
• Velocity – how fast the data is generated and processed to meet the demands. 
• Variability – the inconsistency which can be shown by the data at times. 
• Complexity – data needs to be linked, connected and correlated in order to be able 
to grasp the information.
HORIZONTAL & VERTICAL SCALING 
• Horizontal scaling - scale by adding more machines to your pool of resources. 
• Vertical scaling - scale by adding more power (CPU, RAM, etc.) to your existing 
machine. 
• Horizontal scaling is easier to scale dynamically by adding more machines into 
the existing pool. 
• Vertical scaling is often limited to the capacity of a single machine 
• Horizontal scaling are the Cloud data stores, e.g. DynamoDB, Cassandra , 
MongoDB 
• Vertical scaling is MySQL - Amazon RDS (The cloud version of MySQL)
ACID 
• Atomicity - all of a transaction happens, or none of it does. 
• Consistency - data will be consistent. 
• Isolation - one transaction cannot read data from another transaction that is not yet 
completed. 
• Durability - once a transaction is complete, it is guaranteed that all of the changes have 
been recorded to a durable medium.
NOSQL 
• Basically a large serialized object store 
• (mostly) retrieve objects by defined ID 
• In general, doesn’t support complicated queries 
• Doesn’t have a structured schema 
• Recommends de-normalization 
• Designed to be distributed (cloud-scale) out of the box 
• Because of this, drops the ACID requirements 
• Any database can answer any query 
• Any write query can operate against any database and will “eventually” propagate to other 
distributed servers
BASE-THE OPPOSITE OF ACID 
• Basically Available – guaranteed availability 
• Soft-state – the state of the system may change, even without a query 
(because of node updates) 
• Eventually Consistent – the system will become consistent over time
WHY NOSQL? 
• Today, data is becoming easier to access and capture through third parties such as 
Facebook, Google+ and others. 
• Personal user information, social graphs, geo-location data, user-generated content 
and machine logging data are just a few examples where the data has been 
increasing exponentially. 
• To use the above services properly requires the processing of huge amounts of data. 
Which SQL databases are no good for, and were never designed for. 
• NoSQL databases have evolved to handle this huge data properly.
CAP THEOREM 
• Consistency - This means that the data in the database 
remains consistent after the execution of an operation. 
• Availability - This means that the system is always on, 
no downtime. 
• Partition Tolerance - This means that the system 
continues to function even if the communication 
among the servers is unreliable 
Distributed systems must be partition tolerant , so we 
have to choose between Consistency and Availability.
DIFFERENT TYPES OF NOSQL 
 Column Store 
• Column data is saved together, as opposed to 
row data 
• Super useful for data analytics 
• Hadoop, Cassandra, Hypertable 
 Key-Value Store 
• A key that refers to a payload 
• MemcacheDB, Azure Table Storage, Redis 
 Document / XML / Object Store 
• Key (and possibly other indexes) point at a 
serialized object 
• DB can operate against values in document 
• MongoDB, CouchDB, RavenDB 
 Graph Store 
• Nodes are stored independently, and the 
relationship between nodes (edges) are stored 
with data
RDBMS VS NOSQL 
RDBMS NoSQL 
Structured and organized data Semi-structured or unorganized data 
Structured Query Language (SQL) No declarative query language 
Tight consistency Eventual consistency 
ACID transactions BASE transactions 
Data and Relationships stored in tables No pre defined schema
ELASTICSEARCH 
• Elasticsearch is a flexible and powerful open source, distributed, real-time 
search and analytics engine. 
• It is based on Apache Lucene which is a free open source information retrieval 
software library, originally written in Java. 
• ElasticSearch is distributed, which means that indices can be divided 
into shards and each shard can have zero or more replicas.
OVERVIEW 
• Real time 
Elasticsearch supports real-time GET requests, which makes it 
suitable as a NoSQL solution. 
• Distributed 
It is built to scale horizontally out of the box. As you need more 
capacity, just add more nodes, and let the cluster reorganize itself to 
take advantage of the extra hardware. 
• High Availability 
They will detect and remove failed nodes, and reorganize themselves 
to ensure that your data is safe and accessible.
OVERVIEW 
• Multi Tenancy 
A cluster can host multiple indices which can be queried 
independently or as a group. 
• Document Oriented 
Store complex real world entities in Elasticsearch as structured JSON 
documents. All fields are indexed by default, and all the indices can 
be used in a single query, to return results at breath taking speed. 
• Conflict Management 
Optimistic version control can be used where needed to ensure that 
data is never lost due to conflicting changes from multiple processes
THANK YOU

Contenu connexe

Tendances

Elastic Search
Elastic SearchElastic Search
Elastic SearchNavule Rao
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchBo Andersen
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiRobert Calcavecchia
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membaseArdak Shalkarbayuli
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search medcl
 
Big Data Overview Part 1
Big Data Overview Part 1Big Data Overview Part 1
Big Data Overview Part 1William Simms
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.ioDataconomy Media
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseRicha Budhraja
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchJason Austin
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Edureka!
 

Tendances (18)

Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Elastic search
Elastic searchElastic search
Elastic search
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membase
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Big Data Overview Part 1
Big Data Overview Part 1Big Data Overview Part 1
Big Data Overview Part 1
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io
"TextMining with ElasticSearch", Saskia Vola, CEO at textminers.io
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational Database
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 

En vedette

Put Your Thinking CAP On
Put Your Thinking CAP OnPut Your Thinking CAP On
Put Your Thinking CAP OnTomer Gabel
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big dataSteven Francia
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseKristijan Duvnjak
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 

En vedette (6)

Put Your Thinking CAP On
Put Your Thinking CAP OnPut Your Thinking CAP On
Put Your Thinking CAP On
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 

Similaire à BIG DATA, NOSQL & ELASTICSEARCH TECHNOLOGIES

Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...raghdooosh
 
Basic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB MeetupBasic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB MeetupJohannes Moser
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology LandscapeShivanandaVSeeri
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesShilpaKrishna6
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3Simon Ambridge
 

Similaire à BIG DATA, NOSQL & ELASTICSEARCH TECHNOLOGIES (20)

Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 
Basic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB MeetupBasic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB Meetup
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NoSql
NoSqlNoSql
NoSql
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databases
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
 

Dernier

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 

Dernier (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 

BIG DATA, NOSQL & ELASTICSEARCH TECHNOLOGIES

  • 1. BIG DATA, NOSQL & ELASTICSEARCH BY SANURA HETTIARACHCHI INTERN AT VOCANIC
  • 2. GRAPH DATABASE • Database that uses graph structures with nodes, edges, and properties to represent and store data. • Nodes represent entities. • Properties are pertinent information that relate to nodes. • Edges represent the relationship between the two. Most of the important information is really stored in the edges.
  • 3. ADVANTAGES & DISADVANTAGES • Faster for associative data sets • Map more directly to the structure of object-oriented applications. • Do not typically require expensive join operations. • Depend less on a rigid schema, they are more suitable to manage ad hoc and changing data with evolving schemas. o Relational databases are typically faster at performing the same operation on large numbers of data elements. o Relational databases are well known.
  • 4. BIG DATA • Any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. • Require "massively parallel software running on tens, hundreds, or even thousands of servers"
  • 5. FACTORS OF GROWTH, CHALLENGES AND OPPORTUNITIES OF BIG DATA • Volume – the quantity of data that is generated. • Variety – category to which Big Data belongs to. • Velocity – how fast the data is generated and processed to meet the demands. • Variability – the inconsistency which can be shown by the data at times. • Complexity – data needs to be linked, connected and correlated in order to be able to grasp the information.
  • 6. HORIZONTAL & VERTICAL SCALING • Horizontal scaling - scale by adding more machines to your pool of resources. • Vertical scaling - scale by adding more power (CPU, RAM, etc.) to your existing machine. • Horizontal scaling is easier to scale dynamically by adding more machines into the existing pool. • Vertical scaling is often limited to the capacity of a single machine • Horizontal scaling are the Cloud data stores, e.g. DynamoDB, Cassandra , MongoDB • Vertical scaling is MySQL - Amazon RDS (The cloud version of MySQL)
  • 7. ACID • Atomicity - all of a transaction happens, or none of it does. • Consistency - data will be consistent. • Isolation - one transaction cannot read data from another transaction that is not yet completed. • Durability - once a transaction is complete, it is guaranteed that all of the changes have been recorded to a durable medium.
  • 8. NOSQL • Basically a large serialized object store • (mostly) retrieve objects by defined ID • In general, doesn’t support complicated queries • Doesn’t have a structured schema • Recommends de-normalization • Designed to be distributed (cloud-scale) out of the box • Because of this, drops the ACID requirements • Any database can answer any query • Any write query can operate against any database and will “eventually” propagate to other distributed servers
  • 9. BASE-THE OPPOSITE OF ACID • Basically Available – guaranteed availability • Soft-state – the state of the system may change, even without a query (because of node updates) • Eventually Consistent – the system will become consistent over time
  • 10. WHY NOSQL? • Today, data is becoming easier to access and capture through third parties such as Facebook, Google+ and others. • Personal user information, social graphs, geo-location data, user-generated content and machine logging data are just a few examples where the data has been increasing exponentially. • To use the above services properly requires the processing of huge amounts of data. Which SQL databases are no good for, and were never designed for. • NoSQL databases have evolved to handle this huge data properly.
  • 11. CAP THEOREM • Consistency - This means that the data in the database remains consistent after the execution of an operation. • Availability - This means that the system is always on, no downtime. • Partition Tolerance - This means that the system continues to function even if the communication among the servers is unreliable Distributed systems must be partition tolerant , so we have to choose between Consistency and Availability.
  • 12. DIFFERENT TYPES OF NOSQL  Column Store • Column data is saved together, as opposed to row data • Super useful for data analytics • Hadoop, Cassandra, Hypertable  Key-Value Store • A key that refers to a payload • MemcacheDB, Azure Table Storage, Redis  Document / XML / Object Store • Key (and possibly other indexes) point at a serialized object • DB can operate against values in document • MongoDB, CouchDB, RavenDB  Graph Store • Nodes are stored independently, and the relationship between nodes (edges) are stored with data
  • 13. RDBMS VS NOSQL RDBMS NoSQL Structured and organized data Semi-structured or unorganized data Structured Query Language (SQL) No declarative query language Tight consistency Eventual consistency ACID transactions BASE transactions Data and Relationships stored in tables No pre defined schema
  • 14. ELASTICSEARCH • Elasticsearch is a flexible and powerful open source, distributed, real-time search and analytics engine. • It is based on Apache Lucene which is a free open source information retrieval software library, originally written in Java. • ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas.
  • 15. OVERVIEW • Real time Elasticsearch supports real-time GET requests, which makes it suitable as a NoSQL solution. • Distributed It is built to scale horizontally out of the box. As you need more capacity, just add more nodes, and let the cluster reorganize itself to take advantage of the extra hardware. • High Availability They will detect and remove failed nodes, and reorganize themselves to ensure that your data is safe and accessible.
  • 16. OVERVIEW • Multi Tenancy A cluster can host multiple indices which can be queried independently or as a group. • Document Oriented Store complex real world entities in Elasticsearch as structured JSON documents. All fields are indexed by default, and all the indices can be used in a single query, to return results at breath taking speed. • Conflict Management Optimistic version control can be used where needed to ensure that data is never lost due to conflicting changes from multiple processes