SlideShare une entreprise Scribd logo
1  sur  20
The Term Bigdata stems from Characterisized
by 5V:
Volume: Large Volume of data
Velocity: amount of data per seconds
Variability: level of unintentional
modification affecting data Quality
throughout lifecycle of data.
Value: Value derived from data.
Variety: large range of data which is received
from video , audio, text, image.
Sources Example by 5V.
Volume: Youtube, large volume of video feeds received
and maintained at many video sites like youtube,
vimeo etc…
Variety: Large variety of data text, audio, video,
images, received in sites like facebook, twitter, other
social media platforms.
Velocity: Speed at which data is received in sites like
twitter, facebook (1 billion people all feeding there
data on one site)
Batch Processing Vs Real Time processing
Batch Jobs run at particular time of day like Nightly jobs
or morning jobs which depends on slack time When
server has less load.
But people now want to see the Status like in
transportation when bus is arriving on particular stand
in real time. Or in Retail as soon they update there status
the require real time advertisements. This is shaping
move towards Big data.
Problems differentiated by 5V.
Velocity: With large volume of data received and quick turn
around latency required to reflect the data fed at facebook then
Can it be managed by regular DBMS?
DBMS- maintains ACID properties & have lots of constraints like
primary, foreign keys, check constraints etc.. with quick
turnaround or short latency required these constraints add up
processing time and volume required for storage. So all of these
sites have there own File based storage DBMS like systems with
does not have these constraints. All data is maintained in files, id
assigned to files are indexed and regularly moved (these are
publically know open sourced databases like Cassandra developed
by facebook, BigTable by Google, etc…)
Most of this databases are popularly Categorized as NoSQL
databases.
Technology Company Open Sourced On
Cassandra DataStax Apache Cassandra
used by Facebook , Linkedin ,
Twitter
BigTable Google Google BigTable
Apache HBase Apache HBase ( used by many
companies most popular)
MongoDB MongoDB Inc. Apache (written on C++,Erlang,C)
Couchbase CouchBase Inc Apache (written on Erlang)
Category No SQL database
Column
Oriented
Accumulo, Cassandra, Hbase.
Document Clusterpoint,Couchdb, Couchbase, MarkLogic, MongoDB
Key-Value Dynamo, FoundationDB, MemcacheDB, Redis, Riak, FairCom c-
treeACE
Graph Allegro, Neo4J, OrientDB, Virtuoso, Stardog
- Column Oriented DB store database store Values in Column By Column rather
in other RDBMS row by row.
- It leads to better Compression Of data and hence less space required to store
DB.
- There are Still higher Compression can be achieved when used Probabilistic
Databases.
- Similarly Document oriented Store and arrange data in form of documents.
- Key-Value store Data in form of collection of Key-value pairs. Allowing add,
insert, delete to key-value pairs.
- Graph Databases: Every Element is direct pointer to its adjacent hence no-
lookup required.
Go through the link below:
http://sandyclassic.wordpress.com/2013/07/02/data-
warehousing-business-intelligence-and-cloud-
computing
As we know now Bigdata is solving problems of 5V like
the huge (V)olume of storage required for video sites
like youtube. Etc.
It’s changing how We perceive and Visualize or
Analyze data like HBase used for data storage, Mahout
of used to run analytics and find patterns. These
databases have variety of data which require different
kind of processing cannot be achieved by traditional
RDBMS based products. Example link below:
http://sandyclassic.wordpress.com/2013/06/18/gini-
coefficient-of-economics-and-roc-curve-machine-
learning/
Map-Reduce Algorithm was starting point of All we see
in BigData created by Google researcher.
Mapper divides work into multiple parallel task, sorts
within queue and filters into queue of say 1 queue for
each name.
Reducer Component Aggregates data or summarizes
from multiple units.
So Since data is mostly unstructured the best way to
analyze unstructured data is using Analytics here
Comes New Career Called Data Scientist.
Skill Set Required for Data Scientist:
Mathematics (mostly statistics), Computer Science,
Domain like Sociology (like Social Media Analysis),
One application of Bigdata has been to gather
feedback about product from social media.
Here is Sample project Report below How and what
tools can be used to Analyze social media.
http://www.slideshare.net/SandeepSharma65/social-
media-analysis-project
Hadoop allows to distribute load among many
clusters.
There can be Database clusters, OS clusters,
Application Web server level clustering But here we
are dealing with OS like Distributed File System(DFS).
Hadoop DFS (HDFS) File system developed by yahoo
Competes with BigTable of Google providing quick
storage and retrieval of data in form of files used by
many social media platforms.
‘R’ was open source Statistical Analysis language
having Statistical Constructs available used for
Analysis of data.
Java data mining API, .Net data mining API , python
libraries are used to mine and understand trends in
Data.
PIG is another Apache Hadoop based system used
provide High level language for analyzing large data
sets.
Data Science http://thedatascience.wordpress.com/
Big Data :http://thebigdatatrends.wordpress.com
Data Science Blog2:
http://thedatascientistview.blogspot.ie/
Retail generates huge amount of data for product
positioned on different shelf at store, replenishment level,
reorder level, merchandising, assortment planning all this
data most of it usually structured Since lots of system is
Automated but there are lots of forms, customer feedback,
planning data analysis of mails other chat platforms.
Large Warehouses of Retail store needs plan positioning
and containers in Aisle.
Analyze trends from social media to find customer
preferences for products and offers.
Retail Innovation read:
http://sandyclassic.wordpress.com/2013/10/26/retail-
sector-innovations/
Retail uses lots of Sensors for tracking items with
warehouse and inside Store. The Huge real time data
(video , text and other forms) generated every milli-
second from Sensors embedded across every store and
warehouse Cannot be analyzed by any other medium
better than in Hadoop or Bigdata based System.
Finance being Game of numbers huge data from Book
of accounts, P&L, Balance sheets of etc accumulates of
different business over a period of time But most
books are Structured and hence the data. But Hadoop
offers huge scalable clusters to quickly analyze
structured data as well.
Lots of social media data about interest for share or
any instrument does get reflected in numbers.
Spreadsheets are popular medium of analysis and
other textual forms can be better analyzed if available
over Hadoop like clusters for a kind of semi-structured
data analysis.

Contenu connexe

Tendances

Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
Enterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingEnterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingKnowledgent
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big DataLewis Crawford
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data WarehousingAmdocs
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsOntotext
 
Open Source Business Intelligence Overview
Open Source Business Intelligence OverviewOpen Source Business Intelligence Overview
Open Source Business Intelligence OverviewAlex Meadows
 
Big Data with SQL Server
Big Data with SQL ServerBig Data with SQL Server
Big Data with SQL ServerMark Kromer
 
Job Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data ScientistsJob Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data ScientistsJobsPikr
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
Mastering in Data Warehousing and Business Intelligence
Mastering in Data Warehousing and Business IntelligenceMastering in Data Warehousing and Business Intelligence
Mastering in Data Warehousing and Business IntelligenceEdureka!
 
GraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesGraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesLinkurious
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectOntotext
 
How Linked Data Can Speed Information Discovery
How Linked Data Can Speed Information DiscoveryHow Linked Data Can Speed Information Discovery
How Linked Data Can Speed Information DiscoveryAlex Meadows
 
Graphing Your Data
Graphing Your DataGraphing Your Data
Graphing Your DataAlex Meadows
 
Hadoop - An Introduction
Hadoop - An IntroductionHadoop - An Introduction
Hadoop - An IntroductionShankar R
 

Tendances (20)

Enterprise architecture for big data projects
Enterprise architecture for big data projectsEnterprise architecture for big data projects
Enterprise architecture for big data projects
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Enterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingEnterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum Computing
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big Data
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
Open Source Business Intelligence Overview
Open Source Business Intelligence OverviewOpen Source Business Intelligence Overview
Open Source Business Intelligence Overview
 
Big Data with SQL Server
Big Data with SQL ServerBig Data with SQL Server
Big Data with SQL Server
 
Job Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data ScientistsJob Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data Scientists
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Mastering in Data Warehousing and Business Intelligence
Mastering in Data Warehousing and Business IntelligenceMastering in Data Warehousing and Business Intelligence
Mastering in Data Warehousing and Business Intelligence
 
GraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesGraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph Databases
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
How Linked Data Can Speed Information Discovery
How Linked Data Can Speed Information DiscoveryHow Linked Data Can Speed Information Discovery
How Linked Data Can Speed Information Discovery
 
Big Data Pitfalls
Big Data PitfallsBig Data Pitfalls
Big Data Pitfalls
 
Graphing Your Data
Graphing Your DataGraphing Your Data
Graphing Your Data
 
Hadoop - An Introduction
Hadoop - An IntroductionHadoop - An Introduction
Hadoop - An Introduction
 

En vedette

En vedette (20)

Social media analysis project
Social media analysis projectSocial media analysis project
Social media analysis project
 
Big data inforgraphics
Big data inforgraphicsBig data inforgraphics
Big data inforgraphics
 
Roc curve, analytics
Roc curve, analyticsRoc curve, analytics
Roc curve, analytics
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Data stax no sql use cases
Data stax  no sql use casesData stax  no sql use cases
Data stax no sql use cases
 
Cloud Security Alliance Guide to Cloud Security
Cloud Security Alliance Guide to Cloud SecurityCloud Security Alliance Guide to Cloud Security
Cloud Security Alliance Guide to Cloud Security
 
Cloudyn - Multi vendor Cloud management
Cloudyn - Multi vendor Cloud management Cloudyn - Multi vendor Cloud management
Cloudyn - Multi vendor Cloud management
 
Saas security
Saas securitySaas security
Saas security
 
Cloud Strategy Architecture for multi country deployment
Cloud Strategy Architecture for multi country deploymentCloud Strategy Architecture for multi country deployment
Cloud Strategy Architecture for multi country deployment
 
Solution Architecture - AWS
Solution Architecture - AWSSolution Architecture - AWS
Solution Architecture - AWS
 
Cassandra admin
Cassandra adminCassandra admin
Cassandra admin
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
Iam cloud security_vision_wp_236732
Iam cloud security_vision_wp_236732Iam cloud security_vision_wp_236732
Iam cloud security_vision_wp_236732
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Togaf 9 template system use case diagram
Togaf 9 template   system use case diagramTogaf 9 template   system use case diagram
Togaf 9 template system use case diagram
 
Togaf 9 template application and user location diagram
Togaf 9 template   application and user location diagramTogaf 9 template   application and user location diagram
Togaf 9 template application and user location diagram
 
Overcoming cassandra query limitation spark
Overcoming cassandra query limitation sparkOvercoming cassandra query limitation spark
Overcoming cassandra query limitation spark
 
Togaf 9 template application communication diagram
Togaf 9 template   application communication diagramTogaf 9 template   application communication diagram
Togaf 9 template application communication diagram
 
Cassandra Configuration
Cassandra ConfigurationCassandra Configuration
Cassandra Configuration
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 

Similaire à No sql databases

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introductionsaisreealekhya
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overviewNitesh Ghosh
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035Neelam Rawat
 
Big data and hadoop ecosystem essentials for managers
Big data and hadoop ecosystem essentials for managersBig data and hadoop ecosystem essentials for managers
Big data and hadoop ecosystem essentials for managersManjeet Singh Nagi
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET Journal
 

Similaire à No sql databases (20)

Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data
Big DataBig Data
Big Data
 
Big Data
Big DataBig Data
Big Data
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
TSE_Pres12.pptx
TSE_Pres12.pptxTSE_Pres12.pptx
TSE_Pres12.pptx
 
NoSQL Basics - a quick tour
NoSQL Basics - a quick tourNoSQL Basics - a quick tour
NoSQL Basics - a quick tour
 
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Big Data
Big DataBig Data
Big Data
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
Big data and hadoop ecosystem essentials for managers
Big data and hadoop ecosystem essentials for managersBig data and hadoop ecosystem essentials for managers
Big data and hadoop ecosystem essentials for managers
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Big data
Big dataBig data
Big data
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
 
Big data
Big dataBig data
Big data
 
INTRODUCTION OF BIG DATA
INTRODUCTION OF BIG DATAINTRODUCTION OF BIG DATA
INTRODUCTION OF BIG DATA
 

Plus de Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Plus de Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW (20)

Management Consultancy Saudi Telecom Digital Transformation Design Thinking
Management Consultancy Saudi Telecom Digital Transformation Design ThinkingManagement Consultancy Saudi Telecom Digital Transformation Design Thinking
Management Consultancy Saudi Telecom Digital Transformation Design Thinking
 
Major new initiatives
Major new initiativesMajor new initiatives
Major new initiatives
 
Digital transformation journey Consulting
Digital transformation journey ConsultingDigital transformation journey Consulting
Digital transformation journey Consulting
 
Agile Jira Reporting
Agile Jira Reporting Agile Jira Reporting
Agile Jira Reporting
 
Lnt and bbby Retail Houseare industry Case assignment sandeep sharma
Lnt and bbby Retail Houseare industry Case assignment  sandeep sharmaLnt and bbby Retail Houseare industry Case assignment  sandeep sharma
Lnt and bbby Retail Houseare industry Case assignment sandeep sharma
 
Risk management Consulting For Municipality
Risk management Consulting For MunicipalityRisk management Consulting For Municipality
Risk management Consulting For Municipality
 
GDPR And Privacy By design Consultancy
GDPR And Privacy By design ConsultancyGDPR And Privacy By design Consultancy
GDPR And Privacy By design Consultancy
 
Real implementation Blockchain Best Use Cases Examples
Real implementation Blockchain Best Use Cases ExamplesReal implementation Blockchain Best Use Cases Examples
Real implementation Blockchain Best Use Cases Examples
 
Ffd 05 2012
Ffd 05 2012Ffd 05 2012
Ffd 05 2012
 
Biztalk architecture for Configured SMS service
Biztalk architecture for Configured SMS serviceBiztalk architecture for Configured SMS service
Biztalk architecture for Configured SMS service
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
Pmo best practices
Pmo best practicesPmo best practices
Pmo best practices
 
Agile project management
Agile project managementAgile project management
Agile project management
 
Enroll hostel Business Model
Enroll hostel Business ModelEnroll hostel Business Model
Enroll hostel Business Model
 
Cloud manager client provisioning guideline draft 1.0
Cloud manager client provisioning guideline draft 1.0Cloud manager client provisioning guideline draft 1.0
Cloud manager client provisioning guideline draft 1.0
 
Bpm digital transformation
Bpm digital transformationBpm digital transformation
Bpm digital transformation
 
Digital transformation explained
Digital transformation explainedDigital transformation explained
Digital transformation explained
 
Government Digital transformation trend draft 1.0
Government Digital transformation trend draft 1.0Government Digital transformation trend draft 1.0
Government Digital transformation trend draft 1.0
 
Enterprise architecture maturity rating draft 1.0
Enterprise architecture maturity rating draft 1.0Enterprise architecture maturity rating draft 1.0
Enterprise architecture maturity rating draft 1.0
 
Organisation Structure For digital Transformation Team
Organisation Structure For digital Transformation TeamOrganisation Structure For digital Transformation Team
Organisation Structure For digital Transformation Team
 

Dernier

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 

Dernier (20)

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 

No sql databases

  • 1. The Term Bigdata stems from Characterisized by 5V: Volume: Large Volume of data Velocity: amount of data per seconds Variability: level of unintentional modification affecting data Quality throughout lifecycle of data. Value: Value derived from data. Variety: large range of data which is received from video , audio, text, image.
  • 2. Sources Example by 5V. Volume: Youtube, large volume of video feeds received and maintained at many video sites like youtube, vimeo etc… Variety: Large variety of data text, audio, video, images, received in sites like facebook, twitter, other social media platforms. Velocity: Speed at which data is received in sites like twitter, facebook (1 billion people all feeding there data on one site)
  • 3. Batch Processing Vs Real Time processing Batch Jobs run at particular time of day like Nightly jobs or morning jobs which depends on slack time When server has less load. But people now want to see the Status like in transportation when bus is arriving on particular stand in real time. Or in Retail as soon they update there status the require real time advertisements. This is shaping move towards Big data.
  • 4. Problems differentiated by 5V. Velocity: With large volume of data received and quick turn around latency required to reflect the data fed at facebook then Can it be managed by regular DBMS? DBMS- maintains ACID properties & have lots of constraints like primary, foreign keys, check constraints etc.. with quick turnaround or short latency required these constraints add up processing time and volume required for storage. So all of these sites have there own File based storage DBMS like systems with does not have these constraints. All data is maintained in files, id assigned to files are indexed and regularly moved (these are publically know open sourced databases like Cassandra developed by facebook, BigTable by Google, etc…) Most of this databases are popularly Categorized as NoSQL databases.
  • 5. Technology Company Open Sourced On Cassandra DataStax Apache Cassandra used by Facebook , Linkedin , Twitter BigTable Google Google BigTable Apache HBase Apache HBase ( used by many companies most popular) MongoDB MongoDB Inc. Apache (written on C++,Erlang,C) Couchbase CouchBase Inc Apache (written on Erlang)
  • 6. Category No SQL database Column Oriented Accumulo, Cassandra, Hbase. Document Clusterpoint,Couchdb, Couchbase, MarkLogic, MongoDB Key-Value Dynamo, FoundationDB, MemcacheDB, Redis, Riak, FairCom c- treeACE Graph Allegro, Neo4J, OrientDB, Virtuoso, Stardog - Column Oriented DB store database store Values in Column By Column rather in other RDBMS row by row. - It leads to better Compression Of data and hence less space required to store DB. - There are Still higher Compression can be achieved when used Probabilistic Databases. - Similarly Document oriented Store and arrange data in form of documents. - Key-Value store Data in form of collection of Key-value pairs. Allowing add, insert, delete to key-value pairs. - Graph Databases: Every Element is direct pointer to its adjacent hence no- lookup required.
  • 7. Go through the link below: http://sandyclassic.wordpress.com/2013/07/02/data- warehousing-business-intelligence-and-cloud- computing
  • 8. As we know now Bigdata is solving problems of 5V like the huge (V)olume of storage required for video sites like youtube. Etc. It’s changing how We perceive and Visualize or Analyze data like HBase used for data storage, Mahout of used to run analytics and find patterns. These databases have variety of data which require different kind of processing cannot be achieved by traditional RDBMS based products. Example link below: http://sandyclassic.wordpress.com/2013/06/18/gini- coefficient-of-economics-and-roc-curve-machine- learning/
  • 9. Map-Reduce Algorithm was starting point of All we see in BigData created by Google researcher. Mapper divides work into multiple parallel task, sorts within queue and filters into queue of say 1 queue for each name. Reducer Component Aggregates data or summarizes from multiple units.
  • 10.
  • 11. So Since data is mostly unstructured the best way to analyze unstructured data is using Analytics here Comes New Career Called Data Scientist. Skill Set Required for Data Scientist: Mathematics (mostly statistics), Computer Science, Domain like Sociology (like Social Media Analysis),
  • 12.
  • 13.
  • 14. One application of Bigdata has been to gather feedback about product from social media. Here is Sample project Report below How and what tools can be used to Analyze social media. http://www.slideshare.net/SandeepSharma65/social- media-analysis-project
  • 15. Hadoop allows to distribute load among many clusters. There can be Database clusters, OS clusters, Application Web server level clustering But here we are dealing with OS like Distributed File System(DFS). Hadoop DFS (HDFS) File system developed by yahoo Competes with BigTable of Google providing quick storage and retrieval of data in form of files used by many social media platforms.
  • 16. ‘R’ was open source Statistical Analysis language having Statistical Constructs available used for Analysis of data. Java data mining API, .Net data mining API , python libraries are used to mine and understand trends in Data. PIG is another Apache Hadoop based system used provide High level language for analyzing large data sets.
  • 17. Data Science http://thedatascience.wordpress.com/ Big Data :http://thebigdatatrends.wordpress.com Data Science Blog2: http://thedatascientistview.blogspot.ie/
  • 18. Retail generates huge amount of data for product positioned on different shelf at store, replenishment level, reorder level, merchandising, assortment planning all this data most of it usually structured Since lots of system is Automated but there are lots of forms, customer feedback, planning data analysis of mails other chat platforms. Large Warehouses of Retail store needs plan positioning and containers in Aisle. Analyze trends from social media to find customer preferences for products and offers. Retail Innovation read: http://sandyclassic.wordpress.com/2013/10/26/retail- sector-innovations/
  • 19. Retail uses lots of Sensors for tracking items with warehouse and inside Store. The Huge real time data (video , text and other forms) generated every milli- second from Sensors embedded across every store and warehouse Cannot be analyzed by any other medium better than in Hadoop or Bigdata based System.
  • 20. Finance being Game of numbers huge data from Book of accounts, P&L, Balance sheets of etc accumulates of different business over a period of time But most books are Structured and hence the data. But Hadoop offers huge scalable clusters to quickly analyze structured data as well. Lots of social media data about interest for share or any instrument does get reflected in numbers. Spreadsheets are popular medium of analysis and other textual forms can be better analyzed if available over Hadoop like clusters for a kind of semi-structured data analysis.