SlideShare une entreprise Scribd logo
1  sur  23
Submitted By
Supervise By
Problem Definition
Purpose
What is ….
Challenges with data
Big data algorithms
 How To Produce The Big Data
Big Data Characteristics
Applications of Data Mining
FILD OF BIG DATA
Variety (Complexity)
Real-time/Fast Data
Real-Time Analytics/Decision Requirement
A Single View to the Customer
What’s driving Big Data
Benefits
Big Data consists of huge modules, difficult,
growing data sets with numerous and , independent
sources. With the fast development of networking,
storage of data, and the data gathering capacity, Big
Data are now quickly increasing in all science and
engineering domains, as well as animal, genetic and
biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big
Data revolution, and proposes a Big Data processing
model from the data mining view.
This requires carefully designed algorithms to
analyze model correlations between distributed sites,
and fuse decisions from multiple sources to gain a
best model out of the Big Data. Developing a safe
and sound information sharing protocol is a major
challenge. To support Big Data mining, high-
performance computing platforms are required,
which impose systematic designs to unleash the full
power of the Big Data. Big data as an emerging trend
and the need for Big data mining is rising in all
science and engineering domains.
What is …… ?
Data Mining
 computational process of discovering patterns in large data sets
Big Data
 Big data is the data characterized by 3 attributes: volume, variety and
velocity.”
 it is the term for a collection of data sets so large and complex that it becomes
difficult to process
 data has exponential growth, both structured and unstructured
Data: data is any set of characters that has been gathered and translated
for some purpose, usually analysis. It can be any character, including text and
numbers, pictures, sound, or video. If data is not put into context, it doesn't
do anything to a human or computer.
How much Data does exist?
• 2.5 quintillion bytes of data are created
EVERY DAY
• IBM: 90 percent of the data in the world today
were produced with past two years
• Forms of Data????
Data Mining Challenges with Big Data
• Big Data Mining Platform
• Dig Data Semantics and Application Knowledge
I. Information Sharing and Data Privacy
II. Domain and Application Knowledge
• Big Data Mining Algorithm
I. Local Learning and Model Fusion for Multiple
Information Sources
II. mining from Sparse, Uncertain, and Incomplete Data
III. Mining Complex and Dynamic Data
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
Hierarchical
 based clustering algorithms Hierarchical based
 clustering algorithms Hierarchical based
clustering algorithms
 Model based clustering algorithms
How To Produce The Big Data
Big Data
Types
Enterprise
Data
Transactions
Public
Data
Social
Media
Sensor
Data
Big Data Characteristics
 Data has grown
tremendously.
 Big Data starts
with large-volume,
heterogeneous,
autonomous
sources with
distributed and
decentralized
system
11
Applications of Data Mining
 Marketing
 Analysis of consumer behavior
 Advertising campaigns
 Targeted mailings
Finance
o Creditworthiness of clients
o Performance analysis of finance investments
Manufacturing
o Optimization of resources
o Optimization of manufacturing processes
Variety (Complexity)
 Relational Data (Tables/Transaction/Legacy
Data)
 Text Data (Web)
 Semi-structured Data (XML)
 Graph Data
 Social Network, Semantic Web (RDF), …
 Streaming Data
 You can only scan the data once
 A single application can be
generating/collecting many types of data
 Big Public Data (online, weather, finance,
etc)
15
To extract knowledge all these types of
data need to linked together
Real-time/Fast Data
 The progress and innovation is no longer hindered by the ability to collect
data
 But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable
fashion 16
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and
networks
(measuring all kinds of data)
Real-Time Analytics/Decision Requirement
Customer
Influence
Behavior
Product
Recommendations
that are Relevant
& Compelling
Friend Invitations
to join a
Game or Activity
that expands
business
Preventing Fraud
as it is Occurring
& preventing more
proactively
Learning why Customers
Switch to competitors
and their offers; in
time to Counter
Improving the
Marketing
Effectiveness of a
Promotion while it
is still in Play
A Single View to the Customer
Customer
Social
Media
Gamin
g
Entertain
Bankin
g
Financ
e
Our
Known
Histor
y
Purchas
e
5 Vs of Big Data
Volume
• Data quantity
Velocity
• Data Speed
Variety
• Data Types
Veracity
• Authenticity
Value
• Statistical
• Events
What’s driving Big Data
20
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time
Benefits
 Cost & management
 Economies of scale, “out-sourced” resource
management
 Reduced Time to deployment
 Ease of assembly, works “out of the box”
 Scaling
 On demand provisioning, co-locate data and compute
 Reliability
 Massive, redundant, shared resources
 Sustainability
 Hardware not owned
Data Mining With Big Data
Data Mining With Big Data

Contenu connexe

Tendances

Tendances (20)

Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Big Data
Big DataBig Data
Big Data
 
Real time analytics of big data
Real time analytics of big dataReal time analytics of big data
Real time analytics of big data
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big data
Big dataBig data
Big data
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
big data Presentation
big data Presentationbig data Presentation
big data Presentation
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big data
Big dataBig data
Big data
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
Big-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesignersBig-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesigners
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Big data
Big dataBig data
Big data
 
On Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesOn Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challenges
 
Big data
Big dataBig data
Big data
 

En vedette

Being an ally to trans
Being an ally to transBeing an ally to trans
Being an ally to trans
Pip Nosegroeg
 

En vedette (15)

Data Mining
Data MiningData Mining
Data Mining
 
Analytics and Data Mining Industry Overview
Analytics and Data Mining Industry OverviewAnalytics and Data Mining Industry Overview
Analytics and Data Mining Industry Overview
 
Oferta agregada y demanda agregada
Oferta agregada y demanda agregadaOferta agregada y demanda agregada
Oferta agregada y demanda agregada
 
Driving Member Engagement by Showing #VolunteerLove
Driving Member Engagement by Showing #VolunteerLoveDriving Member Engagement by Showing #VolunteerLove
Driving Member Engagement by Showing #VolunteerLove
 
テキスト1(公開版)
テキスト1(公開版)テキスト1(公開版)
テキスト1(公開版)
 
OER World Map: Adolescence of a Community Platform
OER World Map: Adolescence of a Community PlatformOER World Map: Adolescence of a Community Platform
OER World Map: Adolescence of a Community Platform
 
The 4 Stages Of Learning
The 4 Stages Of LearningThe 4 Stages Of Learning
The 4 Stages Of Learning
 
Being an ally to trans
Being an ally to transBeing an ally to trans
Being an ally to trans
 
Ley de sustancias controladas y poder de estado
Ley de sustancias controladas y poder de estadoLey de sustancias controladas y poder de estado
Ley de sustancias controladas y poder de estado
 
Epidemiology of Preterm Birth
Epidemiology of Preterm BirthEpidemiology of Preterm Birth
Epidemiology of Preterm Birth
 
Legalthings e-book
Legalthings e-bookLegalthings e-book
Legalthings e-book
 
Deben elegirse a 2 comisionados del Instituto de Acceso a la Información Públ...
Deben elegirse a 2 comisionados del Instituto de Acceso a la Información Públ...Deben elegirse a 2 comisionados del Instituto de Acceso a la Información Públ...
Deben elegirse a 2 comisionados del Instituto de Acceso a la Información Públ...
 
Docencia y Public Engagement
Docencia y Public EngagementDocencia y Public Engagement
Docencia y Public Engagement
 
PICTORIAL REPRESENTATION OF INTERNATIONAL KINDERGARTEN GRADE-2
PICTORIAL REPRESENTATION OF INTERNATIONAL KINDERGARTEN GRADE-2PICTORIAL REPRESENTATION OF INTERNATIONAL KINDERGARTEN GRADE-2
PICTORIAL REPRESENTATION OF INTERNATIONAL KINDERGARTEN GRADE-2
 
Movable modular coal preparation plant
Movable modular coal preparation plantMovable modular coal preparation plant
Movable modular coal preparation plant
 

Similaire à Data Mining With Big Data

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
Raul Chong
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
VirajSaud
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
saranya270513
 

Similaire à Data Mining With Big Data (20)

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Using Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay VinzeUsing Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay Vinze
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Identify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataIdentify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big data
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on Privacy
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
Big data
Big dataBig data
Big data
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Dernier (20)

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Data Mining With Big Data

  • 2. Problem Definition Purpose What is …. Challenges with data Big data algorithms  How To Produce The Big Data Big Data Characteristics Applications of Data Mining FILD OF BIG DATA Variety (Complexity) Real-time/Fast Data Real-Time Analytics/Decision Requirement A Single View to the Customer What’s driving Big Data Benefits
  • 3. Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high- performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. What is …… ? Data Mining  computational process of discovering patterns in large data sets Big Data  Big data is the data characterized by 3 attributes: volume, variety and velocity.”  it is the term for a collection of data sets so large and complex that it becomes difficult to process  data has exponential growth, both structured and unstructured Data: data is any set of characters that has been gathered and translated for some purpose, usually analysis. It can be any character, including text and numbers, pictures, sound, or video. If data is not put into context, it doesn't do anything to a human or computer.
  • 6. How much Data does exist? • 2.5 quintillion bytes of data are created EVERY DAY • IBM: 90 percent of the data in the world today were produced with past two years • Forms of Data????
  • 7. Data Mining Challenges with Big Data • Big Data Mining Platform • Dig Data Semantics and Application Knowledge I. Information Sharing and Data Privacy II. Domain and Application Knowledge • Big Data Mining Algorithm I. Local Learning and Model Fusion for Multiple Information Sources II. mining from Sparse, Uncertain, and Incomplete Data III. Mining Complex and Dynamic Data
  • 8.
  • 9. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms Hierarchical  based clustering algorithms Hierarchical based  clustering algorithms Hierarchical based clustering algorithms  Model based clustering algorithms
  • 10. How To Produce The Big Data Big Data Types Enterprise Data Transactions Public Data Social Media Sensor Data
  • 11. Big Data Characteristics  Data has grown tremendously.  Big Data starts with large-volume, heterogeneous, autonomous sources with distributed and decentralized system 11
  • 12. Applications of Data Mining  Marketing  Analysis of consumer behavior  Advertising campaigns  Targeted mailings Finance o Creditworthiness of clients o Performance analysis of finance investments Manufacturing o Optimization of resources o Optimization of manufacturing processes
  • 13.
  • 14.
  • 15. Variety (Complexity)  Relational Data (Tables/Transaction/Legacy Data)  Text Data (Web)  Semi-structured Data (XML)  Graph Data  Social Network, Semantic Web (RDF), …  Streaming Data  You can only scan the data once  A single application can be generating/collecting many types of data  Big Public Data (online, weather, finance, etc) 15 To extract knowledge all these types of data need to linked together
  • 16. Real-time/Fast Data  The progress and innovation is no longer hindered by the ability to collect data  But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 16 Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data)
  • 17. Real-Time Analytics/Decision Requirement Customer Influence Behavior Product Recommendations that are Relevant & Compelling Friend Invitations to join a Game or Activity that expands business Preventing Fraud as it is Occurring & preventing more proactively Learning why Customers Switch to competitors and their offers; in time to Counter Improving the Marketing Effectiveness of a Promotion while it is still in Play
  • 18. A Single View to the Customer Customer Social Media Gamin g Entertain Bankin g Financ e Our Known Histor y Purchas e
  • 19. 5 Vs of Big Data Volume • Data quantity Velocity • Data Speed Variety • Data Types Veracity • Authenticity Value • Statistical • Events
  • 20. What’s driving Big Data 20 - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time
  • 21. Benefits  Cost & management  Economies of scale, “out-sourced” resource management  Reduced Time to deployment  Ease of assembly, works “out of the box”  Scaling  On demand provisioning, co-locate data and compute  Reliability  Massive, redundant, shared resources  Sustainability  Hardware not owned

Notes de l'éditeur

  1. Acco.to IBM