SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
Data Mining With Big
Data
Guide: Prof. Prashant G. Ahire
Presented by :
Miss.Rupa Solapure
Roll no. 259
Agenda
Problem Definition
Objectives
Literature Survey
Architecture/Big Data mining algorithm
Existing System/Mathematical model
Advantages
Disadvantages/Limitations
Characteristics of Big Data
Big Data and it’s challenges
Big Data mining Tools
Applications of Big Data
References
Problem Definition:
Big Data consists of huge modules, difficult, growing data sets with
numerous and , independent sources. With the fast development of
networking, storage of data, and the data gathering capacity, Big Data are
now quickly increasing in all science and engineering domains, as well as
animal, genetic and biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big Data revolution, and
proposes a Big Data processing model from the data mining view.
Objective:
This requires carefully designed algorithms to analyze model correlations
between distributed sites, and fuse decisions from multiple sources to gain a best
model out of the Big Data. Developing a safe and sound information sharing
protocol is a major challenge.
To support Big Data mining, high-performance computing platforms are
required, which impose systematic designs to unleash the full power of the Big
Data. Big data as an emerging trend and the need for Big data mining is rising in
all science and engineering domains.
Literature Survey
Title/Year Keywords Concept/Abstract Author
“Data Mining With Big
Data,Jan 2014”
Big Data,data
Mining,Heterogeneity,Au
tonomous
sources,Complex,and
Evolving associations.
This paper presents a HACE
theorem that characterizes the
features of Big Data
revolutions,processing model
from data mining.
Xindong Wu, Fellow,
IEEE, Xingquan Zhu,
Senior Member, IEEE,
Gong-Qing Wu, and Wei
Ding
“The Survey of Data
Mining Applications
And Feature
Scope,,June 2012”
Data mining task, Data
mining life cycle ,
Visualization of the data
mining model , Data
mining Methods,s
Data mining applications.
This paper imparts more
number of applications of the
data mining and also o focuses
scope of the data mining which
will helpful in the further
research.
Neelamadhab Padhy1,
Dr. Pragnyaban Mishra 2,
and Rasmita Panigrahi3
“Review on Data
Mining with Big
Data..Dec 2014”
Big Data, data mining,
heterogeneity,
autonomous sources,
complex and evolving
associations.
This data-driven model involves
demand-driven aggregation of
information sources, mining and
analysis, security and privacy
considerations.
Savita Suryavanshi, Prof.
Bharati Kale.
“SURVEY ON BIG
DATA MINING
PLATFORMS,
ALGORITHMS AND
CHALLENGES.sep201
4”
big data, big data mining
platforms, big data
mining algorithms, big
data mining challenges,
data mining.
This paper gives A review on
various big data mining
platforms, algorithms and
challenges is also discussed in
this paper.
SHERIN A1, Dr S UMA2,
SARANYA K3, SARANYA
VANI M4.
Architecture:
Fig.: Big data Memory evolution
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Model based clustering algorithms
Existing System:
The rise of Big Data applications where data collection has grown tremendous
doubly and is beyond the ability of commonly used software tools to capture,
manage, and process within a “tolerable elapsed time.”
The most fundamental challenge for Big Data applications is to explore the large
volumes of data and extract useful information or knowledge for future actions.
In many situations, the knowledge extraction process has to be very efficient and
close to real time because storing all observed data is nearly infeasible.
The unprecedented data volumes require an effective data analysis and prediction
platform to achieve fast response and real-time classification for such Big Data.
In model level it will produce local pattern. This pattern will be produced after
mined local data.
By sharing these local patterns with other local sites, we can produce a single
global pattern.
At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how related
the data sources are correlated to each other, and how to form accurate decisions
based on models built from autonomous sources
Continue…
Big Data
Big Data is a comprehensive term for any collection of data sets so large and multifarious
that it becomes difficult to process them using conventional data processing applications.
There are two types of Big Data: structured and unstructured.
Structured data
Structured data are numbers and words that can be easily categorized and analyzed.
These data are generated by things like network sensors embedded in electronic
devices, smart phones, and global positioning system (GPS) devices. Structured data
also include things like sales figures, account balances, and transaction data.
Unstructured data
Unstructured data include more multifarious information, such as customer reviews
from feasible websites, photos and other multimedia, and comments on social
networking sites. These data can not be separated into categorized or analyzed
numerically.
Big Data Characteristic(HACE Theorem)
Figure . The blind men and the enormous elephant: the restricted view
of each blind man leads to a biased conclusion.
HACE theorem suggests that the key characteristics of the
Big Data are:
A. Huge with various and miscellaneous data sources
B. Autonomous Sources with circulated & disperse Control
C. Complex and Evolving associations
Applications of Data Mining
Marketing
 Analysis of consumer behaviour
 Advertising campaigns
 Targeted mailings
 Segmentation of customers, stores, or products
Finance
 Creditworthiness of clients
 Performance analysis of finance investments
 Fraud detection
Manufacturing
 Optimization of resources
 Optimization of manufacturing processes
 Product design based on customer requirements
Health Care
 Discovering patterns in X-ray images
 Analyzing side effects of drugs
 Effectiveness of treatments
Big Data Mining Algorithm
Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site.But it is prohibited because of high data transmission cost
and privacy concerns.
Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
The global data mining is done through two steps process.
 Model level
Knowledge level.
Each and every local sites use local data to calculate the data statistics
and it share this information in order to achieve global data distribution in
their data level.
Data Mining Challenges With Big Data
Fig. a conceptual view of the Big Data processing framework
DISADVANTAGES OF EXISTING
SYSTEM
To explore Big Data, we have analysed several challenges at the
data, model, and system levels.
The challenges at Tier I focus on data accessing and arithmetic
computing procedures. Because Big Data are often stored at
different locations and data volumes may continuously grow, an
effective computing platform will have to take distributed large-
scale data storage into consideration for computing.
PROPOSED SYSTEM
We propose a HACE theorem to model Big Data characteristics. The
characteristics of HACH make it an extreme challenge for
discovering useful knowledge from the Big Data.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
Characteristics of Big Data
Fig. Five Vs of BIG DATA
Volume- The quantity of data
Variety - categorizing the data
Velocity- speed of generation of data or the speed
of processing the data
Variability- Inconsistency
Complexity- Managing the data
Continue…
BIG Data Mining Tools
Hadoop
Apache S4
Strom
Apache Mahout
MOA
Fig.: Big Data processing
Conclusion:
Because of Increase in the amount of data in the field of genomics,
meteorology, biology, environmental research, it becomes difficult to handle
the data, to find Associations, patterns and to analyze the large data sets.
As an organization collects more data at this scale, formalizing the process of
big data analysis will become paramount.The paper describes methods for
different algorithms used to handle such large data sets. And it gives an
overview of architecture and algorithms used in large data sets.
References
 McKinsy Global Institute, Big Data: The next frontier for
innovation, competition and productivity- May 2011
Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013,
Data Mining with Big Data
 Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis,
Algorithms for mining the evolution of conserved relational states in
dynamic network
 IEEE, Data Mining with Big Data, January 2014
 Oracle, June 2013,Unstructured Data Management with Oracle
Database 12c
Data minig with Big data analysis

Contenu connexe

Tendances

Tendances (20)

Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Mining data streams
Mining data streamsMining data streams
Mining data streams
 
Big data and Social Media Analytics
Big data and Social Media AnalyticsBig data and Social Media Analytics
Big data and Social Media Analytics
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big Data
Big DataBig Data
Big Data
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
Data Science-1 (1).ppt
Data Science-1 (1).pptData Science-1 (1).ppt
Data Science-1 (1).ppt
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaWhat are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data
Big DataBig Data
Big Data
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 

En vedette

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
University of Hertfordshire
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
Frank Henry
 
Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability
Athiq Ahamed
 

En vedette (20)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data
Big DataBig Data
Big Data
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
 
Data is Currency
Data is CurrencyData is Currency
Data is Currency
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability
 
Skyline queries
Skyline queriesSkyline queries
Skyline queries
 

Similaire à Data minig with Big data analysis

GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
Parag Kapile
 

Similaire à Data minig with Big data analysis (20)

Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
 
BigData
BigDataBigData
BigData
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
 

Dernier

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 

Dernier (20)

Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 

Data minig with Big data analysis

  • 1. Data Mining With Big Data Guide: Prof. Prashant G. Ahire Presented by : Miss.Rupa Solapure Roll no. 259
  • 2. Agenda Problem Definition Objectives Literature Survey Architecture/Big Data mining algorithm Existing System/Mathematical model Advantages Disadvantages/Limitations Characteristics of Big Data Big Data and it’s challenges Big Data mining Tools Applications of Big Data References
  • 3. Problem Definition: Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. Objective: This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high-performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. Literature Survey Title/Year Keywords Concept/Abstract Author “Data Mining With Big Data,Jan 2014” Big Data,data Mining,Heterogeneity,Au tonomous sources,Complex,and Evolving associations. This paper presents a HACE theorem that characterizes the features of Big Data revolutions,processing model from data mining. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding “The Survey of Data Mining Applications And Feature Scope,,June 2012” Data mining task, Data mining life cycle , Visualization of the data mining model , Data mining Methods,s Data mining applications. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research. Neelamadhab Padhy1, Dr. Pragnyaban Mishra 2, and Rasmita Panigrahi3 “Review on Data Mining with Big Data..Dec 2014” Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, security and privacy considerations. Savita Suryavanshi, Prof. Bharati Kale. “SURVEY ON BIG DATA MINING PLATFORMS, ALGORITHMS AND CHALLENGES.sep201 4” big data, big data mining platforms, big data mining algorithms, big data mining challenges, data mining. This paper gives A review on various big data mining platforms, algorithms and challenges is also discussed in this paper. SHERIN A1, Dr S UMA2, SARANYA K3, SARANYA VANI M4.
  • 6. Architecture: Fig.: Big data Memory evolution
  • 7. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Model based clustering algorithms
  • 8. Existing System: The rise of Big Data applications where data collection has grown tremendous doubly and is beyond the ability of commonly used software tools to capture, manage, and process within a “tolerable elapsed time.” The most fundamental challenge for Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions. In many situations, the knowledge extraction process has to be very efficient and close to real time because storing all observed data is nearly infeasible. The unprecedented data volumes require an effective data analysis and prediction platform to achieve fast response and real-time classification for such Big Data.
  • 9. In model level it will produce local pattern. This pattern will be produced after mined local data. By sharing these local patterns with other local sites, we can produce a single global pattern. At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources Continue…
  • 10. Big Data Big Data is a comprehensive term for any collection of data sets so large and multifarious that it becomes difficult to process them using conventional data processing applications. There are two types of Big Data: structured and unstructured. Structured data Structured data are numbers and words that can be easily categorized and analyzed. These data are generated by things like network sensors embedded in electronic devices, smart phones, and global positioning system (GPS) devices. Structured data also include things like sales figures, account balances, and transaction data. Unstructured data Unstructured data include more multifarious information, such as customer reviews from feasible websites, photos and other multimedia, and comments on social networking sites. These data can not be separated into categorized or analyzed numerically.
  • 11. Big Data Characteristic(HACE Theorem) Figure . The blind men and the enormous elephant: the restricted view of each blind man leads to a biased conclusion.
  • 12. HACE theorem suggests that the key characteristics of the Big Data are: A. Huge with various and miscellaneous data sources B. Autonomous Sources with circulated & disperse Control C. Complex and Evolving associations
  • 13. Applications of Data Mining Marketing  Analysis of consumer behaviour  Advertising campaigns  Targeted mailings  Segmentation of customers, stores, or products Finance  Creditworthiness of clients  Performance analysis of finance investments  Fraud detection Manufacturing  Optimization of resources  Optimization of manufacturing processes  Product design based on customer requirements Health Care  Discovering patterns in X-ray images  Analyzing side effects of drugs  Effectiveness of treatments
  • 14. Big Data Mining Algorithm Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site.But it is prohibited because of high data transmission cost and privacy concerns. Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources. The global data mining is done through two steps process.  Model level Knowledge level. Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 15. Data Mining Challenges With Big Data Fig. a conceptual view of the Big Data processing framework
  • 16. DISADVANTAGES OF EXISTING SYSTEM To explore Big Data, we have analysed several challenges at the data, model, and system levels. The challenges at Tier I focus on data accessing and arithmetic computing procedures. Because Big Data are often stored at different locations and data volumes may continuously grow, an effective computing platform will have to take distributed large- scale data storage into consideration for computing.
  • 17. PROPOSED SYSTEM We propose a HACE theorem to model Big Data characteristics. The characteristics of HACH make it an extreme challenge for discovering useful knowledge from the Big Data.
  • 18. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 19. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 20. Characteristics of Big Data Fig. Five Vs of BIG DATA
  • 21. Volume- The quantity of data Variety - categorizing the data Velocity- speed of generation of data or the speed of processing the data Variability- Inconsistency Complexity- Managing the data Continue…
  • 22. BIG Data Mining Tools Hadoop Apache S4 Strom Apache Mahout MOA
  • 23. Fig.: Big Data processing
  • 24. Conclusion: Because of Increase in the amount of data in the field of genomics, meteorology, biology, environmental research, it becomes difficult to handle the data, to find Associations, patterns and to analyze the large data sets. As an organization collects more data at this scale, formalizing the process of big data analysis will become paramount.The paper describes methods for different algorithms used to handle such large data sets. And it gives an overview of architecture and algorithms used in large data sets.
  • 25. References  McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data  Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network  IEEE, Data Mining with Big Data, January 2014  Oracle, June 2013,Unstructured Data Management with Oracle Database 12c