Facts About Big Data, How it is stored . How Big Data is being Proceed And What is the tools and Techniques which is used for handling BigData. All are coverd in these Slides
32. • Data start to play an increasingly important role in
business and science.
• Storing, searching, sharing, analysing and visualising big
data has become a challenge.
• Especially storing of data is often disregarded as an
issue. Note that sometimes a MySQL database is not
enough.
• Hadoop offers an out of the box distributed filesystem for
storing data files. However, the challenge appears when
someone needs DB capabilities, frequent updates or real
33. Problems Now A days
Nowadays traditional relational databases can reach their limit
in performance.
Data keep on coming in high velocity, high volumes, and high
variety.
Common practices to increase performance fail after a while:
buying a faster server, getting more RAM, using materialised
views, fine tuning queries...
Furthermore, “alter table” doesn’t really work with lots of
data. Backups and data availability becomes an issue.
34. NO SQL
• The term is too broad and new to really define it.
• No schema
• No joins between tables
• No common scripting language (like SQL)
• No ACID (atomicity, consistency, isolation, durability)
• On the other hand you gain horizontal scalability and high performance.
Also, most NoSQL systems are Map/Reduce ready and/or bind with
Hadoop.
37. MongoDB - Sharding MongoDB
If new shard is added, data is balanced automaticall
38. Data Processing
Without data processing, organizations have no access to
massive amounts of data that can help them gain a competitive
edge, give them insight into sales, marketing strategies and
consumer needs. It is imperative that companies large and small
understand the necessity of data processing.
Data processing occurs when data is collected and translated
into usable information
39. The Six Stages of Data Processing
• Data Collection
• Data Preparation
• Data Input
• Processing
• Data Output/Interpretation
• Data Storage
40. The Future of Data Processing
The future of data processing lies in the cloud. Cloud technology
builds on the convenience of current electronic data processing
methods and accelerates its speed and effectiveness. Faster,
higher-quality data means more data for each organization to
utilize and more valuable insights to extract.
41. Big data tools:-
1. Apache Hadoop 2. Microsoft HDInsight
3. NoSQL 4. Hive
5. Sqoop
7. Big data in EXCEL 8. Presto
6. PolyBase
42. Big Data Techniques
Quantitative Analysis
Quantitative analysis is a data analysis technique that focuses on quantifying
the patterns and correlations found in the data. Based on statistical practices,
this technique involves analyzing a large number of observations from a dataset
43. Qualitative Analysis
Qualitative analysis is a data analysis technique that focuses
on describing various data qualities using words. It involves
analyzing a smaller sample in greater depth compared to
quantitative data analysis. These analysis results cannot be
generalized to an entire dataset due to the small sample size
44. DATA MINING
Data mining, also known as data discovery, is a specialized form of
data analysis that targets large datasets. In relation to Big Data
analysis, data mining generally refers to automated, software-based
techniques that sift through massive datasets to identify patterns and
trends.
45. STATISTICAL ANALYSIS
Statistical analysis uses statistical methods based on mathematical formulas as a
means for analyzing data. Statistical analysis is most often quantitative, but can also be
qualitative. This type of analysis is commonly used to describe datasets via
summarization, such as providing the mean, median, or mode of statistics associated
with the dataset.
46. MACHINE LEARNING
Humans are good at spotting patterns and relationships within data.
Unfortunately, we cannot process large amounts of data very quickly.
Machines, on the other hand, are very adept at processing large amounts of
data quickly, but only if they know how.
47. SEMANTIC ANALYSIS
A fragment of text or speech data can carry different meanings in different
contexts, whereas a complete sentence may retain its meaning, even if
structured in different ways. In order for the machines to extract valuable
information, text and speech data needs to be understood by the machines
in the same way as humans do. Semantic analysis represents practices for
extracting meaningful information from textual and speech data.
48. VISUAL ANALYSIS
Visual analysis is a form of data analysis that involves the
graphic representation of data to enable or enhance its visual
perception. Based on the premise that humans can
understand and draw conclusions from graphics more quickly
than from text, visual analysis acts as a discovery tool in the
field of Big Data.