2. Meaning of Big Data
“Big data is a high volume, high velocity, high variety
information asset that demands cost-effective and
innovative forms of information processing for
enhanced business insight and decision making”.
Dr. Amitabh Mishra 2
3. • Big data involves homogeneous voluminous data that could
be:
• Structured (as in RDBMS) or
• Unstructured (as in blogs, tweets, Facebook comments, emails)
• The content may be in different varieties as-
– Audio
– Picture
– Large text
Dr. Amitabh Mishra 3
4. • Handling Big data need newer and innovative technologies for
-capturing, storing, searching, integrating, analysing and
presenting newly found insights.
Dr. Amitabh Mishra 4
5. Benefits Big Data Analytics
• Here is a list of advantages that can be achieved by using Big Data
analytics:
– Understanding and Targeting Customers
– Understanding and Optimizing Business Processes
– Re-develop your products
– Personal Quantification and Performance Optimization
– Helps in Fraud Detection & improving Security
– Perform Risk Analysis
– Customize your website in real time
– Optimizing Machine and Device Performance
Dr. Amitabh Mishra 5
6. Characteristics of Big Data
Dr. Amitabh Mishra 6
Characteristics
Volume
Variety
Velocity
Variability
7. Characteristics: Volume
• Volume is obviously the most common trait of Big Data.
• Many factors contributed to the exponential increase in data volume, such
as:
– Transaction-based data stored through the years,
– Text data constantly streaming in from social media,
– Increasing amounts of sensor data being collected,
– Automatically generated GPS data, and so on.
• With the staggering increase in data volume, even the naming of the next Big Data echelon has been a challenge. The highest mass of data that used
to be called peta bytes (PB) has left its place to zeta bytes (ZB), which is a terabytes (TB).
(1 Terabyte can hold 200,000 songs or 17,000 hours of music / 500 hours of movies)
Dr. Amitabh Mishra 7
8. Characteristics: Variety
• Data today comes in all types of formats formats ranging from traditional
databases to:
– To hierarchical data stores created by the end users and OLAP systems (Online
Analytical Processing)
– To text documents, e-mail, XML, meter-collected, and sensor-captured data
– To video, audio, and stock ticker data
• By some estimates, 80 to 85 percent of all organizations’ data is in some
sort of unstructured or semi - structured format (a format that is not
suitable for traditional databases schemas).
Dr. Amitabh Mishra 8
9. Characteristics: Velocity
• Velocity means the speed of something in a given direction.
• According to Gartner, velocity means both
– How fast data is being produced and
– How fast the data must be processed (i.e., captured, stored, and
analysed) to meet the need or demand.
• Velocity is perhaps the most overlooked characteristic of
Big Data. Reacting quickly enough to deal with velocity is a
challenge to most organizations.
Dr. Amitabh Mishra 9
10. Characteristics: Variability
• In addition to the increasing velocities and
varieties of data, data flows can be highly
inconsistent with periodic peaks.
• Daily, seasonal, and event triggered peak data
loads can be challenging to manage—
especially with social media involved.
Dr. Amitabh Mishra 10