2. Data Centre built by the US National Security
Agency in Bluffdale, Utah - capable of storing a
yottabyte of data (that is one thousand trillion
gigabytes)
3. The size of the digital universe, in
terms of the amount of data being
generated, is forecast by IDC to grow
to a staggering 44 zettabytes by
2020.
5. Data in the organization
• Transaction Records
• Documents
Types of Data ?
• Structured
• Semi Structured
• Unstructured
6. Difficulties in Managing Data
Amount of data increases exponentially.
Data are scattered and collected by many
individuals using various methods and devices.
Data come from many sources including
internal sources, personal sources and external
sources.
Data security, quality and integrity are
critical.
7. Different approaches for management
of Data :
• Conventional file system (use of flat
files)
• DBMS
8. Key Definitions
Database:
Organized collection of logically related data
Data:
Stored representations of meaningful objects and
events
Structured:
Numbers, text, dates
Unstructured:
Images, video, documents
Information:
Data processed to increase knowledge in the person
using the data
Metadata:
Data that describes the properties and context of user
data
10. Misconception about big data :
If it is data and it is big, it is big data
What is big today may not be big tomorrow
Big data has attributes that challenge the
current system or business needs.
11. 4 Vs of big data
Volume
Velocity
Variety
Value
12. Volume
Machine generated data is produced in much
larger quantities than non-traditional data.
For example, a single jet engine can
generate 10 TB of data in 30 minutes
13. Velocity
Social media data streams – while not as
massive as machine-generated data produce
a large influx of opinions and relationships
valuable to customer relationship
management. Even at 140 characters per
tweet, the high velocity (or frequency) of
Twitter data ensures large volumes.
14. Variety
Traditional data formats tend to be
relatively well described and change slowly.
In contrast, non-traditional data formats
exhibit a dazzling rate of change.
15. Value
The economic value of different data varies
significantly. Typically there is good
information hidden amongst a larger body of
non-traditional data. The challenge is
identifying what is valuable and then
transforming and extracting that data for
analysis.
16. Data is key to enterprise decision
support, business process
optimization, next best action, and
other initiatives that are vital to
success and growth.
27. Preparing Data for Analysis using
Spreadsheet Applications (Ex. MS –
Excel)
Sorting
Filtering
Working on missing data
Eliminating duplicates
Lookups
Pivoting
28. Class Activity
Identify the data requirements for a Hotel which is
interested in studying all the aspects of its business. The
management of the hotel intends to use this study to take
appropriate decisions for improving their services.
1. Create a survey form using Google Drive
2. Export the data to MS-Excel
3. Perform Basic Analysis with help of Pivot Table
(Quantitative Data)
4. Identify ways in which qualitative data can be
analyzed.