1. The World is changing Rapidly and Here
is how BIG DATA is contributing
Prepared by: Madhu Reddiboina
2. What are you going to get out of this session today?
Recognize how the
world is changing…
Rapidly!
Understand the role
of Big Data and
Machine Learning in
this Transformation
One approach to
Harvest Big Data
within an Enterprise
3. How is the World Changing?
Accelerated Disruption
4.
5. Lending and Borrowing
Used to be
Borrower Apply for Short
term Credit
Banks/Services Package
Loans
Securitization InvestorsMutual
Funds
Borrower Investors
Lending Platforms
What it is now
6. Higher Education and Learning
Apply for
Admission
Colleges &
Universities
Education Platforms
Massive Open Online Courses (MOOCs)
Instructors
Teachers
Professors
Students In Person
Courses
Banks Loans
GraduationEntrance
Exam
Students Instructors
Teachers
Professors
Used to be
What it is now
8. Patterns of Disruption
New Business Models
Completely New Business
Models that are disrupting
and disposing existing
businesses and creating
white space for new
businesses
Breaking the Barriers
Technology, Big Data
and Analytics breaking
the barriers that were
never possible before
Emerging Asian Economies
Almost half a billion people
are connecting to the internet
for the first time from Asia
creating many new
opportunities for every
business to market and
service them.
18. “Big Data is like teenage sex: everyone talks about it, nobody really
knows how to do it, so everyone claims they are doing it..”
- Dan Ariely, Professor at Duke University
Big data is a broad term for data sets so large or complex that
traditional data processing applications are inadequate. Challenges
include analysis, capture, data curation, search, sharing, storage,
transfer, visualization, querying and information privacy.
-- Wikipedia
Big data is high-volume, high-velocity and/or high-variety information
assets that demand cost-effective, innovative forms of information
processing that enable enhanced insight, decision making, and
process automation.
- Gartner (IT Glossary)
What is Big Data?
19. What are some examples of Big Data sets
Click-Stream Data from Websites
Data from Sensors
• Sensors on a Car
• Sensors on Industrial Machinery
• Sensors on Wifi Networks
• Sensors on Airplanes
• Sensors on IOT/IOE Devices
System Log Data
20. What are some examples of Big Data sets
Geolocation Data
Social Media Comments and Likes
Human Gnome
22. "Field of study that gives computers the ability to learn without
being explicitly programmed.”
- Arthur Samuel (1959)
What is Machine Learning?
“In the past decade, machine learning has given us self-driving cars,
practical machine translation, effective web search, and a vastly
improved understanding of the human genome.”
-Andrew NG. - Assoc. Professor, Stanford
At a high level there are two forms of Machine Learning
– Supervised Learning
– Unsupervised Learning
25. Continuous
Learning
System
How Does a Recommendation System Work?
On-line Web
Application
Recommendation
Engine
Closed
Loop
Process
Real-time Click-
Stream Data
New Business
Rules
Data
Engineering
Data
Analysis
Feature
Engineering
Model
Creation
Model
Evaluation
and Testing
Transaction History
Metadata
Upfront Data Science Work
29. Solution Architecture of a Data Platform To Enable…..
Descriptive
Analytics
Advanced
Analytics
Streaming
Analytics
30. People
Process
Technology
One flavor of information management framework and process infrastructure
to operationalize the data eco system
Infrastructure Management
Data Governance
Data Quality Management
Identity Management & Security
Metadata Management & Data Discovery
DataAcquisition
DataIngestion
DataPreparation
BusinessAccess
Services
Information
Consumers
Information Management Framework
31. Infrastructure Management
Data Governance
Data Quality Management
Identity Management & Security
Metadata Management & Data Discovery
Data Acquisition Data Ingestion
Business
Access Services
Information
ConsumersData Preparation
Unstructured&
SemiStructured
DataSources
Workflow Orchestration
Data
Engineering
Data Quality
Profiling
Search & Indexing
Data
Enrichment
Data
Aggregation
HadoopFileSystem(HDFS)
Dashboards & Reports
Customers
External Users
Internal Users
Summarized Data Marts
Integrated Data Extracts
Reference/Master
DataSources
Operational
Transactional
Sources
LandingZone
Custom
Interfaces
Enterprise Data Hub
Historical
ParquetIngest
AnalyticsSandbox
Archive
To Enable Descriptive Analytics
32. Infrastructure Management
Data Governance
Data Quality Management
Identity Management & Security
Metadata Management & Data Discovery
Data Acquisition Data Ingestion
Business
Access Services
Information
ConsumersData Preparation
Advanced Analytics
Ad Hoc Data AnalysisDetailed Operational Marts
Data Analysts
Workflow Orchestration
Data
Engineering
Data Quality
Profiling
Search & Indexing
Data
Enrichment
Data
Aggregation
HadoopFileSystem(HDFS)
Summarized Data Marts
Integrated Data Extracts
LandingZone
Custom
Interfaces
Enterprise Data Hub
Historical
ParquetIngest
AnalyticsSandbox
Archive
Dashboards & Reports
Customers
External Users
Internal Users
To Enable Advanced Analytics
Unstructured&
SemiStructured
DataSources
Reference/Master
DataSources
Operational
Transactional
Sources
33. Infrastructure Management
Data Governance
Data Quality Management
Identity Management & Security
Metadata Management & Data Discovery
Data Acquisition Data Ingestion
Business
Access Services
Information
ConsumersData Preparation
Custom Web Service Interfaces
End Users
HadoopFileSystem(HDFS)
LandingZone
Real-time Streaming
Data
Custom
Interfaces
To Enable Streaming Analytics
Unstructured&
SemiStructured
DataSources
Reference/Master
DataSources
Operational
Transactional
Sources
34. Infrastructure Management
Data Governance
Data Quality Management
Identity Management & Security
Metadata Management & Data Discovery
JAMS Scheduler
Data Integration
Machine Learning &
Stream Processing
Data flow Orchestration
Advanced Analytics
Search & Indexing
SQL Querying & Analytic
Access
Data Storage
Custom Web Service
Integrations
SFTP
NFS
Custom Built Meta Data Tools
Custom Built Data QC Tools
Kerberos for Authentication Sentry for Authorization
Custom Built Processes
Data Delivery
Dashboards, Reports &
Data Analysis
Tableau Server
Tableau Desktop
Applications
Custom Web
Applications
Hive Data marts
Tableau Data Extracts
Data Analysts
End Users
Data Acquisition Data Ingestion Data Preparation
Business
Access Services
Information
Consumers
Salesforce Applications
Custom Web Services
Customers
External Users
Internal Users
Platform Technologies