Presented to the British High Commission, British Council and the Royal Moroccan Science Council at the Mediterranean Space of Technology and Innovation (MSTI) event held in Rabat, Morocco, this presentation describes vertical innovation and the big data science revolution. It goes on to predict the future of big data science including the Moroccan opportunity to become the data science capital of North Africa.
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Data Disruption by Vertical Innovation
1. private and confidential. all rights reserved. march 1, 20151 @ChandanRajah
Big Data Disruption
by Vertical Innovation
Chandan Rajah
@ChandanRajah | chandan.rajah@gmail.com
“Price of light is far less than the cost of darkness”
2. private and confidential. all rights reserved. march 1, 20152 @ChandanRajah
Disruption by Vertical Innovation
3. private and confidential. all rights reserved. march 1, 20153 @ChandanRajah
Horizontal Innovation
Improvement to Status Quo
VerticalInnovation
StepChangetoStatusQuo
Horizontal & Vertical Innovation
4. private and confidential. all rights reserved. march 1, 20154 @ChandanRajah
Horizontal Innovation
From 1 to N
VerticalInnovation
From0to1
Big Data Science Revolution
Statistical
Analysis
Data
Science
Monolithic
Processing
Big Data
Physical
Computing
Cloud
Big Data
Science
Revolution
5. private and confidential. all rights reserved. march 1, 20155 @ChandanRajah
What is Big Data Science ?
6. private and confidential. all rights reserved. march 1, 20156 @ChandanRajah
Big Data
Big Data ≠ Data Volume
Big Data = Oil Transport
Think of data like ‘Crude Oil’
Big Data is about extracting ‘crude oil’; transporting it in ‘pipelines’; storing it in ‘mega tanks’
7. private and confidential. all rights reserved. march 1, 20157 @ChandanRajah
Data Science
Data Science ≠ Statistical Analysis
Data Science = Oil Refinery
Data science is about ‘treating’ data; applying ‘science’ to the data;
Refine the data ‘results’; and combine to form ‘insight’
8. private and confidential. all rights reserved. march 1, 20158 @ChandanRajah
data you know
data you don’t know
questionsyou’reasking
questionsyou’renotasking
Data Analysis
Data Science
reporting & description
discovery & prediction
DATA MODELLING
Y F( X, random noise, parameters)
ALGORITHMIC MODELLING
Y [ BLACK BOX ] X
Data Modeling versus Data Science
10. private and confidential. all rights reserved. march 1, 201510 @ChandanRajah
DIVIDE
SCATTER
Split Data in Block
Replicate and Store
Petabytes of Resilience
CONQUER
ANALYZE
Parallel Execution
Data Locality
Explore Every Path
INSIGHT
GATHER
Machine Learning
Iterative Evolution
Prediction & Action
Big Idea
11. private and confidential. all rights reserved. march 1, 201511 @ChandanRajah
Name Node
1 32
Client 1. Create Metadata
2. Put Blocks
Data Nodes
Control / Monitoring
1 1
2 2
3 3
WRITE
Name Node
1 1 1 2
2
2
3 3 34
4 4
Client 1. Get Metadata
2. Fetch Blocks
Data Nodes
Control / Monitoring
READ
DIVIDE = HDFS
13. private and confidential. all rights reserved. march 1, 201513 @ChandanRajah
INSIGHT = Functional Programming
Lambda Calculus
Alonzo Church Alan Turing
Statistical Programming
Java Virtual Machine Machine Learning
Scala Python
14. private and confidential. all rights reserved. march 1, 201514 @ChandanRajah
Value Exponent
R
REPORT
I
INFORM
K
KNOW
P
PRESCRIBE
I
INFER
PAST FUTURE
Substantial Value Increase
15. private and confidential. all rights reserved. march 1, 201515 @ChandanRajah
The Moroccan Opportunity
16. private and confidential. all rights reserved. march 1, 201516 @ChandanRajah
Skill Deficit and the Moroccan Opportunity
Dev Ops Engineer
Builds the cluster
Data Analyst
SQL guru
Big Data Developer
Productise insight
Data Scientist
Machine learning expert
What’s the Moroccan Opportunity?
• Talent: 7+ million graduates in training from world class universities
• Initiative: Government push to create a big data skill hub
• Potential: Build pipeline of outsourced talent for the EU & US. Market of £10bn
• Opportunity: next Moroccan Mark Zuckerberg or Larry Page
WARNING: All disruptions have a shelf life. Only the agile reap the rewards.
17. private and confidential. all rights reserved. march 1, 201517 @ChandanRajah
State of Play – The Gartner Hype Curve
19. private and confidential. all rights reserved. march 1, 201519 @ChandanRajah
Horizontal Innovation
From 1 to N
VerticalInnovation
From0to1
Big Data Science Revolution
Batch
Streaming
Hadoop
Spark
Message
Passing
Actor
Systems
Reactive
Insight Model
20. private and confidential. all rights reserved. march 1, 201520 @ChandanRajah
Insight Ecosystem
Reactive Insight Model
Dev Ops Engineers
Data Providers
Data Analysts Data Scientists
Big Data Developers
Data Innovators
• Low Cost, Secure, Multi-tenancy
• Centralised, Readily Available
• Accessible to all skills
• Preferably state financed
21. private and confidential. all rights reserved. march 1, 201521 @ChandanRajah
Reactive Insight Model
Periodic Batch
Data
Streaming Data
Stream
Processing
Mutable Store
Apache Kafka
REAL TIME
INCREMENT
Apache Hbase
Cassandra
Apache Spark
Apache Storm
Immutable
Storage
Pre-computed
Views
Apache HDFS
BATCH
RE-COMPUTE
Apache HiveApache Spark
MapReduce
Single Version
of Truth
Metadata
Management
Statistical
Analysis
Machine
Learning
iPython & Jypiter
R Studio Server
SQL Access
Hue
Productisation
& Web Services
Scala & Akka
22. private and confidential. all rights reserved. march 1, 201522 @ChandanRajah
Where do we Start ?
• Buy over Build. Quick catch-up with front runners.
• Centralise cloud infrastructure. Accessible technology.
• Specialised streams for Dev Ops, Data Science, etc.
• Smaller focused outsourcing units. EU sales presence.
• Hack, Hack, Hack, Hack. Compete on Kaggle. Build Rep.
23. private and confidential. all rights reserved. march 1, 201523 @ChandanRajah
Thank You
Chandan Rajah
@ChandanRajah | chandan.rajah@gmail.com
Senior technologist and inventor. 18yrs of experience in big data science, HPC and AI.
Technology Expert and Advisor for Big Data and IoT at the Digital Catapult.
Entrepreneur in high performance computing and large-scale real time prescriptive analytics