2. Enterprise Data & AI Strategy
End to End Data & AI Platform
Data & AI Lifecycle
Data & AI Key Personas
Ideal Data & AI Components
Azure Data & AI Services
3. Experience from various industries leading AI initiatives,
• Osedea
• Intact Assurances
• Immersive Design Studios
• P2IRC
• Wunderman Production
• Aprosoft
• BSc - NSU in CSE
• MSc - USask in AI
• Exec Ed in Leading AI Policy, Technology - HKS
• Microsoft MVP – AI – 6 times.
• Canada’s Top Dev 30 U 30 2018
4. By 2025, AI
Market will be
60 billion $, in
2016 it was 1.4
billion $
AI can increase
business
productivity by
40%.
COVID-19 will
accelerateAI's
replacement of
humans for
automation
5. Data Ingestion Data Storing
Data
Transformation
Data Reporting &
Visualization
Data Consumption
Metadata
Management
Data Exploration
& Democratizing
DataOps,
Automation and
Data Quality
Checking
Pre-defined AI
models
Custom AI
modelling
AI model
hosting
AI model
monitoring
AI model
retraining
AI computation
AI use case –
batch, realtime
processing
MLOps, AI –
RPA, AI
Automation
Data
Governance and
Management
Multi-Cloud
Platform
support
Business Driven
ROI
Data Security,
Ethics, Bias
Governance
Data Driven
Decision Making
Data & AI short
– long term
Strategy
Data & AI Cost
Monitoring
Data Components AI Components Platform Components
6. Data & AI Platform Strategy
AI & Data Centric
Planning
Data
Knowledge & Skill
Information Context
Insight
STRATEGY
Data Culture
Understanding
• Data Sources
• Data Lake
• Data Ingestion
• Cold & Hot Storage
• Data Processing
• Data Science
• Analytics
• Dashboard &
Visualization
• Artificial Intelligence
ROI & Business
Driven Priority
Data Centric
Decision making &
Governance
Organizational data needs
& priorities
• Data & AI Platform
• Information Management
• Information Architecture &
Data Flow
• Information Security &
Governance
7. AI & Data Platform – Data Centric Strategic
Insight
Understanding
Knowledge
Information
Data
Data
Transformation
Data
Analytics
Maximizing Return On InvestmentTriangle
Bringing data as raw, creating data drop zone, creating data lake / dwh & data connections
Data Insight, Data KPI identifying, Platform, AI PoC’s
Data management, define governance,
revenue generation from Data & AI
Reiterate based on learning, ROI
& experience
Apply Learning &
Understanding
Data Centric & AI Driven Organization
Data Capture
Data Store
Data Quality
Data Manage
Data Metadata
DataAccess
Data Lineage
P.BI Dashboard
Analytics
Artificial
Intelligence
Data Culture
RealTime
Decision
Factual
Storytelling
Augmented
Intelligence
Initialization
Descriptive
Planning
Data Based
Diagnostic
Future
Modules
Predictive
Module
Strategic
Review
Prescriptive
Insight
8.
9. Data Ingestion Lifecycle and Key Personas
Business
Leadership
Data Architect
Data Engineer
Data Modeler
DataVisualization
& Reporting
Data Analysts
& Scientists
Set Goals
Review & Create
Data Ingestion
Architecture
Data Ingestion &
Pipeline
Data Modelling
Data Reporting
& Visualization
Data Quality
Review, KPI
check, Insight
10. Data Platform
Components
1. Data Ingestion
2. Data Storing
3. DataTransformation
4. Data Reporting &Visualization
5. DataConsumption
6. Data Governance & Management
7. Data Exploration Sandbox & Marketplace, data
quality
11. Data Ingestion Data Store
DataTransformation
Data Reporting &
Visualization
Data Governance & Data Access Management
Cloud Data Platform
Data Sandbox
Data Marketplace
BI & Analytics
ML Use Cases
Data Archiving –
Hot & Cold
Structured Data
Unstructured Data
Data Migration
RealTime Ingestion
Trigger & Pipeline
Data Storage Zones
Data Consumption
Data Ingest
Data
Discover
&
Govern
Data Store Data Process Data Serve & Enrich
Data & ML Ops
Raw,Transient, Curation
Data Wrangling
Data Monitoring,Alerting,Administration
Data Security
Data KPI
Data Computation
Data Recipe Repo
Data Share
15. Application Development Lifecycle Data & AI Lifecycle
• DataOps, MLOps
• DevOps
• CICD implementation
• Ensure greater collaboration
between teams
• Speed & Stability of development &
deployment
• Build hosting platform for different
environment.
• Works in application,
platform, infrastructure level.
• DevOps for Data & AI
• DataOps ensures automated process-
oriented methodology used by analytics
& data teams to improve data quality &
reduce cycle time.
• MLOps ensures deployment, monitoring,
reproducibility, scalability of machine
learning models which maintaining CT-CI-
CD pipeline.
16. AI/ML Lifecycle and Key Personas
Business
Leadership
Data Engineer
Data Scientists
ML Engineer
App Developer
IT Operations
Set Goals
Gather &
Prepare Data
Develop ML
Model
Deploy ML
models
Implement
Apps &
Inference
ML Model
Monitoring &
Management
Retrain Models
17. x
1
x
2
x
3
x
4
y
1 2 7 1 A
5 4 4 5 A
9 5 5 5 B
9 2 2 5 B
0 2 3 3 B
Initial Dataset
• Data Cleaning
• Data Curation
• Remove Redundancy
Exploratory
Data Analysis
Pre-Processing Steps
Processed
Dataset
Data Split
80% 20%
Use as Training set Use as Testing set
AI modelling
Hyperparameter
optimization
Cross-Validation Model Trained Model
Feature
Selection
Model
Testing
Predict Y values
Evaluate Model
Performance
Classification Regression
RMSE
R2
MSE
MCC
Specificity
Sensitivity
Accuracy
Learning Algorithms
SVM
DL
GBM
XGBoost
Knn
DT
RF
18. Prepare Data Build & Train Model Deploy & Predict
Data
Storage
Data
Ingestion
Data Preparation
Data Wrangling &
Processing
Data Transformation
Data Validation
Data Featurization
Model Building &
Training
Algorithm Selection
Model Training
Model Testing
Model Validation
Hyper-parameter Tuning
Model Deployment
Model Deployment
Batch Prediction
Model Monitoring
End To End Machine learning Pipeline
19. AI Platform
Components
• Pre-trained common AI use case models
• Machine Learning Platform for custom AI
Modelling
• Jupyter notebook
• AutoML
• ML Designer
• AIVM (DSVM)
• AI / ML ComputationalComponents – CPU /
GPU
• Training & Inferencing Compute Cluster (Cloud
& On-Prem) – Container & Kubernetes Cluster
• DataAnnotationTool
• MLOps – Continuous training, model testing,
monitoring, drifting, deployment, dashboard.