AI in the Enterprise at Scale

AI in the Enterprise
at Scale
ClarisseTaaffe-Hedglin
Executive IT Architect
IBM Garage
IBM Systems
clarisse@us.ibm.com

Agenda
Data AnalyticsToday
The AI Ladder and Lifecycle
AI at ScaleThemes
Infrastructure Considerations

• No governance
• No collaboration
• Limited complexity
How Customers Do Data Analytics Traditionally
Spreadsheets
• Broad rules and categories
• Not dynamic
Business Rules
• Hard to maintain
• Pre-set rules and
approaches
Homegrown
Applications
• Limited use of analytics
• Hard coded models that do
not apply to unique needs
• Slow response
Other Applications

4
Enterprise Analytics Modernization: From Data to Actions
010101010101010111100010011001010111
0000000000010101010100000000000 111101011
11000 000000000000 111111 010101 101010
10101010100
Prescriptive
What should
we do ?
Descriptive
What Has
Happened?
Cognitive
Learn
Dynamically
Predictive
What Will
Happen?
ACTIONDATA
HUMAN INPUTS
<
< >
< >
>
>

Predict a
Future Event
Segment Data
/ Detect
Anomalies
Determine
optimal
quantity,
price,
resource
allocation, or
best action
Understand
Past Activity
Discover
Insights in
Content
(text, images,
video)
Interact in
Natural
Language
Forecast
and Budget
based on
past activity
Supervised Unsupervised
Predictive: What will happen? Prescriptive:
What should
we do?
Descriptive:
What
happened?
Planning:
What is our
Plan?
NLPDeep Learning
Supervised
Common Patterns of Analytics
Solving challenges with Data and AI
will utilize a combination of these analytics patterns

Three broad categories of AI Use Cases
“Structured” Data Use Cases
Computer Vision Use Cases
- Big Data (Rows and Columns)
- GPU Servers
- Available AI Software
More Accuracy !
This is sort of “Magic”
- a deep learning Model is trained to detect and classify objects
Natural Language Processing Use Cases
- A Model learns to read and hear and “understand” language

Organizations are adopting
AI to solve business problems
Fraud Safety, inspection and
process improvement
Defense and security

“AI is the
fastest-growing
workload”*
8*Forrester Research Inc. “AI Deep Learning Workloads Demand a New Approach to Infrastructure”, by
Mike Gualtieri, Christopher Voce, Srividya Sridharan, Michele Goetz, Renee Taylor, May 4, 2018.

COLLECT - Make data simple and accessible
ORGANIZE - Create a trusted analytics foundation
ANALYZE - Scale AI everywhere with trust & transparency
Data of every type, regardless of
where it lives
MODERNIZE
your data estate for an
AI and multicloud world
INFUSE – Operationalize AI across business processes
The AI Ladder
A prescriptive approach to accelerating the journey to AI
9
AI
AI-optimized systems
infrastructure

Unstructured, Landing, Exploration and Archive
Operational Data
Real-time Data Processing & Analytics
Transaction and
application data
Machine,
sensor data
Enterprise
content
Image, geospatial,
video
Social data
Third-party data
Information Integration & Governance
Data is Prerequisite to AI
Risk, Fraud
Chat bots,
personal
assistants
Supply Chain
Optimization
Dynamic
Pricing,
Recommenders
Behavior
Modeling
Vision,
Autonomous
Systems

Available data Sources
Public data
Anything data system can pull
from the outside world for free
through web connections,
databases, IoT and sensors
Proprietary data
What private data from the outside
world could the system be given
permission to use?
Purchased data
What pre-trained data could the
system buy or subscribe to?
IBM Skills Academy / © Copyright 2018 IBM Corporation
Ground truth
Data used to define what the system
knows from day one
Domain knowledge
Data resources that can be used to
teach the system to understand and
be an expert in a particular field
Private data
Unique data the creator owns and
only shares internally
Personal public data
What unique data does the creator
share with the outside world?
Transaction and
application data
Machine,
sensor data
Enterprise
content
Image, geospatial,
video
Social data
Third-party data

Enterprise Data Pipeline for AI
Insights Out
Trained Models,
simulations
Inference
Data In
Transient Storage
SDS/Cloud
Global Ingest
Throughput-oriented,
globally accessible
Cloud
ETL
High throughput, Random
I/O,
SSD/Hybrid
Archive
High scalability, large/sequential I/O
HDD Cloud
Tape
Hadoop / Spark
Data Lakes
Throughput-oriented
Hybrid/HDD
ML / DL
Prep ⇨ Training ⇨ Inference
High throughput, low
latency,
Random I/O
SSD/NVMe
Classification &
Metadata Tagging
High volume, index &
auto-tagging zone
Fast Ingest /
Real-time Analytics
High throughput
SSD
Throughput-oriented,
software defined
temporary landing zone
capacity tier
performance tier performance &
capacity Tier
performance &
capacity Tier
performance tier
capacity tier
FitsTraditional and New UseCases
EDGE INGEST ORGANIZE ANALYZE INSIGHTSML / DL
IBM Spectrum Scale / Storage for AI / © 2020 IBM Corporation

Metadata-Fueled Data Analysis
Large Scale Data Ingest
• Scan records at high speed
• Live event notifications
• Capture system-level tags
• Automatic indexing
Business-Oriented Data
Mapping
• Custom data tagging
• Content-inspection via APIs
• Policy-driven workflows
Data Activation
• Data movement via APIs
• Extensible architecture
• Solution Blueprints
Data Visualization
• Query billions of records
in seconds
• Multi-faceted search
• Drilldown dashboard
• Customizable reports

AI Model Development Workflow
•Data preparation, cleaning, labelling
•Model development environment
•Runtime environment
•Train, deploy and manage
models
•Business KPI and production metrics
•Explainability and fairness
Data Engineering and Data Science Team IT Operations Team

Data Science Exploration
to Production
Use Case Exploration
Data Science Model Build
Use Case Deployment in Production
Requires solution architecture
Deploy
Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Use Case Exploration
Data Science Model Build
Security, Privacy and Governance

• #1 Model
Quality
• Not enough
knowledge about the
problem to build a
good model
• #2 System
Usage
• Typical optimizations
are limited to serial
data collection
• #3 Complexity
do not work in high
dimensions
• #4 Trust
do not explain their
logic to the user
Designing Models Driven By User Desires

AI Use Case: Automate diagnostics
to increase productivity
DIAGNOSTICS
Faster results with higher accuracy
can be achieved with an image
processing system designed to
• address workflow burdens,
• data governance challenges, and
• analysis challenges
with the goal of reducing
• false negative rates in imaging
diagnostics and in clinical settings,
• patient risk and medical legal risk.

Examples of Medical Imaging Applications
s.
3D-UNet segmentation models with
higher resolution images allows for
learning and labeling finer details and
structures of brain tumors.
https://developer.ibm.com/linuxonpower/2018/07/27/tensorflow-large-
model-support-case-study-3d-image-segmentation/
Automatic skin lesion image analysis
for melanoma detection with Memorial
Sloan Kettering (MSK-CC)
Diagnosis of blood-based with
characterization of patient blood
samples to detect and classify blood
cell subtypes
DIAGNOSTICS

Optimizing Medical Imaging
Enhance image identification with deep learning to
assist physicians and benefit patients
1300 MRI images trained by IBM Power
Systems and IBM Storage in just two
hours, compared to forty hours on
traditional architectures
20x faster
DIAGNOSTICS

Actionable
Decisions
Image
Sensor
Database
Text
Data
Fusion
Sources
Fact
Tables
AI &
Analytics
Language
Analytics
Classification
Detection
Time
Series
Descriptive
Statistics
AI Use Case: NoviLens AI Appliance
Under the Hood – End to End Workflow
21© 2020 NoviSystems Corporation
Questions
?
** TechData IBM Reseller in
Research Triangle Park
*Patents pending on the
methods employed by
NoviLens
DATA FUSION

NoviLens AI Appliance Case Study
22
© 2020 NoviSystems Corporation
https://novi.systems/covid-19/
DATA FUSION

Accelerated workflow
uses fewer calculations to
achieve orders of
magnitude resolution
increase
AI Use Case: Molecular Modeling
Achieves human level
performance in days
instead of months.
Force Field Tuning
Intelligent Phase Diagram Exploration
MOLECULAR SIMULATION

24
• Advances in instrument design, sample
preprocessing and mathematical
methods have enabled high volume
throughput imaging at atomic scale.
• Cryogenic electron microscopes
generate an average of 5 TB of image
data per day
BIOMOLECULAR STRUCTURE
AI Use Case: Massive data sets
require massive processing capability

Accelerating Cryo-EM Imaging Analysis
Reduced time-to-completion for high resolution image analysis jobs
while increasing resource utilization
Using IBM AC922 cluster, more than 100 cryo-EM high
resolution image workload analysis jobs running in parallel
on Satori cluster
100+
BIOMOLECULAR STRUCTURE

Traditional infrastructure isn’t
suited for AI workloads
Systems don't easily scale
to meet demand
Processor not optimized for
AI workloads
The wrong infrastructure puts AI at risk.
Data pipeline too slow, causing
bottleneck effect

Common AI Data Considerations
Data Compute
Legacy Data
Stores
IoT, Mobile
& Sensors
Collaboration
Partners
New Data
Ingest InferenceTrainingPreparation
Iterative Model training to improve accuracy
Champion
Challenge
r
-”Data Center”
- At Edge
Trained
Model
 Ease to Massively Scale
 High Performance
 Tiered / Archive
 Secure
 High Performance
 Metadata Tagging
 Single Name Space
Low Latency
Dev & Inference Stack
- Open Source
- Stable and Supported
- Auditable
Productivity
Performance
Robustness
Considerations

Infrastructure
Demands for AI
Equipped for volumes of data
Flexible storage for a range
of data demands
Versatile, power-efficient data
center accelerators
Advanced I/O for minimal latency
Scalability and distributed
data center capability
Inference
Powerful data center
accelerators with coherence
Advanced I/O for high
bandwidth and low latency
Proven scalability
Training
Equipped for volumes of data

© 2020 IBM Corporation29
Data and AI Lifecycle in the Enterprise

Inferencing Considerations
Real-Time (vs Batch): Many AI applications
have response times in milli-seconds and in
many cases have 100K+ IOT events per
second (Latency, Latency, Latency)
Scalability: Ability to scale inference engine
and manage infrastructure
Data Pipeline: The data that is feed into
models has to be cleaned and structured to
produce accurate results
Security: Applications running AI models in
the field and back-offices
Multi-Tenancy: Multiple business
applications leveraging shared infrastructure,
Multiple Models per Business Application
Tools Proliferation: Analytics, Data/Object
Tagging, Model Training and Inferencing
Model Management: Continuous
Training/Re-Training of Models, AI-DevOps,
Ease of Deployment
Transparency: Ability to explain decisions
A
C
C
U
R
A
C
Y
Transaction integration
Huge Scale
As-a-Service offering
Inference Data Center or In-Cloud
Multi-Tenancy
Low latency
Data movement considerations
Near Edge Inferencing
On-prem or In-Cloud
Inference at Edge
On-prem/device
Stand alone device
Low latency
Data movement considerations
Typical AI Inferencing Scenarios

Quality Inspection
- Very low latency
Equipment Sensors
- low latency
Servers
GPU (IC922)
Storage
( ESS )
Optimization
- batch
Factory location 2
Use Case: Manufacturing
Cloud / IOT
Servers
GPU
Storage
Quality Inspection
- Very low latency
- Device Inference?
Equipment Sensors
- low latency
Servers
GPU / FPGA
Storage
( ESS ) Plant Optimization
- batch
Factory location 1
On-Prem
AI
Model
Training
Enterprise
Systems
AI inferencing
In Transaction
Systems
Headquarters
AI Applications
and Data
Hybrid Cloud
- Containers
- Cloud Paks
Data and
meta-data
Archive

OpenPOWER is a technical community
dedicated to expanding the the IBM Power architecture ecosystem
https://github.com/open-ce
Open-CE
Minimize time to value for
foundational ML/DL packages
Provide a flexible source-to-image
solution to provide a complete and
customizable AI environment.

Data Data Data
Microservices Containerized Workloads Multicloud Provisioning
Public Cloud
On-prem
ises
An architecture of loosely coupled
data services, easily refactored to
create containerized workloads
Stand-alone workloads composed of
microservices & data that are flexibly
deployed, orchestrated and managed
Agile provisioning of containerized
workloads in multicloud environments
and consumption of cloud services
Cloud Native Platforms
Agility Efficiency Cost Savings
IBM Cloud Pak for Data

Hybrid Cloud can help business innovate and transform
Migrate
to Hybrid Cloud
Transform
the Business
Innovate
the Business
Evolve
the IT Landscape with Power Systems
20K+ clients running mission critical workloads
on Power Systems. IBM Systems is the
engine behind Enterprise.
62% of Power customers prefer cloud
deployment by 2021
Innovation is only possible if the IT landscape
can evolve leveraging hybrid cloud
technologies
IBM Systems, IBM Cloud, and IBM Services
jointly create hybrid cloud solutions

IBM Power Systems:
Enhancements to hybrid multicloud capabilities
Existing apps
Mission critical | Data Intensive
Emerging apps
Containerized | Cloud native
AIX
VMs
Red Hat OpenShift
Cloud Paks
Application | Data
Linux
VMs
IBM i
VMs
Infrastructure View
IBM Cloud
Power Virtual Server
On-Prem IaaS: IBM PowerVC
Other Clouds
Low upfront
cost
Pay per use
consumption
Optimum
resource
utilization
Cloud Pak for Multicloud Management
ON-PREM AND PRIVATE CLOUD CAPABILITIES:
• Expanding Power Private Cloud Solution to include Scale Out
systems
• New cloud optimized scale-out models (including more
affordable entry point for IBM i)
POWER PUBLIC CLOUD EXPANSION:
• More capacity and more locations for Power in the IBM Cloud
• Introducing SAP HANA on Power in the IBM cloud*
HYBRID CLOUD MANAGEMENT AND AIX/IBM i APPLICATION
MODERNIZATION
• Launching Cloud Pak for Data, Cloud Pak for Applications and
OpenShift 4 on Power
• Consistent and automated hybrid cloud management with
Ansible for IBM Power Systems
Power Systems Infrastructure
ARCHITECTED FOR EMERGING APPS AND MISSION CRITICAL
CHOOSE WHERE YOU DEPLOY (ON-PREM, PUBLIC, PRIVATE)
Private Cloud Solution
Containerized
apps
HA/DR DevTest
Red Hat Ansible content
available for automation
Ansible community content
available for automation
IBM Power Systems / © 2020 IBM Corporation

38
Existing Analytics/AI on IBM Enterprise Power Systems

Use Case: On-Premise consolidation
SAP infrastructure
• 106 SAP Instance
• 25 SAP HANA databases
• 128TB Total memory
Challenge
Customer running SAP ECC and BW on IBM Power with
AIX for many years. New IT strategy based on ”Cloud
only”.
Approach
IBM focused on:
1. Total cost of ownership (TCO) within 3 and 5 years
2. Proven technology and customer’s Power
experiences.
3. IBM Virtualization, Flexibility and Availability
4. ESG - low carbon footprint
Profile: Global company committed to pioneering solutions to the world’s water and climate
challenges and improving people’s quality of life. Company aspires to be Climate Positive and aims
to halve its own water consumption by 2025.
$ < 3X

Use Case: Divestiture
SAP infrastructure
• 16 LPARs | 330TB Tier 1, 290TB Tier 3
• POWER9 infrastructure (85+ Cores)
• GTS Managed but not Managed Apps
Challenge
• Has a HANA and S/4HANA roadmap and are
hiring accordingly.
• #1 priority is to move out from parent’s
datacenters to the cloud in 2020
Approach
Migration to IBM Power Virtual Server in 2020
S/4HANA and HANA migration progresses in parallel
on a longer timescale
Pre-requisites
• SAP Netweaver on IBM Power Virtual Server
• SAP HANA + S/4HANA on IBM Power Virtual
Server
Profile: International industrial service company and one of the world's largest oil field services
companies. The company provides the oil and gas industry with products and services for oil
drilling, formation evaluation, completion, production and reservoir consulting

Provision Faster Scale Affordably Maximize Uptime
• Provision SAP HANA instances
faster with built-in virtualization
• Easily make capacity changes
• Simplify management consolidating
HANA instances
• Minimize infrastructure with scale
up environment
• Granular capacity allocation
• Share and optimize CPU allocation
• Capacity on Demand
• Ranked most reliable server for
over a decade1
• Zero impact planned
maintenance with LPM
• Virtual persistent memory for
faster restart and shutdown
1. ITIC 2018 Global Server Hardware, Server OS Reliability Survey Mid-Year Update. The highest uptime of 99.9996% is
calculated based on 2.0 minutes/server/annum unplanned downtime of any non-mainframe Linux platforms
Your smart choice to run SAP HANA
IBM Power Systems
41

What is SASViya?
A cloud-enabled, in-memory analytics engine
– Provides quick, accurate and reliable analytical insights.
– Elastic, scalable and fault-tolerant processing addresses the complex analytical challenges of today
– Effortlessly scaling for the future.
SASViya provides:
– Faster processing for huge amounts of data and the most complex analytics,
– Including machine learning, deep learning and artificial intelligence
– Standardized code base that supports programming in SAS and other languages, like Python, R, Java
and Lua.
– Support for cloud, on-site or hybrid environments.
– It deploys seamlessly to any infrastructure or application ecosystem.
Page
42

43Page
SAS 9.4 & SAS Viya
Similarities/ Differences/ Relationships
SAS® 9.4
– Discover insights, manage data
analytics approachable. Legacy
SAS Viya
– Cloud-enabled, in-memory
that provides quick, accurate and
analytical insights.
They compliment each other
direct replacement
SAS Visual
Analytics
SAS
Report
Viewer

44
SAS Visual Analytics
SAS high-performance
technologies
accelerate analytic computations
derive value from
massive amounts of
data

Eliminate bottlenecks
• Generate insights on time, every
time by scaling on demand
• Easily allocate precise capacity at
the push of a button
• Simplify management with co-
located workloads in same system
• Optimize resource utilization
Drive agility Reduce risk
• Reduce risk with #1 ranked
systems in reliability
• Zero impact planned downtime
with Live Partition Mobility
• Eliminate bottlenecks with the
industry leading throughput
• 2x I/O and 1.8x memory
bandwidth vs compared x86
platforms
1. ITIC 2018 Global Server Hardware, Server OS Reliability Survey Mid-Year Update. The highest uptime of 99.9996% is
calculated based on 2.0 minutes/server/annum unplanned downtime of any non-mainframe Linux platforms
Accelerate insights from SAS solutions
with
IBM Power Systems
45Page

Best Practice Approach:
Think Solutions !
Gaining insights with Machine Learning/Deep
Learning requires a flexible end to end
solution first approach
Focus on solving problems and use cases
Data is a pre-requisite
ML/DL is just a piece of an overall workflow
Infrastructure matters
Establish trusted collaborations, partners
In Summary

AI in the Enterprise at Scale

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à AI in the Enterprise at Scale

Similaire à AI in the Enterprise at Scale (20)

Plus de Ganesan Narayanasamy

Plus de Ganesan Narayanasamy (20)

Dernier

Dernier (20)

AI in the Enterprise at Scale