SlideShare une entreprise Scribd logo
1  sur  104
Télécharger pour lire hors ligne
Understanding Data, Data
Analytics and AI(I)
2020. 6. 10
Chun MK (chunmk80@gmail.com)
Contents
□ Prologue : Can a Machine imitate Human
Intelligence?
□ Brief History of Artificial Intelligence
□ Understanding (Big)Data
□ History of Data Analytics (Now & Future)
□ Summary
2
Prologue
Can a machine imitate
human intelligence?
3
Copyright © 2020 AcornSoft All Rights Reserved
Imitation Game
4
Copyright © 2020 AcornSoft All Rights Reserved
Can a machine imitate human intelligence?
• Alan Turing’s Bombe* machine “Victory”
* Bombe - decipher German Enigma-
machine-encrypted secret messages
A wartime picture of
a Bletchley Park Bombe
The Imitation Game
Object Tools Output
Movie
Cryptogram Solver
Know the
Enemy
Encrypted
secret
messages
Machine
Algorythm
New
Information
Data analysis Insight
Big Data AI Value
Data Science
5
Copyright © 2020 AcornSoft All Rights Reserved
Can a machine imitate human intelligence?
• Turing Test
In 1950, Alan Turing published his seminal article “Computing Machinery and Intelligence” where he
described how to create intelligent machines and in particular how to test their intelligence. This Turing
Test is still considered today as a benchmark to identify intelligence of an artificial system:
ifahumanisinteracting withanotherhumanandamachineandunabletodistinguish the
machinefromthehuman,thenthemachineissaidtobeintelligent.
6
Copyright © 2020 AcornSoft All Rights Reserved
Can a machine imitate human intelligence?
• The father of artificial intelligence - T
wo Person
Rear view of the rebuilt Bombe
213(h)*182(w)*60(d), 1ton
Marvin Minsky
(1927~2016)
co-founder of the
MIT's AI laboratory
America
Alan Turing
(1912~1954)
the father of
theoretical
computer science
and artificial
intelligence
English
Perceptrons
by Marvin L. Minsky
7
Copyright © 2020 AcornSoft All Rights Reserved
Digital Transformation Era : Two Pillar
Machine
imitate human
intelligence
Data is the
new
application
“data is the new application. …
Data is now fundamental to how
people work and the most
successful companies have
intelligently integrated it into
everyone’s daily workflow.”
Alan Turing in 1950 : can a
machine imitate human
intelligence? In his seminal
paper “Computing
Machinery and Intelligence,”
he formulated a game,
called the imitation game.
Alan Turing
8
Understanding (Big)Data
9
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
Data is how we
express Observation
in reusable form.
A collection of fact!!!
Discrete data can only take certain
values (like whole numbers)
Continuous data can take
any value (within a range)
• Your friends' favorite holiday destination
• The most common given names in your town
• How people describe the smell of a new
perfume
• Height (Continuous)
• Weight (Continuous)
• Petals on a flower (Discrete)
• Customers in a shop (Discrete)
10
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• Why is Data Important?
Data is the basis of
Information, Knowledge, and Value.
VALUE
KNOWLEDGE
INFORMATION
DATA
11
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• Data Generators = Increasing Rapidly
Source : KPCB INTERNET TRENDS 2016, Mary Meeker
12
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• Data is the new application
Pre-1995
Winning Businesses =
Use Human Data / Insights
T
o Improve Customer Experiences…
…1990s-2000s…
Internet + Mobile Devices + Cloud
Netscape Web Browser – 1994
Amazon Web Services (AWS) – 2006
Apple iPhone – 2007
Apple App Store – 2008
…2000s
Winning Businesses =
Build / Use Data Plumbing Tools
To Use Digital Data / Insights To Improve
Customer Experiences
Data is now fundamental to how people work
& the most successful companies have intelligently
integrated it into everyone's daily workflow…
Data is the new application.
Frank Bien – CEO & President, Looker, 6/19
Source : Internet Trends 2019, Mary Meeker
In the broadest sense, only data generated on the Internet is handled.
13
Copyright © 2020 AcornSoft All Rights Reserved
151
What is Data ?
Source: IDC ‘Digitization of the World From Edge to Core White Paper’ developed in collaboration with Seagate (11/18), IDC DataSphere. Note: 1 petabyte = 1MM gigabytes, 1 zeta byte = 1MM petabytes of new data created /
captured each year. The grey area in the graph represents data generated, not stored. Structured data indicates data that has been organized so that it is easily searchable & includes metadata & machine-to-machine (M2M)
data. Replicated data = data that is a copy of the original..
0ZB
100ZB
200ZB
2005 2010 2015 2020E 2025E
9% Structured
16%
13%
10%
32%
New Data Captured / Created / Replicated, per IDC
2018
Data
Volume,
Annual,
Global
(ZB)
Original
Data
Replicated
Data
Data Volume = Extraordinary Growth…
~13% Structured / Tagged & Rising Rapidly
data generated,
not stored
data that has
been organized
14
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• data is the new application
▪ Data volume and utilization are evolving rapidly, and broadly. Meeker projects
“extraordinary growth” in overall data volume, citing IDC research that the total
“datasphere” will grow from 33 zettabytes today to 175 zettabytes in 2025.
▪ Significantly, the percentage of data that is structured – organized for use in a
spreadsheet or database – will rise from 13 percent now to 32 percent in 2025.
▪ These improvements in data collection and formatting are driven by
business goals, as cloud technologies and artificial intelligence allow
companies to leverage data in new ways to improve their business.
▪ Data analysis is becoming a competitive differentiator, Meeker says. The ability
to mine data for customer insights (including personalization and
recommendations) is essential for most businesses, and will become more so in
the future.
https://datacenterfrontier.com/what-mary-meeker-predicts-about-the-future-of-data-centers/
15
Copyright © 2020 AcornSoft All Rights Reserved
152
What is Data ?
Source: Adapted from Graphics presented in IDC ‘Digitization of the World From Edge to Core White Paper’ developed in collaboration with Seagate (11/18), IDC DataSphere.
Connected
Processes
ENDPOINT
People
EDGE
Branch Offices
CORE
Large Datacenters
, Including Public&
Private Cloud
Data Propagation = Expanding…
Endpoints ⇋ Edge ⇋Core The Core is the Heart of
the Datasphere
Core :
This consists of designated
computing datacenters in the
enterprise and cloud providers. It
includes all varieties of cloud
computing, including public,
private, and hybrid cloud. It also
includes enterprise operational
datacenters, such as those
running the electric grid and
telephone networks.
Edge :
Edge refers to enterprise-
hardened servers and
appliances that are not in
core datacenters. This
includes server rooms,
servers in the field, cell
towers, and smaller
datacenters located
regionally and remotely for
faster response times.
Endpoint :
Endpoints include all devices on the edge of the network, including
PCs, phones, industrial sensors, connected cars, and wearables.
We’ll have more intelligence
and more activity at the edge on
data coming from the
generators that we build and
the IoT devices we have
deployed…raw data will be
analyzed on the edge first, and
then the results will be sent
back to the core for deeper
analysis.
– CISO/CFO, Leading Manufacturing Firm
16
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• Data-driven World
“IDC predicts that the Global Datasphere will grow
from 33 Zettabytes in 2018 to 175 Zettabytes by 2025”
The data-driven world will be always on, always tracking, always
monitoring, always listening & always watching
– because it will be always learning.
17
Copyright © 2020 AcornSoft All Rights Reserved
What is Data ?
• Data-driven World
“IDC forecasts that more than 150B devices will be
connected across the globe by 2025, most of which will
be creating data in real time.”
Real-time data represents 15% of the Datasphere in 2017, and
nearly 30% by 2025
18
Copyright © 2020 AcornSoft All Rights Reserved
Challenge created by digital disruption(too much data)
• Top 10 Data andAnalytics Technology Trends for 2019, Gartner
According to Donald Feinberg(vice president and distinguished analyst at
Gartner)
the very challenge created by digital disruption — too much data — has
also created an unprecedented opportunity. The vast amount of data,
together with increasingly powerful processing capabilities enabled
by the cloud, means it is now possible to train and execute
algorithms at the large scale necessary to finally realize the full
potential of AI.
“The size, complexity, distributed nature of data, speed of action and the
continuous intelligence required by digital business means that rigid and
centralized architectures and tools break down,” Mr. Feinberg said. “The
continued survival of any business will depend upon an agile, data-
centric architecture that responds to the constant rate of change.”
https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo
19
History of (Big)Data
Analytics, Now & Future
20
Copyright © 2020 AcornSoft All Rights Reserved
History of Data Analysis
Data analysis is rooted in statistics,
which has a pretty long history.
It is said that the beginning of statistics was
marked in ancient Egypt,
when Egypt was taking a periodic census
for building pyramids.
Throughout history, statistics has played an important role for
governments all across the world, for the creation of censuses,
which were used for various governmental planning activities
(including, of course, taxation).
21
Copyright © 2020 AcornSoft All Rights Reserved
Define Big Data
Prescriptive
Predictive
Decisions
Recommend
Findings
Objectives
small big
few many
Data Object
Size
Data Object
Quantity
VOLUME
VALUE
Data Sources
few many
Contents Types
few many
Structure
Types
structured unstructured
Semantic
Divirsity
low high
VARIETY
slow fast
Acquisition
Rate
VELOCITY
Update Rate
slow fast
Known Data Sources Provenance Data Integrity Governance
VERACITY
* NIST, 2014
too big (volume),
arrives too fast (velocity),
changes too fast (variability),
contains too much noise (veracity),
too diverse (variety)
to be processed within
a local computing structure
using traditional approaches
and techniques
* ISO, 2014
22
Copyright © 2020 AcornSoft All Rights Reserved
Evolution of Data Analytics System
Database
Management
Technology
Development
of Business
Intelligence
& Analytic
Platform
Technologies
and Packages
for Statistical
Processing
1960s 1970s 1980s 1990s 2000s
Flat File Based
Tape based
storage/Batch reporting
Query Modules &
Report Generators
Batch querying &
reporting/reporting
generators
Niche Statistical
Subroutines
Social science/clinical
trials/agriculture
Routinization
Querying &
Reporting
Statistical
Computation
Navigational DBMS
Late 1970 RDBMS
emerged
Early DSS Tools
Commercial tools for
building DSS
Statistical Software
Pharma & Social Scince
SPSS/SAS incorporated
Modularization
Decision Support
& Modeling
1st Gen Statistical
Processing
Relational DBMS
RDBMS solutions
matured/personal
databases for PC
DSS & 4GL
Environments
4GL/EIS/spreadsheet/des
criptive analytics
PC-based Statistical
Packages
Other industries
Pc-based,
graphics/Expert systems
Abstraction
Analytical
Processing
2nd Gen Statistical
Processing
Distributed DBMS
Distributed
architecture(clustering)
Data Warehouse &
BI
BI tool market grew
rapidly/Web based
analytics
Early Data Mining
tools
Vendors & solutions
Scaling &
Distribution
Enterprise Performance
Management
Data Mining
Post Relational
DBMS
Unstructured data, non-
relational data model/ large
scale distributed data
Data Processing &
Analytic Platform
Large scale data
processing/unstructured,real-
time analytics/ big data
analytics
Data Processing &
analytics Platforms
Open source R based
statistical platforms/NLP
Text analysis
Specialization &
Extension
Next Gen Data
Processing
Next Gen Data
Processing
* Max Kanaskar’s “BIG DATA TECHNOLOGY SERIES”에서 자료 정리
From 1974-1980 the "AI Winter" occurred. The "AI Winter" refers to
the time period where government funding and interest in artificial
intelligence dropped off.
The second AI winter (1987-1993) AI
research as due to high cost but not
efficient result.
2nd AI Boom : The
emergence of intelligent
agents (1993-2011)
1st AI Boom : The golden
years-Early enthusiasm
(1956-1974)
23
Copyright © 2020 AcornSoft All Rights Reserved
New Trend – Algorithm Marketplace
• Algorithm MarketplacesAre Bringing theApp Economy toAnalytics
Source: Gartner (October 2015)
Deep Learning
Framework
Open Stable
TensorFlow 2015.11 2019. 6
Keras 2015.3 2019. 8
Mxnet 2016(?) 2020. 2
Pytorch 2016.10 2019.4
Microsoft
cognitive toolkit
2016.1 2019.4
Top 5 Deep Learning Framework
24
Copyright © 2020 AcornSoft All Rights Reserved
Analysis vs. Analytics
✓ process of inspecting
✓ cleaning
✓ transforming
✓ modeling data
✓ separation of a whole into its
component parts
✓ looks backwards over time, providing
marketers with a historical view of
what has happened
✓ discovery
✓ interpretation
✓ communication of meaningful patterns
in data
✓ method of logical analysis
✓ look forward to model the future or
predict a result
Data Analysis Data Analytics
✓ functions and process (main focus)
✓ enterprise architecture, process
architecture
✓ data and reporting (main focus)
✓ information architecture, data
architecture
Assistance Only
25
Artificial Intelligence –
Why is artificial intelligence important?
26
Copyright © 2020 AcornSoft All Rights Reserved
Artificial Intelligence
• The Timeline
27
Copyright © 2020 AcornSoft All Rights Reserved
Artificial Intelligence
• How Artificial Intelligence Works
AI works by combining large amounts of data
with fast, iterative processing
and intelligent algorithms,
allowing the software to learn automatically
from patterns or features in the data.
Born from the vision of Turing and
Minsky that a machine could imitate
intelligent life, AI received its name,
mission, and hype from the conference
organized by McCarthy at Dartmouth
University in 1956.
< A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence >
28
Copyright © 2020 AcornSoft All Rights Reserved
AI - Present
http://www.scryanalytics.com/domains-in-which-artificial-intelligence-is-rivaling-humans/
parallel and
distributed
computing
open source
software
availability
of
Big Data
growing
collaboration
between
academia
and industry
Key reasons for this hyper-growth
After 2011
29
Copyright © 2020 AcornSoft All Rights Reserved
Future AI will be…
• Predictions on the Power of AI by 2035
Over the past several years, there has been a growing belief that AI is a limitless,
mystical force that it is (or will soon be) able to supersede humans and solve any
problem.
For instance, Ray Kurzweil predicted, "artificial intelligence will reach human
levels by around 2029,", and
Gray Scott stated, "there is no reason and no way that a human mind can
keep up with an artificial intelligence machine by 2035.".
An analogous but more ominous sentiment was expressed by Elon Musk, who
wrote, "The pace of progress in artificial intelligence . . . is incredibly fast . . . The
risk of something seriously dangerous happening is in the five-year timeframe.
10 years at most," and later said, "with artificial intelligence we’re summoning
the demon".
http://www.scryanalytics.com/the-current-hype-cycle-in-artificial-intelligence/
30
Copyright © 2020 AcornSoft All Rights Reserved
Artificial Intelligence
• Why is artificial intelligence important?
AI automates repetitive learning and
discovery through data
AI adds intelligence to existing
products
AI adapts through progressive
learning algorithms to let the data do the
programming.
AI analyzes more and deeper data
using neural networks that have many hidden layers.
AI achieves incredible accuracy
through deep neural networks – which was previously impossible.
AI gets the most out of data
When algorithms are self-learning,
the data itself can become intellectual property.
https://www.sas.com/en_us/insights/analytics/what-is-artificial-intelligence.html
1950s–1970s
Neural Networks
Early work with neural networks
stirs excitement for “thinking
machines.”
1980s–2010s
Machine Learning
Machine learning becomes
popular.
Present Day
Deep Learning
Deep learning breakthroughs
drive AI boom.
31
Copyright © 2020 AcornSoft All Rights Reserved
Artificial Intelligence
• Important Sub-fields ofArtificial Intelligence in 2010
http://www.scryanalytics.com/resurgence-of-artificial-intelligence-during-1983-2010/
32
Copyright © 2020 AcornSoft All Rights Reserved
AI Diagram
Deterministic Rules &
Process & Decisions
Robotics
Event Processing
Predictive Knowledge
Management
Natural Language
Processing(NLP)
Rules Engine & BPM
Robotic Process(RPA)
Chat Bots/Virtual Assist
Simple Event
Complex Event(CEP)
Deep Q&A System
Text to Speech
Translation
Speech to Text
Image to Text
Alerts
Likely Anawers
Automated Speech
Automated Writing
Automated Grouping
Automate Repetitive
Task
Machine
Learning
(ML)
Image Recognition
Unsupervised
Learning
Reinforced
Learning
Supervised
Learning
Deep (NN)
Learning
Classic Models
(Bayes, GA,
Regression,
Trees, etc.)
Classify(Text, etc.)
Numeric Prediction
Probability Assessment
Optimization(LP, etc.)
Predictive
Recommendation
Scores, Actions,
Ranking, Forecasts
Examples of Main Areas Examples of Sub Areas Results
AI
Intelligence
Automated
Source : http://vincejeffs.com/ai-use-cases-crm/
Artifical Intelligence
33
Copyright © 2020 AcornSoft All Rights Reserved
AI enabled Analytics
• TheEvolutionofBusinessIntelligence
https://www.eckerson.com/articles/the-impact-of-ai-on-analytics-machine-generated-intelligence
Descriptive Analytics
What happened?
Diagnostic Analytics
What did it happen? Predictive Analytics
What will happen?
Prescriptive Analytics
How can we
make it happen?
Gartner
Analytics
Ascendancy
Model
Diffculty
Value
34
change the subject to
cloud container -
3 Survey results
35
Copyright © 2020 AcornSoft All Rights Reserved
Why are Enterprises Adopting Containers?
• Top Drivers for Containerization
Source : Modernizing Applications with Containers in the Public Cloud, June 2019 IDC
Big data/machine learning/AI initiatives
IoT/edge computing
Data/Data Analytics(AI)
Support for mobile initiatives
Pursue multicloud/hybrid cloud strategy
Pursue DX/new business innovation
Disital Transformation
Increase developer productivity
Modernize existing applications
Increase app development speed/time to market
Support cloud-native, microservices architecture
Agile DevOps
Reduce infrastructure costs/improve efficiency
Reduce operations management costs
Move off old/unsupported operating systems
OPEX
Reliability/availability/scalability
Improve security
Observability/Security
%
[Modernizing Applications with Containers in the Public Cloud]
2019, IDC
36
Copyright © 2020 AcornSoft All Rights Reserved
2019 CONTAINER ADOPTION BENCHMARK SURVEY
• IN WHAT USE CASES WILLYOUR CONTAINERS BE EMPLOYED?
Enterprises are using
containers for
everything from
modernizing legacy
applications to big
data analytics.
Enterprise applications, whether cloud-
native or traditional, need databases to
store and manage persistent data.
Databases have emerged this year as a
high-priority container use case for the
enterprise.
37
Copyright © 2020 AcornSoft All Rights Reserved
IDG –클라우드, 제2막이 다가온다
• 방향은 하이브리드, 목표는 디지털 혁신
2019, 한국IDG, 국내 IT전문가 660명 대상. 현재 어떤 유형의 클라우드를 도입했고, 향후 어
떤 유형을 도입할 예정이며, 클라우드의 업무 활용률은 얼마나 달라지고, 예산은 얼마나 증
액 편성했으며, AI를 업무에 얼마나 활용하는지 등을 조사
38
Copyright © 2020 AcornSoft All Rights Reserved
Container & AI
• Tech trends in Data &Analytics
Source : Gartner, Gartner Top 10 Data and Analytics Trends, November 5, 2019 https://www.gartner.com/smarterwithgartner/gartner-top-10-data-analytics-trends/
Continuous intelligence relies on platforms, architectures, and software that allows
organizations to collect, organize, and analyze data to enable fast actions in response to real-
time events.
39
Copyright © 2020 AcornSoft All Rights Reserved
Container & AI
• Continuous Intelligence
Continuous intelligence analytics solutions are platforms that ingest streaming
data, perform analytics, and embed code, machine learning models, and rules to
enable the real-time enterprise.
Continuous Intelligence Analytics
Platform
Real-time
Enterprise
ingest
streaming data
machine learning
models, rules
embed code
perform analytics
40
Copyright © 2020 AcornSoft All Rights Reserved
Container & AI
• Continuous Intelligence
Continuous intelligence platforms must have an AI that easily integrates a plethora of analytical
tools and machine learning models to detect patterns of events. The goal is to detect urgent
situations to act upon automatically or provide information to real-time dashboards for human
decision-makers.
https://www.forbes.com/sites/forbestechcouncil/2018/10/18/what-is-continuous-intelligence/#326dd4bf7d25
All Data,
Pervasive Access
Continus
Information
Adptable
Actions
Complex &
Fast-Moving Data
Machine
Learning
Unconstrained
Exploration
Continuous intelligence
Build Your Own Cloud
41
Copyright © 2020 AcornSoft All Rights Reserved
Match made in Heaven – Kubernetes & AI
Many companies are starting to embed continuous
intelligence (CI) using artificial intelligence (AI) and
machine learning (ML) into business processes.
And the trend is expected to continue. Gartner
notes that by 2022, more than half of major new
business systems will incorporate CI that uses real-
time context data to improve decisions.
4 Things to Know About Using Kubernetes for AI rtinsights.com/4-things-to-know-about-using-kubernetes-for-ai, By Salvatore SalamoneJanuary 2, 2020
42
Copyright © 2020 AcornSoft All Rights Reserved
Kubernetes and containers and AI
Why Kubernetes and containers are the
perfect fit for machine learning(AI)
43
1 page Summary
44
Copyright © 2020 AcornSoft All Rights Reserved
Two Key Drivers to Containerization
Machine
imitate human
intelligence
Data is the
new
application
“data is the new application. …
Data is now fundamental to how
people work and the most
successful companies have
intelligently integrated it into
everyone’s daily workflow.”
Alan Turing in 1950 : can a
machine imitate human
intelligence? In his seminal
paper “Computing
Machinery and Intelligence,”
he formulated a game,
called the imitation game.
Alan Turing
Data
+
AI
Understanding Data, Data
Analytics, AI (II)
2020. 6. 24
Chun MK
Contents
□ Sectors of The Economy Affected by AI
□ Machine Learning in Practice
□ Anatomy of Machine Learning
□ Machine Learning Model Development Life Cycle
□ Containers as an enabler of AI
□ Machine Learning Platform Sample
□ Summary
47
Copyright © 2020 AcornSoft All Rights Reserved
Sectors of The Economy Affected by AI
• face detection and verification : CNNs
• extracting text from financial documents : LSTM
• Fraud detection : CNNs
• anomaly detection : Variational Autoencoders
• Chatbots
• Collaborative Filtering for recommendations
• NLP for mining descriptive text
• CNNs for apparel detection
• Physical robots powered by Deep Reinforcement Learning
• … …
48
Copyright © 2020 AcornSoft All Rights Reserved
Sectors of The Economy Affected by AI
• Top Use Cases by Function
https://appliedai.com/data/use-cases/1
MARKETING CUSTOMER SERVICE
SALES IT
OPERATIONS FINANCE
HR HEALTHTECH
• Retargeting
• Recommendation
Personalization
• Social Analytics &
Automation
• Predictive Sales
• Sales Data Input
Automation
• Sales Forecasting
• Customer Service
Chatbot (e2e
Solution)
• Intelligent Call
Routing
• Call Analytics
• Analytics Platform
• Natural Language
Processing Library
• Analytics &
Predictive
Intelligence for
Security
• Robotic Process
Automation (RPA)
• Predictive
Maintenance
• Manufacturing
Analytics
• Hiring
• Performance
Management
• HR Analytics
• Fraud Detection
• Financial Analytics
Platform
• Credit
Lending/Scoring
• Patient Data
Analytics
• Personalized
Medications and
Care
• Drug Discovery
NEW FIELD
• TELECOM
• IOT(IIOT)
• SELF-DRIVING CAR
• …
49
Copyright © 2020 AcornSoft All Rights Reserved
Machine
Learning
Impacts
Machine Learning in Practice
• Machine Learning Use example
https://squadex.com/insights/top-machine-learning-use-cases-business/
business decisions
streamlines work processes
reduces overheads
advances our everyday lives
50
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning in Practice
• 9 Practical Machine Learning Use Cases
▪ Image & Video Recognition
▪ Speech Recognition
face recognition, object detection,
text detection (printed and
handwritten), logo and landmark
detection, visual search, reverse
image search, image composition,
and image curation
search engines (e.g. Google,
Baidu), virtual digital
assistants (i.g. Alexa,
Cortana, Siri, Google
Assistant, AliGenie), smart
speakers (e.g. Amazon Echo,
Google Home), and voice-
activated applications (e.g.
Uber, Evernote)
51
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning in Practice
• 9PracticalMachineLearningUseCases
▪ Fraud Detection
▪ Patient Diagnosis
identifying cancerous
tumors and skin cancer,
diagnose diabetes, and
most importantly, predict
disease progression
52
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning in Practice
• 9PracticalMachineLearningUseCases
▪ Anomaly Detection
manufacturing
to increase
productivity
and efficiency,
reduce costs,
and optimize
downtime
credit card
fraud, clinical
diagnosis,
structural
defects are
anomalies
53
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning in Practice
• 9PracticalMachineLearningUseCases
▪ Inventory Optimization
▪ Demand Forecasting
54
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning in Practice
• 9PracticalMachineLearningUseCases
▪ Recommendation Systems
▪ Intrusion Detection
55
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• What is a Machine Learning?
Machine learning is a subset of artificial intelligence(AI) whitch
provides machines the ability to learn automatically & improve
from experience without being explicitly programmed.
56
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Types of Machine Learning
Supervised Learning Unsupervised Learning Reinforcement Learning
Please teach Me! I can learn myself! My way or highway
57
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Supervised Machine Learning
➢ All of the input, the output, the algorithm, and the scenario are being provided by humans(Supervisor).
▪ Makes machine learn explictly
▪ Data with clearly defined output is given
▪ Direct feedback is given
▪ Predicts outcome/future
▪ Resolves classification and regression problems
▪ Applications are Risk Evaluation and Forecast sales
Labelled Data
58
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Unsupervised Machine Learning
• In unsupervised learning, as you might guess the data is unlabeled and the system tries to learn
without a teacher.
• So unsupervised learning is nothing but discovering the hidden patterns or similarities from the
dataset and grouping or labeling them without any human assistance.
• Despite the fact that unsupervised learning has not been implemented on a wider scale yet, this
methodology forms the future behind Machine Learning and its possibilities.
▪ Machine understands the data(Identifies patterns/ structures)
▪ Evaluation is qualitative or indirect
▪ Does not predict/ find anything specific
▪ Applications are recommendation systems and anomaly detection
59
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Reinforcement Machine Learning
• In reinforcement learning, the algorithm discovers through trial and error which actions
yield the greatest rewards. This type of learning has three primary components: the agent
(the learner or decision-maker), the environment (everything the agent interacts with),
and actions (what the agent can do).
▪ An approach to AI
▪ Reward based learning
▪ Learning from +Ve & - Ve reinforcement
▪ Machine learns how to act in a certain enviornment
▪ To maximize rewards
▪ Applications are self driving cars and gaming
60
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Supervised vs. Unsupervised Machine Learning Characteristic
Characteristic Supervised Learning Unsupervised Learning
Learning Goal
Supervised learning is used to predict the result
for a new input.
Unsupervised learning is used to discover
hidden pattern in dataset.
Dataset Used Algorithms are trained using labelled dataset. Algorithms are trained using unlabeled dataset.
Human
assistance
Complete learning process happens under
human supervision and assistance.
All the learning process happens without
human supervision.
Basic Types
It is classified into two types i.e Classification
and Regression.
It can be classified into two basic types i.e
Clustering and Association.
Output It predicts the result. It finds the hidden relationships and patterns.
Accuracy It produces more accurate results.
When compared with supervised learning
results less accurate.
61
Copyright © 2020 AcornSoft All Rights Reserved
Anatomy of Machine Learning
• Summary
62
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Model Development Life Cycle
• Machine Learning Model Development Life Cycle
https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d
Business Requirements
& Hypothesis Designing
Exploratory Data
Analysis
Data Pre-Processing &
Data Cleaning
Feature Engineering &
Feature Selection
Model
Deployment
Model Performance
Model Hyper-
Parameter Tuning
Machine Learning
Model Selection
Visualizations
Process of Machine Learning
Data Platform
63
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Model Development Life Cycle
• MachineLearningModelDevelopmentLifeCycle
https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d
Business Requirements
& Hypothesis Designing
Exploratory Data
Analysis
Data Pre-Processing &
Data Cleaning
Feature Engineering &
Feature Selection
exploration
thecriticalprocessofperforminginitial
investigationsondatasoastodiscover
patterns,tospotanomalies,totest
hypothesisandtocheckassumptions
withthehelpofsummarystatisticsand
graphicalrepresentations*
gooddataexplorationcanprovidethe
usefulinsightswithinthedataaswellas
solvealmost70%oftheprobleminthe
EDAstageonly.
DataReady
• MissingValueChecks&Missing
ValueImputations
• Removaloftheunwanteddata
• DataOptimizationonthebasisof
DomainorBusiness
recommendations
• OutlierDetection&Removal
• DimensionReduction
• Duplicaterecordsremoval
identifythemostimportantfeatures
withinadataset
• CorrelationChecksorCollinearity
Checks
• Zero-VarianceChecks
• PrincipalComponentAnalysisorPCA
• CategoricalDataEncoding
• DataNormalization
• DataStandardizationorScaling
• LogTransformations
the recognition of a problem and the
idea that machine learning could
potentially be used to solve it
collaboration between domain experts
and machine learning experts
* Method : numeric summaries, aggregations, distributions, densities, reviewing all the levels of factor variables, applying
general statistical methods, exploratory plots, and expository plots
spin up clusters when needed and spin them back down when done
64
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Model Development Life Cycle
• MachineLearningModelDevelopmentLifeCycle
https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d
Model Deployment
Model Performance
Model Hyper-
Parameter Tuning
Machine Learning
Model Selection
Visualizations
training
deployment
type of business problem
• Decision Tree
• Random Forest
• Regression
• K-Means or Clustering
• K-Nearest Neighbors or KNN
• Support Vector Machine
• Logistic Regression
• Naive Bayes
• Artificial Neural Networks
tested on the unseen data before
deployed into the field or
production environments
• Confusion Matrix
• Area Under the Curve or AUC
• Precision & Recall
• Sensitivity & Specificity
• F1-Scores
• R-Square
• Gini Values
• KS Statistics
• Tableau
• Power BI
• Splunk
• Dynatrace
• Qlikview
• Graphana
• R-Shiny
• Plotly
an iterative process which actually
consumes a lot of time after the
Data Processing step
• Cross-Validation
• Outlier or Noisy data removal
• Ex, the topology and size of a
neural network
• Shouldn’t be running into
Over-fitting
model object can be
deployed using various
methods
• Rest APIs
• Micro-Services
flexibility to create distributed training environments across multiple host servers, allowing for
better utilization of infrastructure resources
65
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning workflow & elements
• Machine Learning Workflow and elements
ML
code
Configuration
Data
Verification
Data Collection
Feature Extraction
Machine
Resource
Management
Analysis Tools
Automation
Testing and
Debugging
Process
Management Tools
Metadata Management Monitoring
Serving
Infrastructure
Model Centric Data Centric
Model and Data-centric elements of ML systems
Data (quantity and quality — ML ready), ML technical debt, MLOps vs DevOps and Enterprise ML
processes and skills remain the main barriers to adoption.
ML Workflow — % allocation of resources
https://tecton.ai/blog/devops-ml-data/
https://towardsdatascience.com/state-of-the-machine-learning-ai-industry-9bb477f840c8
66
Copyright © 2020 AcornSoft All Rights Reserved
Containers as an enabler of AI
• Backgrounds to use Kubernetes in Machine Learning workload
https://platform9.com/blog/kubernetes-for-machine-learning/
Why are companies using containers to facilitate the development and deployment of AI apps?
Machine Learning Model
Iterative Process
Compute Intensive
Training Simultaneously
Enterprise Needs
Applications run on
Any Server
Any Cloud Provider
Any Operating system
Any where
67
Copyright © 2020 AcornSoft All Rights Reserved
Containers as an enabler of AI
• Backgrounds to use Kubernetes in Machine Learning workload
https://platform9.com/blog/kubernetes-for-machine-learning/
the challenge
of scalability
the challenge of provisioning a computational infrastructure
that can support a resource-intensive machine-learning pipeline
apply the flexibility of cloud-native development and
infrastructure to machine-learning applications
containerization orchestrator –
Kubernetes
68
Copyright © 2020 AcornSoft All Rights Reserved
Containers as an enabler of AI
• Backgrounds to use Kubernetes in Machine Learning workload
https://platform9.com/blog/kubernetes-for-machine-learning/
Auto-scaling
Data
Management
Multitenancy
Abstraction GPU Support
machine learning workflow works best when each
step in the process can be scaled-up when needed,
and scaled back down when done
automate that scaling and support granular scaling
easy to reproduce the environment necessary to
support computation on GPUs (Kubernetes on Nvidia
GPUs)
single access point for diverse data sources
and manages volume lifecycle
‘namespaces’ feature, which enables
a single cluster to be partitioned into
multiple virtual clusters. → own resource
quotas and access control policies
Data pipelines abstraction
Infrastructure abstraction
A containerized cloud-based machine learning workflow orchestrated by Kubernetes meets
many of the challenges posed by the computational requirements of machine learning.
69
Copyright © 2020 AcornSoft All Rights Reserved
Containers as an enabler of AI
• Kubernetes and containers : perfect fit for machine learning
https://jaxenter.com/containers-machine-learning-165203.html
three phases of an AI project where containers are beneficial: exploration, training, and deployment.
Exploration training deployment
Environment
Conditions
(experiment with different data sets
and machine learning algorithms to
find the right data and algorithms to
predict outcomes with maximum
accuracy and efficiency)
• Speed ofiteration
• Abilityto runtests in parallel
(AI model needs to be trained against
large volumes of data across different
platforms to maximize accuracy and
minimize resource utilization)
• Highly compute-intensive
• computeand storage separate
• combineseveral models that serve
different purposes
Appropriate
action
(Containers provide a way to package
up these libraries for specific domains,
point to the right data source and
deploy algorithms in a consistent
fashion)
• isolated environment - customize
for their exploration
• manage multiple sets oflibraries
and frameworks in a shared
environment
• scale workloadsup and down
• A distributed cloudenvironment
also allows computeand storage
to bemanaged separately, which
cuts storage utilization and
thereforecosts.
• runtheir models ondifferent types
ofhardware, such as GPUs and
specialized processors
• Containersalloweach model to be
deployed as a microservice
• Microservices also make it easier to
deploymodels in parallel in
different productionenvironments
70
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Platform Sample
• Sample ; Uber’s Machine Learning Platform : Michelangelo
Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/
Manage data
Data preparation pipelines push data into the Feature Store tables and training data repositories.
71
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Platform Sample
• Sample ; Uber’s Machine Learning Platform : Michelangelo
Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/
Train models Evaluate models
Model training jobs use Feature Store and training data repository data sets to train models and then push them
to the model repository.
72
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Platform Sample
• Sample ; Uber’s Machine Learning Platform : Michelangelo
Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/
Deploy models
Models from the model repository are deployed to online and offline containers for serving.
73
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning Platform Sample
• Sample ; Uber’s Machine Learning Platform : Michelangelo
Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/
Manage data Train models Evaluate models Deploy models
Make predictions
Monitor predictions
Online and offline prediction services use sets of feature vectors to generate predictions.
74
Summary
75
Copyright © 2020 AcornSoft All Rights Reserved
Summary
• Challenges with speed and scale with the MLlife cycle
Ensure
Seamless
Collaboration
Train
Deploy
Build
monitor
Data prep
Ever-changing, expanding open
source ecosystem
Infrastructure and model
performance
Seamless deployment and update of
a variety of modes
Access to scable infracture
on-demand
Ever increasing volume variety and
velocity of data
Now, as containers grow in popularity and AI adoption enters the mainstream,
enterprises are starting to leverage containerization
to gain flexibility, portability, and reliability for the AI and machine learning lifecycle.
Although containers are great at making applications flexible and portable, it is challenging to
manage multiple containers in a complex system.
That's where Kubernetes comes in.
3rd Seminar
How to Design AI functions
to the Cloud Native Infra
2020. 7. 15
Chun MK
Contents
□ What is Cloud Native?
□ What is Cloud Native ML(AI)?
□ Machine Learning operations Infrastructure
□ What is Kubeflow?
□ Why Use Kubeflow?
□ Summary
78
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native?
• CNCF’s Cloud Natvie Trail Map - Fundamentals of Cloud Native systems
CONTAINERIZATION
CI/CD ORCHESSTRATION OBSERVABILITY & ANALYSIS
SERVICE MESH
DISTRIBUTED DATABASE
MESSAGING
NETWORKING & POICY
CONTAINER REGISTRY
& RUNTIMES
SOFTWARE
DISTRIBUTION
application to run in any
computing environment
To bring all the changes in the
code to container automatically
need container orchestration to
manage the container lifecycles
set up some of them like
logging, tracing, metrics etc
To enable more complex operational
requirements, service discovery, health,
routing, A/B testing etc
define flexible networking layers
based on your requirements
need more scalability and resiliency
required sometimes too
store all your containers, also enable
image scanning and signing if required
need a secure software distribution
#1
#2
#3
#4
#5
#6
#7
#8
#9
#10
everything after
step #3 is optional
a recommended
process for leveagring
open source, cloud
native technologies
79
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native?
• CNCF Cloud Native Definition v1.0 Approved by TOC: 2018-06-11
Cloud native technologies empower organizations to build and run scalable
applications in modern, dynamic environments such as public, private, and
hybrid clouds.
▪ Containers,
▪ service meshes,
▪ microservices,
▪ immutable infrastructure, and
▪ declarative APIs exemplify this approach.
https://github.com/cncf/toc/blob/master/DEFINITION.md
Cloud native is not just deploying your application on cloud
but it is more of taking full advantages of cloud.
“Cloud native is different way of thinking need to first make up our minds, not just the
systems, to utilize the full benefits of cloud.”
80
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native?
• What it means to be Cloud Native approach
https://medium.com/developingnodes/what-it-means-to-be-cloud-native-approach-the-cncf-way-9e8ab99d4923
Cloud-native is an approach to building and running
applications that exploit the advantages of the cloud computing
delivery model. Cloud-native is about how applications are
created and deployed, not where. … It’s appropriate for both
public and private clouds.
Why Google donated
Kubernetes to the CNCF?
Google has been using
containers for many years and
they led the Kubernetes project
which is a leading container
orchestration platform.
But alone they can’t really
change the broad perspective
in the industry around modern
applications.
So there was a huge need for
industry leaders to come
together and solve the major
problems facing the modern
approach.
“the cloud isn’t a place, it’s a way of doing IT”
by Michael Dell
81
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native?
• Cloud Native Fundamentals & Container Ochestration Definition
Container
Orchestration
Resource Management Scheduling Service Management
Configuring
scheduling
traffic routing
Availability
deployments
Provisioning
Load balancing
Scaling
Allocation of
resources
Securing
Health monitoring
service discovery
configuration of applications
Cloud
native
fundamentals
Container registory &
runtimes
Networking & policy Software distribution
Distributed database Docker containerization Service mesh
Ci/cd orchestration Observability & analysis
messaging
automate the following tasks at scale
Basic Step Monitoring Object
82
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native ML(AI)?
• Why Cloud Native Machine Learning andAI
As big data gets more complex,
companies are struggling to accommodate
the storage and computing
needs of average organizations,
much less massive enterprises.
https://medium.com/@ODSC/the-benefits-of-cloud-native-ml-and-ai-b88f6d71783
This is where cloud-native ML and AI comes into play.
83
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native ML(AI)?
• Inherent issues in machine learning
https://www.alibabacloud.com/blog/build-a-machine-learning-system-using-kubernetes_595961#
common software development problems
data-driven features of machine learning
Complex
Machine Learning Workflow
▪ workflow becomes longer
▪ data versions are out of control
▪ experiments cannot be easily traced
▪ results cannot be conveniently reproduced
▪ costly to iterate the model
• Google TensorFlow Extended platform
• Facebook FBLearner Flow platform
• Uber Michelangelo platform
internal infrastructure of these enterprises
Google has extensive experience in
building machine learning workflow
platforms. Its TensorFlow Extended
platform supports Google's core businesses
such as search, translation, and video
playback. More importantly, Google has a
profound understanding of engineering
efficiency in the machine learning field.
Google's Kubeflow team made Kubeflow
Pipelines open-source at the end of 2018.
Kubeflow Pipelines is designed in the same
way as Google's internal TensorFlow
Extended machine learning platform. The
only difference is that Kubeflow Pipelines
runs on the Kubernetes platform while
TensorFlow Extended runs on Borg.
TensorFlow Extended
machine learning platform
84
Copyright © 2020 AcornSoft All Rights Reserved
What is Cloud Native ML(AI)?
• Inherent issues in machine learning
verification splitting
hyperparameter
tuning
more
observation
model
serving
data
loading
processing
feature
engineering
model
training
model
verification
Before a model ends up in production, there are potentially many steps required to
build and deploy an ML model
85
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning operations Infrastructure
• The fundamentals MLplatforms components
https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
“machine learning workloads are more prone to maintenance tasks”
86
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning operations Infrastructure
• Components inside the feature store
loading the raw data inside the feature
store storage (Batch and online)
actually computing the features Computing time
performance is critical when designing this
component.(Batch and online)
features for downstream processing (Batch
and online)
https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
87
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning operations Infrastructure
• Components inside the training rig
The objective of the training rig is to find and produce the best model (in a specific point in time) given:
(i) an initial model architecture,
(ii) a set of tunable hyperparameters and a
(iii) a historical labeled feature set.
check for retrain
conditions
e.g learning rate,
optimisers ..
e.g. number of layers
detect when it is needed
to re-train the current
golden model
discover potential new
models by continuously
optimising (or attempting to)
the current gold model.
resource-intensive
✓ actually performs the training (performance
considerations should be taken into account when
engineering the system)
✓ generate the model signature,
✓ clearly defining the input and output interfaces
✓ any initialisation task
issuing models ready for production
work (extensive testing should be
performed on this step)
all the metadata
associated with the
training phase (model
repository, parameters,
experiments ..)
https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
Output
88
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning operations Infrastructure
• Components inside the prediction rig (execute inferences)
apply low level
specific operations to
a potential reusable
and more abstract
feature
to route requests to a
particular prediction
endpoint
The horsepower of the prediction rig
design for classical non-functional
requirements such as performance, scalability
or fault tolerance
Low latency key value store to
quickly respond to re-entrant
queries. It must implement the
classical cache mechanisms
As A/B tests take place
analyse metadata and particularly ground
truth data in the feature store to suggest a
replacement of the golden model with one of
the experiments
ensure cache and memory warm-
ups when a cold star situation
happen (e.g. new model
promotion)
implement model
explainability logic (e.g.
Anchors, CEM ..) and
returns it for a given
request
centralised all the
metadata associated with
the prediction phase (live
experiments performance,
prediction data stats …)
https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
89
Copyright © 2020 AcornSoft All Rights Reserved
Machine Learning operations Infrastructure
• MLPlatform open source instantiation
FEAST* and kubeflow integration is currently work in progress
On-Prem Infrastructure Cloud
Kubernetes
Feature Store Training rig Prediction rig Metadata
KF Serving
FEAST
MDML
KF pipeline
Distributed training
operators
Katib NAS and HP
*
kubeflow already packages all those components in a nice way
https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
NAS(neural architecture search)
90
Copyright © 2020 AcornSoft All Rights Reserved
General Machine Learning Workflow
• Machine Learning Workflow Components
AI (ML/DL) Workflow
AI (ML/DL) core
Data Algorithm set
structured
semi-
structured
Un-
structured
ingestion analysis
transform validation
splitting Feature store
Ad-hoc
training
Building a
model
Model
validation
Hp tuning
Distributed
training
Training at
scale
Roll-out serving logging monitoring
Data
versioning
Experiment
tracking
Kubeflow(+ecosystem)
Data AIcode
Real world
91
Copyright © 2020 AcornSoft All Rights Reserved
What is Kubeflow?
• Kubeflow is the machine learning toolkit for Kubernetes
Kubeflow’s 1.0 applications that make up our develop, build, train, deploy critical user journey.
https://medium.com/analytics-vidhya/kubeflow-for-everyone-9b914d3f65b1
▪ Kubeflow UI, which is called the Central dashboard
▪ Jupyter notebook controller, for deploying and using Jupyter notebooks
▪ Tensorflow and Pytorch operator, for distributed training of models
▪ kfctl, the Kubeflow command line interface deployment and upgrades
▪ Profile controller, for multi-user support and management
https://medium.com/kubeflow/kubeflow-1-0-cloud-native-ml-for-everyone-a3950202751
➢ Graduating applications include: Kubeflow 1.0
➢ Develop, Build, Train, and Deploy with Kubeflow
92
Copyright © 2020 AcornSoft All Rights Reserved
What is Kubeflow?
• Components of Kubeflow (Logical components that make up Kubeflow)
▪ Central Dashboard
The central user interface (UI) in Kubeflow
▪ Metadata
Tracking and managing metadata of machine learning
workflows in Kubeflow
▪ Jupyter Notebooks
Using Jupyter notebooks in Kubeflow
▪ Frameworks for Training
Training of ML models in Kubeflow
▪ Hyperparameter Tuning
Hyperparameter tuning of ML models in Kubeflow
▪ Pipelines
ML Pipelines in Kubeflow
▪ Tools for Serving
Serving of ML models in Kubeflow
▪ Multi-Tenancy in Kubeflow
Multi-user isolation and identity access management (IAM)
▪ Miscellaneous (Nuclio functions)
Miscellaneous Kubeflow components
Nuclio - High performance serverless for data processing and ML
93
Copyright © 2020 AcornSoft All Rights Reserved
What is Kubeflow?
• Kubeflow components in the MLworkflow
Components of Kubeflow
model training
ML training operator
deploy the workflow to various clouds, local, and on-premises
platforms for experimentation and for production use
94
Copyright © 2020 AcornSoft All Rights Reserved
What is Kubeflow?
• Kubeflow is the machine learning toolkit for Kubernetes
Kubeflow’s goal is to make it easy for machine learning (ML) engineers and data scientists
to leverage cloud assets (public or on-premise) for ML workloads.
https://www.kubeflow.org/docs/started/kubeflow-overview/
[ Kubeflow as a platform(ML system) on top of Kubernetes ]
History
Kubeflow started as an open
sourcing of the way Google
ran TensorFlow internally,
based on a pipeline
called TensorFlow Extended.
It began as just a simpler
way to run TensorFlow jobs
on Kubernetes, but has since
expanded to be a multi-
architecture, multi-cloud
framework for running entire
machine learning pipelines.
95
Copyright © 2020 AcornSoft All Rights Reserved
Why Use Kubeflow?
• Kubeflow bridges betweenAI workloads and kubernetes
https://ubuntu.com/blog/data-science-workflows-on-kubernetes-with-kubeflow-pipelines-part-1
“Kubeflow Pipelines are a great way to build portable, scalable machine learning workflows.”
A machine learning workflow
managing ML workloads on top of Kubernetes is still a lot of specialized
operations work which we don’t want to add to the data scientist’s role.
Kubeflow bridges this gap between AI workloads and Kubernetes, making
MLOps more manageable.
Containers provide the right encapsulation, avoiding the need for debugging every time a developer
changes the execution environment, and Kubernetes brings scheduling and orchestration of
containers into the infrastructure.
96
Copyright © 2020 AcornSoft All Rights Reserved
Why Use Kubeflow?
• Reasons for using Kubeflow
▪ Deploying and managing a complex ML system at scale
▪ Experimentation with training an ML model
▪ End to end hybrid and multi-cloud ML workloads
▪ Tuning the model hyperparameters during training
▪ Continuous integration and deployment (CI/CD) for ML
“machine learning workloads are more prone to maintenance tasks”
97
Copyright © 2020 AcornSoft All Rights Reserved
Kubeflow Community User Survey Fall 2019
• What is your primary role?
N = 50
https://medium.com/kubeflow/kubeflow-community-user-survey-fall-2019-a84776c71743
▪ Kubeflow Community User Survey Fall 2019 Results
CNCF SURVEY 2019 : October 2019 and
received 1,337 responses
respondents from Europe (37%) and
North America (38%), followed by Asia
(17%)
majority of respondents (71%) were from
organizations with at least 100
employees, the largest portion of these
coming from enterprises with more than
5,000 employees (30%)
Two-thirds of the respondents were in
the software and technology industry
top job functions were software architect
(41%), DevOps manager (39%), and back-
end developer (24%)
https://www.cncf.io/blog/2020/03/04/2019-cncf-survey-results-are-here-deployments-are-growing-in-size-and-speed-as-cloud-native-adoption-becomes-mainstream/
98
Copyright © 2020 AcornSoft All Rights Reserved
Kubeflow Community User Survey Fall 2019
• Where do you run yourAI/MLworkloads? (Multiple select)?
N = 33
99
Copyright © 2020 AcornSoft All Rights Reserved
Kubeflow Community User Survey Fall 2019
• What hardware do you use for yourAI/MLworkloads? (Multiple select)
100
Copyright © 2020 AcornSoft All Rights Reserved
Kubeflow Community User Survey Fall 2019
• Critical Kubeflow components that you use in your current workflows. (Multiple select)
101
Copyright © 2020 AcornSoft All Rights Reserved
Kubeflow Community User Survey Fall 2019
• What MLframeworks are typically used in your organization? -- spring survey response
102
Copyright © 2020 AcornSoft All Rights Reserved
Summary : AI needs data, lots of it
• AI needs data, lots of it
CEOs who have overlooked data-driven insights to follow intuition instead
Source: 2018 Global CEO Outlook, KPMG International
▪ the data is still very fragmented
▪ most of the current predictive models use only the
historic data and not the streaming (real-time) data
https://www.forbes.com/sites/andythurai/2020/07/06/ai-driven-enterprises/#128f2fea4cfd
More data leads to
better predictions
감사합니다.

Contenu connexe

Tendances

Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
Future of jobs, big data & innovation
Future of jobs, big data & innovation Future of jobs, big data & innovation
Future of jobs, big data & innovation suresh sood
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisTeradata Aster
 
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...Matt Stubbs
 
State of the State: What’s Happening in the Database Market?
State of the State: What’s Happening in the Database Market?State of the State: What’s Happening in the Database Market?
State of the State: What’s Happening in the Database Market?Neo4j
 
AI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesAI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesSarvesh Kumar
 
Systemof insight
Systemof insightSystemof insight
Systemof insightsuresh sood
 
Forecast of Big Data Trends
Forecast of Big Data TrendsForecast of Big Data Trends
Forecast of Big Data TrendsIMC Institute
 
Smart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning UpdateSmart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning UpdateDATAVERSITY
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveThe_IPA
 
Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Matt Turck
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overviewBisakha Praharaj
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practiceVivek Murugesan
 
San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...DataWorks Summit
 

Tendances (20)

Big Data Overview
Big Data OverviewBig Data Overview
Big Data Overview
 
Datapreneurs
DatapreneursDatapreneurs
Datapreneurs
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Future of jobs, big data & innovation
Future of jobs, big data & innovation Future of jobs, big data & innovation
Future of jobs, big data & innovation
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and Analysis
 
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...
Big Data LDN 2017: Creating ROI from Big Data Investments - Monetizing your B...
 
State of the State: What’s Happening in the Database Market?
State of the State: What’s Happening in the Database Market?State of the State: What’s Happening in the Database Market?
State of the State: What’s Happening in the Database Market?
 
AI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesAI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use cases
 
Systemof insight
Systemof insightSystemof insight
Systemof insight
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
BIg Data Trends in 2016
BIg Data Trends in 2016BIg Data Trends in 2016
BIg Data Trends in 2016
 
Forecast of Big Data Trends
Forecast of Big Data TrendsForecast of Big Data Trends
Forecast of Big Data Trends
 
Smart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning UpdateSmart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning Update
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM Perspective
 
Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overview
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practice
 
San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...
 

Similaire à How to design ai functions to the cloud native infra

Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...
Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...
Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...Hong-Seok Kim
 
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdfDrAdeelAkram2
 
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)Mahmood Khosravi
 
Integra: Summiting the Mountain of Big Data (Infographic)
Integra: Summiting the Mountain of Big Data (Infographic)Integra: Summiting the Mountain of Big Data (Infographic)
Integra: Summiting the Mountain of Big Data (Infographic)Jessica Legg
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
 
Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of thingsProductCamp SoCal
 
Big Data Mining Keynote presentation Sept 2013 09012013
Big Data Mining Keynote presentation Sept 2013 09012013Big Data Mining Keynote presentation Sept 2013 09012013
Big Data Mining Keynote presentation Sept 2013 09012013Julio Da Silva
 
The full service mechanic for your big data project
The full service mechanic for your big data projectThe full service mechanic for your big data project
The full service mechanic for your big data projectNeos IT Services GmbH
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxVaishnavGhadge1
 
What is AI without Data?
What is AI without Data?What is AI without Data?
What is AI without Data?InnoTech
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxtangyechloe
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 

Similaire à How to design ai functions to the cloud native infra (20)

Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...
Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...
Significant Changes in Digital Technology with ‘Manufacturing Innovation 3.0’...
 
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf
08_-_Masamichi_Tanaka_-_Bigdata_and_AI_in_IOT.pdf
 
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)
Big Data, Trends,opportunities and some case studies( Mahmoud Khosravi)
 
Integra: Summiting the Mountain of Big Data (Infographic)
Integra: Summiting the Mountain of Big Data (Infographic)Integra: Summiting the Mountain of Big Data (Infographic)
Integra: Summiting the Mountain of Big Data (Infographic)
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of things
 
20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai
 
Big Data Mining Keynote presentation Sept 2013 09012013
Big Data Mining Keynote presentation Sept 2013 09012013Big Data Mining Keynote presentation Sept 2013 09012013
Big Data Mining Keynote presentation Sept 2013 09012013
 
The full service mechanic for your big data project
The full service mechanic for your big data projectThe full service mechanic for your big data project
The full service mechanic for your big data project
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptx
 
What is AI without Data?
What is AI without Data?What is AI without Data?
What is AI without Data?
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Business with Big data
Business with Big dataBusiness with Big data
Business with Big data
 
Bigdata
Bigdata Bigdata
Bigdata
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 

Dernier

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Dernier (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

How to design ai functions to the cloud native infra

  • 1. Understanding Data, Data Analytics and AI(I) 2020. 6. 10 Chun MK (chunmk80@gmail.com)
  • 2. Contents □ Prologue : Can a Machine imitate Human Intelligence? □ Brief History of Artificial Intelligence □ Understanding (Big)Data □ History of Data Analytics (Now & Future) □ Summary
  • 3. 2 Prologue Can a machine imitate human intelligence?
  • 4. 3 Copyright © 2020 AcornSoft All Rights Reserved Imitation Game
  • 5. 4 Copyright © 2020 AcornSoft All Rights Reserved Can a machine imitate human intelligence? • Alan Turing’s Bombe* machine “Victory” * Bombe - decipher German Enigma- machine-encrypted secret messages A wartime picture of a Bletchley Park Bombe The Imitation Game Object Tools Output Movie Cryptogram Solver Know the Enemy Encrypted secret messages Machine Algorythm New Information Data analysis Insight Big Data AI Value Data Science
  • 6. 5 Copyright © 2020 AcornSoft All Rights Reserved Can a machine imitate human intelligence? • Turing Test In 1950, Alan Turing published his seminal article “Computing Machinery and Intelligence” where he described how to create intelligent machines and in particular how to test their intelligence. This Turing Test is still considered today as a benchmark to identify intelligence of an artificial system: ifahumanisinteracting withanotherhumanandamachineandunabletodistinguish the machinefromthehuman,thenthemachineissaidtobeintelligent.
  • 7. 6 Copyright © 2020 AcornSoft All Rights Reserved Can a machine imitate human intelligence? • The father of artificial intelligence - T wo Person Rear view of the rebuilt Bombe 213(h)*182(w)*60(d), 1ton Marvin Minsky (1927~2016) co-founder of the MIT's AI laboratory America Alan Turing (1912~1954) the father of theoretical computer science and artificial intelligence English Perceptrons by Marvin L. Minsky
  • 8. 7 Copyright © 2020 AcornSoft All Rights Reserved Digital Transformation Era : Two Pillar Machine imitate human intelligence Data is the new application “data is the new application. … Data is now fundamental to how people work and the most successful companies have intelligently integrated it into everyone’s daily workflow.” Alan Turing in 1950 : can a machine imitate human intelligence? In his seminal paper “Computing Machinery and Intelligence,” he formulated a game, called the imitation game. Alan Turing
  • 10. 9 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? Data is how we express Observation in reusable form. A collection of fact!!! Discrete data can only take certain values (like whole numbers) Continuous data can take any value (within a range) • Your friends' favorite holiday destination • The most common given names in your town • How people describe the smell of a new perfume • Height (Continuous) • Weight (Continuous) • Petals on a flower (Discrete) • Customers in a shop (Discrete)
  • 11. 10 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • Why is Data Important? Data is the basis of Information, Knowledge, and Value. VALUE KNOWLEDGE INFORMATION DATA
  • 12. 11 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • Data Generators = Increasing Rapidly Source : KPCB INTERNET TRENDS 2016, Mary Meeker
  • 13. 12 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • Data is the new application Pre-1995 Winning Businesses = Use Human Data / Insights T o Improve Customer Experiences… …1990s-2000s… Internet + Mobile Devices + Cloud Netscape Web Browser – 1994 Amazon Web Services (AWS) – 2006 Apple iPhone – 2007 Apple App Store – 2008 …2000s Winning Businesses = Build / Use Data Plumbing Tools To Use Digital Data / Insights To Improve Customer Experiences Data is now fundamental to how people work & the most successful companies have intelligently integrated it into everyone's daily workflow… Data is the new application. Frank Bien – CEO & President, Looker, 6/19 Source : Internet Trends 2019, Mary Meeker In the broadest sense, only data generated on the Internet is handled.
  • 14. 13 Copyright © 2020 AcornSoft All Rights Reserved 151 What is Data ? Source: IDC ‘Digitization of the World From Edge to Core White Paper’ developed in collaboration with Seagate (11/18), IDC DataSphere. Note: 1 petabyte = 1MM gigabytes, 1 zeta byte = 1MM petabytes of new data created / captured each year. The grey area in the graph represents data generated, not stored. Structured data indicates data that has been organized so that it is easily searchable & includes metadata & machine-to-machine (M2M) data. Replicated data = data that is a copy of the original.. 0ZB 100ZB 200ZB 2005 2010 2015 2020E 2025E 9% Structured 16% 13% 10% 32% New Data Captured / Created / Replicated, per IDC 2018 Data Volume, Annual, Global (ZB) Original Data Replicated Data Data Volume = Extraordinary Growth… ~13% Structured / Tagged & Rising Rapidly data generated, not stored data that has been organized
  • 15. 14 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • data is the new application ▪ Data volume and utilization are evolving rapidly, and broadly. Meeker projects “extraordinary growth” in overall data volume, citing IDC research that the total “datasphere” will grow from 33 zettabytes today to 175 zettabytes in 2025. ▪ Significantly, the percentage of data that is structured – organized for use in a spreadsheet or database – will rise from 13 percent now to 32 percent in 2025. ▪ These improvements in data collection and formatting are driven by business goals, as cloud technologies and artificial intelligence allow companies to leverage data in new ways to improve their business. ▪ Data analysis is becoming a competitive differentiator, Meeker says. The ability to mine data for customer insights (including personalization and recommendations) is essential for most businesses, and will become more so in the future. https://datacenterfrontier.com/what-mary-meeker-predicts-about-the-future-of-data-centers/
  • 16. 15 Copyright © 2020 AcornSoft All Rights Reserved 152 What is Data ? Source: Adapted from Graphics presented in IDC ‘Digitization of the World From Edge to Core White Paper’ developed in collaboration with Seagate (11/18), IDC DataSphere. Connected Processes ENDPOINT People EDGE Branch Offices CORE Large Datacenters , Including Public& Private Cloud Data Propagation = Expanding… Endpoints ⇋ Edge ⇋Core The Core is the Heart of the Datasphere Core : This consists of designated computing datacenters in the enterprise and cloud providers. It includes all varieties of cloud computing, including public, private, and hybrid cloud. It also includes enterprise operational datacenters, such as those running the electric grid and telephone networks. Edge : Edge refers to enterprise- hardened servers and appliances that are not in core datacenters. This includes server rooms, servers in the field, cell towers, and smaller datacenters located regionally and remotely for faster response times. Endpoint : Endpoints include all devices on the edge of the network, including PCs, phones, industrial sensors, connected cars, and wearables. We’ll have more intelligence and more activity at the edge on data coming from the generators that we build and the IoT devices we have deployed…raw data will be analyzed on the edge first, and then the results will be sent back to the core for deeper analysis. – CISO/CFO, Leading Manufacturing Firm
  • 17. 16 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • Data-driven World “IDC predicts that the Global Datasphere will grow from 33 Zettabytes in 2018 to 175 Zettabytes by 2025” The data-driven world will be always on, always tracking, always monitoring, always listening & always watching – because it will be always learning.
  • 18. 17 Copyright © 2020 AcornSoft All Rights Reserved What is Data ? • Data-driven World “IDC forecasts that more than 150B devices will be connected across the globe by 2025, most of which will be creating data in real time.” Real-time data represents 15% of the Datasphere in 2017, and nearly 30% by 2025
  • 19. 18 Copyright © 2020 AcornSoft All Rights Reserved Challenge created by digital disruption(too much data) • Top 10 Data andAnalytics Technology Trends for 2019, Gartner According to Donald Feinberg(vice president and distinguished analyst at Gartner) the very challenge created by digital disruption — too much data — has also created an unprecedented opportunity. The vast amount of data, together with increasingly powerful processing capabilities enabled by the cloud, means it is now possible to train and execute algorithms at the large scale necessary to finally realize the full potential of AI. “The size, complexity, distributed nature of data, speed of action and the continuous intelligence required by digital business means that rigid and centralized architectures and tools break down,” Mr. Feinberg said. “The continued survival of any business will depend upon an agile, data- centric architecture that responds to the constant rate of change.” https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo
  • 21. 20 Copyright © 2020 AcornSoft All Rights Reserved History of Data Analysis Data analysis is rooted in statistics, which has a pretty long history. It is said that the beginning of statistics was marked in ancient Egypt, when Egypt was taking a periodic census for building pyramids. Throughout history, statistics has played an important role for governments all across the world, for the creation of censuses, which were used for various governmental planning activities (including, of course, taxation).
  • 22. 21 Copyright © 2020 AcornSoft All Rights Reserved Define Big Data Prescriptive Predictive Decisions Recommend Findings Objectives small big few many Data Object Size Data Object Quantity VOLUME VALUE Data Sources few many Contents Types few many Structure Types structured unstructured Semantic Divirsity low high VARIETY slow fast Acquisition Rate VELOCITY Update Rate slow fast Known Data Sources Provenance Data Integrity Governance VERACITY * NIST, 2014 too big (volume), arrives too fast (velocity), changes too fast (variability), contains too much noise (veracity), too diverse (variety) to be processed within a local computing structure using traditional approaches and techniques * ISO, 2014
  • 23. 22 Copyright © 2020 AcornSoft All Rights Reserved Evolution of Data Analytics System Database Management Technology Development of Business Intelligence & Analytic Platform Technologies and Packages for Statistical Processing 1960s 1970s 1980s 1990s 2000s Flat File Based Tape based storage/Batch reporting Query Modules & Report Generators Batch querying & reporting/reporting generators Niche Statistical Subroutines Social science/clinical trials/agriculture Routinization Querying & Reporting Statistical Computation Navigational DBMS Late 1970 RDBMS emerged Early DSS Tools Commercial tools for building DSS Statistical Software Pharma & Social Scince SPSS/SAS incorporated Modularization Decision Support & Modeling 1st Gen Statistical Processing Relational DBMS RDBMS solutions matured/personal databases for PC DSS & 4GL Environments 4GL/EIS/spreadsheet/des criptive analytics PC-based Statistical Packages Other industries Pc-based, graphics/Expert systems Abstraction Analytical Processing 2nd Gen Statistical Processing Distributed DBMS Distributed architecture(clustering) Data Warehouse & BI BI tool market grew rapidly/Web based analytics Early Data Mining tools Vendors & solutions Scaling & Distribution Enterprise Performance Management Data Mining Post Relational DBMS Unstructured data, non- relational data model/ large scale distributed data Data Processing & Analytic Platform Large scale data processing/unstructured,real- time analytics/ big data analytics Data Processing & analytics Platforms Open source R based statistical platforms/NLP Text analysis Specialization & Extension Next Gen Data Processing Next Gen Data Processing * Max Kanaskar’s “BIG DATA TECHNOLOGY SERIES”에서 자료 정리 From 1974-1980 the "AI Winter" occurred. The "AI Winter" refers to the time period where government funding and interest in artificial intelligence dropped off. The second AI winter (1987-1993) AI research as due to high cost but not efficient result. 2nd AI Boom : The emergence of intelligent agents (1993-2011) 1st AI Boom : The golden years-Early enthusiasm (1956-1974)
  • 24. 23 Copyright © 2020 AcornSoft All Rights Reserved New Trend – Algorithm Marketplace • Algorithm MarketplacesAre Bringing theApp Economy toAnalytics Source: Gartner (October 2015) Deep Learning Framework Open Stable TensorFlow 2015.11 2019. 6 Keras 2015.3 2019. 8 Mxnet 2016(?) 2020. 2 Pytorch 2016.10 2019.4 Microsoft cognitive toolkit 2016.1 2019.4 Top 5 Deep Learning Framework
  • 25. 24 Copyright © 2020 AcornSoft All Rights Reserved Analysis vs. Analytics ✓ process of inspecting ✓ cleaning ✓ transforming ✓ modeling data ✓ separation of a whole into its component parts ✓ looks backwards over time, providing marketers with a historical view of what has happened ✓ discovery ✓ interpretation ✓ communication of meaningful patterns in data ✓ method of logical analysis ✓ look forward to model the future or predict a result Data Analysis Data Analytics ✓ functions and process (main focus) ✓ enterprise architecture, process architecture ✓ data and reporting (main focus) ✓ information architecture, data architecture Assistance Only
  • 26. 25 Artificial Intelligence – Why is artificial intelligence important?
  • 27. 26 Copyright © 2020 AcornSoft All Rights Reserved Artificial Intelligence • The Timeline
  • 28. 27 Copyright © 2020 AcornSoft All Rights Reserved Artificial Intelligence • How Artificial Intelligence Works AI works by combining large amounts of data with fast, iterative processing and intelligent algorithms, allowing the software to learn automatically from patterns or features in the data. Born from the vision of Turing and Minsky that a machine could imitate intelligent life, AI received its name, mission, and hype from the conference organized by McCarthy at Dartmouth University in 1956. < A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence >
  • 29. 28 Copyright © 2020 AcornSoft All Rights Reserved AI - Present http://www.scryanalytics.com/domains-in-which-artificial-intelligence-is-rivaling-humans/ parallel and distributed computing open source software availability of Big Data growing collaboration between academia and industry Key reasons for this hyper-growth After 2011
  • 30. 29 Copyright © 2020 AcornSoft All Rights Reserved Future AI will be… • Predictions on the Power of AI by 2035 Over the past several years, there has been a growing belief that AI is a limitless, mystical force that it is (or will soon be) able to supersede humans and solve any problem. For instance, Ray Kurzweil predicted, "artificial intelligence will reach human levels by around 2029,", and Gray Scott stated, "there is no reason and no way that a human mind can keep up with an artificial intelligence machine by 2035.". An analogous but more ominous sentiment was expressed by Elon Musk, who wrote, "The pace of progress in artificial intelligence . . . is incredibly fast . . . The risk of something seriously dangerous happening is in the five-year timeframe. 10 years at most," and later said, "with artificial intelligence we’re summoning the demon". http://www.scryanalytics.com/the-current-hype-cycle-in-artificial-intelligence/
  • 31. 30 Copyright © 2020 AcornSoft All Rights Reserved Artificial Intelligence • Why is artificial intelligence important? AI automates repetitive learning and discovery through data AI adds intelligence to existing products AI adapts through progressive learning algorithms to let the data do the programming. AI analyzes more and deeper data using neural networks that have many hidden layers. AI achieves incredible accuracy through deep neural networks – which was previously impossible. AI gets the most out of data When algorithms are self-learning, the data itself can become intellectual property. https://www.sas.com/en_us/insights/analytics/what-is-artificial-intelligence.html 1950s–1970s Neural Networks Early work with neural networks stirs excitement for “thinking machines.” 1980s–2010s Machine Learning Machine learning becomes popular. Present Day Deep Learning Deep learning breakthroughs drive AI boom.
  • 32. 31 Copyright © 2020 AcornSoft All Rights Reserved Artificial Intelligence • Important Sub-fields ofArtificial Intelligence in 2010 http://www.scryanalytics.com/resurgence-of-artificial-intelligence-during-1983-2010/
  • 33. 32 Copyright © 2020 AcornSoft All Rights Reserved AI Diagram Deterministic Rules & Process & Decisions Robotics Event Processing Predictive Knowledge Management Natural Language Processing(NLP) Rules Engine & BPM Robotic Process(RPA) Chat Bots/Virtual Assist Simple Event Complex Event(CEP) Deep Q&A System Text to Speech Translation Speech to Text Image to Text Alerts Likely Anawers Automated Speech Automated Writing Automated Grouping Automate Repetitive Task Machine Learning (ML) Image Recognition Unsupervised Learning Reinforced Learning Supervised Learning Deep (NN) Learning Classic Models (Bayes, GA, Regression, Trees, etc.) Classify(Text, etc.) Numeric Prediction Probability Assessment Optimization(LP, etc.) Predictive Recommendation Scores, Actions, Ranking, Forecasts Examples of Main Areas Examples of Sub Areas Results AI Intelligence Automated Source : http://vincejeffs.com/ai-use-cases-crm/ Artifical Intelligence
  • 34. 33 Copyright © 2020 AcornSoft All Rights Reserved AI enabled Analytics • TheEvolutionofBusinessIntelligence https://www.eckerson.com/articles/the-impact-of-ai-on-analytics-machine-generated-intelligence Descriptive Analytics What happened? Diagnostic Analytics What did it happen? Predictive Analytics What will happen? Prescriptive Analytics How can we make it happen? Gartner Analytics Ascendancy Model Diffculty Value
  • 35. 34 change the subject to cloud container - 3 Survey results
  • 36. 35 Copyright © 2020 AcornSoft All Rights Reserved Why are Enterprises Adopting Containers? • Top Drivers for Containerization Source : Modernizing Applications with Containers in the Public Cloud, June 2019 IDC Big data/machine learning/AI initiatives IoT/edge computing Data/Data Analytics(AI) Support for mobile initiatives Pursue multicloud/hybrid cloud strategy Pursue DX/new business innovation Disital Transformation Increase developer productivity Modernize existing applications Increase app development speed/time to market Support cloud-native, microservices architecture Agile DevOps Reduce infrastructure costs/improve efficiency Reduce operations management costs Move off old/unsupported operating systems OPEX Reliability/availability/scalability Improve security Observability/Security % [Modernizing Applications with Containers in the Public Cloud] 2019, IDC
  • 37. 36 Copyright © 2020 AcornSoft All Rights Reserved 2019 CONTAINER ADOPTION BENCHMARK SURVEY • IN WHAT USE CASES WILLYOUR CONTAINERS BE EMPLOYED? Enterprises are using containers for everything from modernizing legacy applications to big data analytics. Enterprise applications, whether cloud- native or traditional, need databases to store and manage persistent data. Databases have emerged this year as a high-priority container use case for the enterprise.
  • 38. 37 Copyright © 2020 AcornSoft All Rights Reserved IDG –클라우드, 제2막이 다가온다 • 방향은 하이브리드, 목표는 디지털 혁신 2019, 한국IDG, 국내 IT전문가 660명 대상. 현재 어떤 유형의 클라우드를 도입했고, 향후 어 떤 유형을 도입할 예정이며, 클라우드의 업무 활용률은 얼마나 달라지고, 예산은 얼마나 증 액 편성했으며, AI를 업무에 얼마나 활용하는지 등을 조사
  • 39. 38 Copyright © 2020 AcornSoft All Rights Reserved Container & AI • Tech trends in Data &Analytics Source : Gartner, Gartner Top 10 Data and Analytics Trends, November 5, 2019 https://www.gartner.com/smarterwithgartner/gartner-top-10-data-analytics-trends/ Continuous intelligence relies on platforms, architectures, and software that allows organizations to collect, organize, and analyze data to enable fast actions in response to real- time events.
  • 40. 39 Copyright © 2020 AcornSoft All Rights Reserved Container & AI • Continuous Intelligence Continuous intelligence analytics solutions are platforms that ingest streaming data, perform analytics, and embed code, machine learning models, and rules to enable the real-time enterprise. Continuous Intelligence Analytics Platform Real-time Enterprise ingest streaming data machine learning models, rules embed code perform analytics
  • 41. 40 Copyright © 2020 AcornSoft All Rights Reserved Container & AI • Continuous Intelligence Continuous intelligence platforms must have an AI that easily integrates a plethora of analytical tools and machine learning models to detect patterns of events. The goal is to detect urgent situations to act upon automatically or provide information to real-time dashboards for human decision-makers. https://www.forbes.com/sites/forbestechcouncil/2018/10/18/what-is-continuous-intelligence/#326dd4bf7d25 All Data, Pervasive Access Continus Information Adptable Actions Complex & Fast-Moving Data Machine Learning Unconstrained Exploration Continuous intelligence
  • 42. Build Your Own Cloud 41 Copyright © 2020 AcornSoft All Rights Reserved Match made in Heaven – Kubernetes & AI Many companies are starting to embed continuous intelligence (CI) using artificial intelligence (AI) and machine learning (ML) into business processes. And the trend is expected to continue. Gartner notes that by 2022, more than half of major new business systems will incorporate CI that uses real- time context data to improve decisions. 4 Things to Know About Using Kubernetes for AI rtinsights.com/4-things-to-know-about-using-kubernetes-for-ai, By Salvatore SalamoneJanuary 2, 2020
  • 43. 42 Copyright © 2020 AcornSoft All Rights Reserved Kubernetes and containers and AI Why Kubernetes and containers are the perfect fit for machine learning(AI)
  • 45. 44 Copyright © 2020 AcornSoft All Rights Reserved Two Key Drivers to Containerization Machine imitate human intelligence Data is the new application “data is the new application. … Data is now fundamental to how people work and the most successful companies have intelligently integrated it into everyone’s daily workflow.” Alan Turing in 1950 : can a machine imitate human intelligence? In his seminal paper “Computing Machinery and Intelligence,” he formulated a game, called the imitation game. Alan Turing Data + AI
  • 46. Understanding Data, Data Analytics, AI (II) 2020. 6. 24 Chun MK
  • 47. Contents □ Sectors of The Economy Affected by AI □ Machine Learning in Practice □ Anatomy of Machine Learning □ Machine Learning Model Development Life Cycle □ Containers as an enabler of AI □ Machine Learning Platform Sample □ Summary
  • 48. 47 Copyright © 2020 AcornSoft All Rights Reserved Sectors of The Economy Affected by AI • face detection and verification : CNNs • extracting text from financial documents : LSTM • Fraud detection : CNNs • anomaly detection : Variational Autoencoders • Chatbots • Collaborative Filtering for recommendations • NLP for mining descriptive text • CNNs for apparel detection • Physical robots powered by Deep Reinforcement Learning • … …
  • 49. 48 Copyright © 2020 AcornSoft All Rights Reserved Sectors of The Economy Affected by AI • Top Use Cases by Function https://appliedai.com/data/use-cases/1 MARKETING CUSTOMER SERVICE SALES IT OPERATIONS FINANCE HR HEALTHTECH • Retargeting • Recommendation Personalization • Social Analytics & Automation • Predictive Sales • Sales Data Input Automation • Sales Forecasting • Customer Service Chatbot (e2e Solution) • Intelligent Call Routing • Call Analytics • Analytics Platform • Natural Language Processing Library • Analytics & Predictive Intelligence for Security • Robotic Process Automation (RPA) • Predictive Maintenance • Manufacturing Analytics • Hiring • Performance Management • HR Analytics • Fraud Detection • Financial Analytics Platform • Credit Lending/Scoring • Patient Data Analytics • Personalized Medications and Care • Drug Discovery NEW FIELD • TELECOM • IOT(IIOT) • SELF-DRIVING CAR • …
  • 50. 49 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Impacts Machine Learning in Practice • Machine Learning Use example https://squadex.com/insights/top-machine-learning-use-cases-business/ business decisions streamlines work processes reduces overheads advances our everyday lives
  • 51. 50 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning in Practice • 9 Practical Machine Learning Use Cases ▪ Image & Video Recognition ▪ Speech Recognition face recognition, object detection, text detection (printed and handwritten), logo and landmark detection, visual search, reverse image search, image composition, and image curation search engines (e.g. Google, Baidu), virtual digital assistants (i.g. Alexa, Cortana, Siri, Google Assistant, AliGenie), smart speakers (e.g. Amazon Echo, Google Home), and voice- activated applications (e.g. Uber, Evernote)
  • 52. 51 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning in Practice • 9PracticalMachineLearningUseCases ▪ Fraud Detection ▪ Patient Diagnosis identifying cancerous tumors and skin cancer, diagnose diabetes, and most importantly, predict disease progression
  • 53. 52 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning in Practice • 9PracticalMachineLearningUseCases ▪ Anomaly Detection manufacturing to increase productivity and efficiency, reduce costs, and optimize downtime credit card fraud, clinical diagnosis, structural defects are anomalies
  • 54. 53 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning in Practice • 9PracticalMachineLearningUseCases ▪ Inventory Optimization ▪ Demand Forecasting
  • 55. 54 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning in Practice • 9PracticalMachineLearningUseCases ▪ Recommendation Systems ▪ Intrusion Detection
  • 56. 55 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • What is a Machine Learning? Machine learning is a subset of artificial intelligence(AI) whitch provides machines the ability to learn automatically & improve from experience without being explicitly programmed.
  • 57. 56 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Types of Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Please teach Me! I can learn myself! My way or highway
  • 58. 57 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Supervised Machine Learning ➢ All of the input, the output, the algorithm, and the scenario are being provided by humans(Supervisor). ▪ Makes machine learn explictly ▪ Data with clearly defined output is given ▪ Direct feedback is given ▪ Predicts outcome/future ▪ Resolves classification and regression problems ▪ Applications are Risk Evaluation and Forecast sales Labelled Data
  • 59. 58 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Unsupervised Machine Learning • In unsupervised learning, as you might guess the data is unlabeled and the system tries to learn without a teacher. • So unsupervised learning is nothing but discovering the hidden patterns or similarities from the dataset and grouping or labeling them without any human assistance. • Despite the fact that unsupervised learning has not been implemented on a wider scale yet, this methodology forms the future behind Machine Learning and its possibilities. ▪ Machine understands the data(Identifies patterns/ structures) ▪ Evaluation is qualitative or indirect ▪ Does not predict/ find anything specific ▪ Applications are recommendation systems and anomaly detection
  • 60. 59 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Reinforcement Machine Learning • In reinforcement learning, the algorithm discovers through trial and error which actions yield the greatest rewards. This type of learning has three primary components: the agent (the learner or decision-maker), the environment (everything the agent interacts with), and actions (what the agent can do). ▪ An approach to AI ▪ Reward based learning ▪ Learning from +Ve & - Ve reinforcement ▪ Machine learns how to act in a certain enviornment ▪ To maximize rewards ▪ Applications are self driving cars and gaming
  • 61. 60 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Supervised vs. Unsupervised Machine Learning Characteristic Characteristic Supervised Learning Unsupervised Learning Learning Goal Supervised learning is used to predict the result for a new input. Unsupervised learning is used to discover hidden pattern in dataset. Dataset Used Algorithms are trained using labelled dataset. Algorithms are trained using unlabeled dataset. Human assistance Complete learning process happens under human supervision and assistance. All the learning process happens without human supervision. Basic Types It is classified into two types i.e Classification and Regression. It can be classified into two basic types i.e Clustering and Association. Output It predicts the result. It finds the hidden relationships and patterns. Accuracy It produces more accurate results. When compared with supervised learning results less accurate.
  • 62. 61 Copyright © 2020 AcornSoft All Rights Reserved Anatomy of Machine Learning • Summary
  • 63. 62 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Model Development Life Cycle • Machine Learning Model Development Life Cycle https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d Business Requirements & Hypothesis Designing Exploratory Data Analysis Data Pre-Processing & Data Cleaning Feature Engineering & Feature Selection Model Deployment Model Performance Model Hyper- Parameter Tuning Machine Learning Model Selection Visualizations Process of Machine Learning Data Platform
  • 64. 63 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Model Development Life Cycle • MachineLearningModelDevelopmentLifeCycle https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d Business Requirements & Hypothesis Designing Exploratory Data Analysis Data Pre-Processing & Data Cleaning Feature Engineering & Feature Selection exploration thecriticalprocessofperforminginitial investigationsondatasoastodiscover patterns,tospotanomalies,totest hypothesisandtocheckassumptions withthehelpofsummarystatisticsand graphicalrepresentations* gooddataexplorationcanprovidethe usefulinsightswithinthedataaswellas solvealmost70%oftheprobleminthe EDAstageonly. DataReady • MissingValueChecks&Missing ValueImputations • Removaloftheunwanteddata • DataOptimizationonthebasisof DomainorBusiness recommendations • OutlierDetection&Removal • DimensionReduction • Duplicaterecordsremoval identifythemostimportantfeatures withinadataset • CorrelationChecksorCollinearity Checks • Zero-VarianceChecks • PrincipalComponentAnalysisorPCA • CategoricalDataEncoding • DataNormalization • DataStandardizationorScaling • LogTransformations the recognition of a problem and the idea that machine learning could potentially be used to solve it collaboration between domain experts and machine learning experts * Method : numeric summaries, aggregations, distributions, densities, reviewing all the levels of factor variables, applying general statistical methods, exploratory plots, and expository plots spin up clusters when needed and spin them back down when done
  • 65. 64 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Model Development Life Cycle • MachineLearningModelDevelopmentLifeCycle https://medium.com/analytics-vidhya/machine-learning-model-development-life-cycle-dcb238a3bd2d Model Deployment Model Performance Model Hyper- Parameter Tuning Machine Learning Model Selection Visualizations training deployment type of business problem • Decision Tree • Random Forest • Regression • K-Means or Clustering • K-Nearest Neighbors or KNN • Support Vector Machine • Logistic Regression • Naive Bayes • Artificial Neural Networks tested on the unseen data before deployed into the field or production environments • Confusion Matrix • Area Under the Curve or AUC • Precision & Recall • Sensitivity & Specificity • F1-Scores • R-Square • Gini Values • KS Statistics • Tableau • Power BI • Splunk • Dynatrace • Qlikview • Graphana • R-Shiny • Plotly an iterative process which actually consumes a lot of time after the Data Processing step • Cross-Validation • Outlier or Noisy data removal • Ex, the topology and size of a neural network • Shouldn’t be running into Over-fitting model object can be deployed using various methods • Rest APIs • Micro-Services flexibility to create distributed training environments across multiple host servers, allowing for better utilization of infrastructure resources
  • 66. 65 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning workflow & elements • Machine Learning Workflow and elements ML code Configuration Data Verification Data Collection Feature Extraction Machine Resource Management Analysis Tools Automation Testing and Debugging Process Management Tools Metadata Management Monitoring Serving Infrastructure Model Centric Data Centric Model and Data-centric elements of ML systems Data (quantity and quality — ML ready), ML technical debt, MLOps vs DevOps and Enterprise ML processes and skills remain the main barriers to adoption. ML Workflow — % allocation of resources https://tecton.ai/blog/devops-ml-data/ https://towardsdatascience.com/state-of-the-machine-learning-ai-industry-9bb477f840c8
  • 67. 66 Copyright © 2020 AcornSoft All Rights Reserved Containers as an enabler of AI • Backgrounds to use Kubernetes in Machine Learning workload https://platform9.com/blog/kubernetes-for-machine-learning/ Why are companies using containers to facilitate the development and deployment of AI apps? Machine Learning Model Iterative Process Compute Intensive Training Simultaneously Enterprise Needs Applications run on Any Server Any Cloud Provider Any Operating system Any where
  • 68. 67 Copyright © 2020 AcornSoft All Rights Reserved Containers as an enabler of AI • Backgrounds to use Kubernetes in Machine Learning workload https://platform9.com/blog/kubernetes-for-machine-learning/ the challenge of scalability the challenge of provisioning a computational infrastructure that can support a resource-intensive machine-learning pipeline apply the flexibility of cloud-native development and infrastructure to machine-learning applications containerization orchestrator – Kubernetes
  • 69. 68 Copyright © 2020 AcornSoft All Rights Reserved Containers as an enabler of AI • Backgrounds to use Kubernetes in Machine Learning workload https://platform9.com/blog/kubernetes-for-machine-learning/ Auto-scaling Data Management Multitenancy Abstraction GPU Support machine learning workflow works best when each step in the process can be scaled-up when needed, and scaled back down when done automate that scaling and support granular scaling easy to reproduce the environment necessary to support computation on GPUs (Kubernetes on Nvidia GPUs) single access point for diverse data sources and manages volume lifecycle ‘namespaces’ feature, which enables a single cluster to be partitioned into multiple virtual clusters. → own resource quotas and access control policies Data pipelines abstraction Infrastructure abstraction A containerized cloud-based machine learning workflow orchestrated by Kubernetes meets many of the challenges posed by the computational requirements of machine learning.
  • 70. 69 Copyright © 2020 AcornSoft All Rights Reserved Containers as an enabler of AI • Kubernetes and containers : perfect fit for machine learning https://jaxenter.com/containers-machine-learning-165203.html three phases of an AI project where containers are beneficial: exploration, training, and deployment. Exploration training deployment Environment Conditions (experiment with different data sets and machine learning algorithms to find the right data and algorithms to predict outcomes with maximum accuracy and efficiency) • Speed ofiteration • Abilityto runtests in parallel (AI model needs to be trained against large volumes of data across different platforms to maximize accuracy and minimize resource utilization) • Highly compute-intensive • computeand storage separate • combineseveral models that serve different purposes Appropriate action (Containers provide a way to package up these libraries for specific domains, point to the right data source and deploy algorithms in a consistent fashion) • isolated environment - customize for their exploration • manage multiple sets oflibraries and frameworks in a shared environment • scale workloadsup and down • A distributed cloudenvironment also allows computeand storage to bemanaged separately, which cuts storage utilization and thereforecosts. • runtheir models ondifferent types ofhardware, such as GPUs and specialized processors • Containersalloweach model to be deployed as a microservice • Microservices also make it easier to deploymodels in parallel in different productionenvironments
  • 71. 70 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Platform Sample • Sample ; Uber’s Machine Learning Platform : Michelangelo Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/ Manage data Data preparation pipelines push data into the Feature Store tables and training data repositories.
  • 72. 71 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Platform Sample • Sample ; Uber’s Machine Learning Platform : Michelangelo Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/ Train models Evaluate models Model training jobs use Feature Store and training data repository data sets to train models and then push them to the model repository.
  • 73. 72 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Platform Sample • Sample ; Uber’s Machine Learning Platform : Michelangelo Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/ Deploy models Models from the model repository are deployed to online and offline containers for serving.
  • 74. 73 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning Platform Sample • Sample ; Uber’s Machine Learning Platform : Michelangelo Uber’s ML Platform — Michelangelo, source: Uber engineering https://eng.uber.com/michelangelo-machine-learning-platform/ Manage data Train models Evaluate models Deploy models Make predictions Monitor predictions Online and offline prediction services use sets of feature vectors to generate predictions.
  • 76. 75 Copyright © 2020 AcornSoft All Rights Reserved Summary • Challenges with speed and scale with the MLlife cycle Ensure Seamless Collaboration Train Deploy Build monitor Data prep Ever-changing, expanding open source ecosystem Infrastructure and model performance Seamless deployment and update of a variety of modes Access to scable infracture on-demand Ever increasing volume variety and velocity of data Now, as containers grow in popularity and AI adoption enters the mainstream, enterprises are starting to leverage containerization to gain flexibility, portability, and reliability for the AI and machine learning lifecycle. Although containers are great at making applications flexible and portable, it is challenging to manage multiple containers in a complex system. That's where Kubernetes comes in.
  • 77. 3rd Seminar How to Design AI functions to the Cloud Native Infra 2020. 7. 15 Chun MK
  • 78. Contents □ What is Cloud Native? □ What is Cloud Native ML(AI)? □ Machine Learning operations Infrastructure □ What is Kubeflow? □ Why Use Kubeflow? □ Summary
  • 79. 78 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native? • CNCF’s Cloud Natvie Trail Map - Fundamentals of Cloud Native systems CONTAINERIZATION CI/CD ORCHESSTRATION OBSERVABILITY & ANALYSIS SERVICE MESH DISTRIBUTED DATABASE MESSAGING NETWORKING & POICY CONTAINER REGISTRY & RUNTIMES SOFTWARE DISTRIBUTION application to run in any computing environment To bring all the changes in the code to container automatically need container orchestration to manage the container lifecycles set up some of them like logging, tracing, metrics etc To enable more complex operational requirements, service discovery, health, routing, A/B testing etc define flexible networking layers based on your requirements need more scalability and resiliency required sometimes too store all your containers, also enable image scanning and signing if required need a secure software distribution #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 everything after step #3 is optional a recommended process for leveagring open source, cloud native technologies
  • 80. 79 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native? • CNCF Cloud Native Definition v1.0 Approved by TOC: 2018-06-11 Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. ▪ Containers, ▪ service meshes, ▪ microservices, ▪ immutable infrastructure, and ▪ declarative APIs exemplify this approach. https://github.com/cncf/toc/blob/master/DEFINITION.md Cloud native is not just deploying your application on cloud but it is more of taking full advantages of cloud. “Cloud native is different way of thinking need to first make up our minds, not just the systems, to utilize the full benefits of cloud.”
  • 81. 80 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native? • What it means to be Cloud Native approach https://medium.com/developingnodes/what-it-means-to-be-cloud-native-approach-the-cncf-way-9e8ab99d4923 Cloud-native is an approach to building and running applications that exploit the advantages of the cloud computing delivery model. Cloud-native is about how applications are created and deployed, not where. … It’s appropriate for both public and private clouds. Why Google donated Kubernetes to the CNCF? Google has been using containers for many years and they led the Kubernetes project which is a leading container orchestration platform. But alone they can’t really change the broad perspective in the industry around modern applications. So there was a huge need for industry leaders to come together and solve the major problems facing the modern approach. “the cloud isn’t a place, it’s a way of doing IT” by Michael Dell
  • 82. 81 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native? • Cloud Native Fundamentals & Container Ochestration Definition Container Orchestration Resource Management Scheduling Service Management Configuring scheduling traffic routing Availability deployments Provisioning Load balancing Scaling Allocation of resources Securing Health monitoring service discovery configuration of applications Cloud native fundamentals Container registory & runtimes Networking & policy Software distribution Distributed database Docker containerization Service mesh Ci/cd orchestration Observability & analysis messaging automate the following tasks at scale Basic Step Monitoring Object
  • 83. 82 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native ML(AI)? • Why Cloud Native Machine Learning andAI As big data gets more complex, companies are struggling to accommodate the storage and computing needs of average organizations, much less massive enterprises. https://medium.com/@ODSC/the-benefits-of-cloud-native-ml-and-ai-b88f6d71783 This is where cloud-native ML and AI comes into play.
  • 84. 83 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native ML(AI)? • Inherent issues in machine learning https://www.alibabacloud.com/blog/build-a-machine-learning-system-using-kubernetes_595961# common software development problems data-driven features of machine learning Complex Machine Learning Workflow ▪ workflow becomes longer ▪ data versions are out of control ▪ experiments cannot be easily traced ▪ results cannot be conveniently reproduced ▪ costly to iterate the model • Google TensorFlow Extended platform • Facebook FBLearner Flow platform • Uber Michelangelo platform internal infrastructure of these enterprises Google has extensive experience in building machine learning workflow platforms. Its TensorFlow Extended platform supports Google's core businesses such as search, translation, and video playback. More importantly, Google has a profound understanding of engineering efficiency in the machine learning field. Google's Kubeflow team made Kubeflow Pipelines open-source at the end of 2018. Kubeflow Pipelines is designed in the same way as Google's internal TensorFlow Extended machine learning platform. The only difference is that Kubeflow Pipelines runs on the Kubernetes platform while TensorFlow Extended runs on Borg. TensorFlow Extended machine learning platform
  • 85. 84 Copyright © 2020 AcornSoft All Rights Reserved What is Cloud Native ML(AI)? • Inherent issues in machine learning verification splitting hyperparameter tuning more observation model serving data loading processing feature engineering model training model verification Before a model ends up in production, there are potentially many steps required to build and deploy an ML model
  • 86. 85 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning operations Infrastructure • The fundamentals MLplatforms components https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc “machine learning workloads are more prone to maintenance tasks”
  • 87. 86 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning operations Infrastructure • Components inside the feature store loading the raw data inside the feature store storage (Batch and online) actually computing the features Computing time performance is critical when designing this component.(Batch and online) features for downstream processing (Batch and online) https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
  • 88. 87 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning operations Infrastructure • Components inside the training rig The objective of the training rig is to find and produce the best model (in a specific point in time) given: (i) an initial model architecture, (ii) a set of tunable hyperparameters and a (iii) a historical labeled feature set. check for retrain conditions e.g learning rate, optimisers .. e.g. number of layers detect when it is needed to re-train the current golden model discover potential new models by continuously optimising (or attempting to) the current gold model. resource-intensive ✓ actually performs the training (performance considerations should be taken into account when engineering the system) ✓ generate the model signature, ✓ clearly defining the input and output interfaces ✓ any initialisation task issuing models ready for production work (extensive testing should be performed on this step) all the metadata associated with the training phase (model repository, parameters, experiments ..) https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc Output
  • 89. 88 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning operations Infrastructure • Components inside the prediction rig (execute inferences) apply low level specific operations to a potential reusable and more abstract feature to route requests to a particular prediction endpoint The horsepower of the prediction rig design for classical non-functional requirements such as performance, scalability or fault tolerance Low latency key value store to quickly respond to re-entrant queries. It must implement the classical cache mechanisms As A/B tests take place analyse metadata and particularly ground truth data in the feature store to suggest a replacement of the golden model with one of the experiments ensure cache and memory warm- ups when a cold star situation happen (e.g. new model promotion) implement model explainability logic (e.g. Anchors, CEM ..) and returns it for a given request centralised all the metadata associated with the prediction phase (live experiments performance, prediction data stats …) https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc
  • 90. 89 Copyright © 2020 AcornSoft All Rights Reserved Machine Learning operations Infrastructure • MLPlatform open source instantiation FEAST* and kubeflow integration is currently work in progress On-Prem Infrastructure Cloud Kubernetes Feature Store Training rig Prediction rig Metadata KF Serving FEAST MDML KF pipeline Distributed training operators Katib NAS and HP * kubeflow already packages all those components in a nice way https://towardsdatascience.com/a-view-on-machine-learning-operations-infrastructure-e2bbc7cf0bdc NAS(neural architecture search)
  • 91. 90 Copyright © 2020 AcornSoft All Rights Reserved General Machine Learning Workflow • Machine Learning Workflow Components AI (ML/DL) Workflow AI (ML/DL) core Data Algorithm set structured semi- structured Un- structured ingestion analysis transform validation splitting Feature store Ad-hoc training Building a model Model validation Hp tuning Distributed training Training at scale Roll-out serving logging monitoring Data versioning Experiment tracking Kubeflow(+ecosystem) Data AIcode Real world
  • 92. 91 Copyright © 2020 AcornSoft All Rights Reserved What is Kubeflow? • Kubeflow is the machine learning toolkit for Kubernetes Kubeflow’s 1.0 applications that make up our develop, build, train, deploy critical user journey. https://medium.com/analytics-vidhya/kubeflow-for-everyone-9b914d3f65b1 ▪ Kubeflow UI, which is called the Central dashboard ▪ Jupyter notebook controller, for deploying and using Jupyter notebooks ▪ Tensorflow and Pytorch operator, for distributed training of models ▪ kfctl, the Kubeflow command line interface deployment and upgrades ▪ Profile controller, for multi-user support and management https://medium.com/kubeflow/kubeflow-1-0-cloud-native-ml-for-everyone-a3950202751 ➢ Graduating applications include: Kubeflow 1.0 ➢ Develop, Build, Train, and Deploy with Kubeflow
  • 93. 92 Copyright © 2020 AcornSoft All Rights Reserved What is Kubeflow? • Components of Kubeflow (Logical components that make up Kubeflow) ▪ Central Dashboard The central user interface (UI) in Kubeflow ▪ Metadata Tracking and managing metadata of machine learning workflows in Kubeflow ▪ Jupyter Notebooks Using Jupyter notebooks in Kubeflow ▪ Frameworks for Training Training of ML models in Kubeflow ▪ Hyperparameter Tuning Hyperparameter tuning of ML models in Kubeflow ▪ Pipelines ML Pipelines in Kubeflow ▪ Tools for Serving Serving of ML models in Kubeflow ▪ Multi-Tenancy in Kubeflow Multi-user isolation and identity access management (IAM) ▪ Miscellaneous (Nuclio functions) Miscellaneous Kubeflow components Nuclio - High performance serverless for data processing and ML
  • 94. 93 Copyright © 2020 AcornSoft All Rights Reserved What is Kubeflow? • Kubeflow components in the MLworkflow Components of Kubeflow model training ML training operator deploy the workflow to various clouds, local, and on-premises platforms for experimentation and for production use
  • 95. 94 Copyright © 2020 AcornSoft All Rights Reserved What is Kubeflow? • Kubeflow is the machine learning toolkit for Kubernetes Kubeflow’s goal is to make it easy for machine learning (ML) engineers and data scientists to leverage cloud assets (public or on-premise) for ML workloads. https://www.kubeflow.org/docs/started/kubeflow-overview/ [ Kubeflow as a platform(ML system) on top of Kubernetes ] History Kubeflow started as an open sourcing of the way Google ran TensorFlow internally, based on a pipeline called TensorFlow Extended. It began as just a simpler way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi- architecture, multi-cloud framework for running entire machine learning pipelines.
  • 96. 95 Copyright © 2020 AcornSoft All Rights Reserved Why Use Kubeflow? • Kubeflow bridges betweenAI workloads and kubernetes https://ubuntu.com/blog/data-science-workflows-on-kubernetes-with-kubeflow-pipelines-part-1 “Kubeflow Pipelines are a great way to build portable, scalable machine learning workflows.” A machine learning workflow managing ML workloads on top of Kubernetes is still a lot of specialized operations work which we don’t want to add to the data scientist’s role. Kubeflow bridges this gap between AI workloads and Kubernetes, making MLOps more manageable. Containers provide the right encapsulation, avoiding the need for debugging every time a developer changes the execution environment, and Kubernetes brings scheduling and orchestration of containers into the infrastructure.
  • 97. 96 Copyright © 2020 AcornSoft All Rights Reserved Why Use Kubeflow? • Reasons for using Kubeflow ▪ Deploying and managing a complex ML system at scale ▪ Experimentation with training an ML model ▪ End to end hybrid and multi-cloud ML workloads ▪ Tuning the model hyperparameters during training ▪ Continuous integration and deployment (CI/CD) for ML “machine learning workloads are more prone to maintenance tasks”
  • 98. 97 Copyright © 2020 AcornSoft All Rights Reserved Kubeflow Community User Survey Fall 2019 • What is your primary role? N = 50 https://medium.com/kubeflow/kubeflow-community-user-survey-fall-2019-a84776c71743 ▪ Kubeflow Community User Survey Fall 2019 Results CNCF SURVEY 2019 : October 2019 and received 1,337 responses respondents from Europe (37%) and North America (38%), followed by Asia (17%) majority of respondents (71%) were from organizations with at least 100 employees, the largest portion of these coming from enterprises with more than 5,000 employees (30%) Two-thirds of the respondents were in the software and technology industry top job functions were software architect (41%), DevOps manager (39%), and back- end developer (24%) https://www.cncf.io/blog/2020/03/04/2019-cncf-survey-results-are-here-deployments-are-growing-in-size-and-speed-as-cloud-native-adoption-becomes-mainstream/
  • 99. 98 Copyright © 2020 AcornSoft All Rights Reserved Kubeflow Community User Survey Fall 2019 • Where do you run yourAI/MLworkloads? (Multiple select)? N = 33
  • 100. 99 Copyright © 2020 AcornSoft All Rights Reserved Kubeflow Community User Survey Fall 2019 • What hardware do you use for yourAI/MLworkloads? (Multiple select)
  • 101. 100 Copyright © 2020 AcornSoft All Rights Reserved Kubeflow Community User Survey Fall 2019 • Critical Kubeflow components that you use in your current workflows. (Multiple select)
  • 102. 101 Copyright © 2020 AcornSoft All Rights Reserved Kubeflow Community User Survey Fall 2019 • What MLframeworks are typically used in your organization? -- spring survey response
  • 103. 102 Copyright © 2020 AcornSoft All Rights Reserved Summary : AI needs data, lots of it • AI needs data, lots of it CEOs who have overlooked data-driven insights to follow intuition instead Source: 2018 Global CEO Outlook, KPMG International ▪ the data is still very fragmented ▪ most of the current predictive models use only the historic data and not the streaming (real-time) data https://www.forbes.com/sites/andythurai/2020/07/06/ai-driven-enterprises/#128f2fea4cfd More data leads to better predictions