Contenu connexe Similaire à Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING (20) Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING1. A Look Inside Applied ML
Brian Goral | Cloudera Fast Forward Labs
2. 2 © Cloudera, Inc. All rights reserved.
THE OPPORTUNITY: THE AI-FIRST ENTERPRISE
PROTECT
business
CONNECT
products &
services (IoT)
GROW
customer insights
Thousands of opportunities for ML and AI to unlock value within the enterprise
● Predictive maintenance
● Logistics optimization
● Self-driving cars
● Customer churn
● Marketing effectiveness
● Next best action
● Insider threat prevention
● Fraud prevention
● Risk analysis
3. 3© Cloudera, Inc. All rights reserved.
MULTINATIONAL
ACCOUNTING &
PROFESSIONAL
SERVICES FIRM
• Accounting questionnaire
automation : hundreds of
person hours per client
saved
• NLP of Tax Code and legal
filings: new data-based
client services
4. 4© Cloudera, Inc. All rights reserved.
GLOBAL INTERNET
SERVICES & TELCO
PROVIDER
• Modeling system and
individual relationships in
order to understand and
predict partner and
customer behavior
• Experimentation
framework for custom
automated interventions
• Developed advanced
propensity to cross-
sell/upsell models
5. 5© Cloudera, Inc. All rights reserved.
CHESAPEAKE ENERGY
• IoT data streaming from
more than 70 trillion
sensor data points
• Predictive maintenance
anticipates shutdowns up
to 72 hours in advance.
• Data analytics across
systems maintains
production levels while
slashing capital costs by
more than 80 percent.
6. 6 © Cloudera, Inc. All rights reserved.
US PUBLIC SECTOR
• Domain generated
algorithms (DGA) for
malware threat
detection.
• Builds on
Probabilistic
Programming
Research and
methods.
7. 7 © Cloudera, Inc. All rights reserved.
LET’S MAKE MACHINE LEARNING BORING.
9. 9 © Cloudera, Inc. All rights reserved.
Puppy,
Muffin, or
Fried
Chicken?
11. 11 © Cloudera, Inc. All rights reserved.
MOVING FROM EXPLORATION TO PRODUCTION OF ML & AI
WE’RE WITNESSING THE INDUSTRIALIZATION OF AI
FROM THE LAB… TO THE FACTORY
12. 12 © Cloudera, Inc. All rights reserved.
LET’S MAKE MACHINE LEARNING BORING.
13. 13 © Cloudera, Inc. All rights reserved.
WORKFLOWS FOR EFFECTIVE ML
DONE WELL, THE VALUE GENERATED BY MACHINE LEARNING GROWS OVER TIME
13
DEVELOPMENT PRODUCTIONSTRATEGY
critical business
infrastructure
research and explorationprioritization and use case
identification
14. 14 © Cloudera, Inc. All rights reserved.
SCALING ML & AI IN THE ENTERPRISE
WHETHER YOU ARE A FORTUNE 100 OR A STARTUP
STRATEGY PEOPLE &
ORGANIZATION
TECHNOLOGYSECURITY,
GOVERNANCE,
COMPLIANCE
15. 15© Cloudera, Inc. All rights reserved.
OUR APPROACH
Modern enterprise platform, tools and expert guidance to help you unlock business
value with ML/AI
Agile platform to build,
train, and deploy many
scalable ML applications
Enterprise data science
tools to accelerate
team productivity
Expert guidance,
services & training to
fast track value & scale
16. 16 © Cloudera, Inc. All rights reserved.16
AI
MACHINE
LEARNING
DATA SCIENCE
ANALYTICS
"BIG DATA"
PATH
TO
AI-FIRST
17. 17 © Cloudera, Inc. All rights reserved.
END-TO-END INFRASTRUCTURE TO MAKE THIS EASY
Systematic process to
discover and prioritize
opportunities for ML and AI
OPPORTUNITY
DISCOVERY
“AI FACTORY”
Agile platform to build, train,
and deploy ML applications
BUSINESS
TRANSFORMATION
Capability to absorb
business transformation
and discover next
opportunity
18. 18 © Cloudera, Inc. All rights reserved.
IT’S NOT EASY..
Now trending: barriers to success & scale
• Expertise, experience data and data products aren’t being shared. Data is
siloed with no central catalog. Teams end up duplicating work (and code) to
collect, clean and load datasets.
• Data science teams sit in [Org] and pursue technically interesting projects, but
without aligning to business properties... they become ineffective.
• Teams gravitate towards complexity, thinking complex = better. They focus
on the algorithm as the solution to all their problems, but often the real
problem is in their data.
19. 19 © Cloudera, Inc. All rights reserved.
THE CHALLENGE
Balance these needs for self-service
DATA SCIENCE
• Access to granular data
• Flexibility
• Preferred open source tools
• Elastic provisioning
• Compute
• Storage
• Reproducible research
• Path to production
DATA MANAGEMENT
• Security
• Governance
• Standards
• Low maintenance
• Low cost
• Self-service access
20. 20 © Cloudera, Inc. All rights reserved.
THE TYPICAL SOLUTION
“If I can’t use my favorite tools, I’ll…”
• Copy data to my laptop
• Copy data to a data science appliance
• Copy data to a cloud service
Why this is a problem:
• Complicates security
• Breaks data governance
• Adds latency to process
• Makes collaboration more difficult
• Complicates model management and
deployment
• Creates infrastructure silos
21. 21 © Cloudera, Inc. All rights reserved.
A MODERN DATA SCIENCE ARCHITECTURE
Containerized environments with scalable, on-demand compute
• Built with Docker and Kubernetes
• Isolated, reproducible user environments
• Supports both big and small data
• Local Python, R, Scala runtimes
• Schedule & share GPU resources
• Run Spark, Impala, and other CDH services
• Secure and governed by default
• Easy, audited access to Kerberized clusters
• Leverages SDX platform services
• Deployed with Cloudera Manager
CDH CDH
Cloudera Manager
gateway node(s) CDH nodes
Hive, HDFS, ...
CDSW CDSW
...
Master
...
Engine
EngineEngine
EngineEngine
22. 22 © Cloudera, Inc. All rights reserved.
ACCELERATING THREE STAGES OF MACHINE LEARNING
Manage models
Deploy models
Monitor performance
DEPLOYDEVELOP
Explore data
Develop models
Share results
TRAIN
Optimize parameters
Track experiments
Compare performance
Enterprise AI platform supporting model development, training, and deployment
23. 28 © Cloudera, Inc. All rights reserved.
CLOUDERA FAST FORWARD LABS
24. Cloudera Fast Forward Labs’ mission is to fast track
knowledge transfer from the lab to industry. We bring
technical innovations to industry applications.
You can think of us as your data nerd best friends.
25. 30
CLOUDERA FAST FORWARD LABS
ADVISING & RESEARCHML APPLICATION
DEVELOPMENT
ML STRATEGY
ENGAGEMENT
Evaluates and, if feasible, delivers an
ML application (model, code, docs)
using your data on Cloudera
architecture
Delivers a strategy prescription
focused on ML, by identifying and
prioritizing ML use cases, and
recommending process design
Ongoing ML expert advising plus
access to CFFL research reports and
prototypes.
Expert guidance to accelerate value and scale
26. 31© Cloudera, Inc. All rights reserved.
APPLIED Machine Learning
Practical guidance for implementing the recently possible
Machine learning with privacy across edge devices
• Breakthrough technology for use-cases including industrial
predictive maintenance
• Train models on distributed data across end-points and
share models -- not data -- for privacy and efficiency
• Shift the burden of training out of datacenter infrastructure
and onto edge devices
Highly efficient models for extracting value from real time
data streams
• Orders-of-magnitude efficiency boost without
scaling compute
• Enables shifting computation to devices
• Efficiently track trends and correlation across
multiple data series
27. 32© Cloudera, Inc. All rights reserved.
APPLIED Machine Learning
Practical guidance for implementing the recently possible
Understanding what’s inside the black boxes
• Customer churn reasoning
• Regulatory compliance and bias testing
• Reverse engineering 3rd party models
Makes language computable
• Quickly derive insights from large amounts of text data
• Understand customers based on call center transcripts
and recommend actions
• Combine 360 data to develop cross-sell/up-sell profiles
28. 33© Cloudera, Inc. All rights reserved.
STATE STREET VERUS
New revenue and new product
opportunities
29. 34© Cloudera, Inc. All rights reserved.
FORTUNE 500
FINANCIAL SERVICES &
INSURANCE PROVIDER
• ML analytics on call center
transcripts and customer
history to deliver
personalization that boosts
conversion
• Data discovery, elimination
of data & workflow silos,
and optimization of team
structure driving massive
operational efficiencies
30. 35 © Cloudera, Inc. All rights reserved.
MACHINE LEARNING IN BANKING & FINANCIAL SERVICES
Investments in a few major areas
Fraud, Risk & Security Investment DecisionsCustomer Insights &
Experience
31. 36 © Cloudera, Inc. All rights reserved.
MACHINE LEARNING IN TELECOMMUNICATIONS
Investments in a few major areas
Network Optimization Next Best Offer | Dynamic
Marketing
Customer Churn | Propensity to
Buy
32. 37 © Cloudera, Inc. All rights reserved.
MACHINE LEARNING IN MFG & IOT
Investments in a few major areas
Automation Digital Ecosystem OptimizationRisk Mitigation
34. 40© Cloudera, Inc. All rights reserved.
GLOBAL ELECTRONICS
MANUFACTURER -
CONNECTIVITY &
SENSORS
Applied ML
Interpretability, advanced
recommendation systems
and RNNs to identify
customer cross-sell &
upsell opportunities
37. 43 © Cloudera, Inc. All rights reserved.
BANKING, FINANCIAL SERVICES &
INSURANCE
38. 44© Cloudera, Inc. All rights reserved.
RECENT TOPICS
Practical guidance for implementing the recently possible
Prediction with confidence
• Interpretable models that quantify uncertainty
• Investment market pricing prediction
• Fraud & anomaly detection
Makes language computable
• Derive insights from large amounts of unstructured text
data
• Call center transcript analysis and recommended actions
• News analysis and insight alerts for traders
39. 46 © Cloudera, Inc. All rights reserved.
AS NEW TECH
CAPABILITIES EMERGE,
BE READY