AI algorithms offer great promise in criminal justice, credit scoring, hiring and other domains. However, algorithmic fairness is a legitimate concern. Possible bias and adversarial contamination can come from training data, inappropriate data handling/model selection or incorrect algorithm design. This talk discusses how to build an open, transparent, secure and fair pipeline that fully integrates into the AI lifecycle — leveraging open-source projects such as AI Fairness 360 (AIF360), Adversarial Robustness Toolbox (ART), the Fabric for Deep Learning (FfDL) and the Model Asset eXchange (MAX).
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Open, Secure & Transparent AI Pipelines
1. Open, Secure & Transparent AI
Pipelines
Nick Pentreath, Principal Engineer IBM
2. About
@MLnick on Twitter & Github
Principal Engineer, IBM
CODAIT - Center for Open-Source Data
& AI Technologies
Machine Learning & AI
Apache Spark committer & PMC
Author of Machine Learning with Spark
Various conferences & meetups
3. Center for Open Source
Data and AI Technologies
Watson West Building
505 Howard St.
San Francisco, California
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
Relaunch of the IBM Spark Technology Center
(STC) to reflect expanded mission.
We contribute to foundational open source
software across the enterprise AI lifecycle.
36 open-source developers!
Improving Enterprise AI Lifecycle in Open Source
CODAIT
codait.org
17. 17
Fabric for Deep Learning
https://developer.ibm.com/ope
n/projects/fabric-for-deep-
learning-ffdl/
https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims
at making Deep Learning easily accessible to Data
Scientists and AI developers.
• FfDL provides a consistent way to train and visualize Deep
Learning jobs across multiple frameworks like TensorFlow,
Caffe, PyTorch, Keras etc.
FfDL
Community Partners
FfDL is one of InfoWorld’s 2018 Best of Open
Source Software Award winners for machine
learning and deep learning!
19. What, Where, How?
• What are you deploying?
• What is a “model”?
• Where are you deploying?
• Target environment
• Batch, streaming, real-time?
• How are you deploying?
• “devops” deployment mechanism
• Serving framework
We will talk mostly about the what
21. Pipelines, not Models
• Deploying just the model part of the
workflow is not enough
• Entire pipeline must be deployed
• Data transform
• Feature extraction & pre-processing
• ML model itself
• Prediction transformation
• Technically even ETL is part of the
pipeline!
• Pipelines in open-source frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)
22. Challenges
• Proliferation of formats
• Open source, open standard: PMML, PFA, ONNX
• Open-source, non-standard: MLeap, Spark,
TensorFlow, Keras, PyTorch, Caffe, …
• Proprietary formats: lock-in, not portable
• Lack of standardization leads to custom
solutions
• Where standards exist, limitations lead to
custom extensions, eliminating the benefits
• Need to manage and bridge many different:
• Languages - Python, R, Notebooks, Scala / Java / C
• Frameworks – too many to count!
• Dependencies
• Versions
• Performance characteristics can be highly
variable across these dimensions
• Friction between teams
• Data scientists & researchers – latest & greatest
• Production – stability, control, minimize changes,
performance
• Business – metrics, business impact, product must
always work!
24. Containers for ML Deployment
• But …
• What goes in the container is most
important
• Performance can be highly variable across
language, framework, version
• Requires devops knowledge, CI /
deployment pipelines, good practices
• Does not solve the issue of standardization
• Formats
• APIs exposed
• A serving framework is still required on top
• Container-based deployment has
significant benefits
• Repeatability
• Ease of configuration
• Separation of concerns – focus on what, not
how
• Allow data scientists & researchers to use
their language / framework of choice
• Container frameworks take care of (certain)
monitoring, fault tolerance, HA, etc.
25. IBM Developer
Model Asset eXchange
http://ibm.biz/model-exchange
Free, open-source deep learning models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a web service in 30 seconds.
Start training on Fabric for Deep Learning (FfDL) or
Watson Machine Learning in minutes.
28. Predictive Model Markup
Language (PMML)
• Shortcomings
• Cannot represent arbitrary programs / analytic
applications
• Flexibility comes from custom plugins => lose
benefits of standardization
• Data Mining Group (DMG)
• Model interchange format in XML with
operators
• Widely used and supported; open standard
• Spark support lacking natively but 3rd party
projects available: jpmml-sparkml
• Comprehensive support for Spark ML components
(perhaps surprisingly!)
• Watch SPARK-11237
• Other exporters include scikit-learn, R,
XGBoost and LightGBM
29. Portable Format for Analytics
• Shortcomings
• PFA is still young and needs to gain adoption
• Production robustness and performance at
scale?
• Limitations of PFA – especially for deep learning
applications
• A standard can move slowly in terms of new
features, fixes and enhancements
• PFA is being championed by the DMG
• PFA consists of:
• JSON serialization format
• AVRO schemas for data types
• Encodes functions (actions) that are applied to inputs to
create outputs with a set of built-in functions and
language constructs (e.g. control-flow, conditionals)
• Essentially a mini functional math language + schema
specification
• Portability across languages, frameworks, run
times and versions
https://github.com/CODAIT/aardpfark
30. Open Neural Network Exchange (ONNX)
• Shortcomings
• Relatively poor support for “traditional” ML or
general language constructs (currently)
• String / categorical processing
• Datetime, collection operators
• Intermediate variables
• Evolving standard – coverage of operators (e.g.
TensorFlow)
• Championed by Facebook & Microsoft
• Protobuf serialization format
• Describes computation graph (including
operators)
• In this way the serialized graph is “self-describing”
similarly to PFA
• More focused on Deep Learning / tensor
operations
• Baked into PyTorch 1.0.0 / Caffe2 as the
serialization & interchange format
• ONNX-ML covers “traditional” ML operators
https://github.com/onnx
32. Monitoring
Performance Business
Monitoring
Software
Traditional software monitoring
Latency, throughput, resource usage, etc
Model metrics
Traditional ML evaluation measures +
bias, robustness, explainability
Business metrics
Impact of predictions on business
outcomes
• Additional revenue - e.g. uplift from recommender
• Cost savings – e.g. value of fraud prevented
• Metrics implicitly influencing these – e.g. user
engagement
33. Feedback
Adapt
An intelligent system must automatically
learn from & adapt to the world around it
Continual learning
Retraining, online learning,
reinforcement learning
Feedback loops
Explicit: models create or directly
influence their own training data
Implicit: predictions influence behavior in
longer-term or indirect ways
Humans in the loop
Data
Transform
TrainDeploy
Feedback
35. 35
AI Fairness 360
Toolbox:
Fairness metrics (30+)
Fairness metric
explanations
Bias mitigation
algorithms (10)
AIF360AIF360 toolkit is an open-source library to help
detect and remove bias in machine learning
models.
The AI Fairness 360 Python package includes
a comprehensive set of metrics for datasets
and models to test for biases, explanations for
these metrics, and algorithms to mitigate bias
in datasets and models.
https://github.com/IBM/AIF360
https://developer.ibm.com/patterns/ensuring-
fairness-when-processing-loan-applications/
39. IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attack and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
39
https://github.com/IBM/adversarial-robustness-toolbox
https://developer.ibm.com/patterns/integrate-adversarial-
attacks-model-training-pipeline/
Toolbox
Evasion attacks (11)
Defenses (9)
Detection methods for
adversarial samples &
poisoning attacks
Robustness metrics
40. Getting CLEVER
40
Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER):
• Published by IBM Research
• Attack-agnostic measure
• Efficient computation
https://medium.com/@MITIBMLab/getting-clever-er-expanding-
the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb
https://arxiv.org/abs/1801.10578
https://github.com/IBM/CLEVER-Robustness-Score