The Road to Production: Automating your Anomaly Detectors - by jao (Jose A. Ortega), Co-Founder and Chief Technology Officer at BigML.
*Machine Learning School in The Netherlands 2022.
3. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
3 / 61
4. Machine Learning as a System Service
The goal
Machine Learning as a system level
service
• Accessibility
• Integrability
• Automation
• Ease of use
4 / 61
6. Machine Learning as a System Service
The goal
Machine Learning as a system level
service
The means
• APIs: ML building blocks
• Abstraction layer over feature
engineering
• Abstraction layer over algorithms
• Automation
6 / 61
7. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
7 / 61
10. RESTful done right: Whitebox resources
• Your data, your model
• Model reverse engineering becomes moot
• Maximizes reach (Web, CLI, desktop, IoT)
10 / 61
11. RESTful-ish ML Services
• Excellent abstraction layer
• Transparent data model
• Immutable resources and UUIDs: traceability
• Simple yet effective interaction model
• Easy access from any language (API bindings)
Algorithmic complexity and computing resources management
problems mostly washed away
11 / 61
14. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
14 / 61
19. Tumor detection using anomalies
Given data about a tumor:
• Extract the relevant features that
characterize it (unsupervised
learning)
• Classify the tumor as either benign
or malignant, improving diagnosis
and avoiding unnecessary surgery
19 / 61
20. Tumor detection using anomalies
Given data about a tumor:
• Extract the relevant features that
characterize it (unsupervised
learning)
• Classify the tumor as either benign
or malignant, improving diagnosis
and avoiding unnecessary surgery
Example: University of Wisconsin Hospital’s Cancer dataset
https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/
19 / 61
28. (Non) automation via Web UI
Strengths of Web UI
Simple Just clicking around
Discoverable Exploration and experimenting
Abstract Transparent error handling and scalability
24 / 61
29. (Non) automation via Web UI
Strengths of Web UI
Simple Just clicking around
Discoverable Exploration and experimenting
Abstract Transparent error handling and scalability
Problems of Web UI
Only simple Simple tasks are simple, hard tasks quickly get hard
No automation or batch operations Clicking humans don’t scale well
24 / 61
30. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
25 / 61
34. Is this production code?
How do we generalize to, say, 100 datasets?
29 / 61
35. Example workflow: Python bindings
# Now do it 100 times, serially
for i in range(0, 100):
r, s = 0.8, i
train = api.create_dataset(dataset, {"rate": r, "seed": s})
test = api.create_dataset(dataset, {"rate": r, "seed": s, "out_of_bag": True})
api.ok(train)
model.append(api.create_model(train))
api.ok(model)
api.ok(test)
evaluation.append(api.create_evaluation(model, test))
api.ok(evaluation[i])
30 / 61
36. Example workflow: Python bindings
# More efficient if we parallelize, but at what level?
for i in range(0, 100):
r, s = 0.8, i
train.append(api.create_dataset(dataset, {"rate": r, "seed": s}))
test.append(api.create_dataset(dataset, {"rate": r, "seed": s, "out_of_bag": True})
# Do we wait here?
api.ok(train[i])
api.ok(test[i])
for i in range(0, 100):
model.append(api.create_model(train[i]))
api.ok(model[i])
for i in range(0, 100):
evaluation.append(api.create_evaluation(model, test_dataset))
api.ok(evaluation[i])
31 / 61
37. Example workflow: Python bindings
# More efficient if we parallelize, but at what level?
for i in range(0, 100):
r, s = 0.8, i
train.append(api.create_dataset(dataset, {"rate": r, "seed": s}))
test.append(api.create_dataset(dataset, {"rate": r, "seed": s, "out_of_bag": True})
for i in range(0, 100):
# Or do we wait here?
api.ok(train[i])
model.append(api.create_model(train[i]))
for i in range(0, 100):
# and here?
api.ok(model[i])
api.ok(train[i])
evaluation.append(api.create_evaluation(model, test_dataset))
api.ok(evaluation[i])
32 / 61
38. Example workflow: Python bindings
# More efficient if we parallelize, but how do we handle errors??
for i in range(0, 100):
r, s = 0.8, i
train.append(api.create_dataset(dataset, {"rate": r, "seed": s}))
test.append(api.create_dataset(dataset, {"rate": r, "seed": s, "out_of_bag": True})
for i in range(0, 100):
api.ok(train[i])
model.append(api.create_model(train[i]))
for i in range(0, 100):
try:
api.ok(model[i])
api.ok(test[i])
evaluation.append(api.create_evaluation(model, test_dataset))
api.ok(evaluation[i])
except:
# How to recover if test[i] is failed? New datasets? Abort?
33 / 61
39. Client-side Machine Learning Automation
Problems of bindings-based, client solutions
Complexity Lots of details outside the problem domain
Reuse No inter-language compatibility
Scalability Client-side workflows are hard to optimize
Reproducibility Noisy, complex and hard to audit development environment
Not enough abstraction
34 / 61
41. Rich, parameterized workflows: cross-validation
bigmler analyze --cross-validation # parameterized input
--dataset $(cat output/diabetes/dataset)
--k-folds 3 # number of folds during validation
--output-dir output/diabetes-validation
36 / 61
42. Client-side Machine Learning automation
Problems of client-side solutions
Hard to generalize Declarative client tools hide complexity at the cost of flexibility
Hard to combine Black–box tools cannot be easily integrated as parts of bigger
client–side workflows
Hard to audit Client–side development environments are complex and very hard
to sandbox
Not enough automation
37 / 61
43. Client-side Machine Learning automation
Problems of client-side solutions
Complex Too fine-grained, leaky abstractions
Cumbersome Error handling, network issues
Hard to reuse Tied to a single programming language
Hard to scale Parallelization again a problem
Hard to generalize Declarative client tools hide complexity at the cost of flexibility
Hard to combine Black–box tools cannot be easily integrated as parts of bigger
client–side workflows
Hard to audit Client–side development environments are complex and very hard
to sandbox
Not enough abstraction
37 / 61
44. Client-side Machine Learning automation
Problems of client-side solutions
Complex Too fine-grained, leaky abstractions
Cumbersome Error handling, network issues
Hard to reuse Tied to a single programming language
Hard to scale Parallelization again a problem
Hard to generalize Declarative client tools hide complexity at the cost of flexibility
Hard to combine Black–box tools cannot be easily integrated as parts of bigger
client–side workflows
Hard to audit Client–side development environments are complex and very hard
to sandbox
Algorithmic complexity and computing resources management problems mostly
washed away are back!
37 / 61
45. Client-side Machine Learning automation
Algorithmic complexity and computing resources management problems are back! 38 / 61
46. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
39 / 61
52. In a Nutshell
1. Workflows reified as server–side, RESTful resources
2. Domain–specific language for ML workflow automation
44 / 61
53. Workflows as RESTful Resources
Library Reusable building-block: a collection of WhizzML
definitions that can be imported by other libraries or
scripts.
Script Executable code that describes an actual workflow.
• Imports List of libraries with code used by the script.
• Inputs List of input values that parameterize the
workflow.
• Outputs List of values computed by the script and
returned to the user.
Execution Given a script and a complete set of inputs, the workflow
can be executed and its outputs generated.
45 / 61
59. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
51 / 61
61. Syntactic Abstraction in WhizzML: Simple workflow
;; ML artifacts are first-class citizens,
;; we only need to talk about our domain
(let ([train-id test-id] (create-dataset-split id 0.8)
model-id (create-model train-id))
(create-evaluation test-id
model-id
{"name" "Evaluation 80/20"
"missing_strategy" 0}))
53 / 61
62. Syntactic Abstraction in WhizzML: Simple workflow
;; ML artifacts are first-class citizens,
;; we only need to talk about our domain
(let ([train-id test-id] (create-dataset-split id 0.8)
model-id (create-model train-id))
(create-evaluation test-id
model-id
{"name" "Evaluation 80/20"
"missing_strategy" 0}))
Ready for production!
53 / 61
63. Domain Specificity and Scalability: Trivial parallelization
;; Workflow for 1 resource
(let ([train-id test-id] (create-dataset-split id 0.8)
model-id (create-model train-id))
(create-evaluation test-id model-id))
54 / 61
64. Domain Specificity and Scalability: Trivial parallelization
;; Workflow for arbitrary number of resources
(let (splits (for (id input-datasets)
(create-dataset-split id 0.8)))
(for (split splits)
(create-evaluation (create-model (split 0)) (split 1))))
55 / 61
65. Domain Specificity and Scalability: Trivial parallelization
;; Workflow for arbitrary number of resources
(let (splits (for (id input-datasets)
(create-dataset-split id 0.8)))
(for (split splits)
(create-evaluation (create-model (split 0)) (split 1))))
Ready for production!
55 / 61
69. Outline
1 ML as a system service
2 ML as a RESTful cloudy service
3 Machine Learning worflows
4 Client–side automation
5 Server–side workflow automation
6 A first taste of WhizzML: abstraction is back
7 And back to the (distributed) client: BigMLOps
59 / 61
70. Package and deploy BigML work
fl
ows in a few clicks
Deploy and
monitor your
application
1 Create an Application
Ops
2 Connect to BigML and add
Work
fl
ows and models
3 Package everything
in a container
4
60 / 61