In early September, Apple released a paper describing Overton, the framework they built to create, monitor, and improve production-based ML systems. After presenting the main lines that define this framework, we will take a closer look at the heart of Overton: slice-based learning.
3. OverviewOverview
What / When / WhereWhat / When / Where
Publishing date
Early September
Conference
NeurIPS
Goal
ML software lifecycle management
Maturity
Production
11. Fine grained ML & multi-tasksFine grained ML & multi-tasks
ProblemsProblems
Lots of subtasks (implicit & explicit)
Need to evaluate & monitor them
Need to improve on them
12. Fine grained ML & multi-tasksFine grained ML & multi-tasks
ApproachApproach
Slice-based learning
De nition of data subsets
Augmentation of model capacity
Dedicated metrics
13. Fine grained ML & multi-tasksFine grained ML & multi-tasks
Slice-based learningSlice-based learning
14. Fine grained ML & multi-tasksFine grained ML & multi-tasks
De nition of critical dataDe nition of critical data
subsetssubsets
With “slice functions”:
def sf_bike(x):
return "bike" in object_detector(x)
def sf_night(x):
return avg(X.pixels.intensity) < 0.3
15. Fine grained ML & multi-tasksFine grained ML & multi-tasks
Slice expertsSlice experts
For each slice, an expert:
should add capacity to the model
has to know when to trigger
has to have dedicated metrics
16. Fine grained ML & multi-tasksFine grained ML & multi-tasks
Hard partsHard parts
Noise : slices are de ned with heuristics
Scale 1 : when the number of slices goes up, does the
model still run fast enough?
Scale 2 : when the number of slices goes up, is the
model still good enough?
17. Fine grained ML & multi-tasksFine grained ML & multi-tasks
Proposed solutionProposed solution
20. Weak supervisionWeak supervision
Problem statementProblem statement
We need data, but:
It's expensive, not available, yadda yadda
Especially for interesting corner cases
It's noisy
21. Weak supervisionWeak supervision
SolutionSolution
Use heuristics
@labeling_function()
def lf_regex_check_out(x):
"""Spam comments say 'check out my video', 'check it out', et
return SPAM if re.search(r"check.*out", x.text, flags=re.I) e