7. What we’re covering
• Boundary Erosion
• Data Dependencies
• Spaghetti Code
• The Real World
8. `whoami`
• O’Reilly Author - Thoughtful Machine Learning. Use
AUTHD to get a discount on OReilly.com.
• Former Financial Quant
• Independent Consultant
• @mjkirk
28. Solution: Tombstones
!
• def run_this_once_in_prod!;
Tombstone.new(‘2014-01-02’); end
• When you think something is dead put a
Tombstone on it
• https://www.youtube.com/watch?v=29UXzfQWOhQ
38. Lessons Learned In one
Slide
Danger Solutions
Entanglement Regularize or Isolate Models
Visibility Debt Keep an access log of who uses what
Unstable Data Version datasets
Underutilized Data Trim by finding better features
Glue Code Write your own implementations
Pipeline Jungle Find minimum cut in systems
Experimental Paths Use Tombstones
Configuration Debt Reconfigure with new datasets
Fixed Thresholds Include accuracy as part of model
Correlation Changes Trim non-causal data from models
39. Links and Contact
• @mjkirk
• matt@matthewkirk.com
• Machine Learning: The High-Interest Credit Card of
Technical Debt: https://bit.ly/1zs9TXi
• Is that code dead?: http://bit.ly/1sg0B1L
40. Photo Sources
• Cost of gigabyte: http://royal.pingdom.com/2011/12/19/would-you-pay-7260-for-a-3-tb-drive-charting-hdd-and-ssd-prices-over-time/
• Golden Opportunity: https://flic.kr/p/7xvfZr
• Problems are Opportunities: https://flic.kr/p/ifFos
• Master Charge: https://flic.kr/p/noQUh1
• Erosion: https://flic.kr/p/9agH2q
• Coupler: https://flic.kr/p/ppm9HG
• Fruit Loops: https://flic.kr/p/5rkLhP
• Somewhere in Quản Bạ, Hà Giang: https://flic.kr/p/q4K9Bo
• Data Dependencies: https://flic.kr/p/dVq7vg
• Unstable!: https://flic.kr/p/s7RLj
• Underutilized Piano: https://flic.kr/p/2sZVP
• Spaghetti: https://flic.kr/p/tuwkp
• Glue: https://flic.kr/p/6L13SK
• Pipelines at google: https://flic.kr/p/pvLQG2