SlideShare une entreprise Scribd logo
1  sur  50
Automating Inventory at Stitch Fix
Using Beta Binomial Regression for Cold Start Problems
Sally Langford - Data Scientist
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
stitch-fix-ml
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
How Stitch Fix works:
- Tell us about your style, fit and price preferences.
How Stitch Fix works:
- Tell us about your style, fit and price preferences.
- A personal stylist will curate five pieces for you.
How Stitch Fix works:
- Tell us about your style, fit and price preferences.
- A personal stylist will curate five pieces for you.
- Try all the items on at home.
How Stitch Fix works:
- Tell us about your style, fit and price preferences.
- A personal stylist will curate five pieces for you.
- Try all the items on at home.
- Give your stylist feedback on all items, then only pay for what you keep.
How Stitch Fix works:
- Tell us about your style, fit and price preferences.
- A personal stylist will curate five pieces for you.
- Try all the items on at home.
- Give your stylist feedback on all items, then only pay for what you keep.
- Return the other items in envelope provided.
How Stitch Fix works:
Benefits of Machine Learning in Inventory Management:
- Scalable with business.
- Rapid reforecasting.
- Capture nonlinear relationships.
- Cold start problems.
time
number of
units
order arrives
in warehouse
shirt is sent to
clients and is sold
plaid shirt
time
order arrives
in warehouse
shirt is sent to
clients and is sold
plaid shirt
number of
units
Inventory consumption of a style is proportional to;
- daily demand,
- clients for which the style is recommended,
- whether there are units in the warehouse,
- probability a stylist chooses to send the client this style,
- if the client buys the style.
Inventory consumption of a style is proportional to;
- daily demand,
- clients for which the style is recommended,
- whether there are units in the warehouse,
- probability a stylist chooses to send the client this style,
- if the client buys the style.
Inventory consumption of a style is proportional to;
- daily demand,
- clients for which the style is recommended,
- whether there are units in the warehouse,
- probability a stylist chooses to send the client this style,
- if the client buys the style.
Ranked styles recommended for client - which will the stylist choose to send?
Ranked styles recommended for client - which will the stylist choose to send?
Ranked styles recommended for client - which will the stylist choose to send?
Ranked styles recommended for client - which will the stylist choose to send?
Ranked styles recommended for client - which will the stylist choose to send?
plaid long-sleeve shirt
?
plaid long-sleeve shirt
plaid long-sleeve shirt
plaid long-sleeve shirt
...
plaid long-sleeve shirt
...
P(chosen) + P(not chosen) = 1
plaid long-sleeve shirt
?
...
plaid long-sleeve shirt
blue long-sleeve shirt
Prior Beliefs
Prior Beliefs
Evidence
Prior Beliefs
Posterior Beliefs
Evidence
N ~ Binom(Nav
, p)
N ~ Binom(Nav
, p)
N ~ Binom(Nav
, p)
p ~ B(α, β)
B(α’, β’) = B(α0
+ k, β0
+ n - k)
B(α’, β’) = B(α0
+ k, β0
+ n - k)
Step 1: Use maximum likelihood to calculate α0
and β0
for the distribution of p in
groups of similar styles.
Step 2: After a period of time, update this prior for the number of times the new
style has been recommended for a client (n), and chosen to be sent (k).
Step 3: Calculate the mean and confidence interval of p from the resulting
distribution. This is used as the probability that the new style will be chosen to be
sent to a client.
Step 4: Repeat steps 2-3.
VGAM (in python):
import rpy2.robjects as robjects
robjects.r.library("VGAM")
robjects.r("fit = vglm(cbind(successData,trialData - successData) ~ 1,
betabinomialff, trace=TRUE)")
alpha, beta = robjects.r("Coef(fit)")
import scipy
fit = scipy.stats.beta.fit(data, floc=0, fscale=1)
alpha, beta = fit[0], fit[1]
---
result = scipy.optimize.minimize(loss_function, p0, jac=True, **kwargs)
Data Storage
Job Scheduler
SQL Engine
SQL EngineHive Metastore
Flotilla: Auto scaling cluster
Job server for
Spark cluster
Data Scientist Code
Top 5 recommendations for client D
Top 5 recommendations for client C
Top 5 recommendations for client B
Top 5 recommendations for client A
Top 5 recommendations for client E
B(α , β ) = B(μ/σ, (1 - μ)/σ)
μ
σ
μ = μ0
+ μn
log(1 + n)
number of
units
time
now planned orders
new styles
number of
units
time
now
forecasted units
planned orders
new styles
How do we use our inventory forecast model?
- When should we re-order inventory?
- How should we buy inventory by size?
- How should orders be separated into different warehouses?
- When should a style not be sent out anymore, in place of a new option?
Metrics of success:
- Fraction of inventory out with clients compared to in the warehouse?
- How many styles are available to send to a client?
- ∆ in the beginning of month projected units.
- Cumulative units sold over time.
Do you want to calculate the probability of success in a binomial process?
Not enough data?
Use Beta Binomial Regression for your cold start problem!
slangford@stitchfix.com
Stitch Fix
Algorithms Blog
Algorithms Tour
@stitchfix_algo
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
stitch-fix-ml

Contenu connexe

Plus de C4Media

Plus de C4Media (20)

High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 

Dernier

Dernier (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Automating Inventory at Stitch Fix

  • 1. Automating Inventory at Stitch Fix Using Beta Binomial Regression for Cold Start Problems Sally Langford - Data Scientist
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ stitch-fix-ml
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. How Stitch Fix works:
  • 5. - Tell us about your style, fit and price preferences. How Stitch Fix works:
  • 6. - Tell us about your style, fit and price preferences. - A personal stylist will curate five pieces for you. How Stitch Fix works:
  • 7. - Tell us about your style, fit and price preferences. - A personal stylist will curate five pieces for you. - Try all the items on at home. How Stitch Fix works:
  • 8. - Tell us about your style, fit and price preferences. - A personal stylist will curate five pieces for you. - Try all the items on at home. - Give your stylist feedback on all items, then only pay for what you keep. How Stitch Fix works:
  • 9. - Tell us about your style, fit and price preferences. - A personal stylist will curate five pieces for you. - Try all the items on at home. - Give your stylist feedback on all items, then only pay for what you keep. - Return the other items in envelope provided. How Stitch Fix works:
  • 10. Benefits of Machine Learning in Inventory Management: - Scalable with business. - Rapid reforecasting. - Capture nonlinear relationships. - Cold start problems.
  • 11. time number of units order arrives in warehouse shirt is sent to clients and is sold plaid shirt
  • 12. time order arrives in warehouse shirt is sent to clients and is sold plaid shirt number of units
  • 13. Inventory consumption of a style is proportional to; - daily demand, - clients for which the style is recommended, - whether there are units in the warehouse, - probability a stylist chooses to send the client this style, - if the client buys the style.
  • 14. Inventory consumption of a style is proportional to; - daily demand, - clients for which the style is recommended, - whether there are units in the warehouse, - probability a stylist chooses to send the client this style, - if the client buys the style.
  • 15. Inventory consumption of a style is proportional to; - daily demand, - clients for which the style is recommended, - whether there are units in the warehouse, - probability a stylist chooses to send the client this style, - if the client buys the style.
  • 16. Ranked styles recommended for client - which will the stylist choose to send?
  • 17. Ranked styles recommended for client - which will the stylist choose to send?
  • 18. Ranked styles recommended for client - which will the stylist choose to send?
  • 19. Ranked styles recommended for client - which will the stylist choose to send?
  • 20. Ranked styles recommended for client - which will the stylist choose to send?
  • 26. ... P(chosen) + P(not chosen) = 1 plaid long-sleeve shirt
  • 28.
  • 34. N ~ Binom(Nav , p) p ~ B(α, β)
  • 35. B(α’, β’) = B(α0 + k, β0 + n - k)
  • 36. B(α’, β’) = B(α0 + k, β0 + n - k)
  • 37. Step 1: Use maximum likelihood to calculate α0 and β0 for the distribution of p in groups of similar styles. Step 2: After a period of time, update this prior for the number of times the new style has been recommended for a client (n), and chosen to be sent (k). Step 3: Calculate the mean and confidence interval of p from the resulting distribution. This is used as the probability that the new style will be chosen to be sent to a client. Step 4: Repeat steps 2-3.
  • 38. VGAM (in python): import rpy2.robjects as robjects robjects.r.library("VGAM") robjects.r("fit = vglm(cbind(successData,trialData - successData) ~ 1, betabinomialff, trace=TRUE)") alpha, beta = robjects.r("Coef(fit)") import scipy fit = scipy.stats.beta.fit(data, floc=0, fscale=1) alpha, beta = fit[0], fit[1] --- result = scipy.optimize.minimize(loss_function, p0, jac=True, **kwargs)
  • 39. Data Storage Job Scheduler SQL Engine SQL EngineHive Metastore Flotilla: Auto scaling cluster Job server for Spark cluster Data Scientist Code
  • 40.
  • 41. Top 5 recommendations for client D Top 5 recommendations for client C Top 5 recommendations for client B Top 5 recommendations for client A Top 5 recommendations for client E
  • 42. B(α , β ) = B(μ/σ, (1 - μ)/σ) μ σ
  • 43. μ = μ0 + μn log(1 + n)
  • 44. number of units time now planned orders new styles
  • 46. How do we use our inventory forecast model? - When should we re-order inventory? - How should we buy inventory by size? - How should orders be separated into different warehouses? - When should a style not be sent out anymore, in place of a new option?
  • 47. Metrics of success: - Fraction of inventory out with clients compared to in the warehouse? - How many styles are available to send to a client? - ∆ in the beginning of month projected units. - Cumulative units sold over time.
  • 48. Do you want to calculate the probability of success in a binomial process? Not enough data? Use Beta Binomial Regression for your cold start problem!
  • 50. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ stitch-fix-ml