SlideShare a Scribd company logo
1 of 42
Developing robust training
systems
About me
Illarion Khlestov, Senior Research Engineer
GitHub: https://github.com/ikhlestov
Blog: https://medium.com/@illarionkhlestov
Facebook: https://www.facebook.com/i.khlestov
Motivation
It’s not complicated
Structure
25-30 mins
10 mins
Data handling
Data variety issue
Solution - unified format
How to store data?
Format validation
- jsonschema
- trafaret
- validr
- voluptuous
Inbound data validation
- Allowed range of entries
- Annotation quality
- Data quantity
- Checksums
- Suitability
Visualization
Plotly & Dash
Plotly & Dash
Plotly structuring
Moving between servers
Data downloaders
Auto synchronization
Data update & cleanup
Training
The simplest approach
Results saving
- id:
- Logs
- Weights
- Graphs
- Predictions
- id example:
- 1541023813_mobile_net_trained_with_SGD_001
Configs
Python configs
Python configs import
https://docs.python.org/3/library/importlib.html
Configs updates
Network validation
- Test run:
- Train
- Validation
- Test
- Results saving
- Size calculation
- Speed calculation
Training Speedup
Image source
Evaluation
Summary table
Configs diffs
Model evaluation table
- Dataset ID
- Link to dataset statistic
- Data preprocessing
- Train graphs and logs
- Performance
- Accuracy
- Speed
Can you spot a frog-cat?
Generated by facets
Mislabeling Debugging
Pre-production
- Additional metrics
- Don’t mix ids
- Test auto-check
Training result structure
Bonus: Libraries
- https://github.com/IDSIA/sacred - configure, organize, log and reproduce experiments
- https://github.com/henripal/labnotebook - flexibly monitor, record, save, and query experiments
- https://github.com/facebookresearch/visdom - visualizations of live, rich data
- https://github.com/uber/horovod - distributed training framework
- https://github.com/autonomio/talos - Hyperparameter Optimization for Keras Models
- https://github.com/hyperopt/hyperopt - Hyperparameter Optimization in Python
http://bit.ly/training_systems

More Related Content

Similar to Illarion Khlestov "Developing robust ML training systems"

Similar to Illarion Khlestov "Developing robust ML training systems" (20)

Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & UnicomPractical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
 
Practical, team-focused operability techniques for distributed systems - DevO...
Practical, team-focused operability techniques for distributed systems - DevO...Practical, team-focused operability techniques for distributed systems - DevO...
Practical, team-focused operability techniques for distributed systems - DevO...
 
MSBI Online Training in Hyderabad
MSBI Online Training in HyderabadMSBI Online Training in Hyderabad
MSBI Online Training in Hyderabad
 
MSBI Online Training in India
MSBI Online Training in IndiaMSBI Online Training in India
MSBI Online Training in India
 
MSBI Online Training
MSBI Online Training MSBI Online Training
MSBI Online Training
 
Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...
Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...
Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...
 
Welcome Webinar Slides
Welcome Webinar SlidesWelcome Webinar Slides
Welcome Webinar Slides
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
 
Model-based Analysis of Java EE Web Security Configurations - Mise 2016
Model-based Analysis of Java EE Web Security Configurations - Mise 2016Model-based Analysis of Java EE Web Security Configurations - Mise 2016
Model-based Analysis of Java EE Web Security Configurations - Mise 2016
 
Avoiding the 10 Deadliest and Most Common Sins for Securing Windows
Avoiding the 10 Deadliest and Most Common Sins for Securing WindowsAvoiding the 10 Deadliest and Most Common Sins for Securing Windows
Avoiding the 10 Deadliest and Most Common Sins for Securing Windows
 
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model TrainingAlluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
 
Practical operability techniques for teams - IPEXPO 2017
Practical operability techniques for teams - IPEXPO 2017Practical operability techniques for teams - IPEXPO 2017
Practical operability techniques for teams - IPEXPO 2017
 
Management Information System by professor sai chandu
Management Information System by professor sai chanduManagement Information System by professor sai chandu
Management Information System by professor sai chandu
 
Rameshwar panchal Resume
Rameshwar panchal ResumeRameshwar panchal Resume
Rameshwar panchal Resume
 
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomersPyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
 
Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
New World Of SharePoint 2010 Administration Oleson
New World Of SharePoint 2010 Administration OlesonNew World Of SharePoint 2010 Administration Oleson
New World Of SharePoint 2010 Administration Oleson
 
SQL Bits 2018 | Best practices for Power BI on implementation and monitoring
SQL Bits 2018 | Best practices for Power BI on implementation and monitoring SQL Bits 2018 | Best practices for Power BI on implementation and monitoring
SQL Bits 2018 | Best practices for Power BI on implementation and monitoring
 

More from Lviv Startup Club

More from Lviv Startup Club (20)

Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
 
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
 
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
 
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
 
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
 
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
 
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
 
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
 
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
 
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
 
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
 
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
 
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
 
Nataliya Kryvonis: Essential soft skills to lead your team (UA)
Nataliya Kryvonis: Essential soft skills to lead your team (UA)Nataliya Kryvonis: Essential soft skills to lead your team (UA)
Nataliya Kryvonis: Essential soft skills to lead your team (UA)
 
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
 
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
 
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
 
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 

Illarion Khlestov "Developing robust ML training systems"

Editor's Notes

  1. Welcome everyone, My name is Larry. Few words about me. I'm working as a Seniour Research Engineer. I'm enaged in developing of various computer vision algorithms for videos and images processing. Mainly I make cloud based solutions. Sometimes I post my ideas or notes to Medium blog. Also there are a few interesting projects on my gitHub. So - subscribe. Moreover it's always cool to work with clever and talanted people. That's why if you want to join the team - feel free to have a talk after the lecture.
  2. But let’s turn back to presentation. As it was described in the title we are going to talk about developing training systems. You may ask - what is the reason? From my point of view the main motivation is that there are a lot of tutorials how to build and train simple networks. Many sources try to give you a first impression what the Machine learning is. Mainly they are like: just take already existing code, try to find pretrained weights, tune them a little bit. And you are ready for production. Nothing complicated there.
  3. Unfortunatelly, there is not a lot info about how to build large systems that can be updated, reused and trained with upcoming data. On the other hand such systems should be easy to use by many people if you have a team with more than one person. Today I’ll try to show that such projects shouldn’t be very complicated and with some simple architecture decisions you can speedup and improve your development process. And of course, you may not build such systems by your own but at least you will know what should you are looking for.
  4. During the lecture I’ll cover three main chapters, such as * How to store, handle and prepare data for training * How to manage training itself * How we should analyze results after training Additionally in the end I’ll discuss some libraries and packages ready to use as a components. Lecture will take not very long. I’ll be happy to answer all questions in the end. And I'm going to provide link to slides in the end, that's why there is no reason to photo all links or library names. So let’s start.
  5. TODO: replace with icons?