SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
Model Registry with
Open-Source tools: Git,
GItHub and CI/CD
Dmitry Petrov, Ph.D.
dmitry@iterative.ai
Co-Founder & CEO at Iterative.ai
iterative.ai
About me
Before
Ex-Data Scientist at Microsoft
Now
Creator of Data Version Control
Co-founder, CEO at Iterative.ai
(San Francisco, CA)
Dmitry Petrov
@FullStackML
What will you learn?
Git as the single source of truth for ML models
What will you learn?
Git as the single source of truth for ML models
I. What is model registry
Centralized model store to collaboratively
manage the full lifecycle of ML models.
⬥ Model lineage & versioning
⬥ Stage transitions: from staging to production
⬥ Model Annotations & Discovery: timestamp,
description, labels
What is Model Registry (1)
What is Model Registry (2)
Model Registry
ML Train ML Production
Git, GitHub/Lab
Save
API
Pull
API
com
m
it
What is Model Registry (3)
Name Version Status Code Meta-info: labels
churn 3.1 prod 43ba862 customer, xgboost
segment 0.4 - d19ce1f cv, car
cv-class 1.13 staging afd8e35 cv, car, doors, glass
… … … … …
II. Model registry challenges
Challenge 1: model & code lineage (1)
Model Registry
ML Train ML Production
Git, GitHub/Lab
Save
API
Pull
API
com
m
it
d19ce1f
d19ce1f
Name Version Status Code Meta-info: labels
churn 3.1 prod 43ba862 customer, xgboost
segment 0.4 - d19ce1f cv, car
cv-class 1.13 staging afd8e35 cv, car, doors, glass
… … … … …
Challenge 1: model & code lineage (2)
Challenge 2: app & model deployment diff (1)
Model Registry
ML Train
ML Production
Git, GitHub/Lab
Save
API
Pull
API
Com
m
it
App Production
CI/CD
Deploy
Push
Challenge 2: app & model deployment diff (2)
Model Registry
ML Train
ML Production
Git, GitHub/Lab
Save
API
Pull
API
Com
m
it
App Production
CI/CD
Deploy
Push
ML
App
Challenge 3: additional service
Model Registry
ML Train
ML Production
Git, GitHub/Lab
Save
API
Pull
API
Com
m
it
App Production
CI/CD
Deploy
Push
?
Git as the source of truth
ML Train
App & ML
Production
Git, GitHub/Lab
Commit
CI/CD
Deploy
Push
Git as the source of truth: Git only API
ML Train
App & ML
Production
Git, GitHub/Lab
Commit
CI/CD
Deploy
Push
III. What is needed from Git?
How to save the table in Git?
Name Version Status Code Meta-info: labels
churn 3.1 prod 43ba862 customer, xgboost
segment 0.4 - d19ce1f cv, car
cv-class 1.13 staging afd8e35 cv, car, doors, glass
… … … … …
Basic:
⬥ Human-friendly versioning: d19ce → v1.0.1
⬥ Status: prod, staging, dev
Integration to production:
⬥ Push to production
Model registry functionality (1)
Basic:
⬥ Human-friendly versioning: d19ce → v1.0.1
⬥ Status: prod, staging, dev
Integration to production:
⬥ Push to production
Advanced:
⬥ Large ML models
⬥ Mono-repo or multiple models
⬥ Annotations
Model registry functionality (2)
Basic:
⬥ Human-friendly versioning
⬥ Status
Integration to production:
⬥ Push to production
Advanced:
⬥ Large ML models
⬥ Mono-repo or multiple models
⬥ Annotations
Model registry functionality (3)
Git tags
CI/CD
Meta-file
artifacts.yaml
Infrastructure as Code (IaC)
We are creating Model Registry as Code (MRaC)
Advanced:
⬥ Large ML models
⬥ Mono-repo or multiple models
⬥ Annotations
Solution: Git as the source of truth
IV. Git as model registry
(Basic)
Git: Human-readable Versioning
Git: model status - from Stage to Prod (1)
Git: model status - from Stage to Prod (2)
Git: Model deployment (GitHub Action)
GTO - Git Tag Ops
https://github.com/iterative/gto
Turn your Git Repo into Artifact Registry:
● Register new versions of artifacts marking significant
changes to them
● Promote versions to signal downstream systems to act
● Attach additional info about your artifact with
Enrichments
● Act on new versions and promotions in CI
GTO: Human-readable Versioning
GTO: model status - from Stage to Prod
GTO: reliable / numeric model statuses
GTO: Model Registry in command line
GTO: important!
1. Do not forget to push tags:
2. CI/CD tags pattern:
Git: summary
Versioning Git tag `model-v1.0.0`
Status Git tag `model-prod`
Pull & Push to production CI/CD
V. Git as model registry
(Advanced)
Git: Large ML model files - artifacts.yaml
Git: multiple models in a single repo (1)
Git: multiple models in a single repo (2)
artifacts.yaml
(with sections)
Git: ML model annotations
GTO: Large ML model files
GTO: ML model annotations
artifacts.yaml
GTO: multiple models in a single repo
Git: summary
Versioning Git tag `model-v1.0.0`
Status Git tag `model-prod`
Pull & Push to production CI/CD
Large ML models URL in `artifacts.yaml`
Multiple models Sections in `artifacts.yaml`
Annotations Labels in `artifacts.yaml`
VI. Limitations of Git
Model visibility
Git good for “inside repository” model registry,
not cross-repository.
Model visibility - User Interface
Git good for “inside repository” model registry,
not cross-repository.
Solution: Build an UI for collect models from
all repositories
Model visibility - User Interface example
VII. Summary
⬥ Git as the single source of truth for ML models
⬥ No need in external services: only Git, GitHub/Lab
⬥ Limitation - only in repository level
Summary: Model Registry with Git
Thank you!
Dmitry Petrov
@FullStackML
Email dmitry@iterative.ai
iterative.ai
iterative.ai

Contenu connexe

Tendances

Tendances (20)

The kvm virtualization way
The kvm virtualization wayThe kvm virtualization way
The kvm virtualization way
 
MQ Guide France - IBM MQ and Containers
MQ Guide France - IBM MQ and ContainersMQ Guide France - IBM MQ and Containers
MQ Guide France - IBM MQ and Containers
 
Clone detection in Python
Clone detection in PythonClone detection in Python
Clone detection in Python
 
Load Testing in Python with Locust from Zero
Load Testing in Python with Locust from ZeroLoad Testing in Python with Locust from Zero
Load Testing in Python with Locust from Zero
 
Android 10
Android 10Android 10
Android 10
 
Study on Android Emulator
Study on Android EmulatorStudy on Android Emulator
Study on Android Emulator
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)
 
PostgreSQLのgitレポジトリから見える2022年の開発状況(第38回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQLのgitレポジトリから見える2022年の開発状況(第38回PostgreSQLアンカンファレンス@オンライン 発表資料)PostgreSQLのgitレポジトリから見える2022年の開発状況(第38回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQLのgitレポジトリから見える2022年の開発状況(第38回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
最新機能までを総ざらい!PostgreSQLの注目機能を振り返る(第32回 中国地方DB勉強会 in 岡山 発表資料)
 
MySQL5.7 GA の Multi-threaded slave
MySQL5.7 GA の Multi-threaded slaveMySQL5.7 GA の Multi-threaded slave
MySQL5.7 GA の Multi-threaded slave
 
VxRail_7.0.x_管理員手冊.en.zh-TW.docx
VxRail_7.0.x_管理員手冊.en.zh-TW.docxVxRail_7.0.x_管理員手冊.en.zh-TW.docx
VxRail_7.0.x_管理員手冊.en.zh-TW.docx
 
VMware cloud on AWS
VMware cloud on AWSVMware cloud on AWS
VMware cloud on AWS
 
Nagra OpenTV 5 solution
Nagra OpenTV 5 solutionNagra OpenTV 5 solution
Nagra OpenTV 5 solution
 
IIJmio meeting 25 スマートフォンはなぜ「つながらない」のか
IIJmio meeting 25 スマートフォンはなぜ「つながらない」のかIIJmio meeting 25 スマートフォンはなぜ「つながらない」のか
IIJmio meeting 25 スマートフォンはなぜ「つながらない」のか
 
Kubernetesでの性能解析 ~なんとなく遅いからの脱却~(Kubernetes Meetup Tokyo #33 発表資料)
Kubernetesでの性能解析 ~なんとなく遅いからの脱却~(Kubernetes Meetup Tokyo #33 発表資料)Kubernetesでの性能解析 ~なんとなく遅いからの脱却~(Kubernetes Meetup Tokyo #33 発表資料)
Kubernetesでの性能解析 ~なんとなく遅いからの脱却~(Kubernetes Meetup Tokyo #33 発表資料)
 
GraalVM を普通の Java VM として使う ~クラウドベンチマークなどでの比較~
GraalVM を普通の Java VM として使う ~クラウドベンチマークなどでの比較~GraalVM を普通の Java VM として使う ~クラウドベンチマークなどでの比較~
GraalVM を普通の Java VM として使う ~クラウドベンチマークなどでの比較~
 
Apache ignite as in-memory computing platform
Apache ignite as in-memory computing platformApache ignite as in-memory computing platform
Apache ignite as in-memory computing platform
 
IIJmio meeting 19 IIJ フルMVNO徹底解説
IIJmio meeting 19 IIJ フルMVNO徹底解説IIJmio meeting 19 IIJ フルMVNO徹底解説
IIJmio meeting 19 IIJ フルMVNO徹底解説
 
PostgreSQL DBのバックアップを一元化しよう
PostgreSQL DBのバックアップを一元化しようPostgreSQL DBのバックアップを一元化しよう
PostgreSQL DBのバックアップを一元化しよう
 
Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners
 

Similaire à Model Registry with Open-Source tools: Git, GitHub and CI/CD

Similaire à Model Registry with Open-Source tools: Git, GitHub and CI/CD (20)

Ultimate Git Workflow - Seoul 2015
Ultimate Git Workflow - Seoul 2015Ultimate Git Workflow - Seoul 2015
Ultimate Git Workflow - Seoul 2015
 
Git essential training & sharing self
Git essential training & sharing selfGit essential training & sharing self
Git essential training & sharing self
 
Kubecon SIG Apps December 2017 Update
Kubecon SIG Apps December 2017 UpdateKubecon SIG Apps December 2017 Update
Kubecon SIG Apps December 2017 Update
 
Introduction To Git For Version Control Architecture And Common Commands Comp...
Introduction To Git For Version Control Architecture And Common Commands Comp...Introduction To Git For Version Control Architecture And Common Commands Comp...
Introduction To Git For Version Control Architecture And Common Commands Comp...
 
Speeding up your team with GitOps
Speeding up your team with GitOpsSpeeding up your team with GitOps
Speeding up your team with GitOps
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
 
SCM (Source Control Management) - Git Basic
SCM (Source Control Management) - Git Basic SCM (Source Control Management) - Git Basic
SCM (Source Control Management) - Git Basic
 
Delivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsDelivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOps
 
Gitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQL
 
Introducing Gitora,the version control tool for PL/SQL
Introducing Gitora,the version control tool for PL/SQLIntroducing Gitora,the version control tool for PL/SQL
Introducing Gitora,the version control tool for PL/SQL
 
GitLab as an Alternative Development Platform for Github.com
GitLab as an Alternative Development Platform for Github.comGitLab as an Alternative Development Platform for Github.com
GitLab as an Alternative Development Platform for Github.com
 
Gitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQL
 
Gitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQL
 
Git
GitGit
Git
 
PyData Berlin 2018: dvc.org
PyData Berlin 2018: dvc.orgPyData Berlin 2018: dvc.org
PyData Berlin 2018: dvc.org
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Git single branch
Git single branchGit single branch
Git single branch
 
Gerrit linuxtag2011
Gerrit linuxtag2011Gerrit linuxtag2011
Gerrit linuxtag2011
 
Managing e commerce systems codebase with git
Managing e commerce systems codebase with gitManaging e commerce systems codebase with git
Managing e commerce systems codebase with git
 
和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow
 

Dernier

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Dernier (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 

Model Registry with Open-Source tools: Git, GitHub and CI/CD

  • 1. Model Registry with Open-Source tools: Git, GItHub and CI/CD Dmitry Petrov, Ph.D. dmitry@iterative.ai Co-Founder & CEO at Iterative.ai iterative.ai
  • 2. About me Before Ex-Data Scientist at Microsoft Now Creator of Data Version Control Co-founder, CEO at Iterative.ai (San Francisco, CA) Dmitry Petrov @FullStackML
  • 3. What will you learn? Git as the single source of truth for ML models
  • 4. What will you learn? Git as the single source of truth for ML models
  • 5. I. What is model registry
  • 6. Centralized model store to collaboratively manage the full lifecycle of ML models. ⬥ Model lineage & versioning ⬥ Stage transitions: from staging to production ⬥ Model Annotations & Discovery: timestamp, description, labels What is Model Registry (1)
  • 7. What is Model Registry (2) Model Registry ML Train ML Production Git, GitHub/Lab Save API Pull API com m it
  • 8. What is Model Registry (3) Name Version Status Code Meta-info: labels churn 3.1 prod 43ba862 customer, xgboost segment 0.4 - d19ce1f cv, car cv-class 1.13 staging afd8e35 cv, car, doors, glass … … … … …
  • 9. II. Model registry challenges
  • 10. Challenge 1: model & code lineage (1) Model Registry ML Train ML Production Git, GitHub/Lab Save API Pull API com m it d19ce1f d19ce1f
  • 11. Name Version Status Code Meta-info: labels churn 3.1 prod 43ba862 customer, xgboost segment 0.4 - d19ce1f cv, car cv-class 1.13 staging afd8e35 cv, car, doors, glass … … … … … Challenge 1: model & code lineage (2)
  • 12. Challenge 2: app & model deployment diff (1) Model Registry ML Train ML Production Git, GitHub/Lab Save API Pull API Com m it App Production CI/CD Deploy Push
  • 13. Challenge 2: app & model deployment diff (2) Model Registry ML Train ML Production Git, GitHub/Lab Save API Pull API Com m it App Production CI/CD Deploy Push ML App
  • 14. Challenge 3: additional service Model Registry ML Train ML Production Git, GitHub/Lab Save API Pull API Com m it App Production CI/CD Deploy Push ?
  • 15. Git as the source of truth ML Train App & ML Production Git, GitHub/Lab Commit CI/CD Deploy Push
  • 16. Git as the source of truth: Git only API ML Train App & ML Production Git, GitHub/Lab Commit CI/CD Deploy Push
  • 17. III. What is needed from Git?
  • 18. How to save the table in Git? Name Version Status Code Meta-info: labels churn 3.1 prod 43ba862 customer, xgboost segment 0.4 - d19ce1f cv, car cv-class 1.13 staging afd8e35 cv, car, doors, glass … … … … …
  • 19. Basic: ⬥ Human-friendly versioning: d19ce → v1.0.1 ⬥ Status: prod, staging, dev Integration to production: ⬥ Push to production Model registry functionality (1)
  • 20. Basic: ⬥ Human-friendly versioning: d19ce → v1.0.1 ⬥ Status: prod, staging, dev Integration to production: ⬥ Push to production Advanced: ⬥ Large ML models ⬥ Mono-repo or multiple models ⬥ Annotations Model registry functionality (2)
  • 21. Basic: ⬥ Human-friendly versioning ⬥ Status Integration to production: ⬥ Push to production Advanced: ⬥ Large ML models ⬥ Mono-repo or multiple models ⬥ Annotations Model registry functionality (3) Git tags CI/CD Meta-file artifacts.yaml
  • 22. Infrastructure as Code (IaC) We are creating Model Registry as Code (MRaC) Advanced: ⬥ Large ML models ⬥ Mono-repo or multiple models ⬥ Annotations Solution: Git as the source of truth
  • 23. IV. Git as model registry (Basic)
  • 25. Git: model status - from Stage to Prod (1)
  • 26. Git: model status - from Stage to Prod (2)
  • 27. Git: Model deployment (GitHub Action)
  • 28. GTO - Git Tag Ops https://github.com/iterative/gto Turn your Git Repo into Artifact Registry: ● Register new versions of artifacts marking significant changes to them ● Promote versions to signal downstream systems to act ● Attach additional info about your artifact with Enrichments ● Act on new versions and promotions in CI
  • 30. GTO: model status - from Stage to Prod
  • 31. GTO: reliable / numeric model statuses
  • 32. GTO: Model Registry in command line
  • 33. GTO: important! 1. Do not forget to push tags: 2. CI/CD tags pattern:
  • 34. Git: summary Versioning Git tag `model-v1.0.0` Status Git tag `model-prod` Pull & Push to production CI/CD
  • 35. V. Git as model registry (Advanced)
  • 36. Git: Large ML model files - artifacts.yaml
  • 37. Git: multiple models in a single repo (1)
  • 38. Git: multiple models in a single repo (2) artifacts.yaml (with sections)
  • 39. Git: ML model annotations
  • 40. GTO: Large ML model files
  • 41. GTO: ML model annotations artifacts.yaml
  • 42. GTO: multiple models in a single repo
  • 43. Git: summary Versioning Git tag `model-v1.0.0` Status Git tag `model-prod` Pull & Push to production CI/CD Large ML models URL in `artifacts.yaml` Multiple models Sections in `artifacts.yaml` Annotations Labels in `artifacts.yaml`
  • 45. Model visibility Git good for “inside repository” model registry, not cross-repository.
  • 46. Model visibility - User Interface Git good for “inside repository” model registry, not cross-repository. Solution: Build an UI for collect models from all repositories
  • 47. Model visibility - User Interface example
  • 49. ⬥ Git as the single source of truth for ML models ⬥ No need in external services: only Git, GitHub/Lab ⬥ Limitation - only in repository level Summary: Model Registry with Git
  • 50. Thank you! Dmitry Petrov @FullStackML Email dmitry@iterative.ai iterative.ai