SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
© 2024
A Guide to a Modern
Project Setup
PyConDE / PyData 2024, April 22nd
Florian Wilhelm
Streamlining
Python
Development
Mathematical Modelling
Modern Data Warehousing & Analytics
Personalisation & RecSys
Uncertainty Quantification & Causality
Python Data Stack
OSS Contributor & Creator of PyScaffold
Dr. Florian Wilhelm
• HEAD OF DATA SCIENCE
FlorianWilhelm.info
florian.wilhelm@inovex.de
FlorianWilhelm
‣ Application Development (Web Platforms, Mobile
Apps, Smart Devices and Robotics, UI/UX
design,Backend Services)
‣ Data Management and Analytics (Business
Intelligence, Big Data, Searches, Data Science
and Deep Learning, Machine Perception and
Artificial Intelligence)
‣ Scalable IT-Infrastructures (IT Engineering, Cloud
Services, DevOps, Replatforming, Security)
‣ Training and Coaching (inovex Academy)
is an innovation and quality-driven
IT project house with a focus on
digital transformation.
Using technology to
inspire our clients.
And ourselves.
Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen
www.inovex.de
1. Introduction:
a. What makes a good project setup?
b. How do we achieve it?
2. Streamlined Project Setup:
a. configuration with pyproject.toml
b. tooling with hatch, ruff, mypy, pytest, …
3. Conclusion
Agenda
Introduction
1. efficient development
2. easy collaboration
3. seamless build & deployment
What makes a streamlined Python Project Setup?
1. Conventions
a. project structure
b. code formatting, e.g., pep8, black, ruff
c. documentation, e.g., Sphinx, mkdocs
2. Automation
a. dependency & environment management
b. building & publishing
c. versioning, e.g., semantic versioning
d. testing, linting/formatting, type checking
3. Easy to Use!
Concrete Requirements for those Goals
Semantic Versioning
‣ tells developers what to
expect
‣ avoids dependency hell for
developers using your
software
‣ necessary for requirement
specifiers like ~= 2.21 or
^2.2.21 (Poetry only)
More Details: https://www.geeksforgeeks.org/introduction-semantic-versioning/ and https://semver.org/
This is not a talk about the best Package Management Tool
Source: An unbiased evaluation of environment management and packaging tools (https://www.inovex.de/de/blog/)
Streamlined Project Setup
‣ reproducibly building & publishing packages
‣ robust environment management with support for
custom scripts
‣ easy Python management, replacing pyenv
‣ easy semantic versioning based on Git tags
‣ sophisticated testing within various environments,
replacing tox
🐣 Hatch, the extensible Python project manager
Ofek Lev
‣ folders for
∙ source files
∙ documentation
∙ tests
‣ human-readable information
∙ README.md
∙ …
‣ configuration files
∙ pyproject.toml
∙ …
Project Directory Structure
‣ defines the build system
‣ metadata about your project
for PyPI
‣ configuration for (almost) all
tools
∙ pytest
∙ mypy
∙ ruff
∙ coverage
All-in-One Configuration with pyproject.toml
Scripts in pyproject.toml for automation of tasks, e.g.
∙ running unit-tests with our without coverage, debugging,
∙ building the documentation,
∙ running the linters, code checks, mypy,
∙ …
Automation with Scripts!
> hatch run test:cov
‣ replaces tons of tools
‣ easy configuration via
pyproject.toml
‣ extremely fast
‣ over 700 plugins
Code Quality: Linting & Formatting
Ruff
flake8
autoflake
pydocstyle
…
Why mypy?
Type Checking: Are you my type?
compile-time type checking finds many errors in
advance, often edge cases.
type declaration act as machine-checked
documentation, thus enhancing the dev
experience.
Mypy Example
> hatch run lint:typing
pytest
‣ defacto standard for unit testing
‣ powerful features like fixtures, etc.
‣ tons of useful plugins, e.g.:
∙ pytest-cov for coverage
∙ pytest-recording for mocking calls to external services
∙ pytest-sugar to make it easier on the eyes
Testing with pytest & hatch
hatch & tox
‣ isolated environments for testing different Python versions and
dependency combinations
Avoiding human-errors by automated checks on every git commit
Automated QA with pre-commit
‣ Automatic and reproducible testing
‣ Publishing packages based on git tags
‣ Established branching strategy, e.g. GithubFlow
for efficient collaboration
‣ Scalability and Adaptability when needed
‣ Automated deployments, building of
documentation etc.
Automation with CI/CD
More Details: Data Science in Production: Packaging, Versioning and Continuous Integration (https://www.inovex.de/de/blog/)
Conclusion
‣ unified configuration in pyproject.toml
‣ standardized folder structure with
src-layout and useful README.md
‣ easy package management and
automation with hatch
‣ automated QA with ruff, pytest,
pre-commit, mypy, CI/CD
‣ proper documentation with mkdocs
‣ automation & conventions are key!
https://github.com/FlorianWilhelm/the-hatchlor
Check out the Hatchlor!
⭐
CHEERS TO THE COMMUNITY
Credits & Resources
‣ Ofek Lev, the creator of hatch, for is
awesome work in his spare time ❤
‣ Michael Hofmann from inovex who
made these awesome slides
© 2023
Thank you!
Dr. Florian Wilhelm
Head of Data Science
PyConDE / PyData 2024
inovex.de
florian.wilhelm@inovex.de
@inovexlife
@inovexgmbh

Contenu connexe

Tendances

Tendances (20)

Experience A Live BI 4.3 Upgrade
Experience A Live BI 4.3 UpgradeExperience A Live BI 4.3 Upgrade
Experience A Live BI 4.3 Upgrade
 
Understanding Web Cache
Understanding Web CacheUnderstanding Web Cache
Understanding Web Cache
 
Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009
 
How to Enable Monetization of Your API Ecosystem
How to Enable Monetization of Your API EcosystemHow to Enable Monetization of Your API Ecosystem
How to Enable Monetization of Your API Ecosystem
 
Starcor IPTV/OTT Solution. The Introduction
Starcor IPTV/OTT Solution. The IntroductionStarcor IPTV/OTT Solution. The Introduction
Starcor IPTV/OTT Solution. The Introduction
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
 
ZERTO Introduction to End User Presentation
ZERTO Introduction to End User PresentationZERTO Introduction to End User Presentation
ZERTO Introduction to End User Presentation
 
Iasp Enablement
Iasp EnablementIasp Enablement
Iasp Enablement
 
JFS 2021 - The Process Automation Map
JFS 2021 - The Process Automation MapJFS 2021 - The Process Automation Map
JFS 2021 - The Process Automation Map
 
Mulesoft corporate template final
Mulesoft corporate template  final Mulesoft corporate template  final
Mulesoft corporate template final
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
apidays Paris 2022 - The next five years of the API Economy, Paolo Malinverno...
apidays Paris 2022 - The next five years of the API Economy, Paolo Malinverno...apidays Paris 2022 - The next five years of the API Economy, Paolo Malinverno...
apidays Paris 2022 - The next five years of the API Economy, Paolo Malinverno...
 
Event-driven architecture
Event-driven architectureEvent-driven architecture
Event-driven architecture
 
Event Driven Architecture (EDA) Reference Architecture
Event Driven Architecture (EDA) Reference ArchitectureEvent Driven Architecture (EDA) Reference Architecture
Event Driven Architecture (EDA) Reference Architecture
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Introduction to Modern Software Architecture
Introduction to Modern Software ArchitectureIntroduction to Modern Software Architecture
Introduction to Modern Software Architecture
 
Sap bpc optimized for s4 hana real time consolidation
Sap bpc optimized for s4 hana   real time consolidationSap bpc optimized for s4 hana   real time consolidation
Sap bpc optimized for s4 hana real time consolidation
 
IBM Integration Bus and REST APIs - Sanjay Nagchowdhury
IBM Integration Bus and REST APIs - Sanjay NagchowdhuryIBM Integration Bus and REST APIs - Sanjay Nagchowdhury
IBM Integration Bus and REST APIs - Sanjay Nagchowdhury
 
Leveraging API Docs and Tools at Mercedes-Benz /developers
Leveraging API Docs and Tools at Mercedes-Benz /developersLeveraging API Docs and Tools at Mercedes-Benz /developers
Leveraging API Docs and Tools at Mercedes-Benz /developers
 
Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.
 

Similaire à Streamlining Python Development: A Guide to a Modern Project Setup

Similaire à Streamlining Python Development: A Guide to a Modern Project Setup (20)

DevSecOps - Security in DevOps
DevSecOps - Security in DevOpsDevSecOps - Security in DevOps
DevSecOps - Security in DevOps
 
Wie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der CloudWie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der Cloud
 
Secrets of Successful Cloud Foundry Adopters
Secrets of Successful Cloud Foundry AdoptersSecrets of Successful Cloud Foundry Adopters
Secrets of Successful Cloud Foundry Adopters
 
Velocity NY 2018 "The Cloud Native Developer Workflow"
Velocity NY 2018 "The Cloud Native Developer Workflow"Velocity NY 2018 "The Cloud Native Developer Workflow"
Velocity NY 2018 "The Cloud Native Developer Workflow"
 
microXchg 2019: "Creating an Effective Developer Experience for Cloud-Native ...
microXchg 2019: "Creating an Effective Developer Experience for Cloud-Native ...microXchg 2019: "Creating an Effective Developer Experience for Cloud-Native ...
microXchg 2019: "Creating an Effective Developer Experience for Cloud-Native ...
 
Cloud Native Development
Cloud Native DevelopmentCloud Native Development
Cloud Native Development
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any Kubernetes
 
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
 
Next Level DevOps Implementation with GitOps
Next Level DevOps Implementation with GitOpsNext Level DevOps Implementation with GitOps
Next Level DevOps Implementation with GitOps
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to Habitat
 
Is Automation Necessary for the CC Survival?
Is Automation Necessary for the CC Survival?Is Automation Necessary for the CC Survival?
Is Automation Necessary for the CC Survival?
 
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with ConcourseContinuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
 
DevSecOps: Bringing security to the DevOps pipeline
DevSecOps: Bringing security to the DevOps pipelineDevSecOps: Bringing security to the DevOps pipeline
DevSecOps: Bringing security to the DevOps pipeline
 
Next gen software operations models in the cloud
Next gen software operations models in the cloudNext gen software operations models in the cloud
Next gen software operations models in the cloud
 
DevSecOps: Bringing security to the DevOps pipeline
DevSecOps: Bringing security to the DevOps pipelineDevSecOps: Bringing security to the DevOps pipeline
DevSecOps: Bringing security to the DevOps pipeline
 
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
 
DevOps-Roadmap
DevOps-RoadmapDevOps-Roadmap
DevOps-Roadmap
 
Continuous Lifecycle London 2018 Event Keynote
Continuous Lifecycle London 2018 Event KeynoteContinuous Lifecycle London 2018 Event Keynote
Continuous Lifecycle London 2018 Event Keynote
 
Top 10 Python Frameworks for App Development
Top 10 Python Frameworks for App DevelopmentTop 10 Python Frameworks for App Development
Top 10 Python Frameworks for App Development
 
Security in the DevOps pipeline of containerized core application: Case Study...
Security in the DevOps pipeline of containerized core application: Case Study...Security in the DevOps pipeline of containerized core application: Case Study...
Security in the DevOps pipeline of containerized core application: Case Study...
 

Plus de Florian Wilhelm

Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
Florian Wilhelm
 

Plus de Florian Wilhelm (16)

Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer Programming
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics Stack
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AI
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and Programming
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Streamlining Python Development: A Guide to a Modern Project Setup

  • 1. © 2024 A Guide to a Modern Project Setup PyConDE / PyData 2024, April 22nd Florian Wilhelm Streamlining Python Development
  • 2. Mathematical Modelling Modern Data Warehousing & Analytics Personalisation & RecSys Uncertainty Quantification & Causality Python Data Stack OSS Contributor & Creator of PyScaffold Dr. Florian Wilhelm • HEAD OF DATA SCIENCE FlorianWilhelm.info florian.wilhelm@inovex.de FlorianWilhelm
  • 3. ‣ Application Development (Web Platforms, Mobile Apps, Smart Devices and Robotics, UI/UX design,Backend Services) ‣ Data Management and Analytics (Business Intelligence, Big Data, Searches, Data Science and Deep Learning, Machine Perception and Artificial Intelligence) ‣ Scalable IT-Infrastructures (IT Engineering, Cloud Services, DevOps, Replatforming, Security) ‣ Training and Coaching (inovex Academy) is an innovation and quality-driven IT project house with a focus on digital transformation. Using technology to inspire our clients. And ourselves. Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen www.inovex.de
  • 4. 1. Introduction: a. What makes a good project setup? b. How do we achieve it? 2. Streamlined Project Setup: a. configuration with pyproject.toml b. tooling with hatch, ruff, mypy, pytest, … 3. Conclusion Agenda
  • 6. 1. efficient development 2. easy collaboration 3. seamless build & deployment What makes a streamlined Python Project Setup?
  • 7. 1. Conventions a. project structure b. code formatting, e.g., pep8, black, ruff c. documentation, e.g., Sphinx, mkdocs 2. Automation a. dependency & environment management b. building & publishing c. versioning, e.g., semantic versioning d. testing, linting/formatting, type checking 3. Easy to Use! Concrete Requirements for those Goals
  • 8. Semantic Versioning ‣ tells developers what to expect ‣ avoids dependency hell for developers using your software ‣ necessary for requirement specifiers like ~= 2.21 or ^2.2.21 (Poetry only) More Details: https://www.geeksforgeeks.org/introduction-semantic-versioning/ and https://semver.org/
  • 9. This is not a talk about the best Package Management Tool Source: An unbiased evaluation of environment management and packaging tools (https://www.inovex.de/de/blog/)
  • 11. ‣ reproducibly building & publishing packages ‣ robust environment management with support for custom scripts ‣ easy Python management, replacing pyenv ‣ easy semantic versioning based on Git tags ‣ sophisticated testing within various environments, replacing tox 🐣 Hatch, the extensible Python project manager Ofek Lev
  • 12. ‣ folders for ∙ source files ∙ documentation ∙ tests ‣ human-readable information ∙ README.md ∙ … ‣ configuration files ∙ pyproject.toml ∙ … Project Directory Structure
  • 13. ‣ defines the build system ‣ metadata about your project for PyPI ‣ configuration for (almost) all tools ∙ pytest ∙ mypy ∙ ruff ∙ coverage All-in-One Configuration with pyproject.toml
  • 14. Scripts in pyproject.toml for automation of tasks, e.g. ∙ running unit-tests with our without coverage, debugging, ∙ building the documentation, ∙ running the linters, code checks, mypy, ∙ … Automation with Scripts! > hatch run test:cov
  • 15. ‣ replaces tons of tools ‣ easy configuration via pyproject.toml ‣ extremely fast ‣ over 700 plugins Code Quality: Linting & Formatting Ruff flake8 autoflake pydocstyle …
  • 16. Why mypy? Type Checking: Are you my type? compile-time type checking finds many errors in advance, often edge cases. type declaration act as machine-checked documentation, thus enhancing the dev experience.
  • 17. Mypy Example > hatch run lint:typing
  • 18. pytest ‣ defacto standard for unit testing ‣ powerful features like fixtures, etc. ‣ tons of useful plugins, e.g.: ∙ pytest-cov for coverage ∙ pytest-recording for mocking calls to external services ∙ pytest-sugar to make it easier on the eyes Testing with pytest & hatch hatch & tox ‣ isolated environments for testing different Python versions and dependency combinations
  • 19. Avoiding human-errors by automated checks on every git commit Automated QA with pre-commit
  • 20. ‣ Automatic and reproducible testing ‣ Publishing packages based on git tags ‣ Established branching strategy, e.g. GithubFlow for efficient collaboration ‣ Scalability and Adaptability when needed ‣ Automated deployments, building of documentation etc. Automation with CI/CD More Details: Data Science in Production: Packaging, Versioning and Continuous Integration (https://www.inovex.de/de/blog/)
  • 21. Conclusion ‣ unified configuration in pyproject.toml ‣ standardized folder structure with src-layout and useful README.md ‣ easy package management and automation with hatch ‣ automated QA with ruff, pytest, pre-commit, mypy, CI/CD ‣ proper documentation with mkdocs ‣ automation & conventions are key!
  • 23. CHEERS TO THE COMMUNITY Credits & Resources ‣ Ofek Lev, the creator of hatch, for is awesome work in his spare time ❤ ‣ Michael Hofmann from inovex who made these awesome slides
  • 24. © 2023 Thank you! Dr. Florian Wilhelm Head of Data Science PyConDE / PyData 2024 inovex.de florian.wilhelm@inovex.de @inovexlife @inovexgmbh