This document discusses Apache Airflow and Google Cloud Composer. It begins by providing background on Apache Airflow, including that it is an open source workflow engine contributed by Airbnb. It then discusses how Codementor uses Airflow for ETL pipelines and machine learning workflows. The document mainly focuses on comparing self-hosting Airflow versus using Google Cloud Composer. Cloud Composer reduces efforts around hosting, permissions management, and monitoring. However, it has some limitations like occasional zombie tasks and higher costs. Overall, Cloud Composer allows teams to focus more on data logic and performance versus infrastructure maintenance.
Airflow Best Practises & Roadmap to Airflow 2.0Kaxil Naik
This document provides an overview of new features in Airflow 1.10.8/1.10.9 and best practices for writing DAGs and configuring Airflow for production. It also outlines the roadmap for Airflow 2.0, including dag serialization, a revamped real-time UI, developing a production-grade modern API, releasing official Docker/Helm support, and improving the scheduler. The document aims to help users understand recent Airflow updates and plan their migration to version 2.0.
Airflow is a workflow management system for authoring, scheduling and monitoring workflows or directed acyclic graphs (DAGs) of tasks. It has features like DAGs to define tasks and their relationships, operators to describe tasks, sensors to monitor external systems, hooks to connect to external APIs and databases, and a user interface for visualizing pipelines and monitoring runs. Airflow uses a variety of executors like SequentialExecutor, CeleryExecutor and MesosExecutor to run tasks on schedulers like Celery or Kubernetes. It provides security features like authentication, authorization and impersonation to manage access.
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Yohei Onishi
This is the slide I presented at PyCon SG 2019. I talked about overview of Airflow and how we can use Airflow and the other data engineering services on AWS and GCP to build data pipelines.
Google Associate Cloud Engineer Certification TipsDaniel Zivkovic
Tips & best practices to prepare for the GCP ACE (Associate Cloud Engineer) Exam by Dan Sullivan - the author of the official Google Cloud Certified study guides!
Event details: https://www.meetup.com/Serverless-Toronto/events/271344917/
Event recording: http://youtube.serverlesstoronto.org/
RSVP for more exciting (online) events at https://www.meetup.com/Serverless-Toronto/events/
Apache Airflow is an open-source workflow management platform developed by Airbnb and now an Apache Software Foundation project. It allows users to define and manage data pipelines as directed acyclic graphs (DAGs) of tasks. The tasks can be operators to perform actions, move data between systems, and use sensors to monitor external systems. Airflow provides a rich web UI, CLI and integrations with databases, Hadoop, AWS and others. It is scalable, supports dynamic task generation and templates, alerting, retries, and distributed execution across clusters.
Orchestrating workflows Apache Airflow on GCP & AWSDerrick Qin
Working in a cloud or on-premises environment, we all somehow move data from A to B on-demand or on schedule. It is essential to have a tool that can automate recurring workflows. This can be anything from an ETL(Extract, Transform, and Load) job for a regular analytics report all the way to automatically re-training a machine learning model.
In this talk, we will introduce Apache Airflow and how it can help orchestrate your workflows. We will cover key concepts, features, and use cases of Apache Airflow, as well as how you can enjoy Apache Airflow on GCP and AWS by demo-ing a few practical workflows.
This document discusses Apache Airflow and Google Cloud Composer. It begins by providing background on Apache Airflow, including that it is an open source workflow engine contributed by Airbnb. It then discusses how Codementor uses Airflow for ETL pipelines and machine learning workflows. The document mainly focuses on comparing self-hosting Airflow versus using Google Cloud Composer. Cloud Composer reduces efforts around hosting, permissions management, and monitoring. However, it has some limitations like occasional zombie tasks and higher costs. Overall, Cloud Composer allows teams to focus more on data logic and performance versus infrastructure maintenance.
Airflow Best Practises & Roadmap to Airflow 2.0Kaxil Naik
This document provides an overview of new features in Airflow 1.10.8/1.10.9 and best practices for writing DAGs and configuring Airflow for production. It also outlines the roadmap for Airflow 2.0, including dag serialization, a revamped real-time UI, developing a production-grade modern API, releasing official Docker/Helm support, and improving the scheduler. The document aims to help users understand recent Airflow updates and plan their migration to version 2.0.
Airflow is a workflow management system for authoring, scheduling and monitoring workflows or directed acyclic graphs (DAGs) of tasks. It has features like DAGs to define tasks and their relationships, operators to describe tasks, sensors to monitor external systems, hooks to connect to external APIs and databases, and a user interface for visualizing pipelines and monitoring runs. Airflow uses a variety of executors like SequentialExecutor, CeleryExecutor and MesosExecutor to run tasks on schedulers like Celery or Kubernetes. It provides security features like authentication, authorization and impersonation to manage access.
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Yohei Onishi
This is the slide I presented at PyCon SG 2019. I talked about overview of Airflow and how we can use Airflow and the other data engineering services on AWS and GCP to build data pipelines.
Google Associate Cloud Engineer Certification TipsDaniel Zivkovic
Tips & best practices to prepare for the GCP ACE (Associate Cloud Engineer) Exam by Dan Sullivan - the author of the official Google Cloud Certified study guides!
Event details: https://www.meetup.com/Serverless-Toronto/events/271344917/
Event recording: http://youtube.serverlesstoronto.org/
RSVP for more exciting (online) events at https://www.meetup.com/Serverless-Toronto/events/
Apache Airflow is an open-source workflow management platform developed by Airbnb and now an Apache Software Foundation project. It allows users to define and manage data pipelines as directed acyclic graphs (DAGs) of tasks. The tasks can be operators to perform actions, move data between systems, and use sensors to monitor external systems. Airflow provides a rich web UI, CLI and integrations with databases, Hadoop, AWS and others. It is scalable, supports dynamic task generation and templates, alerting, retries, and distributed execution across clusters.
Orchestrating workflows Apache Airflow on GCP & AWSDerrick Qin
Working in a cloud or on-premises environment, we all somehow move data from A to B on-demand or on schedule. It is essential to have a tool that can automate recurring workflows. This can be anything from an ETL(Extract, Transform, and Load) job for a regular analytics report all the way to automatically re-training a machine learning model.
In this talk, we will introduce Apache Airflow and how it can help orchestrate your workflows. We will cover key concepts, features, and use cases of Apache Airflow, as well as how you can enjoy Apache Airflow on GCP and AWS by demo-ing a few practical workflows.
Why we chose Argo Workflow to scale DevOps at InVisionNebulaworks
As the DevOps team grows in size and start to form a multi DevOps team structure, it starts to experience growing pains such as working in silos, decreased velocity, or lack of collaboration. The solution is to standardize tools for automation and provide the building blocks of commonly used patterns readily available. This is where workflows come into play. Adopting Workflows provides a common scalable platform for DevOps engineers to automate, trigger, and execute repetitive tasks and therefore leads to increased efficiency and innovation.
The document provides an overview of Apache Airflow, an open-source workflow management platform for data pipelines. It describes how Airflow allows users to programmatically author, schedule and monitor workflows or data pipelines via a GUI. It also outlines key Airflow concepts like DAGs (directed acyclic graphs), tasks, operators, sensors, XComs (cross-communication), connections, variables and executors that allow parallel task execution.
This document provides an overview of building data pipelines using Apache Airflow. It discusses what a data pipeline is, common components of data pipelines like data ingestion and processing, and issues with traditional data flows. It then introduces Apache Airflow, describing its features like being fault tolerant and supporting Python code. The core components of Airflow including the web server, scheduler, executor, and worker processes are explained. Key concepts like DAGs, operators, tasks, and workflows are defined. Finally, it demonstrates Airflow through an example DAG that extracts and cleanses tweets.
Introduction to Apache Airflow, it's main concepts and features and an example of a DAG. Afterwards some lessons and best practices learned by from the 3 years I have been using Airflow to power workflows in production.
In the session, we discussed the End-to-end working of Apache Airflow that mainly focused on "Why What and How" factors. It includes the DAG creation/implementation, Architecture, pros & cons. It also includes how the DAG is created for scheduling the Job and what all steps are required to create the DAG using python script & finally with the working demo.
This document provides an overview of Apache Airflow, an open-source workflow management system. It describes Airflow's key features like workflow definition using directed acyclic graphs (DAGs), rich UI, scheduler, operators for tasks like databases and web services, and use of Jinja templating. The document also discusses Airflow's architecture with parallel execution, UI, command line operations like backfilling, and security features. Airflow is used by over 200 companies for workflows like ETL, analytics, and machine learning pipelines.
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...NETWAYS
The growth of observability trends and Kubernetes adoption generates more demanding requirements for monitoring systems. Volumes of time series data increase exponentially, and old solutions just can’t keep up with the pace. The talk will cover how and why we created a new open source time series database from scratch. Which architectural decisions, which trade-offs we had to take in order to match the new expectations and handle 100 million metrics per second with VictoriaMetrics. The talk will be interesting for software engineers and DevOps familiar with observability and modern monitoring systems, or for those who’re interested in building scalable high performant databases for time series.
Building an analytics workflow using Apache AirflowYohei Onishi
This document discusses using Apache Airflow to build an analytics workflow. It begins with an overview of Airflow and how it can be used to author workflows through Python code. Examples are shown of using Airflow to copy files between S3 buckets. The document then covers setting up a highly available Airflow cluster, implementing continuous integration/deployment, and monitoring workflows. It emphasizes that Google Cloud Composer can simplify deploying and managing Airflow clusters on Google Kubernetes Engine and integrating with other Google Cloud services.
This document provides an overview of setting up monitoring for MySQL and MongoDB servers using Prometheus and Grafana. It discusses installing and configuring Prometheus, Grafana, exporters for collecting metrics from MySQL, MongoDB and systems, and dashboards for visualizing the metrics in Grafana. The tutorial hands-on sets up Prometheus and Grafana in two virtual machines to monitor a MySQL master-slave replication setup and MongoDB cluster.
Cloud run - Serverless Containers Done Rightmfazal
Have a peek into Cloud Run, GCP's new fully managed serverless platform, that allows one to run HTTP stateless containers while only paying for when used and without worrying about the infrastructure. Includes demos of how to get started and sharing of real-life use cases of how this will change deployment of containerized applications.
Prometheus is an open-source monitoring system started in 2012 by former Google engineers. It uses a pull-based architecture to easily scale and features a powerful multi-dimensional data model and query language. Prometheus scrapes metrics from instrumented jobs like node exporters and stores time series data which can then be queried and graphed.
This document describes how to set up monitoring for MySQL databases using Prometheus and Grafana. It includes instructions for installing and configuring Prometheus and Alertmanager on a monitoring server to scrape metrics from node_exporter and mysql_exporter. Ansible playbooks are provided to automatically install the exporters and configure Prometheus. Finally, steps are outlined for creating Grafana dashboards to visualize the metrics and monitor MySQL performance.
Slide deck for the fourth data engineering lunch, presented by guest speaker Will Angel. It covered the topic of using Airflow for data engineering. Airflow is a scheduling tool for managing data pipelines.
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
Google BigQuery is a big data analytics service that allows users to analyze petabytes of data using SQL queries. It offers features like fast query response times, SQL-like queries, multi-dataset support, and pay-as-you-go pricing. The document provides an overview of BigQuery and demonstrates how to import and query data from the BigQuery web UI, command line, and programmatically using Node.js and Google Apps Script.
This document provides an overview of using Prometheus for monitoring and alerting. It discusses using Node Exporters and other exporters to collect metrics, storing metrics in Prometheus, querying metrics using PromQL, and configuring alert rules and the Alertmanager for notifications. Key aspects covered include scraping configs, common exporters, data types and selectors in PromQL, operations and functions, and setting up alerts and the Alertmanager for routing alerts.
Slides du meetup Google Cloud présentant les différents services Google Cloud
- Compute Engine
- Gigquery
- Cloud Storage
- Cloud Function
- Google Dataflow / Apache Beam
- Google Spanner etc...
Dans cette session vous découvrirez comment le couple Visual Studio 2013 et le Framework 4.5.1 augmentent votre productivité ainsi que les performances de vos applications .NET. Dans la lignée du Framework 4.5, cette dernière version offre son lot d’améliorations, parfaitement stabilisées au rythme des versions de Visual Studio. Cependant, Microsoft a décidé de livrer des packages officiels à un rythme plus rapide et plus fréquent que les versions majeures de Visual Studio. La dernière version de NuGet intégrée à Visual Studio 2013 permet de trouver plus facilement le type de package recherché. Ne manquez pas cette session et venez découvrir l’essentielle des nouveautés apportées par Visual Studio 2013 et le Framework 4.5.1.
Speakers : Michel Perfetti (Cellenza), Bruno Boucard (Cellenza)
Why we chose Argo Workflow to scale DevOps at InVisionNebulaworks
As the DevOps team grows in size and start to form a multi DevOps team structure, it starts to experience growing pains such as working in silos, decreased velocity, or lack of collaboration. The solution is to standardize tools for automation and provide the building blocks of commonly used patterns readily available. This is where workflows come into play. Adopting Workflows provides a common scalable platform for DevOps engineers to automate, trigger, and execute repetitive tasks and therefore leads to increased efficiency and innovation.
The document provides an overview of Apache Airflow, an open-source workflow management platform for data pipelines. It describes how Airflow allows users to programmatically author, schedule and monitor workflows or data pipelines via a GUI. It also outlines key Airflow concepts like DAGs (directed acyclic graphs), tasks, operators, sensors, XComs (cross-communication), connections, variables and executors that allow parallel task execution.
This document provides an overview of building data pipelines using Apache Airflow. It discusses what a data pipeline is, common components of data pipelines like data ingestion and processing, and issues with traditional data flows. It then introduces Apache Airflow, describing its features like being fault tolerant and supporting Python code. The core components of Airflow including the web server, scheduler, executor, and worker processes are explained. Key concepts like DAGs, operators, tasks, and workflows are defined. Finally, it demonstrates Airflow through an example DAG that extracts and cleanses tweets.
Introduction to Apache Airflow, it's main concepts and features and an example of a DAG. Afterwards some lessons and best practices learned by from the 3 years I have been using Airflow to power workflows in production.
In the session, we discussed the End-to-end working of Apache Airflow that mainly focused on "Why What and How" factors. It includes the DAG creation/implementation, Architecture, pros & cons. It also includes how the DAG is created for scheduling the Job and what all steps are required to create the DAG using python script & finally with the working demo.
This document provides an overview of Apache Airflow, an open-source workflow management system. It describes Airflow's key features like workflow definition using directed acyclic graphs (DAGs), rich UI, scheduler, operators for tasks like databases and web services, and use of Jinja templating. The document also discusses Airflow's architecture with parallel execution, UI, command line operations like backfilling, and security features. Airflow is used by over 200 companies for workflows like ETL, analytics, and machine learning pipelines.
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...NETWAYS
The growth of observability trends and Kubernetes adoption generates more demanding requirements for monitoring systems. Volumes of time series data increase exponentially, and old solutions just can’t keep up with the pace. The talk will cover how and why we created a new open source time series database from scratch. Which architectural decisions, which trade-offs we had to take in order to match the new expectations and handle 100 million metrics per second with VictoriaMetrics. The talk will be interesting for software engineers and DevOps familiar with observability and modern monitoring systems, or for those who’re interested in building scalable high performant databases for time series.
Building an analytics workflow using Apache AirflowYohei Onishi
This document discusses using Apache Airflow to build an analytics workflow. It begins with an overview of Airflow and how it can be used to author workflows through Python code. Examples are shown of using Airflow to copy files between S3 buckets. The document then covers setting up a highly available Airflow cluster, implementing continuous integration/deployment, and monitoring workflows. It emphasizes that Google Cloud Composer can simplify deploying and managing Airflow clusters on Google Kubernetes Engine and integrating with other Google Cloud services.
This document provides an overview of setting up monitoring for MySQL and MongoDB servers using Prometheus and Grafana. It discusses installing and configuring Prometheus, Grafana, exporters for collecting metrics from MySQL, MongoDB and systems, and dashboards for visualizing the metrics in Grafana. The tutorial hands-on sets up Prometheus and Grafana in two virtual machines to monitor a MySQL master-slave replication setup and MongoDB cluster.
Cloud run - Serverless Containers Done Rightmfazal
Have a peek into Cloud Run, GCP's new fully managed serverless platform, that allows one to run HTTP stateless containers while only paying for when used and without worrying about the infrastructure. Includes demos of how to get started and sharing of real-life use cases of how this will change deployment of containerized applications.
Prometheus is an open-source monitoring system started in 2012 by former Google engineers. It uses a pull-based architecture to easily scale and features a powerful multi-dimensional data model and query language. Prometheus scrapes metrics from instrumented jobs like node exporters and stores time series data which can then be queried and graphed.
This document describes how to set up monitoring for MySQL databases using Prometheus and Grafana. It includes instructions for installing and configuring Prometheus and Alertmanager on a monitoring server to scrape metrics from node_exporter and mysql_exporter. Ansible playbooks are provided to automatically install the exporters and configure Prometheus. Finally, steps are outlined for creating Grafana dashboards to visualize the metrics and monitor MySQL performance.
Slide deck for the fourth data engineering lunch, presented by guest speaker Will Angel. It covered the topic of using Airflow for data engineering. Airflow is a scheduling tool for managing data pipelines.
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
Google BigQuery is a big data analytics service that allows users to analyze petabytes of data using SQL queries. It offers features like fast query response times, SQL-like queries, multi-dataset support, and pay-as-you-go pricing. The document provides an overview of BigQuery and demonstrates how to import and query data from the BigQuery web UI, command line, and programmatically using Node.js and Google Apps Script.
This document provides an overview of using Prometheus for monitoring and alerting. It discusses using Node Exporters and other exporters to collect metrics, storing metrics in Prometheus, querying metrics using PromQL, and configuring alert rules and the Alertmanager for notifications. Key aspects covered include scraping configs, common exporters, data types and selectors in PromQL, operations and functions, and setting up alerts and the Alertmanager for routing alerts.
Slides du meetup Google Cloud présentant les différents services Google Cloud
- Compute Engine
- Gigquery
- Cloud Storage
- Cloud Function
- Google Dataflow / Apache Beam
- Google Spanner etc...
Dans cette session vous découvrirez comment le couple Visual Studio 2013 et le Framework 4.5.1 augmentent votre productivité ainsi que les performances de vos applications .NET. Dans la lignée du Framework 4.5, cette dernière version offre son lot d’améliorations, parfaitement stabilisées au rythme des versions de Visual Studio. Cependant, Microsoft a décidé de livrer des packages officiels à un rythme plus rapide et plus fréquent que les versions majeures de Visual Studio. La dernière version de NuGet intégrée à Visual Studio 2013 permet de trouver plus facilement le type de package recherché. Ne manquez pas cette session et venez découvrir l’essentielle des nouveautés apportées par Visual Studio 2013 et le Framework 4.5.1.
Speakers : Michel Perfetti (Cellenza), Bruno Boucard (Cellenza)
Google est le champion de la data et naturellement sa plateforme cloud propose toutes les briques nécessaires pour mettre en place un Data lake.
Dans cette présentation, nous vous détaillerons les différents services permettant de mettre en place concrètement un data lake, et ainsi répondre aux questions suivantes:
Comment stocker mes données ?
Comment les intégrer ?
Comment les exploiter ?
Comment orchestrer des traitements ?
Comment maitriser mon data lake ?
Devoxx: Tribulation d'un développeur sur le CloudTugdual Grall
Comme beaucoup de développeurs une grande partie de mon temps libre est utilisé pour découvrir de nouvelles technologies et développer des applications avec celles-ci.
J'ai donc choisi de découvrir le développement d'application Java sur le cloud, avec Google AppEngine, pour créer le site http://www.resultri.com qui permet de gérer les resultats de triathlon (mon autre passion).
Développer cette application est une aventure interessante que je partage avec vous durant ce BOF:
découverte de GAE et des outils de developpement
les "surprises" du NoSQL, surtout pour un cerveau "cablé relationnel comme le mien"
hmmm tout n'est pas gratuit?
les quelques trucs à savoir : l'importance de memcache, utilisation de CloudSQL, les batchs....
Notre voyage vers le déploiement continu avec micro-services, la conteneurisation et l'orchestration des conteneurs utilisant Kubernetes. Sur notre chemin, nous avons dû créer divers outils pour nous aider à mieux utiliser et tester le tout avant d'aller en production. Nous avons également intégré une variété d'autres outils pour nous donner de la visibilité sur notre plate-forme. Cette conférence sera un aperçu de notre voyage jusqu'à maintenant.
Our journey towards continuous deployment with micro-services, containerization and orchestration of containers using Kubernetes. On our way there, we've had to create various tools to help us better use and test everything before going to production. We also had to integrate a variety of other tools to give us visibility on our platform.
This talk will be an overview of our journey up to now.
Les bonnes pratiques pour migrer d'Oracle vers PostgresEDB
Lors de cette presentation il abordera :
Comment donner la priorité à la bonne application ou un projet pour votre première migration Oracle ;
Conseils pour exécuter un processus de migration progressif et bien définie pour minimiser les risques et augmenter la valeur ajoutée / temps passé ;
Bien gérer les préoccupations et les pièges courants liés à un projet de migration ;
Quelles sont les ressources que vous pouvez exploiter avant, pendant et après votre migration ;
Suggestions sur comment vous pouvez atteindre l'indépendance d'une base de données Oracle - sans sacrifier les performances.
Public cible : Cette présentation est destinée aux décideurs IT et des membres de l'équipe impliqués dans les décisions et l'exécution de base de données.
This is a technical presentation about Openshift Platform-as-a-Service for Clermont'ech API Hour #26, 2017/03/27.
More informations here : http://clermontech.org/api-hours/api-hour-26.html
About the author : https://www.linkedin.com/in/jperville/
Ouvrez la porte ou prenez un mur (Agile Tour Genève 2024)Laurent Speyser
(Conférence dessinée)
Vous êtes certainement à l’origine, ou impliqué, dans un changement au sein de votre organisation. Et peut être que cela ne se passe pas aussi bien qu’attendu…
Depuis plusieurs années, je fais régulièrement le constat de l’échec de l’adoption de l’Agilité, et plus globalement de grands changements, dans les organisations. Je vais tenter de vous expliquer pourquoi ils suscitent peu d'adhésion, peu d’engagement, et ils ne tiennent pas dans le temps.
Heureusement, il existe un autre chemin. Pour l'emprunter il s'agira de cultiver l'invitation, l'intelligence collective , la mécanique des jeux, les rites de passages, .... afin que l'agilité prenne racine.
Vous repartirez de cette conférence en ayant pris du recul sur le changement tel qu‘il est généralement opéré aujourd’hui, et en ayant découvert (ou redécouvert) le seul guide valable à suivre, à mon sens, pour un changement authentique, durable, et respectueux des individus! Et en bonus, 2 ou 3 trucs pratiques!
L'IA connaît une croissance rapide et son intégration dans le domaine éducatif soulève de nombreuses questions. Aujourd'hui, nous explorerons comment les étudiants utilisent l'IA, les perceptions des enseignants à ce sujet, et les mesures possibles pour encadrer ces usages.
Constat Actuel
L'IA est de plus en plus présente dans notre quotidien, y compris dans l'éducation. Certaines universités, comme Science Po en janvier 2023, ont interdit l'utilisation de l'IA, tandis que d'autres, comme l'Université de Prague, la considèrent comme du plagiat. Cette diversité de positions souligne la nécessité urgente d'une réponse institutionnelle pour encadrer ces usages et prévenir les risques de triche et de plagiat.
Enquête Nationale
Pour mieux comprendre ces dynamiques, une enquête nationale intitulée "L'IA dans l'enseignement" a été réalisée. Les auteurs de cette enquête sont Le Sphynx (sondage) et Compilatio (fraude académique). Elle a été diffusée dans les universités de Lyon et d'Aix-Marseille entre le 21 juin et le 15 août 2023, touchant 1242 enseignants et 4443 étudiants. Les questionnaires, conçus pour étudier les usages de l'IA et les représentations de ces usages, abordaient des thèmes comme les craintes, les opportunités et l'acceptabilité.
Résultats de l'Enquête
Les résultats montrent que 55 % des étudiants utilisent l'IA de manière occasionnelle ou fréquente, contre 34 % des enseignants. Cependant, 88 % des enseignants pensent que leurs étudiants utilisent l'IA, ce qui pourrait indiquer une surestimation des usages. Les usages identifiés incluent la recherche d'informations et la rédaction de textes, bien que ces réponses ne puissent pas être cumulées dans les choix proposés.
Analyse Critique
Une analyse plus approfondie révèle que les enseignants peinent à percevoir les bénéfices de l'IA pour l'apprentissage, contrairement aux étudiants. La question de savoir si l'IA améliore les notes sans développer les compétences reste débattue. Est-ce un dopage académique ou une opportunité pour un apprentissage plus efficace ?
Acceptabilité et Éthique
L'enquête révèle que beaucoup d'étudiants jugent acceptable d'utiliser l'IA pour rédiger leurs devoirs, et même un quart des enseignants partagent cet avis. Cela pose des questions éthiques cruciales : copier-coller est-il tricher ? Utiliser l'IA sous supervision ou pour des traductions est-il acceptable ? La réponse n'est pas simple et nécessite un débat ouvert.
Propositions et Solutions
Pour encadrer ces usages, plusieurs solutions sont proposées. Plutôt que d'interdire l'IA, il est suggéré de fixer des règles pour une utilisation responsable. Des innovations pédagogiques peuvent également être explorées, comme la création de situations de concurrence professionnelle ou l'utilisation de détecteurs d'IA.
Conclusion
En conclusion, bien que l'étude présente des limites, elle souligne un besoin urgent de régulation. Une charte institutionnelle pourrait fournir un cadre pour une utilisation éthique.
Le Comptoir OCTO - Équipes infra et prod, ne ratez pas l'embarquement pour l'...OCTO Technology
par Claude Camus (Coach agile d'organisation @OCTO Technology) et Gilles Masy (Organizational Coach @OCTO Technology)
Les équipes infrastructure, sécurité, production, ou cloud, doivent consacrer du temps à la modernisation de leurs outils (automatisation, cloud, etc) et de leurs pratiques (DevOps, SRE, etc). Dans le même temps, elles doivent répondre à une avalanche croissante de demandes, tout en maintenant un niveau de qualité de service optimal.
Habitué des environnements développeurs, les transformations agiles négligent les particularités des équipes OPS. Lors de ce comptoir, nous vous partagerons notre proposition de valeur de l'agilité@OPS, qui embarquera vos équipes OPS en Classe Business (Agility), et leur fera dire : "nous ne reviendrons pas en arrière".
Le Comptoir OCTO - Qu’apporte l’analyse de cycle de vie lors d’un audit d’éco...OCTO Technology
Par Nicolas Bordier (Consultant numérique responsable @OCTO Technology) et Alaric Rougnon-Glasson (Sustainable Tech Consultant @OCTO Technology)
Sur un exemple très concret d’audit d’éco-conception de l’outil de bilan carbone C’Bilan développé par ICDC (Caisse des dépôts et consignations) nous allons expliquer en quoi l’ACV (analyse de cycle de vie) a été déterminante pour identifier les pistes d’actions pour réduire jusqu'à 82% de l’empreinte environnementale du service.
Vidéo Youtube : https://www.youtube.com/watch?v=7R8oL2P_DkU
Compte-rendu :
MongoDB in a scale-up: how to get away from a monolithic hell — MongoDB Paris...Horgix
This is the slide deck of a talk by Alexis "Horgix" Chotard and Laurentiu Capatina presented at the MongoDB Paris User Group in June 2024 about the feedback on how PayFit move away from a monolithic hell of a self-hosted MongoDB cluster to managed alternatives. Pitch below.
March 15, 2023, 6:59 AM: a MongoDB cluster collapses. Tough luck, this cluster contains 95% of user data and is absolutely vital for even minimal operation of our application. To worsen matters, this cluster is 7 years behind on versions, is not scalable, and barely observable. Furthermore, even the data model would quickly raise eyebrows: applications communicating with each other by reading/writing in the same MongoDB documents, documents reaching the maximum limit of 16MiB with hundreds of levels of nesting, and so forth. The incident will last several days and result in the loss of many users. We've seen better scenarios.
Let's explore how PayFit found itself in this hellish situation and, more importantly, how we managed to overcome it!
On the agenda: technical stabilization, untangling data models, breaking apart a Single Point of Failure (SPOF) into several elements with a more restricted blast radius, transitioning to managed services, improving internal accesses, regaining control over risky operations, and ultimately, approaching a technical migration when it impacts all development teams.
8. 8
• Service managé
• Repose sur Apache Airflow
• En python
• Intégré aux services B
• BigQuery, Dataflow, Dataproc, Datastore, Cloud Storage, Pub/Sub, Cloud ML engine
• Autres Cloud
AIRFLOW :
9. 9
• Pas seulement un CRON
• Un workflow robuste
• Génération de métadata
• Reprise des tâches
• Dispose d’une interface
• Workflow as Code
• De nombreux fonctionnalités
• Reprises
• SLA
• Workflow complexe
AIRFLOW :
10. 10
• Qu’est ce qu’un DAG
• Permet de définir une hiérarchie / workflow
• Le workflow est un ensemble de tâches
SON RÔLE
« Un graphe orienté acyclique est un graphe orienté qui ne possède pas de circuit » wikipedia
11. 11
• Le DAG
SON RÔLE
Définition
• Une date de
début
• Des paramètres
de mailing, de
reprise
Opération(s)
• Description du
processus
• Une
implémentation
d’un opérateur
est une tâche
Relation
• Relation entre
des opérations
• Branching
conditionnel
12. 12
• Une tâche doit avoir des paramètres
• A quel moment elle commence ?
• Le nombre de fois qu’on peut recommencer ? Le délai entre deux essaies
• Les actions d’envoie de mail, de callback
• Si les tâches dépendent du passé
FOCUS DEFINITION
13. 13
Operator VS Sensor
Execution VS Trigger
FOCUS OPERATOR
HttpSensor
HdfsSensorS3Sensor
HttpSensor
HdfsSensorS3Sensor
BigqueryOperator
• Possibilité de rajouter des paramètres pour rendre dynamique
• Récupérer des fichiers sur Cloud Storage etc..
14. 14
Interface VS commandLine
gcloud beta composer environments create dev --
location us-central1 --zone us-central1-f --
machine-type n1-standard-2 --labels env=beta
CRÉATION DE L’ENVIRONNEMENT