Théorie des langages - TP - WellKnownTextYann Caron
Cours de théorie des langages, théorie de la compilation, techniques de compilations et paradigmes de programmation que je dispense aux Ingé 2 et 3 à l’École National des Sciences Géographiques de Paris.
Théorie des langages - TP - WellKnownTextYann Caron
Cours de théorie des langages, théorie de la compilation, techniques de compilations et paradigmes de programmation que je dispense aux Ingé 2 et 3 à l’École National des Sciences Géographiques de Paris.
From 2009 to 2015, Yann Schwartz evolved the design of Criteo's log ingestion system from Ubu to more Kafka-like processes. The talk discusses the evolution of the system over time, including dealing with issues like scaling, fault tolerance, and data formats. It also pokes fun at some common misconceptions or underestimations people have about technologies over time.
Data Day Texas 2017: Scaling Data Science at Stitch FixStefan Krawczyk
At Stitch Fix we have a lot of Data Scientists. Around eighty at last count. One reason why I think we have so many, is that we do things differently. To get their work done, Data Scientists have access to whatever resources they need (within reason), because they’re end to end responsible for their work; they collaborate with their business partners on objectives and then prototype, iterate, productionize, monitor and debug everything and anything required to get the output desired. They’re full data-stack data scientists!
The teams in the organization do a variety of different tasks:
- Clothing recommendations for clients.
- Clothes reordering recommendations.
- Time series analysis & forecasting of inventory, client segments, etc.
- Warehouse worker path routing.
- NLP.
… and more!
They’re also quite prolific at what they do -- we are approaching 4500 job definitions at last count. So one might be wondering now, how have we enabled them to get their jobs done without getting in the way of each other?
This is where the Data Platform teams comes into play. With the goal of lowering the cognitive overhead and engineering effort required on part of the Data Scientist, the Data Platform team tries to provide abstractions and infrastructure to help the Data Scientists. The relationship is a collaborative partnership, where the Data Scientist is free to make their own decisions and thus choose they way they do their work, and the onus then falls on the Data Platform team to convince Data Scientists to use their tools; the easiest way to do that is by designing the tools well.
In regard to scaling Data Science, the Data Platform team has helped establish some patterns and infrastructure that help alleviate contention. Contention on:
Access to Data
Access to Compute Resources:
Ad-hoc compute (think prototype, iterate, workspace)
Production compute (think where things are executed once they’re needed regularly)
For the talk (and this post) I only focused on how we reduced contention on Access to Data, & Access to Ad-hoc Compute to enable Data Science to scale at Stitch Fix. With that I invite you to take a look through the slides.
Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann
Slides from my talk at Berlin Buzzwords, 27 May 2014. Unfortunately Slideshare has screwed up the fonts. See https://speakerdeck.com/ept/samza-at-linkedin-taking-stream-processing-to-the-next-level for a version of the deck with correct fonts.
Stream processing is an essential part of real-time data systems, such as news feeds, live search indexes, real-time analytics, metrics and monitoring. But writing stream processes is still hard, especially when you're dealing with so much data that you have to distribute it across multiple machines. How can you keep the system running smoothly, even when machines fail and bugs occur?
Apache Samza is a new framework for writing scalable stream processing jobs. Like Hadoop and MapReduce for batch processing, it takes care of the hard parts of running your message-processing code on a distributed infrastructure, so that you can concentrate on writing your application using simple APIs. It is in production use at LinkedIn.
This talk will introduce Samza, and show how to use it to solve a range of different problems. Samza has some unique features that make it especially interesting for large deployments, and in this talk we will dig into how they work under the hood. In particular:
• Samza is built to support many different jobs written by different teams. Isolation between jobs ensures that a single badly behaved job doesn't affect other jobs. It is robust by design.
• Samza can handle jobs that require large amounts of state, for example joining multiple streams, augmenting a stream with data from a database, or aggregating data over long time windows. This makes it a very powerful tool for applications.
Rise of the Machines - Automate your DevelopmentSven Peters
When we talk about automation in software development, we immediately think of automated builds and deployments. We may also be using scripts to help make our daily work easier. But this is really just the beginning of the rise of the machines.
I show you how leading developers in our industry are using open source and commercial tools for automating much more. They've got "robots" for monitoring production servers, updating issues, supporting customers, reviewing code, setting up laptops, doing development reporting, conducting customer feedback -- even automating daily standups. In what instances is it useful to automate? In what cases does it not make sense? Automation prevents us from having to do the same thing twice, helps us to work better together, reduces workflow errors and frees up time to write production code. Plus, as it turns out, spending time on automation is fun! Don't be afraid of robots in software development, embrace them! Even if I save you just half an hour a week, this talk will be a beneficial investment of your time.
We work together in teams, across divisions and with different companies. A lot of our productive work time is lost because information is kept in departments, on file servers or in peoples heads. With the trend to distributed organizations we need to communicate more effective.
This sessions shows how companies like Atlassian and Hubspot have encouraged their employees to live and breath a collaborative culture. I will talk about 4 things that helped us work happily together like building a great work environment, focusing on people instead of roles, using tools to communicate faster and more transparently and staying away from a command & control mentality. Collaboration creates greater value, enhances achievement, and produces sustainable business models. It’s time to move from the industrial age to the information age and start the collaboration revolution!
We often relate Domain-Driven Design with the content of Eric Evans' book; however even this book suggests looking outside for other patterns and inspirations: analysis patterns (Accounting, Finance), domain-oriented use of design patterns (the Flyweight pattern), established formalisms (e.g. monoids) and XP literature in particular (e.g. the patterns on the c2 wiki and OOPSLA papers).
The world has not stopped since the book either, and new ideas keep on emerging regularly. And you can share your own patterns as well.
In this session, through examples and code we'll go through some particularly important patterns which deserve to be in your tool belt. We'll also provide guidance on how best to use them (or not), at the right time and in the right context, and on how to train your colleagues on them!
Introduction au langage PHP (1ere partie) élaborée par Marouan OMEZZINEMarouan OMEZZINE
Une introduction / ébauche au langage PHP (intro, xampp, premier pas, les structures, les variables, les types, les fonctions ...) faite dans le cadre des formations inter-membres du club Junior ENSI (http://www.junior-ensi.org/) de l'école nationale des sciences de l'informatique (http://www.ensi.rnu.tn/).
Codedarmor 2012 - 13/11 - Dart, un langage moderne pour le webcodedarmor
Dart est un nouveau langage de programmation pour développer des applications web. Créé par Google et ouvert au grand public en octobre 2011, c’est un langage orienté-objet avec une syntaxe familière que l’on soit développeur Java ou développeur JavaScript. Ses deux objectifs ? Performance et facilité d’utilisation.
Dans cette présentation, nous verrons quels sont les objectifs de Google en introduisant ce nouveau langage. Nous irons au coeur de celui-ci en présentant les différentes spécificités, typage optionnel, les Isolates comme modèle de concurrence, les différents mode d’exécution, la gestion du DOM... Enfin, nous discuterons des échéances à venir pour savoir si Dart saura s’imposer ou non en tant que langage d’avenir pour le web.
Par Julien Vey de Zénika.
From 2009 to 2015, Yann Schwartz evolved the design of Criteo's log ingestion system from Ubu to more Kafka-like processes. The talk discusses the evolution of the system over time, including dealing with issues like scaling, fault tolerance, and data formats. It also pokes fun at some common misconceptions or underestimations people have about technologies over time.
Data Day Texas 2017: Scaling Data Science at Stitch FixStefan Krawczyk
At Stitch Fix we have a lot of Data Scientists. Around eighty at last count. One reason why I think we have so many, is that we do things differently. To get their work done, Data Scientists have access to whatever resources they need (within reason), because they’re end to end responsible for their work; they collaborate with their business partners on objectives and then prototype, iterate, productionize, monitor and debug everything and anything required to get the output desired. They’re full data-stack data scientists!
The teams in the organization do a variety of different tasks:
- Clothing recommendations for clients.
- Clothes reordering recommendations.
- Time series analysis & forecasting of inventory, client segments, etc.
- Warehouse worker path routing.
- NLP.
… and more!
They’re also quite prolific at what they do -- we are approaching 4500 job definitions at last count. So one might be wondering now, how have we enabled them to get their jobs done without getting in the way of each other?
This is where the Data Platform teams comes into play. With the goal of lowering the cognitive overhead and engineering effort required on part of the Data Scientist, the Data Platform team tries to provide abstractions and infrastructure to help the Data Scientists. The relationship is a collaborative partnership, where the Data Scientist is free to make their own decisions and thus choose they way they do their work, and the onus then falls on the Data Platform team to convince Data Scientists to use their tools; the easiest way to do that is by designing the tools well.
In regard to scaling Data Science, the Data Platform team has helped establish some patterns and infrastructure that help alleviate contention. Contention on:
Access to Data
Access to Compute Resources:
Ad-hoc compute (think prototype, iterate, workspace)
Production compute (think where things are executed once they’re needed regularly)
For the talk (and this post) I only focused on how we reduced contention on Access to Data, & Access to Ad-hoc Compute to enable Data Science to scale at Stitch Fix. With that I invite you to take a look through the slides.
Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann
Slides from my talk at Berlin Buzzwords, 27 May 2014. Unfortunately Slideshare has screwed up the fonts. See https://speakerdeck.com/ept/samza-at-linkedin-taking-stream-processing-to-the-next-level for a version of the deck with correct fonts.
Stream processing is an essential part of real-time data systems, such as news feeds, live search indexes, real-time analytics, metrics and monitoring. But writing stream processes is still hard, especially when you're dealing with so much data that you have to distribute it across multiple machines. How can you keep the system running smoothly, even when machines fail and bugs occur?
Apache Samza is a new framework for writing scalable stream processing jobs. Like Hadoop and MapReduce for batch processing, it takes care of the hard parts of running your message-processing code on a distributed infrastructure, so that you can concentrate on writing your application using simple APIs. It is in production use at LinkedIn.
This talk will introduce Samza, and show how to use it to solve a range of different problems. Samza has some unique features that make it especially interesting for large deployments, and in this talk we will dig into how they work under the hood. In particular:
• Samza is built to support many different jobs written by different teams. Isolation between jobs ensures that a single badly behaved job doesn't affect other jobs. It is robust by design.
• Samza can handle jobs that require large amounts of state, for example joining multiple streams, augmenting a stream with data from a database, or aggregating data over long time windows. This makes it a very powerful tool for applications.
Rise of the Machines - Automate your DevelopmentSven Peters
When we talk about automation in software development, we immediately think of automated builds and deployments. We may also be using scripts to help make our daily work easier. But this is really just the beginning of the rise of the machines.
I show you how leading developers in our industry are using open source and commercial tools for automating much more. They've got "robots" for monitoring production servers, updating issues, supporting customers, reviewing code, setting up laptops, doing development reporting, conducting customer feedback -- even automating daily standups. In what instances is it useful to automate? In what cases does it not make sense? Automation prevents us from having to do the same thing twice, helps us to work better together, reduces workflow errors and frees up time to write production code. Plus, as it turns out, spending time on automation is fun! Don't be afraid of robots in software development, embrace them! Even if I save you just half an hour a week, this talk will be a beneficial investment of your time.
We work together in teams, across divisions and with different companies. A lot of our productive work time is lost because information is kept in departments, on file servers or in peoples heads. With the trend to distributed organizations we need to communicate more effective.
This sessions shows how companies like Atlassian and Hubspot have encouraged their employees to live and breath a collaborative culture. I will talk about 4 things that helped us work happily together like building a great work environment, focusing on people instead of roles, using tools to communicate faster and more transparently and staying away from a command & control mentality. Collaboration creates greater value, enhances achievement, and produces sustainable business models. It’s time to move from the industrial age to the information age and start the collaboration revolution!
We often relate Domain-Driven Design with the content of Eric Evans' book; however even this book suggests looking outside for other patterns and inspirations: analysis patterns (Accounting, Finance), domain-oriented use of design patterns (the Flyweight pattern), established formalisms (e.g. monoids) and XP literature in particular (e.g. the patterns on the c2 wiki and OOPSLA papers).
The world has not stopped since the book either, and new ideas keep on emerging regularly. And you can share your own patterns as well.
In this session, through examples and code we'll go through some particularly important patterns which deserve to be in your tool belt. We'll also provide guidance on how best to use them (or not), at the right time and in the right context, and on how to train your colleagues on them!
Introduction au langage PHP (1ere partie) élaborée par Marouan OMEZZINEMarouan OMEZZINE
Une introduction / ébauche au langage PHP (intro, xampp, premier pas, les structures, les variables, les types, les fonctions ...) faite dans le cadre des formations inter-membres du club Junior ENSI (http://www.junior-ensi.org/) de l'école nationale des sciences de l'informatique (http://www.ensi.rnu.tn/).
Codedarmor 2012 - 13/11 - Dart, un langage moderne pour le webcodedarmor
Dart est un nouveau langage de programmation pour développer des applications web. Créé par Google et ouvert au grand public en octobre 2011, c’est un langage orienté-objet avec une syntaxe familière que l’on soit développeur Java ou développeur JavaScript. Ses deux objectifs ? Performance et facilité d’utilisation.
Dans cette présentation, nous verrons quels sont les objectifs de Google en introduisant ce nouveau langage. Nous irons au coeur de celui-ci en présentant les différentes spécificités, typage optionnel, les Isolates comme modèle de concurrence, les différents mode d’exécution, la gestion du DOM... Enfin, nous discuterons des échéances à venir pour savoir si Dart saura s’imposer ou non en tant que langage d’avenir pour le web.
Par Julien Vey de Zénika.
Lors de cette présentation vous trouverez la liste complète des nouveautés de PHP 5.3 avec des exemples d'implémentation.
Elle comprend aussi une partie introspective sur le futur de PHP au 30 juin 2010
L'IA connaît une croissance rapide et son intégration dans le domaine éducatif soulève de nombreuses questions. Aujourd'hui, nous explorerons comment les étudiants utilisent l'IA, les perceptions des enseignants à ce sujet, et les mesures possibles pour encadrer ces usages.
Constat Actuel
L'IA est de plus en plus présente dans notre quotidien, y compris dans l'éducation. Certaines universités, comme Science Po en janvier 2023, ont interdit l'utilisation de l'IA, tandis que d'autres, comme l'Université de Prague, la considèrent comme du plagiat. Cette diversité de positions souligne la nécessité urgente d'une réponse institutionnelle pour encadrer ces usages et prévenir les risques de triche et de plagiat.
Enquête Nationale
Pour mieux comprendre ces dynamiques, une enquête nationale intitulée "L'IA dans l'enseignement" a été réalisée. Les auteurs de cette enquête sont Le Sphynx (sondage) et Compilatio (fraude académique). Elle a été diffusée dans les universités de Lyon et d'Aix-Marseille entre le 21 juin et le 15 août 2023, touchant 1242 enseignants et 4443 étudiants. Les questionnaires, conçus pour étudier les usages de l'IA et les représentations de ces usages, abordaient des thèmes comme les craintes, les opportunités et l'acceptabilité.
Résultats de l'Enquête
Les résultats montrent que 55 % des étudiants utilisent l'IA de manière occasionnelle ou fréquente, contre 34 % des enseignants. Cependant, 88 % des enseignants pensent que leurs étudiants utilisent l'IA, ce qui pourrait indiquer une surestimation des usages. Les usages identifiés incluent la recherche d'informations et la rédaction de textes, bien que ces réponses ne puissent pas être cumulées dans les choix proposés.
Analyse Critique
Une analyse plus approfondie révèle que les enseignants peinent à percevoir les bénéfices de l'IA pour l'apprentissage, contrairement aux étudiants. La question de savoir si l'IA améliore les notes sans développer les compétences reste débattue. Est-ce un dopage académique ou une opportunité pour un apprentissage plus efficace ?
Acceptabilité et Éthique
L'enquête révèle que beaucoup d'étudiants jugent acceptable d'utiliser l'IA pour rédiger leurs devoirs, et même un quart des enseignants partagent cet avis. Cela pose des questions éthiques cruciales : copier-coller est-il tricher ? Utiliser l'IA sous supervision ou pour des traductions est-il acceptable ? La réponse n'est pas simple et nécessite un débat ouvert.
Propositions et Solutions
Pour encadrer ces usages, plusieurs solutions sont proposées. Plutôt que d'interdire l'IA, il est suggéré de fixer des règles pour une utilisation responsable. Des innovations pédagogiques peuvent également être explorées, comme la création de situations de concurrence professionnelle ou l'utilisation de détecteurs d'IA.
Conclusion
En conclusion, bien que l'étude présente des limites, elle souligne un besoin urgent de régulation. Une charte institutionnelle pourrait fournir un cadre pour une utilisation éthique.
Ouvrez la porte ou prenez un mur (Agile Tour Genève 2024)Laurent Speyser
(Conférence dessinée)
Vous êtes certainement à l’origine, ou impliqué, dans un changement au sein de votre organisation. Et peut être que cela ne se passe pas aussi bien qu’attendu…
Depuis plusieurs années, je fais régulièrement le constat de l’échec de l’adoption de l’Agilité, et plus globalement de grands changements, dans les organisations. Je vais tenter de vous expliquer pourquoi ils suscitent peu d'adhésion, peu d’engagement, et ils ne tiennent pas dans le temps.
Heureusement, il existe un autre chemin. Pour l'emprunter il s'agira de cultiver l'invitation, l'intelligence collective , la mécanique des jeux, les rites de passages, .... afin que l'agilité prenne racine.
Vous repartirez de cette conférence en ayant pris du recul sur le changement tel qu‘il est généralement opéré aujourd’hui, et en ayant découvert (ou redécouvert) le seul guide valable à suivre, à mon sens, pour un changement authentique, durable, et respectueux des individus! Et en bonus, 2 ou 3 trucs pratiques!
Le Comptoir OCTO - Qu’apporte l’analyse de cycle de vie lors d’un audit d’éco...OCTO Technology
Par Nicolas Bordier (Consultant numérique responsable @OCTO Technology) et Alaric Rougnon-Glasson (Sustainable Tech Consultant @OCTO Technology)
Sur un exemple très concret d’audit d’éco-conception de l’outil de bilan carbone C’Bilan développé par ICDC (Caisse des dépôts et consignations) nous allons expliquer en quoi l’ACV (analyse de cycle de vie) a été déterminante pour identifier les pistes d’actions pour réduire jusqu'à 82% de l’empreinte environnementale du service.
Vidéo Youtube : https://www.youtube.com/watch?v=7R8oL2P_DkU
Compte-rendu :
MongoDB in a scale-up: how to get away from a monolithic hell — MongoDB Paris...Horgix
This is the slide deck of a talk by Alexis "Horgix" Chotard and Laurentiu Capatina presented at the MongoDB Paris User Group in June 2024 about the feedback on how PayFit move away from a monolithic hell of a self-hosted MongoDB cluster to managed alternatives. Pitch below.
March 15, 2023, 6:59 AM: a MongoDB cluster collapses. Tough luck, this cluster contains 95% of user data and is absolutely vital for even minimal operation of our application. To worsen matters, this cluster is 7 years behind on versions, is not scalable, and barely observable. Furthermore, even the data model would quickly raise eyebrows: applications communicating with each other by reading/writing in the same MongoDB documents, documents reaching the maximum limit of 16MiB with hundreds of levels of nesting, and so forth. The incident will last several days and result in the loss of many users. We've seen better scenarios.
Let's explore how PayFit found itself in this hellish situation and, more importantly, how we managed to overcome it!
On the agenda: technical stabilization, untangling data models, breaking apart a Single Point of Failure (SPOF) into several elements with a more restricted blast radius, transitioning to managed services, improving internal accesses, regaining control over risky operations, and ultimately, approaching a technical migration when it impacts all development teams.
2. Oui mais pourquoi ? Déboguer dans la nature, diagnostiquer les problèmes avec un outil le plus léger possible.
3. Windbg - cdb - kd Debuggingtools for Windows http://microsoft.com/whdc/devtools/debugging Au départ, un debugger natif (user mode et kernel), frontal de bibliothèques bas niveau. windbg : debugger « graphique » cdb + kd (kernel debugger) cdb (console debugger) : idem que windbg mais en pure ligne de commande et user mode seulement kd: kernel debugger (ligne de commande) Pas d’installation nécessaire (le répertoire décompressé par l’installeur se suffit à lui-même)
4. sos et sosex sos (Son of Strike) extension pour le code managé livré avec chaque version du framework sosex extension non MS de sos Steve Johnson - http://www.stevestechspot.com/
5. sos This is how the world ends, not with a bang but a whimper. (TS Eliot – the Hollow men) This is how sosbegins, not with a whimper but a bang. !Help
6. Avant de commencer .loadbysosmscorwks charge la version de sos correspondant à la version du CLR de l’appli .loadsosex charge sosex (doit être dans le répertoire de windbg/cdb) .cmdtree[path]md_tree.txt menu de racourcis personnalisable
7. Commandes amusantes !DumpHeap –stat Instances du tas, regroupées par type, triées par taille totale !Threads Liste des threads managed !runaway 7 Exécute l’appli et classe les threads par temps CPU utilisé
8. Encore des commandes ! SyncBlk Liste des locks pris ou attendus ! ClrStack [-a|-l|-p] Call stack CLR du thread courant, et variables et arguments courants !GCRootaddr Liste récursive des instances qui font que l’instance à [addr] reste en vie
9. Toujours plus haut ~n Infos sur le thread n ~ns Passer au thread n ~e*!ClrStack –a Exécuter une commande sur tous les threads
10. sosex !dlk Détection automatique des deadlocks !refsaddr Liste des références de et vers l’instance à addr !mln addr Type de l’objet CLR à l’adresse addr
17. Powerdbg Frontal Powershell de pilotage de windbg http://www.codeplex.com/powerdbg powershell –(stdin/out)-> cdb –(tcp)-> windbg Fonctions simples pour lancer des commandes Sortie standard de cdb traitée ligne à ligne Scripts (et moteur) écrits en Powershell (une trentaine de scripts existants)
20. Ressources Debugging MS .Net 2.0 Applications, MS Press, John Robbins Windows Internals 5th edition, MS Press, Mark Russinovitch Blog de Tess Ferrandez Blog de John Robbins www.codeplex.com/powerdbg www.polom.com/linqdbg