Présentation faite dans le cadre du début du projet d'automatisation du SI de la Mairie de Noumea.
Elle avait pour but d'expliquer au DSI l'intérêt d'un tel changement.
Social Networks and the Richness of Datalarsgeorge
Social networks by their nature deal with large amounts of user-generated data that must be processed and presented in a time sensitive manner. Much more write intensive than previous generations of websites, social networks have been on the leading edge of non-relational persistence technology adoption. This talk presents how Germany's leading social networks Schuelervz, Studivz and Meinvz are incorporating Redis and Project Voldemort into their platform to run features like activity streams.
Présentation faite dans le cadre du début du projet d'automatisation du SI de la Mairie de Noumea.
Elle avait pour but d'expliquer au DSI l'intérêt d'un tel changement.
Social Networks and the Richness of Datalarsgeorge
Social networks by their nature deal with large amounts of user-generated data that must be processed and presented in a time sensitive manner. Much more write intensive than previous generations of websites, social networks have been on the leading edge of non-relational persistence technology adoption. This talk presents how Germany's leading social networks Schuelervz, Studivz and Meinvz are incorporating Redis and Project Voldemort into their platform to run features like activity streams.
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
Keynote during BiDaTA 2013 in Genoa, a special track of the ADBIS 2013 conference. URL: http://dbdmg.polito.it/bidata2013/index.php/keynote-presentation
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012larsgeorge
This document summarizes Lars George's presentation on moving from batch to real-time processing with Hadoop. It discusses using Hadoop (HDFS and MapReduce) for batch processing of large amounts of data and integrating real-time databases and stream processing tools like HBase and Storm to enable faster querying and analytics. Example architectures shown combine batch and real-time systems by using real-time tools to process streaming data and periodically syncing results to Hadoop and HBase for long-term storage and analysis.
HBase Applications - Atlanta HUG - May 2014larsgeorge
HBase is good a various workloads, ranging from sequential range scans to purely random access. These access patterns can be translated into application types, usually falling into two major groups: entities and events. This presentation discussed the underlying implications and how to approach those use-cases. Examples taken from Facebook show how this has been tackled in real life.
The document discusses several key factors for optimizing HBase performance including:
1. Reads and writes compete for disk, network, and thread resources so they can cause bottlenecks.
2. Memory allocation needs to balance space for memstores, block caching, and Java heap usage.
3. The write-ahead log can be a major bottleneck and increasing its size or number of logs can improve write performance.
4. Flushes and compactions need to be tuned to avoid premature flushes causing "compaction storms".
These are my slides for the 5 minute overview talk I gave during a recent workshop at the European Commission in Brussels, on the topic of "Big Data Skills in Europe".
Have a lot of data? Using or considering using Apache HBase (part of the Hadoop family) to store your data? Want to have your cake and eat it too? Phoenix is an open source project put out by Salesforce. Join us to learn how you can continue to use SQL, but get the raw speed of native HBase usage through Phoenix.
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second. This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they
http://berlinbuzzwords.de/sessions/advanced-hbase-schema-design
Sept 17 2013 - THUG - HBase a Technical IntroductionAdam Muise
HBase Technical Introduction. This deck includes a description of memory design, write path, read path, some operational tidbits, SQL on HBase (Phoenix and Hive), as well as HOYA (HBase on YARN).
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
This document provides a summary of new features and improvements in recent versions of Apache HBase, a distributed, scalable, big data store. It discusses major changes and enhancements in HBase 0.92+, 0.94+, and 0.96+, including new HFile formats, coprocessors, caching improvements, performance tuning, and more. The document is intended to bring readers up to date on the current state and capabilities of HBase.
Near-realtime analytics with Kafka and HBasedave_revell
A presentation at OSCON 2012 by Nate Putnam and Dave Revell about Urban Airship's analytics stack. Features Kafka, HBase, and Urban Airship's own open source projects statshtable and datacube.
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
This talk is about showing the complexity in building a data pipeline in Hadoop, starting with the technology aspect, and the correlating to the skillsets of current Hadoop adopters.
Parquet is an open-source columnar storage format that provides an efficient data layout for analytical queries. Twitter uses Parquet to store logs and analytics data across multiple large Hadoop clusters, saving petabytes of storage and reducing query times by up to 66% by reading only needed columns. Parquet defines a language-independent file format that stores data by column rather than row to optimize analytical access patterns.
This document summarizes Facebook's use cases and architecture for integrating Apache Hive and HBase. It discusses loading data from Hive into HBase tables using INSERT statements, querying HBase tables from Hive using SELECT statements, and maintaining low latency access to dimension tables stored in HBase while performing analytics on fact data stored in Hive. The architecture involves writing a storage handler and SerDe to map between the two systems and executing Hive queries by generating MapReduce jobs that read from or write to HBase.
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
Keynote during BiDaTA 2013 in Genoa, a special track of the ADBIS 2013 conference. URL: http://dbdmg.polito.it/bidata2013/index.php/keynote-presentation
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012larsgeorge
This document summarizes Lars George's presentation on moving from batch to real-time processing with Hadoop. It discusses using Hadoop (HDFS and MapReduce) for batch processing of large amounts of data and integrating real-time databases and stream processing tools like HBase and Storm to enable faster querying and analytics. Example architectures shown combine batch and real-time systems by using real-time tools to process streaming data and periodically syncing results to Hadoop and HBase for long-term storage and analysis.
HBase Applications - Atlanta HUG - May 2014larsgeorge
HBase is good a various workloads, ranging from sequential range scans to purely random access. These access patterns can be translated into application types, usually falling into two major groups: entities and events. This presentation discussed the underlying implications and how to approach those use-cases. Examples taken from Facebook show how this has been tackled in real life.
The document discusses several key factors for optimizing HBase performance including:
1. Reads and writes compete for disk, network, and thread resources so they can cause bottlenecks.
2. Memory allocation needs to balance space for memstores, block caching, and Java heap usage.
3. The write-ahead log can be a major bottleneck and increasing its size or number of logs can improve write performance.
4. Flushes and compactions need to be tuned to avoid premature flushes causing "compaction storms".
These are my slides for the 5 minute overview talk I gave during a recent workshop at the European Commission in Brussels, on the topic of "Big Data Skills in Europe".
Have a lot of data? Using or considering using Apache HBase (part of the Hadoop family) to store your data? Want to have your cake and eat it too? Phoenix is an open source project put out by Salesforce. Join us to learn how you can continue to use SQL, but get the raw speed of native HBase usage through Phoenix.
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second. This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they
http://berlinbuzzwords.de/sessions/advanced-hbase-schema-design
Sept 17 2013 - THUG - HBase a Technical IntroductionAdam Muise
HBase Technical Introduction. This deck includes a description of memory design, write path, read path, some operational tidbits, SQL on HBase (Phoenix and Hive), as well as HOYA (HBase on YARN).
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
This document provides a summary of new features and improvements in recent versions of Apache HBase, a distributed, scalable, big data store. It discusses major changes and enhancements in HBase 0.92+, 0.94+, and 0.96+, including new HFile formats, coprocessors, caching improvements, performance tuning, and more. The document is intended to bring readers up to date on the current state and capabilities of HBase.
Near-realtime analytics with Kafka and HBasedave_revell
A presentation at OSCON 2012 by Nate Putnam and Dave Revell about Urban Airship's analytics stack. Features Kafka, HBase, and Urban Airship's own open source projects statshtable and datacube.
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
This talk is about showing the complexity in building a data pipeline in Hadoop, starting with the technology aspect, and the correlating to the skillsets of current Hadoop adopters.
Parquet is an open-source columnar storage format that provides an efficient data layout for analytical queries. Twitter uses Parquet to store logs and analytics data across multiple large Hadoop clusters, saving petabytes of storage and reducing query times by up to 66% by reading only needed columns. Parquet defines a language-independent file format that stores data by column rather than row to optimize analytical access patterns.
This document summarizes Facebook's use cases and architecture for integrating Apache Hive and HBase. It discusses loading data from Hive into HBase tables using INSERT statements, querying HBase tables from Hive using SELECT statements, and maintaining low latency access to dimension tables stored in HBase while performing analytics on fact data stored in Hive. The architecture involves writing a storage handler and SerDe to map between the two systems and executing Hive queries by generating MapReduce jobs that read from or write to HBase.
Java dans Windows Azure: l'exemple de JonasMicrosoft
Jonas, serveur d'application J2EE, a récemment été porté par Bull, avec l'aide de Microsoft, sur Windows Azure. Au-delà de la mixité des environnements Java et Microsoft, cette session démontrera par l'exemple la grande ouverture de Windows Azure à des technologies peu habituées à s'éxécuter en environnement Windows.
Une usine logicielle est un ensemble d’outils pré-configurés, de frameworks, de conventions, de processus, de documentations et de modèles de projets qui structurent les développeurs et leurs développements.
L’objectif est d’automatiser au maximum la production et la maintenance des applications afin d’améliorer leur qualité et le « time to market ».
Présentation de Maven et de son utilisation en entreprise dans le cadre du Ch'ti JUG, le 15 juin 2009.
Pourquoi Maven ? Pourquoi l'adopter ? Les bonnes et mauvaise pratiques. Son avenir ...
Notre voyage vers le déploiement continu avec micro-services, la conteneurisation et l'orchestration des conteneurs utilisant Kubernetes. Sur notre chemin, nous avons dû créer divers outils pour nous aider à mieux utiliser et tester le tout avant d'aller en production. Nous avons également intégré une variété d'autres outils pour nous donner de la visibilité sur notre plate-forme. Cette conférence sera un aperçu de notre voyage jusqu'à maintenant.
Our journey towards continuous deployment with micro-services, containerization and orchestration of containers using Kubernetes. On our way there, we've had to create various tools to help us better use and test everything before going to production. We also had to integrate a variety of other tools to give us visibility on our platform.
This talk will be an overview of our journey up to now.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Crm acceleration paris-may2010- presentation de larry augustinYsance
SugarCRM is a leading provider of open source CRM software with over 600 employees worldwide. It has strong market traction with nearly 600 new customers in Q1, over 60,000 paid subscriber seats, and more than 60,000 downloads per month. SugarCRM is introducing version 6.0 of its software, which is focused on speed, simplicity, and interoperability.
7. Comparatif Puppet / Capistrano Tableau comparatif exposant les différences d’utilisation entre Puppet et Capistrano Puppet Capistrano Mode de fonctionnement Polling régulier des clients Tâches ponctuelles (appel manuel ou par cron) Orientation Objets / notions prédéfinis tels que « Package », « Service », « File », … Des commandes génériques comme « upload », « download », « system », « run », … Cible Infrastructure et services Services et applicatif Objectif Homogénéité de la configuration Reproductibilité des tâches et exécution en parallèle