Soumettre la recherche
Mettre en ligne
Hadoop ecosystem framework n hadoop in live environment
•
Télécharger en tant que PPT, PDF
•
8 j'aime
•
2,461 vues
Delhi/NCR HUG
Suivre
Delhi Hadoop User Group MeetUp - 10th Sept. 2011 -Slides
Lire moins
Lire la suite
Technologie
Formation
Signaler
Partager
Signaler
Partager
1 sur 37
Télécharger maintenant
Recommandé
The slides for HadoopCon 2014 in Taiwan.
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
At the end of day, the only thing that data scientists want is tabular data for their analysis. They do not want to spend hours or days preparing data. How does a data engineer handle the massive amount of data that is being streamed at them from IoT devices and apps, and at the same time add structure to it so that data scientists can focus on finding insights and not preparing data? By the way, you need to do this within minutes (sometimes seconds). Oh… and there are a lot of other data sources that you need to ingest, and the current providers of data are changing their structure. GoPro has massive amounts of heterogeneous data being streamed from their consumer devices and applications, and they have developed the concept of “dynamic DDL” to structure their streamed data on the fly using Spark Streaming, Kafka, HBase, Hive and S3. The idea is simple: Add structure (schema) to the data as soon as possible; allow the providers of the data to dictate the structure; and automatically create event-based and state-based tables (DDL) for all data sources to allow data scientists to access the data via their lingua franca, SQL, within minutes.
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Databricks
Interactive and Distributed SQL Query Engine by facebook
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Data scientists write SQL queries everyday. Very often they know how to write correct queries but don’t know why their queries are slow. This is more obvious in Spark than in Redshift as Spark requires additional tuning such as caching while Redshift does heavy lifting behind the scene. In this talk I will cover a few lessons we learned from migrating one of the biggest table here (900M+ rows/day) from AWS Redshift to Spark. Specifically: – Why and how do we migrate? – How do we tune the query for Spark to gain 10x speed vs direct translated from Redshift – How do we scale the team on Spark (with 80+ people in our data science team)
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Spark Summit
Dr. Eric N. Hanson, Principal Software Development Engineer at Microsoft and Apache Hive committer presents the recent improvements in Hive
Overview of the Hive Stinger Initiative
Overview of the Hive Stinger Initiative
Modern Data Stack France
This presentation gives an overview of Apache Spark and explains the features of Apache Zeppelin(incubator). Zeppelin is the open source tool for data discovery, exploration and visualization. It supports REPLs for shell, SparkSQL, Spark(scala), python and angular. This presentation was made on the Big Data Day, at the Great Indian Developer Summit, Bangalore, April 2015
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
prajods
Presentation at Spark Summit 2015
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Spark Summit
Slides for presentation at Spark Summit 2017 in San Francisco.
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
hadooparchbook
Recommandé
The slides for HadoopCon 2014 in Taiwan.
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
At the end of day, the only thing that data scientists want is tabular data for their analysis. They do not want to spend hours or days preparing data. How does a data engineer handle the massive amount of data that is being streamed at them from IoT devices and apps, and at the same time add structure to it so that data scientists can focus on finding insights and not preparing data? By the way, you need to do this within minutes (sometimes seconds). Oh… and there are a lot of other data sources that you need to ingest, and the current providers of data are changing their structure. GoPro has massive amounts of heterogeneous data being streamed from their consumer devices and applications, and they have developed the concept of “dynamic DDL” to structure their streamed data on the fly using Spark Streaming, Kafka, HBase, Hive and S3. The idea is simple: Add structure (schema) to the data as soon as possible; allow the providers of the data to dictate the structure; and automatically create event-based and state-based tables (DDL) for all data sources to allow data scientists to access the data via their lingua franca, SQL, within minutes.
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Databricks
Interactive and Distributed SQL Query Engine by facebook
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Data scientists write SQL queries everyday. Very often they know how to write correct queries but don’t know why their queries are slow. This is more obvious in Spark than in Redshift as Spark requires additional tuning such as caching while Redshift does heavy lifting behind the scene. In this talk I will cover a few lessons we learned from migrating one of the biggest table here (900M+ rows/day) from AWS Redshift to Spark. Specifically: – Why and how do we migrate? – How do we tune the query for Spark to gain 10x speed vs direct translated from Redshift – How do we scale the team on Spark (with 80+ people in our data science team)
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
Spark Summit
Dr. Eric N. Hanson, Principal Software Development Engineer at Microsoft and Apache Hive committer presents the recent improvements in Hive
Overview of the Hive Stinger Initiative
Overview of the Hive Stinger Initiative
Modern Data Stack France
This presentation gives an overview of Apache Spark and explains the features of Apache Zeppelin(incubator). Zeppelin is the open source tool for data discovery, exploration and visualization. It supports REPLs for shell, SparkSQL, Spark(scala), python and angular. This presentation was made on the Big Data Day, at the Great Indian Developer Summit, Bangalore, April 2015
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
prajods
Presentation at Spark Summit 2015
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Spark Summit
Slides for presentation at Spark Summit 2017 in San Francisco.
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
hadooparchbook
Presto, an open source distributed SQL engine originally built at Facebook, has a rapidly growing community of developers and users. In this talk, speakers from both Facebook and Teradata, will discuss technical details of some of the recent developments such as integration with Hadoop ecosystem (YARN/Slider and Ambari), security features (Kerberos), enabling BI tools via JDBC/ODBC drivers, new connectors (Redis, MongoDB) and storage engines (Raptor) as well as improvements in performance and ANSI SQL coverage. In addition, we will present a few use cases and major new users that leverage interactive SQL capabilities Presto offers. Finally, we will present our roadmap for the next year. See the video at https://youtu.be/wMy3LXuTb0U
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
kbajda
http://www.hakkalabs.co/articles/building-data-pipeline-scratch
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Hakka Labs
Apache Kylin is a distributed OLAP engine on Hadoop, which provides sub-second level query latency over datasets scaling to petabytes. Kylin’s superior query performance relies on pre-calculated multi-dimension Cube, which is often time-consuming to build. By default, Kylin uses MapReduce Cube Engine built atop of Hadoop MapReduce framework to aggregate huge amounts of source data. The MR Engine has been well-tuned over years and proven to be stable in hundreds of production deployments. Recently, the Kylin team is trying to further speed up the process of cube building by replacing MR with Spark. Kyligence has initiated the new Spark Cube Engine with some benchmarks between Spark and MR over different datasets, and has received some promising results. Hear about their results and experiences on moving Cube building, which is a huge computing task, to Spark.
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Databricks
Spark Summit 2016 talk by Guozhang Wang
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Accelerating Spark Machine Learning with Redis Modules
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit
Sparkling Water 2.0: The Next Generation of Machine Learning on Apache Spark
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub Hava
Spark Summit
Presentation at Spark Summit 2015
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark Summit
kelly technologies is the best Hadoop Training Institutes in Hyderabad. Providing Hadoop training by real time faculty in Hyderaba www.kellytechno.com
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
Kelly Technologies
A short tutorial on Presto and its internals at Treasure Data.
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
Intro
Introduction to apache spark
Introduction to apache spark
UserReport
From Single-Tenant Hadoop to 3000 Tenants in Apache Spark: Experiences from Watson Analytics
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit
Presentation on Presto (http://prestodb.io) basics, design and Teradata's open source involvement. Presented on Sept 24th 2015 by Wojciech Biela and Łukasz Osipiuk at the #20 Warsaw Hadoop User Group meetup http://www.meetup.com/warsaw-hug/events/224872317
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
Wojciech Biela
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache Spark
DataWorks Summit
At Databricks, we have a unique view into hundreds different companies using Apache Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common manageability, debugging, and visibility issues that our users run into. This talk will first show some representatives of these common issues. Then, we will show you what we have done and have been working on in Databricks to make Spark clusters easier to manage, monitor, and debug.
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Databricks
An in depth introduction to Apache Spark. Presented at Bangalore Apache Spark Meetup on 21/02/2015.
Introduction to Apache Spark
Introduction to Apache Spark
datamantra
Machine learning has its challenges, and understanding the algorithms is not always easy. In this session, you’ll discover methods to make these challenges less daunting. Intended for software engineers who need to understand the requirements and constraints of data scientists, and data scientists who need to implement or help implement production systems, the session will begin with a quick introduction to data quality and a level-set on common vocabulary. You’ll then explore the formats that are required by Spark ML to run its algorithms, and see how to automate the build through user-defined functions and other techniques. Automation will make reproducibility easy, minimize errors and increase the efficiency of data scientists. Key takeaways will include: – How to build the required tool set in Java – Understanding the formats required by Spark ML (a new vocabulary) – Learning fundamentals about data quality and how to make sure the data is usable
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
Databricks
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
Spark Summit
Apache spark - History and market overview
Apache spark - History and market overview
Apache spark - History and market overview
Martin Zapletal
With components like Spark SQL, MLlib, and Streaming, Spark is a unified engine for building data applications. In this talk, we will take a look at how we use Spark on our own Databricks platform throughout our data pipeline for use cases such as ETL, data warehousing, and real time analysis. We will demonstrate how these applications empower engineering and data analytics. We will also share some lessons learned from building our data pipeline around security and operations. This talk will include examples on how to use Structured Streaming (a.k.a Streaming DataFrames) for online analysis, SparkR for offline analysis, and how we connect multiple sources to achieve a Just-In-Time Data Warehouse.
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Databricks
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2nwSwEh. Marco Bonzanini discusses the process of building data pipelines, e.g. extraction, cleaning, integration, pre-processing of data; in general, all the steps necessary to prepare data for a data-driven product. In particular, he focuses on data plumbing and on the practice of going from prototype to production. Filmed at qconlondon.com. Marco Bonzanini is Data Scientist and co-organizer of PyData London Meetup.
Building Data Pipelines in Python
Building Data Pipelines in Python
C4Media
Now Lord Voldemort can easily manage his Hadoop jobs
Azkaban
Azkaban
Anatoliy Nikulin
Managing a workflow using Azkaban scheduler. It can be used in batch as well as interactive workloads
Interactive workflow management using Azkaban
Interactive workflow management using Azkaban
datamantra
Contenu connexe
Tendances
Presto, an open source distributed SQL engine originally built at Facebook, has a rapidly growing community of developers and users. In this talk, speakers from both Facebook and Teradata, will discuss technical details of some of the recent developments such as integration with Hadoop ecosystem (YARN/Slider and Ambari), security features (Kerberos), enabling BI tools via JDBC/ODBC drivers, new connectors (Redis, MongoDB) and storage engines (Raptor) as well as improvements in performance and ANSI SQL coverage. In addition, we will present a few use cases and major new users that leverage interactive SQL capabilities Presto offers. Finally, we will present our roadmap for the next year. See the video at https://youtu.be/wMy3LXuTb0U
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
kbajda
http://www.hakkalabs.co/articles/building-data-pipeline-scratch
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Hakka Labs
Apache Kylin is a distributed OLAP engine on Hadoop, which provides sub-second level query latency over datasets scaling to petabytes. Kylin’s superior query performance relies on pre-calculated multi-dimension Cube, which is often time-consuming to build. By default, Kylin uses MapReduce Cube Engine built atop of Hadoop MapReduce framework to aggregate huge amounts of source data. The MR Engine has been well-tuned over years and proven to be stable in hundreds of production deployments. Recently, the Kylin team is trying to further speed up the process of cube building by replacing MR with Spark. Kyligence has initiated the new Spark Cube Engine with some benchmarks between Spark and MR over different datasets, and has received some promising results. Hear about their results and experiences on moving Cube building, which is a huge computing task, to Spark.
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Databricks
Spark Summit 2016 talk by Guozhang Wang
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Accelerating Spark Machine Learning with Redis Modules
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit
Sparkling Water 2.0: The Next Generation of Machine Learning on Apache Spark
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub Hava
Spark Summit
Presentation at Spark Summit 2015
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark Summit
kelly technologies is the best Hadoop Training Institutes in Hyderabad. Providing Hadoop training by real time faculty in Hyderaba www.kellytechno.com
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
Kelly Technologies
A short tutorial on Presto and its internals at Treasure Data.
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
Intro
Introduction to apache spark
Introduction to apache spark
UserReport
From Single-Tenant Hadoop to 3000 Tenants in Apache Spark: Experiences from Watson Analytics
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit
Presentation on Presto (http://prestodb.io) basics, design and Teradata's open source involvement. Presented on Sept 24th 2015 by Wojciech Biela and Łukasz Osipiuk at the #20 Warsaw Hadoop User Group meetup http://www.meetup.com/warsaw-hug/events/224872317
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
Wojciech Biela
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache Spark
DataWorks Summit
At Databricks, we have a unique view into hundreds different companies using Apache Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common manageability, debugging, and visibility issues that our users run into. This talk will first show some representatives of these common issues. Then, we will show you what we have done and have been working on in Databricks to make Spark clusters easier to manage, monitor, and debug.
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Databricks
An in depth introduction to Apache Spark. Presented at Bangalore Apache Spark Meetup on 21/02/2015.
Introduction to Apache Spark
Introduction to Apache Spark
datamantra
Machine learning has its challenges, and understanding the algorithms is not always easy. In this session, you’ll discover methods to make these challenges less daunting. Intended for software engineers who need to understand the requirements and constraints of data scientists, and data scientists who need to implement or help implement production systems, the session will begin with a quick introduction to data quality and a level-set on common vocabulary. You’ll then explore the formats that are required by Spark ML to run its algorithms, and see how to automate the build through user-defined functions and other techniques. Automation will make reproducibility easy, minimize errors and increase the efficiency of data scientists. Key takeaways will include: – How to build the required tool set in Java – Understanding the formats required by Spark ML (a new vocabulary) – Learning fundamentals about data quality and how to make sure the data is usable
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
Databricks
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
Spark Summit
Apache spark - History and market overview
Apache spark - History and market overview
Apache spark - History and market overview
Martin Zapletal
With components like Spark SQL, MLlib, and Streaming, Spark is a unified engine for building data applications. In this talk, we will take a look at how we use Spark on our own Databricks platform throughout our data pipeline for use cases such as ETL, data warehousing, and real time analysis. We will demonstrate how these applications empower engineering and data analytics. We will also share some lessons learned from building our data pipeline around security and operations. This talk will include examples on how to use Structured Streaming (a.k.a Streaming DataFrames) for online analysis, SparkR for offline analysis, and how we connect multiple sources to achieve a Just-In-Time Data Warehouse.
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Databricks
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2nwSwEh. Marco Bonzanini discusses the process of building data pipelines, e.g. extraction, cleaning, integration, pre-processing of data; in general, all the steps necessary to prepare data for a data-driven product. In particular, he focuses on data plumbing and on the practice of going from prototype to production. Filmed at qconlondon.com. Marco Bonzanini is Data Scientist and co-organizer of PyData London Meetup.
Building Data Pipelines in Python
Building Data Pipelines in Python
C4Media
Tendances
(20)
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub Hava
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Introduction to apache spark
Introduction to apache spark
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache Spark
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Introduction to Apache Spark
Introduction to Apache Spark
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
Apache spark - History and market overview
Apache spark - History and market overview
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Building Data Pipelines in Python
Building Data Pipelines in Python
En vedette
Now Lord Voldemort can easily manage his Hadoop jobs
Azkaban
Azkaban
Anatoliy Nikulin
Managing a workflow using Azkaban scheduler. It can be used in batch as well as interactive workloads
Interactive workflow management using Azkaban
Interactive workflow management using Azkaban
datamantra
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
DataWorks Summit
Kiev, 26 of October
Hadoop presentation
Hadoop presentation
Vlad Orlov
Description of using Pig with the Azkaban workflow scheduler for Hadoop
Azkaban and Pig at LinkedIn
Azkaban and Pig at LinkedIn
Russell Jurney
Azkaban - WorkFlow Scheduler/Automation Engine Seminar given at KPMG by Praveen Thirukonda.
Azkaban - WorkFlow Scheduler/Automation Engine
Azkaban - WorkFlow Scheduler/Automation Engine
Praveen Thirukonda
Или как писать Rich Internet Applications, в старом добром десктопном стиле.
Vaadin thinking of u and i. Или как писать Rich Internet Applications, в стар...
Vaadin thinking of u and i. Или как писать Rich Internet Applications, в стар...
Anatoliy Nikulin
Или как покупать контекстную рекламу в режиме реального времени, и не утонуть в водопаде данных.
Архитектура продукта Thumbtack RTB Bidder
Архитектура продукта Thumbtack RTB Bidder
Anatoliy Nikulin
Сравниваем Hive и Pig. Две ETL системы для работы с большими данными/
Hive vs Pig
Hive vs Pig
Anatoliy Nikulin
HBase Architecture. BigData
HBase inside
HBase inside
Anatoliy Nikulin
Анализ многолетних наблюдений омской ИТ отрасли в пяти минутах
Куда мы катимся. Анализ многолетних наблюдений омской ИТ отрасли в пяти минутах
Куда мы катимся. Анализ многолетних наблюдений омской ИТ отрасли в пяти минутах
Anatoliy Nikulin
Apache Hive Лучше день потерять, но потом за пять минут долететь
Apache Hive
Apache Hive
Anatoliy Nikulin
Guagua, a sub-project of Shifu, is a distributed, pluggable and scalable iterative computing framework based on Hadoop MapReduce and YARN. Typical use cases for Guagua are distributed machine learning model training based on Hadoop. By using Guagua, we implement distributed neural network algorithm which can reduce model training time from days to hours on 500GB data sets.
Guagua an iterative computing framework on hadoop
Guagua an iterative computing framework on hadoop
pengshanzhang
Доклад на конференции Юкон 2016. Разбивка на архитектурные слои системы управления данными (DMP)
Конференция Юкон. Процессинг данных на лямбда архитектуре.
Конференция Юкон. Процессинг данных на лямбда архитектуре.
Anatoliy Nikulin
Apache hadoop
Apache hadoop
Darpan Dekivadiya
NoSQL, типы, виды, и решения
NoSQL thumbtack experience, Анатолий Никулин
NoSQL thumbtack experience, Анатолий Никулин
Anatoliy Nikulin
Snapshot of the hadoop ecosystem at the beginning of 2014, with the rise of real time and in memory processing distributed frameworks that complement and supplant the Map Reduce paradigm
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku
Hadoop Ecosystem and Hadoop-Related Projects at Apache excluding Cloudera project related to Hadoop
Hadoop Ecosystem
Hadoop Ecosystem
Sandip Darwade
Introduction of Hadoop Ecosystem
Hadoop ecosystem
Hadoop ecosystem
Stanley Wang
Hadoop is the popular open source like Facebook, Twitter, RFID readers, sensors, and implementation of MapReduce, a powerful tool so on.Your management wants to derive designed for deep analysis and transformation of information from both the relational data and thevery large data sets. Hadoop enables you to unstructuredexplore complex data, using custom analyses data, and wants this information as soon astailored to your information and questions. possible.Hadoop is the system that allows unstructured What should you do? Hadoop may be the answer!data to be distributed across hundreds or Hadoop is an open source project of the Apachethousands of machines forming shared nothing Foundation.clusters, and the execution of Map/Reduce It is a framework written in Java originallyroutines to run on the data in that cluster. Hadoop developed by Doug Cutting who named it after hishas its own filesystem which replicates data to sons toy elephant.multiple nodes to ensure if one node holding data Hadoop uses Google’s MapReduce and Google Filegoes down, there are at least 2 other nodes from System technologies as its foundation.which to retrieve that piece of information. This It is optimized to handle massive quantities of dataprotects the data availability from node failure, which could be structured, unstructured orsomething which is critical when there are many semi-structured, using commodity hardware, thatnodes in a cluster (aka RAID at a server level). is, relatively inexpensive computers. This massive parallel processing is done with greatWhat is Hadoop? performance. However, it is a batch operation handling massive quantities of data, so theThe data are stored in a relational database in your response time is not immediate.desktop computer and this desktop computer As of Hadoop version 0.20.2, updates are nothas no problem handling this load. possible, but appends will be possible starting inThen your company starts growing very quickly, version 0.21.and that data grows to 10GB. Hadoop replicates its data across differentAnd then 100GB. computers, so that if one goes down, the data areAnd you start to reach the limits of your current processed on one of the replicated computers.desktop computer. Hadoop is not suitable for OnLine Transaction So you scale-up by investing in a larger computer, Processing workloads where data are randomly and you are then OK for a few more months. accessed on structured data like a relational When your data grows to 10TB, and then 100TB. database.Hadoop is not suitable for OnLineAnd you are fast approaching the limits of that Analytical Processing or Decision Support Systemcomputer. workloads where data are sequentially accessed onMoreover, you are now asked to feed your structured data like a relational database, to application with unstructured data coming from generate reports that provide business sources intelligence. Hadoop is used for Big Data. It complements OnLine Transaction Processing and OnLine Analytical Pro
Hadoop technology
Hadoop technology
tipanagiriharika
En vedette
(20)
Azkaban
Azkaban
Interactive workflow management using Azkaban
Interactive workflow management using Azkaban
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
Hadoop presentation
Hadoop presentation
Azkaban and Pig at LinkedIn
Azkaban and Pig at LinkedIn
Azkaban - WorkFlow Scheduler/Automation Engine
Azkaban - WorkFlow Scheduler/Automation Engine
Vaadin thinking of u and i. Или как писать Rich Internet Applications, в стар...
Vaadin thinking of u and i. Или как писать Rich Internet Applications, в стар...
Архитектура продукта Thumbtack RTB Bidder
Архитектура продукта Thumbtack RTB Bidder
Hive vs Pig
Hive vs Pig
HBase inside
HBase inside
Куда мы катимся. Анализ многолетних наблюдений омской ИТ отрасли в пяти минутах
Куда мы катимся. Анализ многолетних наблюдений омской ИТ отрасли в пяти минутах
Apache Hive
Apache Hive
Guagua an iterative computing framework on hadoop
Guagua an iterative computing framework on hadoop
Конференция Юкон. Процессинг данных на лямбда архитектуре.
Конференция Юкон. Процессинг данных на лямбда архитектуре.
Apache hadoop
Apache hadoop
NoSQL thumbtack experience, Анатолий Никулин
NoSQL thumbtack experience, Анатолий Никулин
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
Hadoop Ecosystem
Hadoop Ecosystem
Hadoop ecosystem
Hadoop ecosystem
Hadoop technology
Hadoop technology
Similaire à Hadoop ecosystem framework n hadoop in live environment
Understanding hadoop ecosystem.
Hadoop Big Data A big picture
Hadoop Big Data A big picture
J S Jodha
Hadoop and BigData presentation
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
Big Data raises challenges about how to process such vast pool of raw data and how to aggregate value to our lives. For addressing these demands an ecosystem of tools named Hadoop was conceived.
Big Data and Hadoop
Big Data and Hadoop
Flavio Vit
Hadoop Architecture, Components and few Design Samples of Web Systems which utilizes them in real sense! Enjoy !!
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
Hitendra Kumar
hadoop
hadoop
swatic018
hadoop
hadoop
swatic018
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
Nisanth Simon
A presentation I gave to R&D Informatics broadly introducing large scale data processing with Hadoop focusing on HDFS, MapReduce, Pig, and Hive.
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Chris Baglieri
Presentation given at the TDWI Executive Summit 2009 in San Diego, California.
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
Amr Awadallah
Big Data - Overiew , Architecture, Testing, Performance Testing , Performance Monitoring , Performance Tuning Hadoop Clusters , Automation Tools
Big data
Big data
Abilash Mavila
Eric Baldeschwieler keynote from Apache Lucene Eurocon conference, October 18, 2011.
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hortonworks
A short primer on the Hadoop - a distributed computing platform.
Hadoop Primer
Hadoop Primer
Steve Staso
Discussed the huge problems facing on Data storages and enclosed the Technology for it.Discussed drawbacks and future goals of it.
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Tarak Tar
Discussed the huge problems facing on Data storages and enclosed the Technology for it.Discussed drawbacks and future goals of it.
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Tarak Tar
Presentation from Owen O'Malley about Hadoop
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
A guide to using Apache Hadoop as your open source big data platform of choice, including the vendors that make various Hadoop flavors, related open source tools, Hadoop capabilities and suitable applications.
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Cognizant
This is a small presentation on Hadoop .This is useful for seminar topics..
Hadoop Technology
Hadoop Technology
Atul Kushwaha
jhuyyuvuyfytc
Hadoop tutorial
Hadoop tutorial
Aamir Ameen
hadoop architecture,hadoop,advantages of hadoop,disadvantages of hadoop,uses of hadoop,why hadoop,doug cutting,big data,hdfs,
Hadoop info
Hadoop info
Nikita Sure
Similaire à Hadoop ecosystem framework n hadoop in live environment
(20)
Hadoop Big Data A big picture
Hadoop Big Data A big picture
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Big Data and Hadoop
Big Data and Hadoop
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
hadoop
hadoop
hadoop
hadoop
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
Big data
Big data
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hadoop Primer
Hadoop Primer
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Hadoop basics
Hadoop basics
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
Hadoop Technology
Hadoop Technology
Hadoop tutorial
Hadoop tutorial
Hadoop info
Hadoop info
Dernier
Presentation on the progress in the Domino Container community project as delivered at the Engage 2024 conference
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Slides from the presentation on Machine Learning for the Arts & Humanities seminar at the University of Bologna (Digital Humanities and Digital Knowledge program)
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Copy of the slides presented by Matt Robison to the SFWelly Salesforce user group community on May 2 2024. The audience was truly international with attendees from at least 4 different countries joining online. Matt is an expert in data cloud and this was a brilliant session.
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Read about the journey the Adobe Experience Manager team has gone through in order to become and scale API-first throughout the organisation.
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
What is a good lead in your organisation? Which leads are priority? What happens to leads? When sales and marketing give different answers to these questions, or perhaps aren't sure of the answers at all, frustrations build and opportunities are left on the table. Join us for an illuminating session with Cian McLoughlin, HubSpot Principal Customer Success Manager, as we look at that crucial piece of the customer journey in which leads are transferred from marketing to sales.
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
Details
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
My presentation at the Lehigh Carbon Community College (LCCC) NSA GenCyber Cyber Security Day event that is intended to foster an interest in the cyber security field amongst college students.
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
Presented by Mike Hicks
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
DianaGray10
Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
Dernier
(20)
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Hadoop ecosystem framework n hadoop in live environment
1.
2.
3.
4.
HDFS Architecture
5.
Map Reduce Flow
By Ricky Ho
6.
HBase Architecture
7.
8.
9.
10.
11.
12.
13.
Oozie Flow Start
Map reduce Fork MR Streaming Pig Join Decision MR Pipes Java FileSystem End
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
Télécharger maintenant