SlideShare a Scribd company logo
1 of 23
Big Data at Globant
Success Cases in AWS
Sabina A. Schneider
What is Big Data?
What is Data Science?
Data Architecture                  Enterprise                  High
                                  Information               Availability
                                    Strategy                   and
                                                           Performance
                     NoSQL
                    Distributed                 Mission
                    Solutions                   Critical




                        Product Positioning in the Market

                    Deeper insight about your Customers

                            Analytics and Alerts on KPIs

                Cross-reference data with different sources
Core Technologies
BigData Ecosystem
Scalable Architecture in the Cloud

 Mobile Devices in
     the cars

                                                                                                                            Third Party
                                                   Web App         Web App              Web App
                                                                                                                            Integration


                     Elastic Load
  Mobile Devices      Balancer
                                                                Auto scaling singly




   Web Client

                                    NoSQL DB   S3 Bucket    Cloudfront    EMR Cluster               Storm
                                                                                                  Real Time
                                                                                                  processing


                                                       Hadoop

                                                                                                   Analytics
                                                                                                   Dashboard

                                                       Trends                                                  Web Client
                                                                         Pig

                                               BigData – storage and processing
Metamarkets                 has
developed a web-based
analytics     console       that
supports drill-downs and
roll-ups of high dimensional
data      sets       (real-time
bidding), comprising billions
of events, in real-time.

Data store collects 10 GB
of information every day,
and has over 15 TB.

Reports using Hadoop and
Hive on AWS Infrastructure.

The 40-instance cluster can
scan, filter, and aggregate 1
billion    rows     in   950
milliseconds.
Gree is a leading
casual           game
development
company.
Globant developed a
Hadoop           based
architecture to store
gaming events and
generate     telemetry
information.     These
metrics are used to
analyze,      segment
gamer          profiles,
estimate revenue and
perform      predictive
analysis on game
performance.
Products Positioning
in the Market
• Tweets recollection on
specific events (eg:
elections), integrated
with a set of
MapReduce based
queries

• Data stored in a 20-
node Hadoop cluster


• Google Visualization
tools for widget based
Dashboard
What?
• Innovation to the Financial Market
• Sentiment Analytics to what’s happening now and what can happen next in the
Market
• Predictions one week in advance according to comments on Tweeter


Challenges
• Aggresive Real Time analysis on Social Networks
• Dashboarding comparing with real values from Yahoo Finances
• Sentiment Analysis and Languague filtering
• Analytics Predictions
Data Science
                                  Recommend
                                     ation             Classification

               Sophisticated
               Mathematical
                algorithm

                                         Statistical
                                                                    Clustering
                                         Algorithm




                                Predictions on KPIs

                               Predictions on Metrics
Moneygram Transaction Scoring
Analysis of Moneygram historical transactional data labeled as Fraudulent/Non Fraudulent

     • 8 years of transactional data to analyze

Training using Support Vector Machines of historical data

     • Classification achieved by using only a subset of data using soft margins (by use of slack
     variables) to construct dividing hyperplane
     • Possible use of kernel principal components to preprocess data and reduce dimensionality of
     training dataset
     • Avoid high computation times (sparse solution)

Benefits
    • Detect fraudulent transactions with a higher level of accuracy
    • Increase in customer service satisfaction (less false-positives)
Shopping cart suggestion engine
Generate suggestions based on client shopping history

• Cluster a large dataset representing clients' shopping history using
unsupervised learning algorithms.

• Use information from new/existing client to classify into the clusterized
shopping history from ALL clients.

• Generate suggestions based on the cluster's shopping preferences

• Use of Hadoop and Mahout for clustering and posterior classification
•   Metadata word clustering using Solr

•   Content management and information sorting/ categorization classified by location.
    Enhance the performance at a view level.

•   Indexing of jwt content coming from different sources (internal and external) developed
    with Solr on Lucene. Integration with myJwt.com: internal social network.

      •   organize the content storage: service running in the Cloud that receives content,
          generate different assets (snapshot, thumbnails), extract metadata to be
          centralized in one place
      •   myIdeas: collect ideas from different creative designers from different location
          and share a bonus between the bright ideas
Data Visualization
                     Our data visualization practice allows our customers to understand
                     the evolution of key business drivers, trends, and drill down into the
                     root causes of deviations.

                     Our HTML5 data visualization solution, allows us to combine the
                     flexibility of a custom made solution with a fast time to market. It’s
                     based in standard Widgets, allowing each user to customize the
                     dashboard as required, and visualize it on every device.
Big Data Visualization Framework
Cloud server                     Browser
                 User input

               Video streaming
Kantar Media manages TV Advertisement displayed on DirecTV US.
We developed the addressable advertisement reporting solution, used by advertisers to plan and analyze the
performance of addressable advertisement.
Advertisement displayed on TV is customized to each user profile. The solution allows obtaining reliable
measurements from TV, analyzes the structure of the audience that has watched each advertisement, and
allows evaluating the ROI of the marketing campaign.
Touch screen based
scorecard, used by
the top management
to analyze and
compare results from
different countries
and products.
Thank you!

More Related Content

What's hot

Fraud prevention is better with TigerGraph inside
Fraud prevention is better with  TigerGraph insideFraud prevention is better with  TigerGraph inside
Fraud prevention is better with TigerGraph insideTigerGraph
 
Three Deep Web Analytics Wednesday
Three Deep Web Analytics WednesdayThree Deep Web Analytics Wednesday
Three Deep Web Analytics WednesdayThree Deep Marketing
 
Big data analytics use cases: all you need to know
Big data analytics use cases:  all you need to knowBig data analytics use cases:  all you need to know
Big data analytics use cases: all you need to knowJane Brewer
 
Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast DataBig Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast DataMatt Stubbs
 
Graph + AI World 2020: Opening Day Keynote
Graph + AI World 2020: Opening Day KeynoteGraph + AI World 2020: Opening Day Keynote
Graph + AI World 2020: Opening Day KeynoteTigerGraph
 
Big data landscape version 2.0
Big data landscape version 2.0Big data landscape version 2.0
Big data landscape version 2.0Matt Turck
 
Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Matt Turck
 
Callcenter HPE IDOL overview
Callcenter HPE IDOL overviewCallcenter HPE IDOL overview
Callcenter HPE IDOL overviewTania Akinina
 
Next Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data WarehouseNext Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data WarehouseDenodo
 
Mind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence DashboardsMind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence DashboardsUnilytics
 
EOH Analytics Offering
EOH Analytics OfferingEOH Analytics Offering
EOH Analytics Offeringalliekhan
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdpAIBDP
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningTigerGraph
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Denodo
 
Location Intelligence - The where factor
Location Intelligence - The where factorLocation Intelligence - The where factor
Location Intelligence - The where factorThomas Lejars
 
Denodo DataFest 2016: Metadata and Data: Search and Exploration
Denodo DataFest 2016: Metadata and Data: Search and ExplorationDenodo DataFest 2016: Metadata and Data: Search and Exploration
Denodo DataFest 2016: Metadata and Data: Search and ExplorationDenodo
 
Is your data paying you dividends?
Is your data paying you dividends? Is your data paying you dividends?
Is your data paying you dividends? Karan Sachdeva
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationDenodo
 

What's hot (20)

Fraud prevention is better with TigerGraph inside
Fraud prevention is better with  TigerGraph insideFraud prevention is better with  TigerGraph inside
Fraud prevention is better with TigerGraph inside
 
Three Deep Web Analytics Wednesday
Three Deep Web Analytics WednesdayThree Deep Web Analytics Wednesday
Three Deep Web Analytics Wednesday
 
Big data analytics use cases: all you need to know
Big data analytics use cases:  all you need to knowBig data analytics use cases:  all you need to know
Big data analytics use cases: all you need to know
 
Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast DataBig Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data
 
Graph + AI World 2020: Opening Day Keynote
Graph + AI World 2020: Opening Day KeynoteGraph + AI World 2020: Opening Day Keynote
Graph + AI World 2020: Opening Day Keynote
 
Big data landscape version 2.0
Big data landscape version 2.0Big data landscape version 2.0
Big data landscape version 2.0
 
Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)Big Data, Big Deal? (A Big Data 101 presentation)
Big Data, Big Deal? (A Big Data 101 presentation)
 
Callcenter HPE IDOL overview
Callcenter HPE IDOL overviewCallcenter HPE IDOL overview
Callcenter HPE IDOL overview
 
Next Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data WarehouseNext Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data Warehouse
 
Mind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence DashboardsMind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence Dashboards
 
Making Money With Big Data
Making Money With Big DataMaking Money With Big Data
Making Money With Big Data
 
EOH Analytics Offering
EOH Analytics OfferingEOH Analytics Offering
EOH Analytics Offering
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdp
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph Learning
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
 
Location Intelligence - The where factor
Location Intelligence - The where factorLocation Intelligence - The where factor
Location Intelligence - The where factor
 
Denodo DataFest 2016: Metadata and Data: Search and Exploration
Denodo DataFest 2016: Metadata and Data: Search and ExplorationDenodo DataFest 2016: Metadata and Data: Search and Exploration
Denodo DataFest 2016: Metadata and Data: Search and Exploration
 
Is your data paying you dividends?
Is your data paying you dividends? Is your data paying you dividends?
Is your data paying you dividends?
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital Transformation
 

Viewers also liked

Nemes-Nagy Katalin Erika: Ezt főztük ki!
Nemes-Nagy Katalin Erika: Ezt főztük ki!Nemes-Nagy Katalin Erika: Ezt főztük ki!
Nemes-Nagy Katalin Erika: Ezt főztük ki!digipedkonf
 
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...tudostanar
 
Guia de ciencias n 1 periodo grado 2°
Guia de ciencias n 1 periodo grado 2°Guia de ciencias n 1 periodo grado 2°
Guia de ciencias n 1 periodo grado 2°Monica Muñoz
 
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...digitalisnemzedek
 
Testing Centre of Excellence Model 2016
Testing Centre of Excellence Model 2016Testing Centre of Excellence Model 2016
Testing Centre of Excellence Model 2016Tony Barber
 
Risk in the food supply chain
Risk in the food supply chainRisk in the food supply chain
Risk in the food supply chainTristan Wiggill
 
Józsa Gabriella: zanza.tv
Józsa Gabriella: zanza.tvJózsa Gabriella: zanza.tv
Józsa Gabriella: zanza.tvdigipedkonf
 
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...Amazon Web Services
 
Android Booting Sequence
Android Booting SequenceAndroid Booting Sequence
Android Booting SequenceJayanta Ghoshal
 
Announcing AWS CodeBuild - January 2017 Online Teck Talks
Announcing AWS CodeBuild - January 2017 Online Teck TalksAnnouncing AWS CodeBuild - January 2017 Online Teck Talks
Announcing AWS CodeBuild - January 2017 Online Teck TalksAmazon Web Services
 
Test Environment Management
Test Environment ManagementTest Environment Management
Test Environment ManagementKanoah
 

Viewers also liked (15)

Nemes-Nagy Katalin Erika: Ezt főztük ki!
Nemes-Nagy Katalin Erika: Ezt főztük ki!Nemes-Nagy Katalin Erika: Ezt főztük ki!
Nemes-Nagy Katalin Erika: Ezt főztük ki!
 
Goebbels, joseph fuhrerr
Goebbels, joseph   fuhrerrGoebbels, joseph   fuhrerr
Goebbels, joseph fuhrerr
 
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...
Hajdicsné Varga Katalin: A gépírástanulás eredményességének értékelése tanuló...
 
Uu praktik kedokteran
Uu praktik kedokteranUu praktik kedokteran
Uu praktik kedokteran
 
Guia de ciencias n 1 periodo grado 2°
Guia de ciencias n 1 periodo grado 2°Guia de ciencias n 1 periodo grado 2°
Guia de ciencias n 1 periodo grado 2°
 
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...
Sápi Vivien: Okostelefonok és applikációk legalizálása a középiskolai oktatás...
 
Sketch You Can!
Sketch You Can!Sketch You Can!
Sketch You Can!
 
Testing Centre of Excellence Model 2016
Testing Centre of Excellence Model 2016Testing Centre of Excellence Model 2016
Testing Centre of Excellence Model 2016
 
Risk in the food supply chain
Risk in the food supply chainRisk in the food supply chain
Risk in the food supply chain
 
Józsa Gabriella: zanza.tv
Józsa Gabriella: zanza.tvJózsa Gabriella: zanza.tv
Józsa Gabriella: zanza.tv
 
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
 
Android Booting Sequence
Android Booting SequenceAndroid Booting Sequence
Android Booting Sequence
 
Announcing AWS CodeBuild - January 2017 Online Teck Talks
Announcing AWS CodeBuild - January 2017 Online Teck TalksAnnouncing AWS CodeBuild - January 2017 Online Teck Talks
Announcing AWS CodeBuild - January 2017 Online Teck Talks
 
Filosofia medieval
Filosofia medievalFilosofia medieval
Filosofia medieval
 
Test Environment Management
Test Environment ManagementTest Environment Management
Test Environment Management
 

Similar to 16h00 globant - aws globant-big-data_summit2012

Evolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer ConferenceEvolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer Conferencegdougan1
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache SoftwareBob Marcus
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBigDataExpo
 
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud Modernization
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud ModernizationDenodo DataFest 2017: Lowering IT Costs with Big Data and Cloud Modernization
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud ModernizationDenodo
 
Denodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechDenodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechTekin Mentes
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBigDataExpo
 
Next-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Next-Gen Cloud Analytics with AWS, Big Data and Data VirtualizationNext-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Next-Gen Cloud Analytics with AWS, Big Data and Data VirtualizationDenodo
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...Big Data Spain
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical OverviewRaheel Retiwalla
 
Introduction to Big Data using AWS Services
Introduction to Big Data using AWS ServicesIntroduction to Big Data using AWS Services
Introduction to Big Data using AWS ServicesAnjani Phuyal
 
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaMindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaData Driven Innovation
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
Mining Information from Data on Cloud
Mining Information from Data on CloudMining Information from Data on Cloud
Mining Information from Data on CloudAmazon Web Services
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 

Similar to 16h00 globant - aws globant-big-data_summit2012 (20)

Evolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer ConferenceEvolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer Conference
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of Analytics
 
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud Modernization
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud ModernizationDenodo DataFest 2017: Lowering IT Costs with Big Data and Cloud Modernization
Denodo DataFest 2017: Lowering IT Costs with Big Data and Cloud Modernization
 
Denodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechDenodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes Logitech
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real Time
 
Next-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Next-Gen Cloud Analytics with AWS, Big Data and Data VirtualizationNext-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Next-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical Overview
 
Introduction to Big Data using AWS Services
Introduction to Big Data using AWS ServicesIntroduction to Big Data using AWS Services
Introduction to Big Data using AWS Services
 
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaMindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
Mining Information from Data on Cloud
Mining Information from Data on CloudMining Information from Data on Cloud
Mining Information from Data on Cloud
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 

More from infolive

Projeto Exame Forum Virtual 3.0 v2
Projeto Exame Forum Virtual 3.0 v2Projeto Exame Forum Virtual 3.0 v2
Projeto Exame Forum Virtual 3.0 v2infolive
 
17h30 aws-databases-summit
17h30   aws-databases-summit17h30   aws-databases-summit
17h30 aws-databases-summitinfolive
 
16h30 aws gru security deck
16h30   aws gru security deck16h30   aws gru security deck
16h30 aws gru security deckinfolive
 
15h00 intel - intel big data for aws summits rev3
15h00   intel - intel big data for aws summits rev315h00   intel - intel big data for aws summits rev3
15h00 intel - intel big data for aws summits rev3infolive
 
14h00 aws costoptimization_jvaria
14h00 aws costoptimization_jvaria14h00 aws costoptimization_jvaria
14h00 aws costoptimization_jvariainfolive
 
13h00 aws 2012-fault_tolerant_applications
13h00   aws 2012-fault_tolerant_applications13h00   aws 2012-fault_tolerant_applications
13h00 aws 2012-fault_tolerant_applicationsinfolive
 
Keynote aws summit 2012 final
Keynote aws summit 2012 finalKeynote aws summit 2012 final
Keynote aws summit 2012 finalinfolive
 
Infolive apresentação 2012
Infolive apresentação 2012Infolive apresentação 2012
Infolive apresentação 2012infolive
 

More from infolive (8)

Projeto Exame Forum Virtual 3.0 v2
Projeto Exame Forum Virtual 3.0 v2Projeto Exame Forum Virtual 3.0 v2
Projeto Exame Forum Virtual 3.0 v2
 
17h30 aws-databases-summit
17h30   aws-databases-summit17h30   aws-databases-summit
17h30 aws-databases-summit
 
16h30 aws gru security deck
16h30   aws gru security deck16h30   aws gru security deck
16h30 aws gru security deck
 
15h00 intel - intel big data for aws summits rev3
15h00   intel - intel big data for aws summits rev315h00   intel - intel big data for aws summits rev3
15h00 intel - intel big data for aws summits rev3
 
14h00 aws costoptimization_jvaria
14h00 aws costoptimization_jvaria14h00 aws costoptimization_jvaria
14h00 aws costoptimization_jvaria
 
13h00 aws 2012-fault_tolerant_applications
13h00   aws 2012-fault_tolerant_applications13h00   aws 2012-fault_tolerant_applications
13h00 aws 2012-fault_tolerant_applications
 
Keynote aws summit 2012 final
Keynote aws summit 2012 finalKeynote aws summit 2012 final
Keynote aws summit 2012 final
 
Infolive apresentação 2012
Infolive apresentação 2012Infolive apresentação 2012
Infolive apresentação 2012
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

16h00 globant - aws globant-big-data_summit2012

  • 1. Big Data at Globant Success Cases in AWS Sabina A. Schneider
  • 2. What is Big Data?
  • 3. What is Data Science?
  • 4. Data Architecture Enterprise High Information Availability Strategy and Performance NoSQL Distributed Mission Solutions Critical Product Positioning in the Market Deeper insight about your Customers Analytics and Alerts on KPIs Cross-reference data with different sources
  • 7. Scalable Architecture in the Cloud Mobile Devices in the cars Third Party Web App Web App Web App Integration Elastic Load Mobile Devices Balancer Auto scaling singly Web Client NoSQL DB S3 Bucket Cloudfront EMR Cluster Storm Real Time processing Hadoop Analytics Dashboard Trends Web Client Pig BigData – storage and processing
  • 8. Metamarkets has developed a web-based analytics console that supports drill-downs and roll-ups of high dimensional data sets (real-time bidding), comprising billions of events, in real-time. Data store collects 10 GB of information every day, and has over 15 TB. Reports using Hadoop and Hive on AWS Infrastructure. The 40-instance cluster can scan, filter, and aggregate 1 billion rows in 950 milliseconds.
  • 9. Gree is a leading casual game development company. Globant developed a Hadoop based architecture to store gaming events and generate telemetry information. These metrics are used to analyze, segment gamer profiles, estimate revenue and perform predictive analysis on game performance.
  • 10. Products Positioning in the Market • Tweets recollection on specific events (eg: elections), integrated with a set of MapReduce based queries • Data stored in a 20- node Hadoop cluster • Google Visualization tools for widget based Dashboard
  • 11. What? • Innovation to the Financial Market • Sentiment Analytics to what’s happening now and what can happen next in the Market • Predictions one week in advance according to comments on Tweeter Challenges • Aggresive Real Time analysis on Social Networks • Dashboarding comparing with real values from Yahoo Finances • Sentiment Analysis and Languague filtering • Analytics Predictions
  • 12. Data Science Recommend ation Classification Sophisticated Mathematical algorithm Statistical Clustering Algorithm Predictions on KPIs Predictions on Metrics
  • 13. Moneygram Transaction Scoring Analysis of Moneygram historical transactional data labeled as Fraudulent/Non Fraudulent • 8 years of transactional data to analyze Training using Support Vector Machines of historical data • Classification achieved by using only a subset of data using soft margins (by use of slack variables) to construct dividing hyperplane • Possible use of kernel principal components to preprocess data and reduce dimensionality of training dataset • Avoid high computation times (sparse solution) Benefits • Detect fraudulent transactions with a higher level of accuracy • Increase in customer service satisfaction (less false-positives)
  • 14. Shopping cart suggestion engine Generate suggestions based on client shopping history • Cluster a large dataset representing clients' shopping history using unsupervised learning algorithms. • Use information from new/existing client to classify into the clusterized shopping history from ALL clients. • Generate suggestions based on the cluster's shopping preferences • Use of Hadoop and Mahout for clustering and posterior classification
  • 15. Metadata word clustering using Solr • Content management and information sorting/ categorization classified by location. Enhance the performance at a view level. • Indexing of jwt content coming from different sources (internal and external) developed with Solr on Lucene. Integration with myJwt.com: internal social network. • organize the content storage: service running in the Cloud that receives content, generate different assets (snapshot, thumbnails), extract metadata to be centralized in one place • myIdeas: collect ideas from different creative designers from different location and share a bonus between the bright ideas
  • 16. Data Visualization Our data visualization practice allows our customers to understand the evolution of key business drivers, trends, and drill down into the root causes of deviations. Our HTML5 data visualization solution, allows us to combine the flexibility of a custom made solution with a fast time to market. It’s based in standard Widgets, allowing each user to customize the dashboard as required, and visualize it on every device.
  • 18. Cloud server Browser User input Video streaming
  • 19.
  • 20. Kantar Media manages TV Advertisement displayed on DirecTV US. We developed the addressable advertisement reporting solution, used by advertisers to plan and analyze the performance of addressable advertisement. Advertisement displayed on TV is customized to each user profile. The solution allows obtaining reliable measurements from TV, analyzes the structure of the audience that has watched each advertisement, and allows evaluating the ROI of the marketing campaign.
  • 21.
  • 22. Touch screen based scorecard, used by the top management to analyze and compare results from different countries and products.