SlideShare une entreprise Scribd logo
1  sur  28
Boston Hadoop User Group
Jeremy Rishel, SVP Engineering, Products, & Data
April 2012
Which is Better?

A. More Data

B. Better Data

C. Better Algorithms




                       Bluefin Labs Proprietary and Confidential
Which is Better?

A. More Data

B. Better Data

C. Better Algorithms

D. All of the Above




                       Bluefin Labs Proprietary and Confidential
Social TV




Television          Social Web
Social TV




Television          Social Web
Social TV




Television          Social Web
Impressions
Impressions   Expressions
Impressions   Expressions
Kinds of Data and Algorithms
Public social media (Twitter, Facebook) 250M+ documents per day

Programming info for 200+ U.S. networks

Video signal for 65+ U.S. networks

Brand conversation & ad tracking for thousands of brands

Realtime semantic analysis of comments

Demographic & behavioral analysis of authors

Advertising context & effect of advertising on brand dynamics

Overlap between audiences and comparative analysis
                                                  Bluefin Labs Proprietary and Confidential
Realtime & Historical Data
2M show telecasts

1.5M ad airings / month

50M links between social media users and TV shows / month

10B links between social media users and TV ads / month

End-to-end latency in minutes - visible & searchable in realtime

Historical data visible & searchable through various UIs/tools

Searchable text index of all social media comments in our archive &
methods for large-scale analysis jobs (including MR)
                                                        Bluefin Labs Proprietary and Confidential
Kinds of Questions
We often deal at the intersection of multiple data streams or data &
algorithms

How much chatter about a show (realtime)? (Social media +
programming info + semantic analysis)

What ads are airing (near realtime)? (Video signals + programming
info + computer vision/audio fingerprinting)

Which brands does the audience of a show talk most about? Which
shows do brand engaged authors talk most about? (Social media +
programming info + brand data + semantic analysis + audience
overlap analysis)



                                                   Bluefin Labs Proprietary and Confidential
More Data

“More data” can mean new streams, broader streams, or more
granular data

“More data” powers better algorithms & aids in creating better data




                                                 Bluefin Labs Proprietary and Confidential
More Data

“More data” can mean new streams, broader streams, or more
granular data

“More data” powers better algorithms & aids in creating better data

Capturing color, texture, & audio features from the TV video stream
improved our ad detection




                                                  Bluefin Labs Proprietary and Confidential
More Data

“More data” can mean new streams, broader streams, or more
granular data

“More data” powers better algorithms & aids in creating better data

Capturing color, texture, & audio features from the TV video stream
improved our ad detection

Tapping into full author history permitted better age classification




                                                    Bluefin Labs Proprietary and Confidential
More Data

“More data” can mean new streams, broader streams, or more
granular data

“More data” powers better algorithms & aids in creating better data

Capturing color, texture, & audio features from the TV video stream
improved our ad detection

Tapping into full author history permitted better age classification

Analyzing closed caption gave us another dimension of semantic
analysis and avenues to explore social/mass media engagement



                                                    Bluefin Labs Proprietary and Confidential
Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement

“Better data” makes for better algorithms & big data more useful




                                                  Bluefin Labs Proprietary and Confidential
Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement

“Better data” makes for better algorithms & big data more useful

Both realtime and large scale review & curation




                                                  Bluefin Labs Proprietary and Confidential
Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement

“Better data” makes for better algorithms & big data more useful

Both realtime and large scale review & curation

Systematic monitoring, statistical QA, & estimation models




                                                  Bluefin Labs Proprietary and Confidential
Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement

“Better data” makes for better algorithms & big data more useful

Both realtime and large scale review & curation

Systematic monitoring, statistical QA, & estimation models

High quality data supports in-domain benchmarking (How is a show
or network vs. competitors? How is a brand within its sector?)




                                                  Bluefin Labs Proprietary and Confidential
Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement

“Better data” makes for better algorithms & big data more useful

Both realtime and large scale review & curation

Systematic monitoring, statistical QA, & estimation models

High quality data supports in-domain benchmarking (How is a show
or network vs. competitors? How is a brand within its sector?)

High quality and consistent data permits richer trend analysis (e.g.
season-over-season or ad campaign-to-ad campaign comparison)

                                                    Bluefin Labs Proprietary and Confidential
Better Algorithms

“Better algorithms” include both new analytics & improvements to
existing ones

“Better algorithm” approaches can be taken with more & better data




                                                 Bluefin Labs Proprietary and Confidential
Better Algorithms

“Better algorithms” include both new analytics & improvements to
existing ones

“Better algorithm” approaches can be taken with more & better data

Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”




                                                  Bluefin Labs Proprietary and Confidential
Better Algorithms

“Better algorithms” include both new analytics & improvements to
existing ones

“Better algorithm” approaches can be taken with more & better data

Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”

Ad discovery methods possible changed once we shifted to broader
approach




                                                  Bluefin Labs Proprietary and Confidential
Better Algorithms

“Better algorithms” include both new analytics & improvements to
existing ones

“Better algorithm” approaches can be taken with more & better data

Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”

Ad discovery methods possible changed once we shifted to broader
approach

Higher quality show telecast engagement data permits more precise
audience analysis across domains - e.g. shows & networks to brands

                                                  Bluefin Labs Proprietary and Confidential
All of the Above

More data helps build better data & algorithms

Better data improves algorithms & makes large data more useful

Better algorithms get leverage out of more & better data

You should care about all three




                                                  Bluefin Labs Proprietary and Confidential
Jeremy Rishel
 jrishel@bluefinlabs.com
Confidential

Contenu connexe

Similaire à Boston Hadoop User Group Presentation

Big Data Matching - How to Find Two Similar Needles in a Really Big Haystack
Big Data Matching - How to Find Two Similar Needles in a Really Big HaystackBig Data Matching - How to Find Two Similar Needles in a Really Big Haystack
Big Data Matching - How to Find Two Similar Needles in a Really Big HaystackPrecisely
 
Liberating data power of APIs
Liberating data power of APIsLiberating data power of APIs
Liberating data power of APIsBala Iyer
 
Unlock your Big Data with Analytics and BI on Office 365 - OFF103
Unlock your Big Data with Analytics and BI on Office 365 - OFF103Unlock your Big Data with Analytics and BI on Office 365 - OFF103
Unlock your Big Data with Analytics and BI on Office 365 - OFF103Brian Culver
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketApigee | Google Cloud
 
Microsoft for Media and Entertainment.
Microsoft for Media and Entertainment.Microsoft for Media and Entertainment.
Microsoft for Media and Entertainment.Nguyễn Quang Huy
 
From Data to Action: the Future of Hospitality Marketing
From Data to Action: the Future of Hospitality MarketingFrom Data to Action: the Future of Hospitality Marketing
From Data to Action: the Future of Hospitality MarketingTim Russell
 
Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudySaket Toshniwal
 
Man & Machine: The Role Of Search Practitioners Utilizing Technology
Man & Machine: The Role Of Search Practitioners Utilizing TechnologyMan & Machine: The Role Of Search Practitioners Utilizing Technology
Man & Machine: The Role Of Search Practitioners Utilizing TechnologyRyan Fitzgibbon
 
Worst Practices in Artificial Intelligence
Worst Practices in Artificial IntelligenceWorst Practices in Artificial Intelligence
Worst Practices in Artificial IntelligenceWilliam Tsoi
 
Supercharging AI with Data Enrichment
Supercharging AI with Data EnrichmentSupercharging AI with Data Enrichment
Supercharging AI with Data EnrichmentPrecisely
 
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank NothaftThe Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank NothaftDatabricks
 
Make Design A First Class Citizen To Ensure Analytics Success
Make Design A First Class Citizen To Ensure Analytics SuccessMake Design A First Class Citizen To Ensure Analytics Success
Make Design A First Class Citizen To Ensure Analytics SuccessSiteworx LLC
 
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataPhilip Bourne
 
Watson DevCon 2016 - From Jeopardy! to the Future
Watson DevCon 2016 - From Jeopardy! to the FutureWatson DevCon 2016 - From Jeopardy! to the Future
Watson DevCon 2016 - From Jeopardy! to the FutureIBM Watson
 
How BrackenData Leverages Data on Over 250,000 Clinical Trials
How BrackenData Leverages Data on Over 250,000 Clinical TrialsHow BrackenData Leverages Data on Over 250,000 Clinical Trials
How BrackenData Leverages Data on Over 250,000 Clinical TrialsBracken
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryWolfgang G. Hoeck
 
Opticon 2015- Powerful Integrations with Optimizely
Opticon 2015- Powerful Integrations with OptimizelyOpticon 2015- Powerful Integrations with Optimizely
Opticon 2015- Powerful Integrations with OptimizelyOptimizely
 
Channeling insights to the right people
Channeling insights to the right peopleChanneling insights to the right people
Channeling insights to the right peopleSebastien Lefebvre
 
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceabilityUsing the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceabilityIBM Sverige
 
ChatGPT and not only: how can you use the power of Generative AI at scale
ChatGPT and not only: how can you use the power of Generative AI at scaleChatGPT and not only: how can you use the power of Generative AI at scale
ChatGPT and not only: how can you use the power of Generative AI at scaleMaxim Salnikov
 

Similaire à Boston Hadoop User Group Presentation (20)

Big Data Matching - How to Find Two Similar Needles in a Really Big Haystack
Big Data Matching - How to Find Two Similar Needles in a Really Big HaystackBig Data Matching - How to Find Two Similar Needles in a Really Big Haystack
Big Data Matching - How to Find Two Similar Needles in a Really Big Haystack
 
Liberating data power of APIs
Liberating data power of APIsLiberating data power of APIs
Liberating data power of APIs
 
Unlock your Big Data with Analytics and BI on Office 365 - OFF103
Unlock your Big Data with Analytics and BI on Office 365 - OFF103Unlock your Big Data with Analytics and BI on Office 365 - OFF103
Unlock your Big Data with Analytics and BI on Office 365 - OFF103
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the Market
 
Microsoft for Media and Entertainment.
Microsoft for Media and Entertainment.Microsoft for Media and Entertainment.
Microsoft for Media and Entertainment.
 
From Data to Action: the Future of Hospitality Marketing
From Data to Action: the Future of Hospitality MarketingFrom Data to Action: the Future of Hospitality Marketing
From Data to Action: the Future of Hospitality Marketing
 
Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case Study
 
Man & Machine: The Role Of Search Practitioners Utilizing Technology
Man & Machine: The Role Of Search Practitioners Utilizing TechnologyMan & Machine: The Role Of Search Practitioners Utilizing Technology
Man & Machine: The Role Of Search Practitioners Utilizing Technology
 
Worst Practices in Artificial Intelligence
Worst Practices in Artificial IntelligenceWorst Practices in Artificial Intelligence
Worst Practices in Artificial Intelligence
 
Supercharging AI with Data Enrichment
Supercharging AI with Data EnrichmentSupercharging AI with Data Enrichment
Supercharging AI with Data Enrichment
 
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank NothaftThe Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
 
Make Design A First Class Citizen To Ensure Analytics Success
Make Design A First Class Citizen To Ensure Analytics SuccessMake Design A First Class Citizen To Ensure Analytics Success
Make Design A First Class Citizen To Ensure Analytics Success
 
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
 
Watson DevCon 2016 - From Jeopardy! to the Future
Watson DevCon 2016 - From Jeopardy! to the FutureWatson DevCon 2016 - From Jeopardy! to the Future
Watson DevCon 2016 - From Jeopardy! to the Future
 
How BrackenData Leverages Data on Over 250,000 Clinical Trials
How BrackenData Leverages Data on Over 250,000 Clinical TrialsHow BrackenData Leverages Data on Over 250,000 Clinical Trials
How BrackenData Leverages Data on Over 250,000 Clinical Trials
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
 
Opticon 2015- Powerful Integrations with Optimizely
Opticon 2015- Powerful Integrations with OptimizelyOpticon 2015- Powerful Integrations with Optimizely
Opticon 2015- Powerful Integrations with Optimizely
 
Channeling insights to the right people
Channeling insights to the right peopleChanneling insights to the right people
Channeling insights to the right people
 
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceabilityUsing the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceability
 
ChatGPT and not only: how can you use the power of Generative AI at scale
ChatGPT and not only: how can you use the power of Generative AI at scaleChatGPT and not only: how can you use the power of Generative AI at scale
ChatGPT and not only: how can you use the power of Generative AI at scale
 

Plus de Bluefin Labs

Social TV Data for the 2012 NBA Finals - Bluefin Labs
Social TV Data for the 2012 NBA Finals - Bluefin LabsSocial TV Data for the 2012 NBA Finals - Bluefin Labs
Social TV Data for the 2012 NBA Finals - Bluefin LabsBluefin Labs
 
Bluefin labs topsocialtv_ads709
Bluefin labs topsocialtv_ads709Bluefin labs topsocialtv_ads709
Bluefin labs topsocialtv_ads709Bluefin Labs
 
Social TV Fact Sheet: May 2012
Social TV Fact Sheet: May 2012Social TV Fact Sheet: May 2012
Social TV Fact Sheet: May 2012Bluefin Labs
 
Social TV Fact Sheet: January 2012
Social TV Fact Sheet: January 2012Social TV Fact Sheet: January 2012
Social TV Fact Sheet: January 2012Bluefin Labs
 
Social TV Fact Sheet: February 2012
Social TV Fact Sheet: February 2012Social TV Fact Sheet: February 2012
Social TV Fact Sheet: February 2012Bluefin Labs
 
Social TV Fact Sheet: April 2012
Social TV Fact Sheet: April 2012Social TV Fact Sheet: April 2012
Social TV Fact Sheet: April 2012Bluefin Labs
 
Social TV Fact Sheet: March 2012
Social TV Fact Sheet: March 2012Social TV Fact Sheet: March 2012
Social TV Fact Sheet: March 2012Bluefin Labs
 
Social TV Fact Sheet: June 2012
Social TV Fact Sheet: June 2012Social TV Fact Sheet: June 2012
Social TV Fact Sheet: June 2012Bluefin Labs
 
Social TV for Sports Media Marketers
Social TV for Sports Media MarketersSocial TV for Sports Media Marketers
Social TV for Sports Media MarketersBluefin Labs
 

Plus de Bluefin Labs (10)

Social TV Data for the 2012 NBA Finals - Bluefin Labs
Social TV Data for the 2012 NBA Finals - Bluefin LabsSocial TV Data for the 2012 NBA Finals - Bluefin Labs
Social TV Data for the 2012 NBA Finals - Bluefin Labs
 
Bluefin labs topsocialtv_ads709
Bluefin labs topsocialtv_ads709Bluefin labs topsocialtv_ads709
Bluefin labs topsocialtv_ads709
 
Social TV Fact Sheet: May 2012
Social TV Fact Sheet: May 2012Social TV Fact Sheet: May 2012
Social TV Fact Sheet: May 2012
 
Social TV Fact Sheet: January 2012
Social TV Fact Sheet: January 2012Social TV Fact Sheet: January 2012
Social TV Fact Sheet: January 2012
 
Social TV Fact Sheet: February 2012
Social TV Fact Sheet: February 2012Social TV Fact Sheet: February 2012
Social TV Fact Sheet: February 2012
 
Social TV Fact Sheet: April 2012
Social TV Fact Sheet: April 2012Social TV Fact Sheet: April 2012
Social TV Fact Sheet: April 2012
 
Social TV Fact Sheet: March 2012
Social TV Fact Sheet: March 2012Social TV Fact Sheet: March 2012
Social TV Fact Sheet: March 2012
 
Social TV Fact Sheet: June 2012
Social TV Fact Sheet: June 2012Social TV Fact Sheet: June 2012
Social TV Fact Sheet: June 2012
 
Social TV Ratings
Social TV RatingsSocial TV Ratings
Social TV Ratings
 
Social TV for Sports Media Marketers
Social TV for Sports Media MarketersSocial TV for Sports Media Marketers
Social TV for Sports Media Marketers
 

Dernier

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Boston Hadoop User Group Presentation

  • 1. Boston Hadoop User Group Jeremy Rishel, SVP Engineering, Products, & Data April 2012
  • 2. Which is Better? A. More Data B. Better Data C. Better Algorithms Bluefin Labs Proprietary and Confidential
  • 3. Which is Better? A. More Data B. Better Data C. Better Algorithms D. All of the Above Bluefin Labs Proprietary and Confidential
  • 4. Social TV Television Social Web
  • 5. Social TV Television Social Web
  • 6. Social TV Television Social Web
  • 7.
  • 9. Impressions Expressions
  • 10. Impressions Expressions
  • 11. Kinds of Data and Algorithms Public social media (Twitter, Facebook) 250M+ documents per day Programming info for 200+ U.S. networks Video signal for 65+ U.S. networks Brand conversation & ad tracking for thousands of brands Realtime semantic analysis of comments Demographic & behavioral analysis of authors Advertising context & effect of advertising on brand dynamics Overlap between audiences and comparative analysis Bluefin Labs Proprietary and Confidential
  • 12. Realtime & Historical Data 2M show telecasts 1.5M ad airings / month 50M links between social media users and TV shows / month 10B links between social media users and TV ads / month End-to-end latency in minutes - visible & searchable in realtime Historical data visible & searchable through various UIs/tools Searchable text index of all social media comments in our archive & methods for large-scale analysis jobs (including MR) Bluefin Labs Proprietary and Confidential
  • 13. Kinds of Questions We often deal at the intersection of multiple data streams or data & algorithms How much chatter about a show (realtime)? (Social media + programming info + semantic analysis) What ads are airing (near realtime)? (Video signals + programming info + computer vision/audio fingerprinting) Which brands does the audience of a show talk most about? Which shows do brand engaged authors talk most about? (Social media + programming info + brand data + semantic analysis + audience overlap analysis) Bluefin Labs Proprietary and Confidential
  • 14. More Data “More data” can mean new streams, broader streams, or more granular data “More data” powers better algorithms & aids in creating better data Bluefin Labs Proprietary and Confidential
  • 15. More Data “More data” can mean new streams, broader streams, or more granular data “More data” powers better algorithms & aids in creating better data Capturing color, texture, & audio features from the TV video stream improved our ad detection Bluefin Labs Proprietary and Confidential
  • 16. More Data “More data” can mean new streams, broader streams, or more granular data “More data” powers better algorithms & aids in creating better data Capturing color, texture, & audio features from the TV video stream improved our ad detection Tapping into full author history permitted better age classification Bluefin Labs Proprietary and Confidential
  • 17. More Data “More data” can mean new streams, broader streams, or more granular data “More data” powers better algorithms & aids in creating better data Capturing color, texture, & audio features from the TV video stream improved our ad detection Tapping into full author history permitted better age classification Analyzing closed caption gave us another dimension of semantic analysis and avenues to explore social/mass media engagement Bluefin Labs Proprietary and Confidential
  • 18. Better Data “Better data” achieved through human-machine collaboration, with a view to continual improvement “Better data” makes for better algorithms & big data more useful Bluefin Labs Proprietary and Confidential
  • 19. Better Data “Better data” achieved through human-machine collaboration, with a view to continual improvement “Better data” makes for better algorithms & big data more useful Both realtime and large scale review & curation Bluefin Labs Proprietary and Confidential
  • 20. Better Data “Better data” achieved through human-machine collaboration, with a view to continual improvement “Better data” makes for better algorithms & big data more useful Both realtime and large scale review & curation Systematic monitoring, statistical QA, & estimation models Bluefin Labs Proprietary and Confidential
  • 21. Better Data “Better data” achieved through human-machine collaboration, with a view to continual improvement “Better data” makes for better algorithms & big data more useful Both realtime and large scale review & curation Systematic monitoring, statistical QA, & estimation models High quality data supports in-domain benchmarking (How is a show or network vs. competitors? How is a brand within its sector?) Bluefin Labs Proprietary and Confidential
  • 22. Better Data “Better data” achieved through human-machine collaboration, with a view to continual improvement “Better data” makes for better algorithms & big data more useful Both realtime and large scale review & curation Systematic monitoring, statistical QA, & estimation models High quality data supports in-domain benchmarking (How is a show or network vs. competitors? How is a brand within its sector?) High quality and consistent data permits richer trend analysis (e.g. season-over-season or ad campaign-to-ad campaign comparison) Bluefin Labs Proprietary and Confidential
  • 23. Better Algorithms “Better algorithms” include both new analytics & improvements to existing ones “Better algorithm” approaches can be taken with more & better data Bluefin Labs Proprietary and Confidential
  • 24. Better Algorithms “Better algorithms” include both new analytics & improvements to existing ones “Better algorithm” approaches can be taken with more & better data Focus areas of NLP/machine learning, computer vision, & statistical analysis; key to “better” is having a way to measure “goodness” Bluefin Labs Proprietary and Confidential
  • 25. Better Algorithms “Better algorithms” include both new analytics & improvements to existing ones “Better algorithm” approaches can be taken with more & better data Focus areas of NLP/machine learning, computer vision, & statistical analysis; key to “better” is having a way to measure “goodness” Ad discovery methods possible changed once we shifted to broader approach Bluefin Labs Proprietary and Confidential
  • 26. Better Algorithms “Better algorithms” include both new analytics & improvements to existing ones “Better algorithm” approaches can be taken with more & better data Focus areas of NLP/machine learning, computer vision, & statistical analysis; key to “better” is having a way to measure “goodness” Ad discovery methods possible changed once we shifted to broader approach Higher quality show telecast engagement data permits more precise audience analysis across domains - e.g. shows & networks to brands Bluefin Labs Proprietary and Confidential
  • 27. All of the Above More data helps build better data & algorithms Better data improves algorithms & makes large data more useful Better algorithms get leverage out of more & better data You should care about all three Bluefin Labs Proprietary and Confidential

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n