SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Search + Big Data:
It’s (still) All About the User
Grant Ingersoll, Chief Scientist – Lucid Imagination
           grant@lucidimagination.com
                 October 19, 2011
Promise and Reality

  “Data is increasingly digital air: the oxygen we
 breathe and the carbon dioxide that we exhale. It
can be a source of both sustenance and pollution.”
            Six Provocations for Big Data
          by Danah Boyd and Kate Crawford



  “The truth is, I spend most of my time trying to
reduce the size of my data so it can be analyzed.”
      Hilary Mason, Chief Scientist, Bitly @ Strata
Pragmatism
Evolution
                           Documents
                           • Models
                           • Feature Selection




                                                 User
                                                 Interaction
       Content
                                                 • Clicks
       Relationships                             • Ratings/
       • Page Rank, etc.                          Reviews
       • Organization                            • Learning to
                                                  Rank
                                                 • Social Graph




                               Queries
                               • Phrases
                               • NLP
Minding the Intersection


                   Search




       Analytics            Discovery
Benefits
§  End users
   •  Better relevance/conversion
   •  Serendipity
   •  Better/faster insight


§  Business:
   •    ROI
   •    Awareness across organization
   •    Enablement
   •    Agility
Needs
§  Fast, efficient, scalable search
§  Large scale, cost effective storage
§  Processing Power:
   •  Large scale distributed for whole data consumption
   •  Streaming/In Memory for real time needs
   •  Ability to learn


§  Willingness to ask questions
The Good News
Search
§  Good scalable, search a given
   •  Talks: Chitouras, Sturlese, Binns, Miller


§  Custom Relevancy via function queries, boosts
§  Explore other relevance models
   •  Talks: Muir, Pugh
   •  Lucene/Solr trunk has pluggable scoring (BM25, etc.)


§  NRT for timeliness
   •  Talks: Busch
Discovery
Facets
  •  Talks: Yonik
  •  Classification, Taxonomy
Clustering
  •  Talk: Frank S.
Suggestions
  •  Auto-suggest, Spelling,
     More Like This,
     Related Searches, search trails
Visualization
Analytics
Analytics for End Users
Offline                         Online
   •    Popularity/Click          •  Trends/Stats
   •    Link Analysis
   •    Search Trails             •  Social/Personal
   •    Recommendations
   •    Spellchecking weights     •  Location
   •    Collocations


                                         STORM
Analytics for Internal Users
Offline                         Online
   •    Top X                     •  Trends
   •    Zero results
   •    MRR, MAP                  •  Operational alerts
   •    User segmentation            (QPS,
   •    Location, conversions        DPS, etc)
   •    Ad hoc Analysis


                                   GIRAPH
What’s Missing?
§  The glue is up to you (us?)
   •    Lucene Index -> Pig/Others
   •    Mahout -> Pig/Others
   •    Mahout -> Lucene/Solr
   •    Logs -> Pig/Others


§  Nice to have:
   •  More in-index functionality (that performs)
         §  Aggregations
         §  Arbitrary stats
         §  Complex Joins
What’s Next?

“I can have all the data I want to have – but I still
  have to communicate it to our players. It has to
  get into their minds. And they have to utilize it. ”
        Brad Stevens, Head Basketball Coach,
     Butler University in Oct. ‘11 McKinsey Quarterly
Thanks!


§  http://www.lucidimagination.com

§  @gsingers

§  grant@lucidimagination.com

§  stump@lucene-eurocon.com
Lucene Ecosystem




               Spark   Storm
              Giraph
Lucene Ecosystem




               Spark   Storm
              Giraph

Contenu connexe

Similaire à Search + Big Data: It's (still) All About the User- Grant Ingersoll

IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsJieyun Yang
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Robert Stribley
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Robert Stribley
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Yandex
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisOpen Analytics
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Robert Stribley
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information RetrievalRoi Blanco
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Robert Stribley
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to AnalyticsLewandog, Inc,
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Robert Stribley
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Robert Stribley
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...K.Mohamed Faizal
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perdutaEdoardo Schepis
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perdutaBetter Software
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013mrkwr
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadRoss Philip
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoChris Farnum
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search enginesPhil Bradley
 

Similaire à Search + Big Data: It's (still) All About the User- Grant Ingersoll (20)

Duncan product tank
Duncan product tankDuncan product tank
Duncan product tank
 
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to Analytics
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perduta
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perduta
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 upload
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San Francisco
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 

Plus de lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Plus de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Search + Big Data: It's (still) All About the User- Grant Ingersoll

  • 1. Search + Big Data: It’s (still) All About the User Grant Ingersoll, Chief Scientist – Lucid Imagination grant@lucidimagination.com October 19, 2011
  • 2. Promise and Reality “Data is increasingly digital air: the oxygen we breathe and the carbon dioxide that we exhale. It can be a source of both sustenance and pollution.” Six Provocations for Big Data by Danah Boyd and Kate Crawford “The truth is, I spend most of my time trying to reduce the size of my data so it can be analyzed.” Hilary Mason, Chief Scientist, Bitly @ Strata
  • 4. Evolution Documents • Models • Feature Selection User Interaction Content • Clicks Relationships • Ratings/ • Page Rank, etc. Reviews • Organization • Learning to Rank • Social Graph Queries • Phrases • NLP
  • 5. Minding the Intersection Search Analytics Discovery
  • 6. Benefits §  End users •  Better relevance/conversion •  Serendipity •  Better/faster insight §  Business: •  ROI •  Awareness across organization •  Enablement •  Agility
  • 7. Needs §  Fast, efficient, scalable search §  Large scale, cost effective storage §  Processing Power: •  Large scale distributed for whole data consumption •  Streaming/In Memory for real time needs •  Ability to learn §  Willingness to ask questions
  • 9. Search §  Good scalable, search a given •  Talks: Chitouras, Sturlese, Binns, Miller §  Custom Relevancy via function queries, boosts §  Explore other relevance models •  Talks: Muir, Pugh •  Lucene/Solr trunk has pluggable scoring (BM25, etc.) §  NRT for timeliness •  Talks: Busch
  • 10. Discovery Facets •  Talks: Yonik •  Classification, Taxonomy Clustering •  Talk: Frank S. Suggestions •  Auto-suggest, Spelling, More Like This, Related Searches, search trails Visualization
  • 12. Analytics for End Users Offline Online •  Popularity/Click •  Trends/Stats •  Link Analysis •  Search Trails •  Social/Personal •  Recommendations •  Spellchecking weights •  Location •  Collocations STORM
  • 13. Analytics for Internal Users Offline Online •  Top X •  Trends •  Zero results •  MRR, MAP •  Operational alerts •  User segmentation (QPS, •  Location, conversions DPS, etc) •  Ad hoc Analysis GIRAPH
  • 14. What’s Missing? §  The glue is up to you (us?) •  Lucene Index -> Pig/Others •  Mahout -> Pig/Others •  Mahout -> Lucene/Solr •  Logs -> Pig/Others §  Nice to have: •  More in-index functionality (that performs) §  Aggregations §  Arbitrary stats §  Complex Joins
  • 15. What’s Next? “I can have all the data I want to have – but I still have to communicate it to our players. It has to get into their minds. And they have to utilize it. ” Brad Stevens, Head Basketball Coach, Butler University in Oct. ‘11 McKinsey Quarterly
  • 16. Thanks! §  http://www.lucidimagination.com §  @gsingers §  grant@lucidimagination.com §  stump@lucene-eurocon.com
  • 17. Lucene Ecosystem Spark Storm Giraph
  • 18. Lucene Ecosystem Spark Storm Giraph