SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
1©MapR Technologies - Confidential
Crowd Sourcing Reflected Intelligence
Using Search and Big Data
Ted Dunning, MapR
Grant Ingersoll, LucidWorks
2©MapR Technologies - Confidential
Is Search Enough?
• Keyword search is a commodity
• Holistic view of the data and the
user interactions with that data are
critical
• Search, Discovery and Analytics are
the key to unlocking this view of
users and data
Search
MapR and LucidWorks
3©MapR Technologies - Confidential
Grant’s Background
 Co-founder:
– LucidWorks – CTO
– Apache Mahout
 Long time Lucene/Solr committer
 Author: Taming Text
 Background in IR and NLP
– Built CLIR, QA and a variety of other search-based apps
4©MapR Technologies - Confidential
Ted’s Background
 Academia, Startups
– MapR, MusicMatch, ID Analytics, Veoh Networks
– Big data since before big
 Open source
– since the dark ages before the internet
– Mahout, Zookeeper, Drill
– bought the beer at first HUG
 Co-Author
– Mahout in Action
 MapR
– Chief Application Architect
 Founding member of Apache Drill
5©MapR Technologies - Confidential
Agenda
 Intro
 Search (R)evolution
 Reflected Intelligence Use Cases
 Building a Next Generation Search and Discovery Platform
– MapR
– LucidWorks
 1+1=3
6©MapR Technologies - Confidential
User Interactions With Big Data
Data
Data
Data
DFS
Key Value
Store
Index
Command
Line
Query
Language
Keyword
Search
System
Administrator
Engineer
End User
7©MapR Technologies - Confidential
User Interactions With Big Data
Data
Data
Data
DFS
Key Value
Store
Index
Command
Line
Query
Language
Keyword
Search
System
Administrator
Engineer
End User
Reflected
Intelligence
8©MapR Technologies - Confidential
Search (R)evolution
 Search use leads to search abuse
– denormalization frees your mind
– scoring is just a sparse matrix multiply
 Lucene/Solr evolution
– non free text usages abound
– many DB-like features
– noSQL before NoSQL was cool
– flexible indexing
– finite State Transducers FTW!
 Scale
 “This ain’t your father’s relevance anymore”
9©MapR Technologies - Confidential
Search, Discovery and Analytics
 Large-scale analysis is key to reflected intelligence
– correlation analysis
• based on queries, clicks, mouse tracks,
even explicit feedback
• produce clusters, trends, topics, SIP’s
– start with engineered knowledge,
refine with user feedback
 Large-scale discovery features
encourage experimentation
 Always test, always enrich!
Search
DiscoveryAnalytics
10©MapR Technologies - Confidential
Social Media Analysis in Telecom
 Goal
– Detect flash-mob traffic events
– Provision additional resources before failures
 Method: Correlate mobile traffic analysis with social media analysis
– events cause traffic micro-bursts
– participants tweet the events ahead of time
– tweet locations converge on burst location
 Deploy operations faster to predict outages and better handle
emergency situations
– high cost bandwidth augmentation can be marshaled as the traffic appears
– anticipation beats reaction
11©MapR Technologies - Confidential
Provenance is 80% of Value
 Problem
– Broadcasters don’t know what audiences really like at a micro level
 Method:
– Analysis of social media to determine advertising reach and response
– Time resolution of social traffic can provide detailed response metrics
 Results:
– In one case the untargeted advertising was worth 5x more if with
supporting response data
12©MapR Technologies - Confidential
Claims Analysis
 Goal
– Insurance claims processing and analysis
– fraud analysis
 Method
– Combine free text search with metadata analysis to identify high risk
activities across the country
– Integrate with corporate workflows to detect and fix outliers in customer
relations
 Results
– Questions that took 24-48 hours now take seconds to answer
13©MapR Technologies - Confidential
Virginia Tech - Help the World
 Grab data around crisis
 Search immediately
 Large-scale analysis enriches data to find
ways to improve responses and
understanding
 http://www.ctrnet.net
14©MapR Technologies - Confidential
Can Search Catch the Bad Guys?
 Online Drug Counterfeit
detection
 Identify commonly used language
indicating counterfeits
– you know it when you see it
– and you know you have seen it
 Feed to analyst via search-driven
application
– enrich based on analysts feedback
15©MapR Technologies - Confidential
Veoh - Cross Recommendations
 Cross recommendation as search
– with search used to build cross recommendation!
 Recommend content to people who exhibit certain behaviors
(clicks, query terms, other)
 (Ab)use of a search engine
– but not as a search engine for content
– more like a search engine for behavior
16©MapR Technologies - Confidential
Recommendation Basics
 History:
User Thing
1 3
2 4
3 4
2 3
3 2
1 1
2 1
17©MapR Technologies - Confidential
Recommendation Basics
 History as matrix:
 t1+t3 cooccur 2 times, t1+t4 once, t2+t4 once
t1 t2 t3 t4
u1 1 0 1 0
u2 1 0 1 1
u3 0 1 0 1
18©MapR Technologies - Confidential
Recommendation Basics
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1 2
t3 not t3
t1 2 1
not t1 1 1
21©MapR Technologies - Confidential
Spot the Anomaly
 Root LLR is roughly like standard deviations
A not A
B 13 1000
not B 1000 100,000
A not A
B 1 0
not B 0 2
A not A
B 1 0
not B 0 10,000
A not A
B 10 0
not B 0 100,000
0.44 0.98
2.26 7.15
22©MapR Technologies - Confidential
Threshold by Score
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1 2
23©MapR Technologies - Confidential
Threshold by Score
 Indicators
t1 t2 t3 t4
t1 1 0 0 1
t2 0 1 0 1
t3 0 0 1 1
t4 1 0 0 1
indicator: t2, t4
24©MapR Technologies - Confidential
Search Engine for Reflected Intelligence
 Map-reduce “big data” part
– Logs record user + item occurrence
– Group by user to get rows of occurrence matrix
– Self-join to get cooccurrence
– Log-likelihood test to find anomalies
 Search part
– Anomalous cooccurrences are indicators
(or use statistical scores to provide fancy boosts)
– Indicator fields and other meta-data are indexed
– Recommendation implemented using a single search
– Boosts, functions, similarity also can reflect learned behavior
25©MapR Technologies - Confidential
What Platform Do You Need?
 Fast, efficient, scalable search
– bulk and near real-time indexing
– handle billions of records with sub-second search and faceting
 Large scale, cost effective storage and processing capabilities
 NLP and machine learning tools that scale to enhance discovery
and analysis
 Integrated log analysis workflows that close the loop between the
raw data and user interactions
26©MapR Technologies - Confidential
Shards
1
2
3 N
Search View
•Documents
•Users
•Logs
Document
Store
Analytic
Services
View into
numeric/histor
ic data
Classification
Recommendation
Personalization &
Machine Learning
Services
Classification Models
In memory
Replicated
Multi-tenant
Discovery &
Enrichment
Clustering,
classification, NLP,
topic identification,
search log analysis,
user behavior
Content Acquisition
ETL, batch or near
real-time
Access APIs
Data
• LucidWorks Search
connectors
• Push
Reference Architecture
27©MapR Technologies - Confidential
MapR
 MapR provides the technology leading Hadoop distribution
– full eco-system distribution
– integrated data platform
– complete solution for data integrity
 MapR clusters also provide tight integration with search
technologies like LucidWorks
– integration is key for effective ops
28©MapR Technologies - Confidential
LucidWorks
 LucidWorks provides the leading packaging of Apache Lucene and
Solr
– build your own, we support
– founded by the most prominent Lucene/Solr experts
 LucidWorks Search
– “Solr++”
• UI, REST API, MapR connectors, relevance tools, much more
 LucidWorks Big Data
– Big Data as a Service
– Integrated LucidWorks Search, Hadoop, machine learning with prebuilt
workflows for many of these tasks
29©MapR Technologies - Confidential
LucidWorks Big Data
Inputs
API
MgmtSearch, Discovery, Analytics
Processing & Storage
Analytics Service Document Service
Big Data LucidWorks Web HDFS
Admin
Service
Mgmt
Data
Mgmt
Provisioning, Monitoring & Configuration
30©MapR Technologies - Confidential
Easy Wins
 Analyze logs from application stored in MapR
 Seamlessly store search indexes in MapR
– and feed to Pig, Mahout and others
– use mirrors + NFS to directly deploy indexes
 Snapshots make backups a snap
– A lot like version control for data
 LucidWorks 2.5.2 easily connects with MapR
31©MapR Technologies - Confidential
1 + 1 = 3
32©MapR Technologies - Confidential
Learn More
 Talk to Ted
@ted_dunning
tdunning@maprtech.com
 Talk to Grant
@gsingers
grant@lucidworks.com
 MapR and Lucid Works
http://www.mapr.com
http://www.lucidworks.com
Hash Tags
#mapr #lucenesolr #lucidworks

Contenu connexe

Tendances

Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveQubole
 
Realtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseRealtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseDataWorks Summit
 
Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionAPNIC
 
Beyond Matching: Applying Data Science Techniques to IOC-based Detection
Beyond Matching: Applying Data Science Techniques to IOC-based DetectionBeyond Matching: Applying Data Science Techniques to IOC-based Detection
Beyond Matching: Applying Data Science Techniques to IOC-based DetectionAlex Pinto
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...Alex Pinto
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...Alex Pinto
 
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Alex Pinto
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsRavi Teja
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversationsuresh sood
 
Whooster for Law Enforcement
Whooster for Law EnforcementWhooster for Law Enforcement
Whooster for Law EnforcementMelanie Scharton
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event webSiripen Pongpaichet
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersRuhollah Farchtchi
 
How To Drive Value with Security Data
How To Drive Value with Security DataHow To Drive Value with Security Data
How To Drive Value with Security DataRaffael Marty
 
Law Enforcement - Overview
Law Enforcement - OverviewLaw Enforcement - Overview
Law Enforcement - OverviewG2 Research Ltd.
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Alex Pinto
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Siripen Pongpaichet
 
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Alex Pinto
 
Advanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time SpeedAdvanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time Speeddanpotterdwch
 

Tendances (20)

Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using Hive
 
Realtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseRealtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBase
 
Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern Detection
 
Beyond Matching: Applying Data Science Techniques to IOC-based Detection
Beyond Matching: Applying Data Science Techniques to IOC-based DetectionBeyond Matching: Applying Data Science Techniques to IOC-based Detection
Beyond Matching: Applying Data Science Techniques to IOC-based Detection
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...
Data-Driven Threat Intelligence: Useful Methods and Measurements for Handling...
 
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversation
 
Whooster for Law Enforcement
Whooster for Law EnforcementWhooster for Law Enforcement
Whooster for Law Enforcement
 
EventShop ISG talk 140213
EventShop ISG talk 140213EventShop ISG talk 140213
EventShop ISG talk 140213
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event web
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
 
How To Drive Value with Security Data
How To Drive Value with Security DataHow To Drive Value with Security Data
How To Drive Value with Security Data
 
Law Enforcement - Overview
Law Enforcement - OverviewLaw Enforcement - Overview
Law Enforcement - Overview
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)
 
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
 
Advanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time SpeedAdvanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time Speed
 

En vedette

Lucene solr 4 spatial extended deep dive
Lucene solr 4 spatial   extended deep diveLucene solr 4 spatial   extended deep dive
Lucene solr 4 spatial extended deep divelucenerevolution
 
Brahe mass scale flexible indexing
Brahe   mass scale flexible indexingBrahe   mass scale flexible indexing
Brahe mass scale flexible indexinglucenerevolution
 
Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...lucenerevolution
 
Cms integration of apache solr how we did it.
Cms integration of apache solr   how we did it.Cms integration of apache solr   how we did it.
Cms integration of apache solr how we did it.lucenerevolution
 
Advanced query parsing techniques
Advanced query parsing techniquesAdvanced query parsing techniques
Advanced query parsing techniqueslucenerevolution
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmaplucenerevolution
 
Implementing a custom search syntax using solr, lucene & parboiled
Implementing a custom search syntax using solr, lucene & parboiledImplementing a custom search syntax using solr, lucene & parboiled
Implementing a custom search syntax using solr, lucene & parboiledlucenerevolution
 
Semantic search in the cloud
Semantic search in the cloudSemantic search in the cloud
Semantic search in the cloudlucenerevolution
 
Migration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshMigration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshlucenerevolution
 
How to Access Your Library Book Collections Using Solr
How to Access Your Library Book Collections Using SolrHow to Access Your Library Book Collections Using Solr
How to Access Your Library Book Collections Using Solrlucenerevolution
 
Updateable Fields in Lucene and other Codec Applications
Updateable Fields in Lucene and other Codec ApplicationsUpdateable Fields in Lucene and other Codec Applications
Updateable Fields in Lucene and other Codec Applicationslucenerevolution
 
Television news search and analysis with lucene solr
Television news search and analysis with lucene solrTelevision news search and analysis with lucene solr
Television news search and analysis with lucene solrlucenerevolution
 
How to Gain Greater Business Intelligence from Lucene/Solr
How to Gain Greater Business Intelligence from Lucene/SolrHow to Gain Greater Business Intelligence from Lucene/Solr
How to Gain Greater Business Intelligence from Lucene/Solrlucenerevolution
 
Things Made Easy: One Click CMS Integration with Solr & Drupal
Things Made Easy: One Click CMS Integration with Solr & DrupalThings Made Easy: One Click CMS Integration with Solr & Drupal
Things Made Easy: One Click CMS Integration with Solr & Drupallucenerevolution
 
Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr lucenerevolution
 
Search with Polygons: Another Approach to Solr Geospatial Search
Search with Polygons: Another Approach to Solr Geospatial SearchSearch with Polygons: Another Approach to Solr Geospatial Search
Search with Polygons: Another Approach to Solr Geospatial Searchlucenerevolution
 
Lectura Comprensiva - Anexo 12
Lectura Comprensiva - Anexo 12Lectura Comprensiva - Anexo 12
Lectura Comprensiva - Anexo 12Jose luis Meza
 
Estatutos 2012
Estatutos 2012Estatutos 2012
Estatutos 2012fondiscol
 
Senior BI -Consultant -Sumit Kumar Thakur
Senior BI -Consultant -Sumit Kumar ThakurSenior BI -Consultant -Sumit Kumar Thakur
Senior BI -Consultant -Sumit Kumar ThakurSumit thakur
 

En vedette (20)

Lucene solr 4 spatial extended deep dive
Lucene solr 4 spatial   extended deep diveLucene solr 4 spatial   extended deep dive
Lucene solr 4 spatial extended deep dive
 
Brahe mass scale flexible indexing
Brahe   mass scale flexible indexingBrahe   mass scale flexible indexing
Brahe mass scale flexible indexing
 
Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...
 
Cms integration of apache solr how we did it.
Cms integration of apache solr   how we did it.Cms integration of apache solr   how we did it.
Cms integration of apache solr how we did it.
 
Advanced query parsing techniques
Advanced query parsing techniquesAdvanced query parsing techniques
Advanced query parsing techniques
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
 
Implementing a custom search syntax using solr, lucene & parboiled
Implementing a custom search syntax using solr, lucene & parboiledImplementing a custom search syntax using solr, lucene & parboiled
Implementing a custom search syntax using solr, lucene & parboiled
 
Semantic search in the cloud
Semantic search in the cloudSemantic search in the cloud
Semantic search in the cloud
 
Migration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshMigration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntosh
 
How to Access Your Library Book Collections Using Solr
How to Access Your Library Book Collections Using SolrHow to Access Your Library Book Collections Using Solr
How to Access Your Library Book Collections Using Solr
 
Updateable Fields in Lucene and other Codec Applications
Updateable Fields in Lucene and other Codec ApplicationsUpdateable Fields in Lucene and other Codec Applications
Updateable Fields in Lucene and other Codec Applications
 
Television news search and analysis with lucene solr
Television news search and analysis with lucene solrTelevision news search and analysis with lucene solr
Television news search and analysis with lucene solr
 
How to Gain Greater Business Intelligence from Lucene/Solr
How to Gain Greater Business Intelligence from Lucene/SolrHow to Gain Greater Business Intelligence from Lucene/Solr
How to Gain Greater Business Intelligence from Lucene/Solr
 
Things Made Easy: One Click CMS Integration with Solr & Drupal
Things Made Easy: One Click CMS Integration with Solr & DrupalThings Made Easy: One Click CMS Integration with Solr & Drupal
Things Made Easy: One Click CMS Integration with Solr & Drupal
 
Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr
 
Search with Polygons: Another Approach to Solr Geospatial Search
Search with Polygons: Another Approach to Solr Geospatial SearchSearch with Polygons: Another Approach to Solr Geospatial Search
Search with Polygons: Another Approach to Solr Geospatial Search
 
Lectura Comprensiva - Anexo 12
Lectura Comprensiva - Anexo 12Lectura Comprensiva - Anexo 12
Lectura Comprensiva - Anexo 12
 
cv-chrif
cv-chrifcv-chrif
cv-chrif
 
Estatutos 2012
Estatutos 2012Estatutos 2012
Estatutos 2012
 
Senior BI -Consultant -Sumit Kumar Thakur
Senior BI -Consultant -Sumit Kumar ThakurSenior BI -Consultant -Sumit Kumar Thakur
Senior BI -Consultant -Sumit Kumar Thakur
 

Similaire à Crowd sourced intelligence built into search over hadoop

MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR Technologies
 
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceHadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceMapR Technologies
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLAPaul Barsch
 
MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211MapR Technologies
 
MapR lucidworks joint webinar
MapR lucidworks joint webinarMapR lucidworks joint webinar
MapR lucidworks joint webinarTed Dunning
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormRevolution Analytics
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseSoftServe
 
Crowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopCrowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopGrant Ingersoll
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAjaved75
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014bigdatagurus_meetup
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar Revolution Analytics
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Sarah Aerni
 
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsBuzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsMapR Technologies
 
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationBuzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationTed Dunning
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation TechnTed Dunning
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesKimberley Mitchell
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopDataWorks Summit
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentationTao Feng
 

Similaire à Crowd sourced intelligence built into search over hadoop (20)

MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012
 
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceHadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
 
MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211
 
MapR lucidworks joint webinar
MapR lucidworks joint webinarMapR lucidworks joint webinar
MapR lucidworks joint webinar
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Crowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopCrowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and Hadoop
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
 
Polyvalent Recommendations
Polyvalent RecommendationsPolyvalent Recommendations
Polyvalent Recommendations
 
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsBuzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal Recommendations
 
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationBuzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendation
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation Techn
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
Zero Touch Analytics
Zero Touch AnalyticsZero Touch Analytics
Zero Touch Analytics
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over Hadoop
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 

Plus de lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Plus de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Dernier

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Dernier (20)

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 

Crowd sourced intelligence built into search over hadoop

  • 1. 1©MapR Technologies - Confidential Crowd Sourcing Reflected Intelligence Using Search and Big Data Ted Dunning, MapR Grant Ingersoll, LucidWorks
  • 2. 2©MapR Technologies - Confidential Is Search Enough? • Keyword search is a commodity • Holistic view of the data and the user interactions with that data are critical • Search, Discovery and Analytics are the key to unlocking this view of users and data Search MapR and LucidWorks
  • 3. 3©MapR Technologies - Confidential Grant’s Background  Co-founder: – LucidWorks – CTO – Apache Mahout  Long time Lucene/Solr committer  Author: Taming Text  Background in IR and NLP – Built CLIR, QA and a variety of other search-based apps
  • 4. 4©MapR Technologies - Confidential Ted’s Background  Academia, Startups – MapR, MusicMatch, ID Analytics, Veoh Networks – Big data since before big  Open source – since the dark ages before the internet – Mahout, Zookeeper, Drill – bought the beer at first HUG  Co-Author – Mahout in Action  MapR – Chief Application Architect  Founding member of Apache Drill
  • 5. 5©MapR Technologies - Confidential Agenda  Intro  Search (R)evolution  Reflected Intelligence Use Cases  Building a Next Generation Search and Discovery Platform – MapR – LucidWorks  1+1=3
  • 6. 6©MapR Technologies - Confidential User Interactions With Big Data Data Data Data DFS Key Value Store Index Command Line Query Language Keyword Search System Administrator Engineer End User
  • 7. 7©MapR Technologies - Confidential User Interactions With Big Data Data Data Data DFS Key Value Store Index Command Line Query Language Keyword Search System Administrator Engineer End User Reflected Intelligence
  • 8. 8©MapR Technologies - Confidential Search (R)evolution  Search use leads to search abuse – denormalization frees your mind – scoring is just a sparse matrix multiply  Lucene/Solr evolution – non free text usages abound – many DB-like features – noSQL before NoSQL was cool – flexible indexing – finite State Transducers FTW!  Scale  “This ain’t your father’s relevance anymore”
  • 9. 9©MapR Technologies - Confidential Search, Discovery and Analytics  Large-scale analysis is key to reflected intelligence – correlation analysis • based on queries, clicks, mouse tracks, even explicit feedback • produce clusters, trends, topics, SIP’s – start with engineered knowledge, refine with user feedback  Large-scale discovery features encourage experimentation  Always test, always enrich! Search DiscoveryAnalytics
  • 10. 10©MapR Technologies - Confidential Social Media Analysis in Telecom  Goal – Detect flash-mob traffic events – Provision additional resources before failures  Method: Correlate mobile traffic analysis with social media analysis – events cause traffic micro-bursts – participants tweet the events ahead of time – tweet locations converge on burst location  Deploy operations faster to predict outages and better handle emergency situations – high cost bandwidth augmentation can be marshaled as the traffic appears – anticipation beats reaction
  • 11. 11©MapR Technologies - Confidential Provenance is 80% of Value  Problem – Broadcasters don’t know what audiences really like at a micro level  Method: – Analysis of social media to determine advertising reach and response – Time resolution of social traffic can provide detailed response metrics  Results: – In one case the untargeted advertising was worth 5x more if with supporting response data
  • 12. 12©MapR Technologies - Confidential Claims Analysis  Goal – Insurance claims processing and analysis – fraud analysis  Method – Combine free text search with metadata analysis to identify high risk activities across the country – Integrate with corporate workflows to detect and fix outliers in customer relations  Results – Questions that took 24-48 hours now take seconds to answer
  • 13. 13©MapR Technologies - Confidential Virginia Tech - Help the World  Grab data around crisis  Search immediately  Large-scale analysis enriches data to find ways to improve responses and understanding  http://www.ctrnet.net
  • 14. 14©MapR Technologies - Confidential Can Search Catch the Bad Guys?  Online Drug Counterfeit detection  Identify commonly used language indicating counterfeits – you know it when you see it – and you know you have seen it  Feed to analyst via search-driven application – enrich based on analysts feedback
  • 15. 15©MapR Technologies - Confidential Veoh - Cross Recommendations  Cross recommendation as search – with search used to build cross recommendation!  Recommend content to people who exhibit certain behaviors (clicks, query terms, other)  (Ab)use of a search engine – but not as a search engine for content – more like a search engine for behavior
  • 16. 16©MapR Technologies - Confidential Recommendation Basics  History: User Thing 1 3 2 4 3 4 2 3 3 2 1 1 2 1
  • 17. 17©MapR Technologies - Confidential Recommendation Basics  History as matrix:  t1+t3 cooccur 2 times, t1+t4 once, t2+t4 once t1 t2 t3 t4 u1 1 0 1 0 u2 1 0 1 1 u3 0 1 0 1
  • 18. 18©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2 t3 not t3 t1 2 1 not t1 1 1
  • 19. 21©MapR Technologies - Confidential Spot the Anomaly  Root LLR is roughly like standard deviations A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 2 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 0.44 0.98 2.26 7.15
  • 20. 22©MapR Technologies - Confidential Threshold by Score  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
  • 21. 23©MapR Technologies - Confidential Threshold by Score  Indicators t1 t2 t3 t4 t1 1 0 0 1 t2 0 1 0 1 t3 0 0 1 1 t4 1 0 0 1 indicator: t2, t4
  • 22. 24©MapR Technologies - Confidential Search Engine for Reflected Intelligence  Map-reduce “big data” part – Logs record user + item occurrence – Group by user to get rows of occurrence matrix – Self-join to get cooccurrence – Log-likelihood test to find anomalies  Search part – Anomalous cooccurrences are indicators (or use statistical scores to provide fancy boosts) – Indicator fields and other meta-data are indexed – Recommendation implemented using a single search – Boosts, functions, similarity also can reflect learned behavior
  • 23. 25©MapR Technologies - Confidential What Platform Do You Need?  Fast, efficient, scalable search – bulk and near real-time indexing – handle billions of records with sub-second search and faceting  Large scale, cost effective storage and processing capabilities  NLP and machine learning tools that scale to enhance discovery and analysis  Integrated log analysis workflows that close the loop between the raw data and user interactions
  • 24. 26©MapR Technologies - Confidential Shards 1 2 3 N Search View •Documents •Users •Logs Document Store Analytic Services View into numeric/histor ic data Classification Recommendation Personalization & Machine Learning Services Classification Models In memory Replicated Multi-tenant Discovery & Enrichment Clustering, classification, NLP, topic identification, search log analysis, user behavior Content Acquisition ETL, batch or near real-time Access APIs Data • LucidWorks Search connectors • Push Reference Architecture
  • 25. 27©MapR Technologies - Confidential MapR  MapR provides the technology leading Hadoop distribution – full eco-system distribution – integrated data platform – complete solution for data integrity  MapR clusters also provide tight integration with search technologies like LucidWorks – integration is key for effective ops
  • 26. 28©MapR Technologies - Confidential LucidWorks  LucidWorks provides the leading packaging of Apache Lucene and Solr – build your own, we support – founded by the most prominent Lucene/Solr experts  LucidWorks Search – “Solr++” • UI, REST API, MapR connectors, relevance tools, much more  LucidWorks Big Data – Big Data as a Service – Integrated LucidWorks Search, Hadoop, machine learning with prebuilt workflows for many of these tasks
  • 27. 29©MapR Technologies - Confidential LucidWorks Big Data Inputs API MgmtSearch, Discovery, Analytics Processing & Storage Analytics Service Document Service Big Data LucidWorks Web HDFS Admin Service Mgmt Data Mgmt Provisioning, Monitoring & Configuration
  • 28. 30©MapR Technologies - Confidential Easy Wins  Analyze logs from application stored in MapR  Seamlessly store search indexes in MapR – and feed to Pig, Mahout and others – use mirrors + NFS to directly deploy indexes  Snapshots make backups a snap – A lot like version control for data  LucidWorks 2.5.2 easily connects with MapR
  • 29. 31©MapR Technologies - Confidential 1 + 1 = 3
  • 30. 32©MapR Technologies - Confidential Learn More  Talk to Ted @ted_dunning tdunning@maprtech.com  Talk to Grant @gsingers grant@lucidworks.com  MapR and Lucid Works http://www.mapr.com http://www.lucidworks.com Hash Tags #mapr #lucenesolr #lucidworks