SlideShare une entreprise Scribd logo
1  sur  49
Searching with SolrWhen, Why and How? By Paul Matthews
86p @paulmatthews86 86p.paul-matthews.co.uk pmatthews@ibuildings.com techportal.ibuildings.com Projects: Travel companies Media corporations 1
Searching… What? When? Why? How? 2
Searching… What? When? Why? How? 3
What is search? Text navigation Customers describing Sorting Examples Quick search Category listings 4
The power of search 5
Database Like 6
Database Like Very little effort A very basic search Poor at: > 1 word 7
Database Full-Text 8
Database Full-Text Some power Convenient Feature poor Often very slow 9
Basic Search Systems 10
Basic Search Systems Rapid search Simple to setup Feature poor Accuracy Require more application code 11
Solr Search 12
Solr Search Very powerful Feature rich Relatively simple Lots of plugins (community) Overkill? Java 13
Things you need to know 14
Searching… What? When? Why? How? 15
Applicable to me? Who is Solr designed for? Traffic Features When is a good time to implement it? Creation Post-live Open Source projects 16
Business indicators Money / Time / Effort spent Bugs Tuning Features Customers 17
Development indicators Data MySQL Full Text Degradation 18
Searching… What? When? Why? How? 19
Is Solr right for me? Know your enemy With great functionality comes great responsibility 20
Data sources Database Easy API Features CSV & XML Solr Cell - Rich Documents PDF MS Office 21
Indexing Parsing Half now, half later 22
Analyzer Process documents The query gets analyzed too 23
Tokenizer 24
TokenizerFilter Synonym 25
Stemming Matching similar words Reduce to Stem 26 Searching Search Searches Searched Searchers Search
Hit Highlighting “Hit” ==> 		“This is a <em>Hit</em> test.” 27
Spell Check Spelchk Did you mean …? “flickr”  28
29
By the power of Queries! Phrase		“Search for a phrase” Wildcards		Look*familiar? Fuzzy		fuzzy~  Proximity		“two words”~12 Range		name:{Paul TO Jeff} 30
name:paul AND location:uk A single field Multiple Fields 31
Faceting (21) Pre-fetching (11) Results (37) 32
Ranked Search Ordered Any field 33
Simultaneous update & search Hold on a minute! Actually, I don’t have to… 34
Searching… What? When? Why? How? 35
Flow 36
Container Choose container Make accessible http://<host>:<port>/solr/admin 37
SolrConfig Cores		~	Database Schema schema.xml		~	Schema definition 38
Fields Define the data indexed Stored Important to model accurately Tweak to achieve functionality Conscious of space and index 39
Index Create documents to Schema Spec 40
Search Quick Search Default Search Advanced Search 41
Quick Search Partial words Search all fields? Required response data 42
Default Search Consider useful Analyzers Potentially match on more fields Enrich or refine results with personal data More in depth results 43
Advanced Search Offer user control Consider search storage Data size vs Additional queries To return more / less results “Search entire document” “Filter by Colour” 44
Searching… What? When? Why? How? 45
Questions? 46
We’re Hiring NL Vlissingen Utrecht UK London Sheffield Liverpool Speak to me at the end… pmatthews@ibuildings.com 47
Thank you Resources Links: http://www.delicious.com/paulm86/solr This talk: http://joind.in/3221 Contact Me: @paulmatthews86 http://about.me/paul.matthews 48

Contenu connexe

Tendances

Ramp up Your Digital Marketing Plan | Internet Marketing for Healthcare
Ramp up Your Digital Marketing Plan | Internet Marketing for HealthcareRamp up Your Digital Marketing Plan | Internet Marketing for Healthcare
Ramp up Your Digital Marketing Plan | Internet Marketing for HealthcareRandall Wong, M.D.
 
SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018Linus Logren
 
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted BrightonSEO: Context is King - Ian Miller, CEO, at Crafted
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted Crafted
 
Data and Evidence-driven SEO
Data and Evidence-driven SEOData and Evidence-driven SEO
Data and Evidence-driven SEOHamlet Batista
 
Search engine optimization simplified
Search engine optimization simplifiedSearch engine optimization simplified
Search engine optimization simplifiedSagar Barapatre
 
SMX - How to Know If You've Been Hit by Google's Panda Penalty
SMX - How to Know If You've Been Hit by Google's Panda PenaltySMX - How to Know If You've Been Hit by Google's Panda Penalty
SMX - How to Know If You've Been Hit by Google's Panda PenaltyHelen Overland
 
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldHumantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldGrant Simmons
 

Tendances (7)

Ramp up Your Digital Marketing Plan | Internet Marketing for Healthcare
Ramp up Your Digital Marketing Plan | Internet Marketing for HealthcareRamp up Your Digital Marketing Plan | Internet Marketing for Healthcare
Ramp up Your Digital Marketing Plan | Internet Marketing for Healthcare
 
SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018
 
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted BrightonSEO: Context is King - Ian Miller, CEO, at Crafted
BrightonSEO: Context is King - Ian Miller, CEO, at Crafted
 
Data and Evidence-driven SEO
Data and Evidence-driven SEOData and Evidence-driven SEO
Data and Evidence-driven SEO
 
Search engine optimization simplified
Search engine optimization simplifiedSearch engine optimization simplified
Search engine optimization simplified
 
SMX - How to Know If You've Been Hit by Google's Panda Penalty
SMX - How to Know If You've Been Hit by Google's Panda PenaltySMX - How to Know If You've Been Hit by Google's Panda Penalty
SMX - How to Know If You've Been Hit by Google's Panda Penalty
 
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldHumantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
 

En vedette

Brave new world of HTML5
Brave new world of HTML5Brave new world of HTML5
Brave new world of HTML5Chris Mills
 
5.台湾脊柱关节病精神、社会支持模式
5.台湾脊柱关节病精神、社会支持模式5.台湾脊柱关节病精神、社会支持模式
5.台湾脊柱关节病精神、社会支持模式netnk
 
#Twevent if 06 10_2012
#Twevent if 06 10_2012#Twevent if 06 10_2012
#Twevent if 06 10_2012Igor Susyak
 
Microsoft power poin t µo™¢© ≠iµh§ß∂e¬-∑s∂iæi
Microsoft power poin t   µo™¢© ≠iµh§ß∂e¬-∑s∂iæiMicrosoft power poin t   µo™¢© ≠iµh§ß∂e¬-∑s∂iæi
Microsoft power poin t µo™¢© ≠iµh§ß∂e¬-∑s∂iæinetnk
 
Portfolio Teaser
Portfolio TeaserPortfolio Teaser
Portfolio Teaserm2andle
 

En vedette (6)

Brave new world of HTML5
Brave new world of HTML5Brave new world of HTML5
Brave new world of HTML5
 
5.台湾脊柱关节病精神、社会支持模式
5.台湾脊柱关节病精神、社会支持模式5.台湾脊柱关节病精神、社会支持模式
5.台湾脊柱关节病精神、社会支持模式
 
#Twevent if 06 10_2012
#Twevent if 06 10_2012#Twevent if 06 10_2012
#Twevent if 06 10_2012
 
Microsoft power poin t µo™¢© ≠iµh§ß∂e¬-∑s∂iæi
Microsoft power poin t   µo™¢© ≠iµh§ß∂e¬-∑s∂iæiMicrosoft power poin t   µo™¢© ≠iµh§ß∂e¬-∑s∂iæi
Microsoft power poin t µo™¢© ≠iµh§ß∂e¬-∑s∂iæi
 
1980’s
1980’s1980’s
1980’s
 
Portfolio Teaser
Portfolio TeaserPortfolio Teaser
Portfolio Teaser
 

Similaire à Search with Solr

Internal Search - The Lost Child of Web Analytics
Internal Search - The Lost Child of Web AnalyticsInternal Search - The Lost Child of Web Analytics
Internal Search - The Lost Child of Web AnalyticsCharles Meaden
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018Lauren Roitman
 
Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018Nate Plaunt
 
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Lucidworks
 
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...Findwise
 
Developing a Search & Findability Practice for the Enterprise
Developing a Search & Findability Practice for the EnterpriseDeveloping a Search & Findability Practice for the Enterprise
Developing a Search & Findability Practice for the EnterpriseRavi Mynampaty
 
Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...Enterprise Ireland
 
Information Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based ResearchInformation Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based ResearchDavid Nzoputa Ofili
 
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)webdagene
 
Site Search Analytics in a Nutshell
Site Search Analytics in a NutshellSite Search Analytics in a Nutshell
Site Search Analytics in a NutshellLouis Rosenfeld
 
Introduction of Search Engine & working process.pdf
Introduction of Search Engine & working process.pdfIntroduction of Search Engine & working process.pdf
Introduction of Search Engine & working process.pdfSAMBaquibillahSagor
 
Keyword Mastery: Keyword Research for Beginners
Keyword Mastery: Keyword Research for BeginnersKeyword Mastery: Keyword Research for Beginners
Keyword Mastery: Keyword Research for BeginnersMowbray Publishing Ltd
 
SEO in the Age of Artificial Intelligence | How AI influences Search
SEO in the Age of Artificial Intelligence | How AI influences SearchSEO in the Age of Artificial Intelligence | How AI influences Search
SEO in the Age of Artificial Intelligence | How AI influences SearchPhilipp Klöckner
 
Time for a new way to measure user experience
Time for a new way to measure user experienceTime for a new way to measure user experience
Time for a new way to measure user experienceKlaus Enzenhofer
 
SEOmoz Tools
SEOmoz ToolsSEOmoz Tools
SEOmoz ToolsWappow
 
Search Engine Marketing (Oldschool) - an introduction.
Search Engine Marketing (Oldschool) - an introduction.Search Engine Marketing (Oldschool) - an introduction.
Search Engine Marketing (Oldschool) - an introduction.Tim Vermeire
 
Google's rapid, recent transformation of search
Google's rapid, recent transformation of searchGoogle's rapid, recent transformation of search
Google's rapid, recent transformation of searchTop Floor Technologies
 

Similaire à Search with Solr (20)

Internal Search - The Lost Child of Web Analytics
Internal Search - The Lost Child of Web AnalyticsInternal Search - The Lost Child of Web Analytics
Internal Search - The Lost Child of Web Analytics
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018
Keyword Research Tools to Make Your Life Easier // Drink Digital September 2018
 
Using LWE/Solr/Lucene for eCom
Using LWE/Solr/Lucene for eComUsing LWE/Solr/Lucene for eCom
Using LWE/Solr/Lucene for eCom
 
Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018
 
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
 
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...
Developing a Search & Findability Practice for the Enterprise – Ravi Mynampat...
 
Developing a Search & Findability Practice for the Enterprise
Developing a Search & Findability Practice for the EnterpriseDeveloping a Search & Findability Practice for the Enterprise
Developing a Search & Findability Practice for the Enterprise
 
Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...
 
Winning the SEO Game for Schools
Winning the SEO Game for SchoolsWinning the SEO Game for Schools
Winning the SEO Game for Schools
 
Information Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based ResearchInformation Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based Research
 
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
 
Site Search Analytics in a Nutshell
Site Search Analytics in a NutshellSite Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
 
Introduction of Search Engine & working process.pdf
Introduction of Search Engine & working process.pdfIntroduction of Search Engine & working process.pdf
Introduction of Search Engine & working process.pdf
 
Keyword Mastery: Keyword Research for Beginners
Keyword Mastery: Keyword Research for BeginnersKeyword Mastery: Keyword Research for Beginners
Keyword Mastery: Keyword Research for Beginners
 
SEO in the Age of Artificial Intelligence | How AI influences Search
SEO in the Age of Artificial Intelligence | How AI influences SearchSEO in the Age of Artificial Intelligence | How AI influences Search
SEO in the Age of Artificial Intelligence | How AI influences Search
 
Time for a new way to measure user experience
Time for a new way to measure user experienceTime for a new way to measure user experience
Time for a new way to measure user experience
 
SEOmoz Tools
SEOmoz ToolsSEOmoz Tools
SEOmoz Tools
 
Search Engine Marketing (Oldschool) - an introduction.
Search Engine Marketing (Oldschool) - an introduction.Search Engine Marketing (Oldschool) - an introduction.
Search Engine Marketing (Oldschool) - an introduction.
 
Google's rapid, recent transformation of search
Google's rapid, recent transformation of searchGoogle's rapid, recent transformation of search
Google's rapid, recent transformation of search
 

Dernier

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Dernier (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Search with Solr

Notes de l'éditeur

  1. Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company
  2. This talk What Is Solr? When is right timeWhySearch ?How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  3. This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  4. What is search? Text based navigation To content / products Customers describing something Capture queries SortingOrganising content Examples Quick search Category listing Advanced search
  5. The Power of SearchFrom LIKE to SOLR
  6. First up DB Like
  7. Pros: Little effort to use, or understand.Cons: Not good User data: Not greater than 1 word
  8. Full Text Lots of people use
  9. Pros: Some power Convenient In DBCons: Feature poor Slow
  10. Basic / Easy to use proper Search
  11. Pros: Can be very fast Often simple to setupCons: Feature poor Less accurate More application code?Google Custom Search Engine Crawls siteXapian Simple search solution
  12. Pros:Poweful Feature rich Relatively Simple Lots of pluginsCons: Could be overkill Different language
  13. On Java stand alone Requires servlet container Tomcat Jetty stand alone Lucene Search library Offers Full Text High performance Java - other implementations available
  14. This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  15. Who? Traffic Not for Facebook Works for average Features It has many No need to use themWhen? Designed from beginning Easily used to enrich site navigation Implementation as post-live project Implementation into existing open source softwareDrupalMagento
  16. Spending time / effort / money on the search box Fixing bugs Endless tuning Adding functionalityCustomers complaining Not finding content High Bounce rates Site is slow Not finding the *right* content
  17. Large data sets 10000 records Speed Like queriesMySQL full-text Site performanceSlowlog? Results Inaccurate MissingGraceful degradation Important for quality Low cost
  18. This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  19. Is Solr right for me?Before Answering:Terms:Find materialsCommunicate to peopleFunctionality:Most Use – Know FunctionalityRe-invent – Wheel
  20. Main 2Database tables Data Import Handler Easy – just configAPI Anything publish API Hooked into contentCSV &amp; XMLSolr Cell - Rich Docs PDF MS Office
  21. Parse: text generate index Removes junk Improve matchesHalf now, half later: Reduce time searching
  22. Analyzer Groups actions of Parsing Important to do same / similar in searching
  23. TokenizerStrings to tokensExample ones:Whitespace – splits on whitespaceKeyword – strips special charsStandard – General purpose, adds context
  24. Transforms tokensLower case.Stop – filters out stop words: a, if, to, andStandard – Remove dots, ‘s (Context only)Synonym.
  25. Hit Highlighting* Remember to set the delimiter, not everything is a web page.
  26. Spell checkingConfigureSpellingsNames - FlickrKeywords
  27. Autocomplete Common queries
  28. Phrase queries &quot;search for a phrase&quot;Wildcard queries Match with wildcards ? single * multipleFuzzy queriesLevenshtein Distance Similar to word ~Proximity queries Words close together &quot;two words&quot;~12Range queries Between two values started:[20110101 TO 20120101] Inclusivename:{Paul TO Jeff} exclusive
  29. Fields Single field Target search Multiple field Build Queries
  30. Faceted Set Counts Filter data Multiple classifications
  31. Ordered results based on best matchOr order by any field
  32. Simultaneous update and search
  33. This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  34. Blog post – to explainsConfigure ContainerSolrIndex Documents Any sourceSearch Default search Advanced search
  35. Container setup Choose Configure Accessible
  36. Define the data Define what is indexed Define what is storedIntegral to returning relevant search responsesRequire tweaking to get rightConscious of space size of the index - speed
  37. Docs to Schema SpecIndexing by Database or API
  38. Partial Words Analyzing?Search all fields Possibly the main onesResponse Less data Stay clear of additional queries consider caching
  39. Consider using stemming analyzers to return more resultsIncrease matching columnsUse session data affect results Consider caching effectsMore response data required
  40. Users modify their search Specify fields For enriching the results Consider bloated storage Tradeoff with Additional queries Tweak later?Advanced for returning More / Less results Search more of the document Filter on property
  41. This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  42. Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company