SlideShare une entreprise Scribd logo
1  sur  34
Mastering solr
   Jur de Vries
Who am I?

Developer/architect at Triquanta
Trainer at Wizzlern
Use case

Market place
Advertisements
Adjust relevancy
Paid boosting of add's
Of course we use Drupal and Apache Solr
Running Solr Locally

Download latest version (3.6)
Be sure to download distribution (not src)
Unpack solr
Go to example directory
Run
 java -jar start.jar
Drupal: which contrib?

2 Possibilities
  Apachesolr search
  Search api with solr backend
Apache solr search

Streghts:
  Supported by Acquia
  Easy to set up
  Mature
Weaknesses
  Integration with views (still in dev)
Search Api

Strengths
  Flexible
  Indexes all entities
  Excellent views integration
  Related fields are easy to add to index
Weaknesses
  Not supported (yet) by Acquia
  Solr backend has some issues
Drupal: which contrib?

Apachesolr search integration
  Quick setup
  Acquia
Search API
  Exportable configuaration
  Views integration
  Index all entities
Depends on your needs
Basic use of search api

Create server
Create index
  Select fields to index
  Define data alterations
  Define processors
Start indexing
Field types

Integer, date, boolean
String or fulltext?
  Fulltext will get processed!
      Tokenize
      Stopwords
      Ignore case
  String is as is
Demo

Run solr
Copy schema.xml and solrconfig.xml (!)
Create server
Create index
Create view
  ads
  Ad filter exposed: search
Advanced use of Search api

This talk is about Solr, not about search API
Understand Solr first!
Many resources on the web
Watch screencasts etc
Mastering Solr

Mastering solr is understanding solr
What happens after a Drupal module?
Let's have a look at the request
Solr request

Look at solr log
Parameters:
  start
  rows
  q (query)
  qf (query fields)
  fl (fields)
  fq (filter query)
Field names

item_id, id
t_.., ss_.., → why?
Solr has to know how to handle fields
Field api: field names differ
Dynamic field names: tell solr field type!
Schema.xml

Defines field types and fields
The real tweaking starts here!
Let's have a look!
  dynamicField
  field type
  analyzers
Copyfield
What can you do in schema.xml?

Synonyms (is disabled by default)
Stopwords (and, or, etc)
Stemming
Proper multilingual handling
Browse the schema

Solr offers schema browsing
Go to: http://localhost:8983/solr/admin
Search relevancy

Types of boosting:
  Field level boost
  Boost function
  Boost query
  (QueryElevation)
Boost parameters

Field level boosting: qf
   qf:t_body^20
   score in field is multiplied by 20
Boost function: bf
   bf:product(fieldname, 2)
   result of function is added to score
Boost query: bq
boost (only for edismax) like bf but multiplication
Let's boost title

Field level boost is incorporated in Search API...
But, where are the numbers in the request???
Search api solr forgot to add them!
There is a patch :-)
But lets do it another way...
Debugging Solr

Lets add &echoParams=all to the request...
Where do all these parameters come from?
Solrconfig.xml!!!
Among other things: request handler
Let's look at the dismax request handler
Solrconfig.xml

(Default) Request handler:
  Default parameters
  Add Spellcheck
  Tweak all kinds of search behavior!
  Let's add default search fields with boost
Boost function

Mathematical functions on field values
Available functions:
  sum(x,y): x + y
  product(x,y): x * y
  scale(x, minTarget, maxTarget)
  recip(x, m, a, b): x / (m * a + b)
  ms(): time → ms(NOW/DAY, created)
  Many more!
Boost date

We need ms(): big values!
Linear? To much difference
Recip!
recip(x,1,1000,1000)
if x 1000: half
1 year: 3.1e10
recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)
bf=recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)^3
Use a graphing tool!
Boost queries

Do a query like fq:
Boost add's:
  content_type:add
  bq=content_type:add
  bq=(content_type)^20
Debugging relevancy

We know how to boost
How can finetuning be done?
solr has the solutions:
  add debugQuery=on
debugQuery=on


normal                   source
Relevancy

Choose your boosting methods
Try in your browser
Finetuning: debugQuery=on, source
Add parameters to solrconfig.xml
Or...
Add parameters in code

use
hook_search_api_solr_query_alter(array
  &$call_args, SearchApiQueryInterface $query)
$call_args['params']['bq'] = '(t_title:foo)^20'
$call_args['params']['bf'][] = b_promote
Override solr service class

In Search API: define server class
extend solr service class
Only change key methods
It's all about passing parameters!
Conclusion

Tweak indexing in schema.xml
  Stopwords
  Multilingual
Tweak searching in solrconfig.xml
Tweak searching by passing variables
This is only an introduction!
Questions?
Feedback & follow-up:
http://drupalcampgent.be/feedback

Contenu connexe

Tendances

Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEnterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEcommerce Solution Provider SysIQ
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Alexandre Rafalovitch
 
An Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrAn Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrLucidworks (Archived)
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)Alexandre Rafalovitch
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksAlexandre Rafalovitch
 
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature PreviewYonik Seeley
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Introduction to Apache Solr.
Introduction to Apache Solr.Introduction to Apache Solr.
Introduction to Apache Solr.ashish0x90
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 

Tendances (20)

Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEnterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)
 
An Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrAn Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache Solr
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
 
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature Preview
 
Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Apache Solr
Apache SolrApache Solr
Apache Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Apache Solr
Apache SolrApache Solr
Apache Solr
 
Introduction to Apache Solr.
Introduction to Apache Solr.Introduction to Apache Solr.
Introduction to Apache Solr.
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 

Similaire à Mastering solr

Enterprise search in_drupal_pub
Enterprise search in_drupal_pubEnterprise search in_drupal_pub
Enterprise search in_drupal_pubdstuartnz
 
Building strong foundations apex enterprise patterns
Building strong foundations apex enterprise patternsBuilding strong foundations apex enterprise patterns
Building strong foundations apex enterprise patternsandyinthecloud
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
Rails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineRails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineDavid Keener
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solrlucenerevolution
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...Fabio Franzini
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Simplify your professional web development with symfony
Simplify your professional web development with symfonySimplify your professional web development with symfony
Simplify your professional web development with symfonyFrancois Zaninotto
 
New Features in JDK 8
New Features in JDK 8New Features in JDK 8
New Features in JDK 8Martin Toshev
 
Introduction to coding using Python
Introduction to coding using PythonIntroduction to coding using Python
Introduction to coding using PythonDan D'Urso
 
Julio Capote, Twitter
Julio Capote, TwitterJulio Capote, Twitter
Julio Capote, TwitterOntico
 
Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Viral Solani
 
Flock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLFlock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLDatabricks
 
Salesforce Summer 14 Release
Salesforce Summer 14 ReleaseSalesforce Summer 14 Release
Salesforce Summer 14 ReleaseJyothylakshmy P.U
 
Programming With Amazon, Google, And E Bay
Programming With Amazon, Google, And E BayProgramming With Amazon, Google, And E Bay
Programming With Amazon, Google, And E BayPhi Jack
 
Mike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and PatternsMike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and Patternsukdpe
 

Similaire à Mastering solr (20)

Enterprise search in_drupal_pub
Enterprise search in_drupal_pubEnterprise search in_drupal_pub
Enterprise search in_drupal_pub
 
Building strong foundations apex enterprise patterns
Building strong foundations apex enterprise patternsBuilding strong foundations apex enterprise patterns
Building strong foundations apex enterprise patterns
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
Rails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineRails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search Engine
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solr
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...
 
Introduction to Force.com
Introduction to Force.comIntroduction to Force.com
Introduction to Force.com
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Simplify your professional web development with symfony
Simplify your professional web development with symfonySimplify your professional web development with symfony
Simplify your professional web development with symfony
 
New Features in JDK 8
New Features in JDK 8New Features in JDK 8
New Features in JDK 8
 
Introduction to coding using Python
Introduction to coding using PythonIntroduction to coding using Python
Introduction to coding using Python
 
Julio Capote, Twitter
Julio Capote, TwitterJulio Capote, Twitter
Julio Capote, Twitter
 
Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)
 
Salesforce
SalesforceSalesforce
Salesforce
 
Flock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLFlock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISL
 
slides.pptx
slides.pptxslides.pptx
slides.pptx
 
Salesforce Summer 14 Release
Salesforce Summer 14 ReleaseSalesforce Summer 14 Release
Salesforce Summer 14 Release
 
Programming With Amazon, Google, And E Bay
Programming With Amazon, Google, And E BayProgramming With Amazon, Google, And E Bay
Programming With Amazon, Google, And E Bay
 
Mike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and PatternsMike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and Patterns
 
Odoo from 7.0 to 8.0 API
Odoo from 7.0 to 8.0 APIOdoo from 7.0 to 8.0 API
Odoo from 7.0 to 8.0 API
 

Dernier

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Mastering solr

  • 1. Mastering solr Jur de Vries
  • 2. Who am I? Developer/architect at Triquanta Trainer at Wizzlern
  • 3. Use case Market place Advertisements Adjust relevancy Paid boosting of add's Of course we use Drupal and Apache Solr
  • 4. Running Solr Locally Download latest version (3.6) Be sure to download distribution (not src) Unpack solr Go to example directory Run java -jar start.jar
  • 5. Drupal: which contrib? 2 Possibilities Apachesolr search Search api with solr backend
  • 6. Apache solr search Streghts: Supported by Acquia Easy to set up Mature Weaknesses Integration with views (still in dev)
  • 7. Search Api Strengths Flexible Indexes all entities Excellent views integration Related fields are easy to add to index Weaknesses Not supported (yet) by Acquia Solr backend has some issues
  • 8. Drupal: which contrib? Apachesolr search integration Quick setup Acquia Search API Exportable configuaration Views integration Index all entities Depends on your needs
  • 9. Basic use of search api Create server Create index Select fields to index Define data alterations Define processors Start indexing
  • 10. Field types Integer, date, boolean String or fulltext? Fulltext will get processed! Tokenize Stopwords Ignore case String is as is
  • 11. Demo Run solr Copy schema.xml and solrconfig.xml (!) Create server Create index Create view ads Ad filter exposed: search
  • 12. Advanced use of Search api This talk is about Solr, not about search API Understand Solr first! Many resources on the web Watch screencasts etc
  • 13. Mastering Solr Mastering solr is understanding solr What happens after a Drupal module? Let's have a look at the request
  • 14. Solr request Look at solr log Parameters: start rows q (query) qf (query fields) fl (fields) fq (filter query)
  • 15. Field names item_id, id t_.., ss_.., → why? Solr has to know how to handle fields Field api: field names differ Dynamic field names: tell solr field type!
  • 16. Schema.xml Defines field types and fields The real tweaking starts here! Let's have a look! dynamicField field type analyzers Copyfield
  • 17. What can you do in schema.xml? Synonyms (is disabled by default) Stopwords (and, or, etc) Stemming Proper multilingual handling
  • 18. Browse the schema Solr offers schema browsing Go to: http://localhost:8983/solr/admin
  • 19. Search relevancy Types of boosting: Field level boost Boost function Boost query (QueryElevation)
  • 20. Boost parameters Field level boosting: qf qf:t_body^20 score in field is multiplied by 20 Boost function: bf bf:product(fieldname, 2) result of function is added to score Boost query: bq boost (only for edismax) like bf but multiplication
  • 21. Let's boost title Field level boost is incorporated in Search API... But, where are the numbers in the request??? Search api solr forgot to add them! There is a patch :-) But lets do it another way...
  • 22. Debugging Solr Lets add &echoParams=all to the request... Where do all these parameters come from? Solrconfig.xml!!! Among other things: request handler Let's look at the dismax request handler
  • 23. Solrconfig.xml (Default) Request handler: Default parameters Add Spellcheck Tweak all kinds of search behavior! Let's add default search fields with boost
  • 24. Boost function Mathematical functions on field values Available functions: sum(x,y): x + y product(x,y): x * y scale(x, minTarget, maxTarget) recip(x, m, a, b): x / (m * a + b) ms(): time → ms(NOW/DAY, created) Many more!
  • 25. Boost date We need ms(): big values! Linear? To much difference Recip! recip(x,1,1000,1000) if x 1000: half 1 year: 3.1e10 recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10) bf=recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)^3 Use a graphing tool!
  • 26. Boost queries Do a query like fq: Boost add's: content_type:add bq=content_type:add bq=(content_type)^20
  • 27. Debugging relevancy We know how to boost How can finetuning be done? solr has the solutions: add debugQuery=on
  • 29. Relevancy Choose your boosting methods Try in your browser Finetuning: debugQuery=on, source Add parameters to solrconfig.xml Or...
  • 30. Add parameters in code use hook_search_api_solr_query_alter(array &$call_args, SearchApiQueryInterface $query) $call_args['params']['bq'] = '(t_title:foo)^20' $call_args['params']['bf'][] = b_promote
  • 31. Override solr service class In Search API: define server class extend solr service class Only change key methods It's all about passing parameters!
  • 32. Conclusion Tweak indexing in schema.xml Stopwords Multilingual Tweak searching in solrconfig.xml Tweak searching by passing variables This is only an introduction!