SlideShare une entreprise Scribd logo
1  sur  17
andrew.janowczyk@searchbox.com
Solr is
◦ Blazing fast open source enterprise search platform
◦ Lucene-based Search Server
◦ Written in Java
◦ Has REST-like HTTP/XML and JSON APIs
◦ Extensive plugin architecture
http://lucene.apache.org/solr/
 Allows for the development of plugins which
provide advanced operations
 Types of plugins:
◦ RequestHandlers
 Uses url parameters and returns own response
◦ SearchComponents
 Responses are embedded in other responses (such as
/select)
◦ ProcessFactory
 Response is stored into a field along with the
document during index time
 A quick tutorial on how to program a
RequestHandler to
◦ Be initialized
◦ Parse configuration file arguments
◦ Do something useful, (counts some words in query)
◦ Format and return response
 We’ll name our plugin “DemoPlugin” and
show how to stick it into the solrconfig.xml
for loading
 In the next slide, we’ll specify a list of variables
called “words”, and each list subtype is a string
“word”
 We want to load these specific words and then
count them on all subsequent queries.
 Ex: config file has “body”, “fish”, “dog”
 Query is: dog body body body fish fish fish fish
orange
 Result should be:
◦ body=3.0
◦ fish=4.0
◦ dog=1.0
<requestHandler name=“/newendpoint"
class="com.searchbox.DemoPlugin">
<lst name=“words">
<str name=“word">body</str>
<str name=“word">fish</str>
<str name=“word">dog</str>
</lst>
</requestHandler>
Variables will be loaded from this section
during the init method discussed later
 We can see that we’re asking for Solr to load
com.searchbox.DemoPlugin. This will be the
output of our project in .jar file format
 Copy the .jar file to the lib directory in the
Solr installation so that Solr can find it.
 That’s it!
package com.searchbox;
import java.util.HashMap;
import java.util.List;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.params.CommonParams;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.common.util.SimpleOrderedMap;
import org.apache.solr.handler.RequestHandlerBase;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.search.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class DemoPlugin extends RequestHandlerBase {
private static Logger LOGGER = LoggerFactory.getLogger(DemoPlugin.class);
volatile long numRequests;
volatile long totalTime;
volatile long numErrors;
List<String> words;
 Initialization is called when the plugin is first
loaded
 This most commonly occurs when Solr is
started up
 At this point we can load things from file
(models, serialized objects, etc)
 Have access to the variables set in
solrconfig.xml
 We have selected to pass a list called “words”
and have also provided the list “fish”, ”body”,
”cat” of words we’d like to count.
 During initialization we need to load this list
from solrconfig.xml and store it locally
@Override
public void init(NamedList params) {
words= (NamedList)params.get(“words”)).getAll(“word”);
if (words.isEmpty()) {
throw new
SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Need to specify at least one word in requestHandler config!");}
}
super.init(params); //pass the rest of the init up
}
Notice that we’ve loaded the list “words” and
then all of its attributes called “word” and put
them into the class level variable words.
@Override
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception
{
numRequests++;
long startTime = System.currentTimeMillis();
try {
HashMap<String, Double> counts = new HashMap<String, Double>();
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q); //get the q param from url
for (String string : q.split(" ")) {
if (words.contains(string)) {
Double oldcount = counts.containsKey(string) ? counts.get(string) : 0;
counts.put(string, oldcount + 1);
}
}
• We start off by keeping track in a volatile variable the number of requests we’ve seen (for use later
in statistics), and we’d like to know how long the process takes so we note the time.
• Next we initialize our local variable which will contain our word counts
• Next we get the “q” parameter from the URL which was sent to us
• We do a very silly split by space to break it into words, and iterate through each of the words. If the
word is in our “words” variable, we keep a running total of the number of times it appears
NamedList<Double> results = new NamedList<Double>();
for (String word : words) {
results.add(word, counts.get(word));
}
rsp.add("results", results);
} catch (Exception e) {
numErrors++;
LOGGER.error(e.getMessage());
} finally {
totalTime += System.currentTimeMillis() - startTime;
}
}
• Now that we’ve looked at all of the strings, and our process is done we need to return the results.
• We create a namedlist of type double to hold the counts, and then iterate through our words adding them
to the response
• Finally, we add our result list to the Solr response variable rsp
• We also see the other end of the catch statement, which is used to collect error counts and print the error
to the Solr logger
• Finally we add the time it took to the total time
@Override
public String getDescription() {
return "Searchbox DemoPlugin";
}
@Override
public String getVersion() {
return "1.0";
}
@Override
public String getSource() {
return "http://www.searchbox.com";
}
@Override
public NamedList<Object> getStatistics() {
NamedList all = new SimpleOrderedMap<Object>();
all.add("requests", "" + numRequests);
all.add("errors", "" + numErrors);
all.add("totalTime(ms)", "" + totalTime);
return all;
}
• In order to have a production grade plugin, users expect to see certain pieces of information
available in their Solr admin panel
• Description, version and source are just Strings
• We see getStatistics() actually uses the volatile variables we were keeping track of before, sticks
them into another named list and returns them. These appear under the statistics panel in Solr.
• That’s it!
http://192.168.56.101:8983/solr/core_name/newendpoint?q=dog%20body%20body%20body%20fish%20fis
h%20fish%20fish%20orange
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="results">
<double name="body">3.0</double>
<double name="fish">4.0</double>
<double name="dog">1.0</double>
</lst>
</response>
• Because we’ve overridden the
getStatistics() method, we can get real-
time stats from the admin panel!
Happy Developing!
Full Source Code available at:
http://www.searchbox.com/developing-a-request-handler-for-solr

Contenu connexe

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Develop a solr request handler plugin

  • 2. Solr is ◦ Blazing fast open source enterprise search platform ◦ Lucene-based Search Server ◦ Written in Java ◦ Has REST-like HTTP/XML and JSON APIs ◦ Extensive plugin architecture http://lucene.apache.org/solr/
  • 3.  Allows for the development of plugins which provide advanced operations  Types of plugins: ◦ RequestHandlers  Uses url parameters and returns own response ◦ SearchComponents  Responses are embedded in other responses (such as /select) ◦ ProcessFactory  Response is stored into a field along with the document during index time
  • 4.  A quick tutorial on how to program a RequestHandler to ◦ Be initialized ◦ Parse configuration file arguments ◦ Do something useful, (counts some words in query) ◦ Format and return response  We’ll name our plugin “DemoPlugin” and show how to stick it into the solrconfig.xml for loading
  • 5.  In the next slide, we’ll specify a list of variables called “words”, and each list subtype is a string “word”  We want to load these specific words and then count them on all subsequent queries.  Ex: config file has “body”, “fish”, “dog”  Query is: dog body body body fish fish fish fish orange  Result should be: ◦ body=3.0 ◦ fish=4.0 ◦ dog=1.0
  • 6. <requestHandler name=“/newendpoint" class="com.searchbox.DemoPlugin"> <lst name=“words"> <str name=“word">body</str> <str name=“word">fish</str> <str name=“word">dog</str> </lst> </requestHandler> Variables will be loaded from this section during the init method discussed later
  • 7.  We can see that we’re asking for Solr to load com.searchbox.DemoPlugin. This will be the output of our project in .jar file format  Copy the .jar file to the lib directory in the Solr installation so that Solr can find it.  That’s it!
  • 8. package com.searchbox; import java.util.HashMap; import java.util.List; import org.apache.solr.common.SolrException; import org.apache.solr.common.params.CommonParams; import org.apache.solr.common.params.SolrParams; import org.apache.solr.common.util.NamedList; import org.apache.solr.common.util.SimpleOrderedMap; import org.apache.solr.handler.RequestHandlerBase; import org.apache.solr.request.SolrQueryRequest; import org.apache.solr.response.SolrQueryResponse; import org.apache.solr.search.*; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class DemoPlugin extends RequestHandlerBase { private static Logger LOGGER = LoggerFactory.getLogger(DemoPlugin.class); volatile long numRequests; volatile long totalTime; volatile long numErrors; List<String> words;
  • 9.  Initialization is called when the plugin is first loaded  This most commonly occurs when Solr is started up  At this point we can load things from file (models, serialized objects, etc)  Have access to the variables set in solrconfig.xml
  • 10.  We have selected to pass a list called “words” and have also provided the list “fish”, ”body”, ”cat” of words we’d like to count.  During initialization we need to load this list from solrconfig.xml and store it locally
  • 11. @Override public void init(NamedList params) { words= (NamedList)params.get(“words”)).getAll(“word”); if (words.isEmpty()) { throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Need to specify at least one word in requestHandler config!");} } super.init(params); //pass the rest of the init up } Notice that we’ve loaded the list “words” and then all of its attributes called “word” and put them into the class level variable words.
  • 12. @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { numRequests++; long startTime = System.currentTimeMillis(); try { HashMap<String, Double> counts = new HashMap<String, Double>(); SolrParams params = req.getParams(); String q = params.get(CommonParams.Q); //get the q param from url for (String string : q.split(" ")) { if (words.contains(string)) { Double oldcount = counts.containsKey(string) ? counts.get(string) : 0; counts.put(string, oldcount + 1); } } • We start off by keeping track in a volatile variable the number of requests we’ve seen (for use later in statistics), and we’d like to know how long the process takes so we note the time. • Next we initialize our local variable which will contain our word counts • Next we get the “q” parameter from the URL which was sent to us • We do a very silly split by space to break it into words, and iterate through each of the words. If the word is in our “words” variable, we keep a running total of the number of times it appears
  • 13. NamedList<Double> results = new NamedList<Double>(); for (String word : words) { results.add(word, counts.get(word)); } rsp.add("results", results); } catch (Exception e) { numErrors++; LOGGER.error(e.getMessage()); } finally { totalTime += System.currentTimeMillis() - startTime; } } • Now that we’ve looked at all of the strings, and our process is done we need to return the results. • We create a namedlist of type double to hold the counts, and then iterate through our words adding them to the response • Finally, we add our result list to the Solr response variable rsp • We also see the other end of the catch statement, which is used to collect error counts and print the error to the Solr logger • Finally we add the time it took to the total time
  • 14. @Override public String getDescription() { return "Searchbox DemoPlugin"; } @Override public String getVersion() { return "1.0"; } @Override public String getSource() { return "http://www.searchbox.com"; } @Override public NamedList<Object> getStatistics() { NamedList all = new SimpleOrderedMap<Object>(); all.add("requests", "" + numRequests); all.add("errors", "" + numErrors); all.add("totalTime(ms)", "" + totalTime); return all; } • In order to have a production grade plugin, users expect to see certain pieces of information available in their Solr admin panel • Description, version and source are just Strings • We see getStatistics() actually uses the volatile variables we were keeping track of before, sticks them into another named list and returns them. These appear under the statistics panel in Solr. • That’s it!
  • 15. http://192.168.56.101:8983/solr/core_name/newendpoint?q=dog%20body%20body%20body%20fish%20fis h%20fish%20fish%20orange <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> </lst> <lst name="results"> <double name="body">3.0</double> <double name="fish">4.0</double> <double name="dog">1.0</double> </lst> </response>
  • 16. • Because we’ve overridden the getStatistics() method, we can get real- time stats from the admin panel!
  • 17. Happy Developing! Full Source Code available at: http://www.searchbox.com/developing-a-request-handler-for-solr