21. Load Balancing
Server Affinity Algorithm
$serversNew = array();
[“host2”, “host3”, “host1”, “host4”]
$numServers = count($servers);
while($numServers > 0) {
// Take the first 4 chars of the md5sum of the server count
// and the query, mod the available servers
$key = hexdec(substr(md5($numServers . '+' . $query),0,4))%($numServers);
$keySet = array_keys($servers);
$serverId = $keySet[$key];
// Push the chosen server onto the new list and remove it
// from the initial list
array_push($serversNew, $servers[$serverId]);
unset($servers[$serverId]);
--$numServers;
}
30. Replication
Multicast Rsync?
[15:25] <engineer> patrick: i'm gonna test multi-rsyncing some indexes
from host1 to host2 and host3 in prod. I'll be watching the graphs and
what not, but let me know if you see anything funky with the network
[15:26] <patrick> ok
....
[15:31] <keyur> is the site down?
47. Solr InterOp
QParsers
The QParserPlugin that returns our new QParser:
public class PersonNameRealQParserPlugin extends QParserPlugin {
public static final String NAME = "personrealqp";
@Override
public void init(NamedList args) {}
@Override
public QParser createParser(String qstr, SolrParams localParams,
SolrParams params, SolrQueryRequest req) {
return new PersonNameRealQParser(qstr, localParams, params, req);
}
}
48. Solr InterOp
QParsers
Registering the plugin in solrconfig.xml:
<queryParser name="personrealqp"
class="com.etsy.person.solr.PersonNameRealQParserPlugin" />
52. Solr InterOp
Custom Stemmer
First we extend KStemmer and intercept stem calls:
public class LStemmer extends KStemmer {
/**.....**/
@Override
String stem(String term) {
String override = overrideStemTransformations.get(term);
if(override != null) return override;
return super.stem(term);
}
}
53. Solr InterOp
Custom Stemmer
Then create a TokenFilter that uses the new Stemmer:
final class LStemFilter extends TokenFilter {
/**.....**/
protected LStemFilter(TokenStream input, int cacheSize) {
super(input);
stemmer = new LStemmer(cacheSize);
}
@Override
public boolean incrementToken() throws IOException {
/**....**/
}
54. Solr InterOp
Custom Stemmer
Create a FilterFactory that exposes it:
public class LStemFilterFactory extends BaseTokenFilterFactory {
private int cacheSize = 20000;
@Override
public void init(Map<String, String> args) {
super.init(args);
String cacheSizeStr = args.get("cacheSize");
if (cacheSizeStr != null) {
cacheSize = Integer.parseInt(cacheSizeStr);
}
}
@Override
public TokenStream create(TokenStream in) {
return new LStemFilter(in, cacheSize);
}
}
55. Solr InterOp
Custom Stemmer
And finally plug it into your analysis chain:
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="solr/common/conf/stopwords.txt"/>
<filter class="com.etsy.solr.analysis.LStemFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>