SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Lucene/Solr 8: 

The next major release
Steve Rowe
Senior Software Developer, Lucidworks
@steven_a_rowe
#Activate18 #ActivateSearch
Agenda
• Recent release cadence
• 7.X
• 8.0
• 8.X
YOU

ARE
HERE
7.X average: 11 weeks6.X average: 10 weeks
7.X
1. Metrics
2. Autoscaling
3. CDCR
4. Time Routed Aliases
5. Replica types
6. Streaming expressions
7. JSON facet API
8. Configset / schema
9. Text Analysis / ML
10. Collections API
11. Queries
12. Large index segment
merging
13. Replication / recovery /
rolling updates
14. Block-join / nested docs
15. Miscellaneous
7.X: Metrics
• Continuation of 6.X work to support Autoscaling efforts
• 7.0: - Aggregated metrics collected in overseer

- solrconfig.xml <jmx> ➞ solr.xml <metrics><reporter>
• 7.1: Prometheus metrics exporter contrib
• 7.4: /admin/metrics/history API: basic long-term key metric
time series aggregation
• Fixed-width windows at

several resolutions
• Not yet in Admin UI:

SOLR-12426
7.X: Autoscaling
• 7.0: - Preferences and policy DSL: flexible replica placement

[ { minimize: cores }, { maximize: freedisk } ]

{ replica: "<2", shard: "#EACH", node: "#ANY" }

- Diagnostics API: return sorted nodes, policy violations
• 7.1: - autoAddReplicas ported to autoscaling framework

- Add/remove/suspend/resume triggers and listeners

- Triggers for added and lost nodes

- ComputePlanAction / ExecutePlanAction

- /autoscaling/history API: cluster events and actions
• 7.2: - Search rate trigger

- /autoscaling/suggestions API

- UTILIZENODE collections API command
7.X: Autoscaling
• 7.3: - Simulation framework

- Arbitrary metric threshold trigger

- Scheduled trigger

- Admin UI to display and execute suggestions
7.X: Autoscaling
• 7.4: - Periodic house-keeping task: cleans up inactive shards

- Index size trigger: document count or size in bytes
• 7.5: - Policy replica attribute: #ALL, #EQUAL, percentage,

range, and floating point values

- Policy cores attribute: #EQUAL, percentage, 

range, and floating point values

- Percentage in freedisk policy attribute

- Simulation framework: test scaling up to 1 billion docs
7.X: Cross Data Center Replication
• 7.2: Support bi-directional syncing of CDCR clusters
This is not
active-active, 

but rather

passive-active
or active-passive:
only one active

cluster at a time.
7.X: Time Routed Aliases
• 7.3: - Specialization of Solr’s collection alias feature

- Support time series data, e.g. logs / sensor data

- Maintain performance under continuous indexing

- CREATEALIAS: start, interval, retention policy

- Automatically create new collections

- Automatically delete old collections (optional)

- Route updates based on timestamp

- Search against all aliased collections*
• 7.5: Preemptively create the next collection when updates

are near the latest collection’s end date (optional)

* Pending optimization: minimize queried collections (SOLR-9562)
7.X: Replica types
• 7.0:













• 7.4: Query param to prioritize replicas by type, e.g.
shards.preference=replica.type:PULL,replica.type:TLOG
Type
Indexes

locally
Supports

soft
commit

& RTG
Pulls
segments
from
leader
Writes to

TLog
Can
become
shard
leader
Queryable
NRT ✅ ✅ ✅ ✅ ✅
TLOG leader ✅ ✅ ✅ ✅ ✅
TLOG ✅ ✅ ✅ ✅
PULL ✅ ✅
7.X: Streaming expressions
• Parallel computation function suite
• Some use cases: MapReduce, aggregations, parallel SQL, pub/
sub messaging, graph traversal, machine learning, statistical
programming
• Each 7.X release has added

many new functions
• 7.5: Ref guide:

Math Expressions User Guide
7.X: JSON Facet API
• 7.0: Terms facets: added optional refinement support
• 7.4: Semantic Knowledge Graph support via new 

relatedness() aggregate function
• Finds ad-hoc relationships by scoring documents
relative to foreground and background document
sets
• 7.5: Heatmap facet support
7.X: Configsets / schema
• 7.0: - _default configset

- Data-driven schema: auto-guessed text fields indexed 2 ways:
• tokenized for search
• strings for sorting/faceting: "*_str" string field, max 256 chars
- Turn off data-driven schema functionality:

curl http://host:8983/solr/mycollection/config 

-d "{ set-user-property: { update.autoCreateFields: false }}"
• 7.5: Disable configset upload: -Dconfigset.upload.enabled=false
7.X: Text analysis / machine learning
• 7.1: Bengali normalizer and stemmer
• 7.2: Enable off-ZooKeeper storage of large (>1MB) LTR models
• 7.3: OpenNLP integration: tokenization, POS tagging, phrase

chunking, lemmatization, NER, language detection
• 7.4: - ProtectedTermFilterFactory: don’t filter protected terms

- TaggerRequestHandler (a.k.a. SolrTextTagger): NER
• 7.5: - "nori" Korean morphological text analysis: "*_txt_ko"

- PhrasesIdentificationComponent: identify and score

candidate query phrases based on index statistics

- UIMA integration removed
7.X: Collections API
• 7.3: Add collection level properties similar to cluster properties
• 7.4: Cluster-wide defaults for numShards, nrtReplicas,

tlogReplicas, pullReplicas
• 7.5: - Support co-locating replicas of two or more collections

together in a node via the withCollection parameter

to the CREATE and MODIFYCOLLECTION commands

- SPLITSHARD: New split method using hard links: splitMethod=link
• 3-5 times faster than the original splitMethod=rewrite
• Slows down replication
• Increases disk usage on replica nodes
7.X: Queries
• 7.1: JSON
query
DSL

curl http://localhost:8983/solr/books/query -d '
{
query: {
bool: {
must: [
"title:solr",
{lucene: {df: content, query: "lucene solr"}}
],
must_not: [
{frange: {u: 3.0, query: ranking}}
]}}}'
7.X: Queries
• 7.2: New synonymQueryStyle field type option: enable

generation of appropriate queries for hierarchical

relations between overlapping terms
• as_same_term (default): SynonymQuery(bird,robin)
• pick_best: Dismax(bird,robin)
• as_distinct_terms: (bird OR robin)
• 7.4: JSON query DSL: Enable query/filter tagging,

e.g. { "#colorfilt" : "color:blue" } 

equivalent to local-param {!tag=colorfilt}color:blue

7.X: Large index segment merging
• Problem: Overly large segments (e.g. as a result of force-

merge/optimize) stop being eligible for merging,

and can start accumulating >50% deleted

documents, wasting space and skewing index stats.
• 7.5: - TieredMergePolicy now respects maxSegmentSizeMB

by default when executing force-merge/optimize and

expunge-deletes

- TieredMergePolicy’s reclaimDeletesWeight has been

replaced with a new deletesPctAllowed setting to

control how aggressively deletes should be reclaimed
7.X: Replication/recovery/rolling upgrades
• 7.3: The old Leader-Initiated-Recovery (LIR) implementation

is deprecated and replaced
• To perform a rolling upgrade to Solr 8, you must be on
Solr 7.3 or higher
• 7.4: - IndexFetcher now skips fetching identical files

- Buffering updates are written to a separate TLog

- Parallel replay of buffering TLogs
7.X: Block-join / nested documents
• 7.3: Added filters and excludeTags local-params for

{!parent} and {!child} query parsers, usable for

multi-select faceting
• 7.5: WIP: Allow Solr to more faithfully represent deeply

nested document relationships, rather than requiring

reconstruction based on the flattened list of child docs

returned by Solr
7.X: Miscellaneous
• 7.3: add-distinct atomic updates
• 7.4: - Ignore large document URP

- TLog: maxSize auto hard-commit setting

(in addition to maxDocs & maxTime)
• 7.5: Custom cluster properties allowed with ext. prefix
8.0
• Autoscaling
• Index upgrades
• HTTP/2
• Miscellaneous
8.0: Autoscaling
• Suggestions API: rebalance options even if no violations
• Suggestions API: add-replica for lost replicas
• maxOps limit for index size trigger
• Autoscaling policy framework will be the default replica
placement strategy
8.0: Index upgrades
• 7.0: Lucene indexes record the major Lucene version that

created the index, and the minimum Lucene version

that contributed to segments.
• 8.0: Version N-2 or older indexes will now fail to open,

even if they have been merged into an N-1 index.
• IndexUpgrader will not upgrade 6.X or earlier indexes
• Re-indexing will be required to upgrade
8.0: HTTP/2
• May 2018: Mark Miller announced his Star Burst effort:

many cleanups and performance enhancements
• July 2018: Cao Manh Dat took up the HTTP/2 aspects: SOLR-12639
• Indexing test: 33M docs, 1 shard, 2 replicas (SOLR-12642)
• Garbage: Leader: 26% less; replica: 76% less
• Indexing throughput: 54% higher
• CPU time: Leader: 39% higher; replica: 76% lower
• Ready to merge back to master, pending release of

Jetty 9.4.13, containing SPNEGO HTTP/2 implementation
8.0: Miscellaneous
• Lucene: scores must be non-negative
• Function(Score)Query-s convert negative scores to zero
• TODO: remove deprecations
• Trie fields? Removal effectively blocked by:
• SOLR-12074: Add numeric equivalent to StrField
• SOLR-11127: Mechanism to migrate schema
for .system collection (a.k.a. blob store) schema from
Trie (pre-7.0) to Points (7.0+)
8.X
• Lucene/Solr minimum JDK
• Luke: Lucene Toolbox
• New Lucene features
8.X: Lucene/Solr minimum JDK
• Oracle will end free JDK 8 support in January 2019
• Both JDK 9 & 10 are already EOL, no more Oracle support
• JDK 11 will very likely be next minimum supported JDK, no
schedule yet
• Under JDK 9+, Solr’s Hadoop-related functionality has
problems, including with Kerberos
• Uwe Schindler’s Jenkins server tests Lucene/Solr on Oracle
9+10+11+12 JDKs
• All have higher Solr test failure rates than on JDK 8
8.X: Luke: UI framework & licensing
• Andrzej Bialecki: Initial implementation: Thinlet, GPL
• Mark Harwood: GWT
• Mark Miller: Apache Pivot
• Dmitry Kan and Tomoko Uchida took ownership on Github
• Tomoko Uchida: JavaFX (bundled w/JDK 8)
• LUCENE-2562: Make Luke a Lucene/Solr Module
• JavaFX/OpenJFX unbundled from Java 11 JDK, GPL+CPE
• Tomoko Uchida: Swing (7.5 release available)
8.X: New Lucene features
• Index impacts, Block-Max WAND, similarity cleanups
• Some queries (especially term queries and disjunctions)
are much faster when number of hits is not required
• FeatureField: incorporate static relevance signals, e.g.
PageRank
• Soft deletes
• Merge policy retains deleted docs according to policy
• Enables document history, e.g. for time-travel indexes
• RAMDirectory replaced by ByteBuffersDirectory
Questions?
Thank you!
Steve Rowe
Senior Software Engineer, Lucidworks
@steven_a_rowe
#Activate18 #ActivateSearch

Contenu connexe

Tendances

Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudthelabdude
 
Supercharging Content Delivery with Varnish
Supercharging Content Delivery with VarnishSupercharging Content Delivery with Varnish
Supercharging Content Delivery with VarnishSamantha Quiñones
 
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBM
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBMUnderstanding the Solr Security Framekwork: Presented by Anshum Gupta, IBM
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBMLucidworks
 
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin LongCODE BLUE
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormP. Taylor Goetz
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016Viet-Dung TRINH
 
24HOP Introduction to Linux for SQL Server DBAs
24HOP Introduction to Linux for SQL Server DBAs24HOP Introduction to Linux for SQL Server DBAs
24HOP Introduction to Linux for SQL Server DBAsKellyn Pot'Vin-Gorman
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupShalin Shekhar Mangar
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
Centralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperCentralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
 
Australian OpenStack User Group August 2012: Chef for OpenStack
Australian OpenStack User Group August 2012: Chef for OpenStackAustralian OpenStack User Group August 2012: Chef for OpenStack
Australian OpenStack User Group August 2012: Chef for OpenStackMatt Ray
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Play Framework and Activator
Play Framework and ActivatorPlay Framework and Activator
Play Framework and ActivatorKevin Webber
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbtFabio Fumarola
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disquszeeg
 

Tendances (20)

Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
 
Curator intro
Curator introCurator intro
Curator intro
 
Supercharging Content Delivery with Varnish
Supercharging Content Delivery with VarnishSupercharging Content Delivery with Varnish
Supercharging Content Delivery with Varnish
 
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBM
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBMUnderstanding the Solr Security Framekwork: Presented by Anshum Gupta, IBM
Understanding the Solr Security Framekwork: Presented by Anshum Gupta, IBM
 
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long
[CB16] DeathNote of Microsoft Windows Kernel by Peter Hlavaty & Jin Long
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016
 
24HOP Introduction to Linux for SQL Server DBAs
24HOP Introduction to Linux for SQL Server DBAs24HOP Introduction to Linux for SQL Server DBAs
24HOP Introduction to Linux for SQL Server DBAs
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
Centralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperCentralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache Zookeeper
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Australian OpenStack User Group August 2012: Chef for OpenStack
Australian OpenStack User Group August 2012: Chef for OpenStackAustralian OpenStack User Group August 2012: Chef for OpenStack
Australian OpenStack User Group August 2012: Chef for OpenStack
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Play Framework and Activator
Play Framework and ActivatorPlay Framework and Activator
Play Framework and Activator
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbt
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 

Similaire à Lucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scaleAnshum Gupta
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Shalin Shekhar Mangar
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10Anshum Gupta
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0Anshum Gupta
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Lucidworks
 
Distributed tracing in OpenStack
Distributed tracing in OpenStackDistributed tracing in OpenStack
Distributed tracing in OpenStackIlya Shakhat
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaCominvent AS
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1Stefan Schmidt
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)Mathew Beane
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under controlMarcin Przepiórowski
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 

Similaire à Lucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks (20)

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
 
Distributed tracing in OpenStack
Distributed tracing in OpenStackDistributed tracing in OpenStack
Distributed tracing in OpenStack
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under control
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 

Plus de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Plus de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Dernier

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Lucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks

  • 1. Lucene/Solr 8: 
 The next major release Steve Rowe Senior Software Developer, Lucidworks @steven_a_rowe #Activate18 #ActivateSearch
  • 2. Agenda • Recent release cadence • 7.X • 8.0 • 8.X YOU
 ARE HERE
  • 3. 7.X average: 11 weeks6.X average: 10 weeks
  • 4. 7.X 1. Metrics 2. Autoscaling 3. CDCR 4. Time Routed Aliases 5. Replica types 6. Streaming expressions 7. JSON facet API 8. Configset / schema 9. Text Analysis / ML 10. Collections API 11. Queries 12. Large index segment merging 13. Replication / recovery / rolling updates 14. Block-join / nested docs 15. Miscellaneous
  • 5. 7.X: Metrics • Continuation of 6.X work to support Autoscaling efforts • 7.0: - Aggregated metrics collected in overseer
 - solrconfig.xml <jmx> ➞ solr.xml <metrics><reporter> • 7.1: Prometheus metrics exporter contrib • 7.4: /admin/metrics/history API: basic long-term key metric time series aggregation • Fixed-width windows at
 several resolutions • Not yet in Admin UI:
 SOLR-12426
  • 6. 7.X: Autoscaling • 7.0: - Preferences and policy DSL: flexible replica placement
 [ { minimize: cores }, { maximize: freedisk } ]
 { replica: "<2", shard: "#EACH", node: "#ANY" }
 - Diagnostics API: return sorted nodes, policy violations • 7.1: - autoAddReplicas ported to autoscaling framework
 - Add/remove/suspend/resume triggers and listeners
 - Triggers for added and lost nodes
 - ComputePlanAction / ExecutePlanAction
 - /autoscaling/history API: cluster events and actions • 7.2: - Search rate trigger
 - /autoscaling/suggestions API
 - UTILIZENODE collections API command
  • 7. 7.X: Autoscaling • 7.3: - Simulation framework
 - Arbitrary metric threshold trigger
 - Scheduled trigger
 - Admin UI to display and execute suggestions
  • 8. 7.X: Autoscaling • 7.4: - Periodic house-keeping task: cleans up inactive shards
 - Index size trigger: document count or size in bytes • 7.5: - Policy replica attribute: #ALL, #EQUAL, percentage,
 range, and floating point values
 - Policy cores attribute: #EQUAL, percentage, 
 range, and floating point values
 - Percentage in freedisk policy attribute
 - Simulation framework: test scaling up to 1 billion docs
  • 9. 7.X: Cross Data Center Replication • 7.2: Support bi-directional syncing of CDCR clusters This is not active-active, 
 but rather
 passive-active or active-passive: only one active
 cluster at a time.
  • 10. 7.X: Time Routed Aliases • 7.3: - Specialization of Solr’s collection alias feature
 - Support time series data, e.g. logs / sensor data
 - Maintain performance under continuous indexing
 - CREATEALIAS: start, interval, retention policy
 - Automatically create new collections
 - Automatically delete old collections (optional)
 - Route updates based on timestamp
 - Search against all aliased collections* • 7.5: Preemptively create the next collection when updates
 are near the latest collection’s end date (optional)
 * Pending optimization: minimize queried collections (SOLR-9562)
  • 11. 7.X: Replica types • 7.0:
 
 
 
 
 
 
 • 7.4: Query param to prioritize replicas by type, e.g. shards.preference=replica.type:PULL,replica.type:TLOG Type Indexes
 locally Supports
 soft commit
 & RTG Pulls segments from leader Writes to
 TLog Can become shard leader Queryable NRT ✅ ✅ ✅ ✅ ✅ TLOG leader ✅ ✅ ✅ ✅ ✅ TLOG ✅ ✅ ✅ ✅ PULL ✅ ✅
  • 12. 7.X: Streaming expressions • Parallel computation function suite • Some use cases: MapReduce, aggregations, parallel SQL, pub/ sub messaging, graph traversal, machine learning, statistical programming • Each 7.X release has added
 many new functions • 7.5: Ref guide:
 Math Expressions User Guide
  • 13. 7.X: JSON Facet API • 7.0: Terms facets: added optional refinement support • 7.4: Semantic Knowledge Graph support via new 
 relatedness() aggregate function • Finds ad-hoc relationships by scoring documents relative to foreground and background document sets • 7.5: Heatmap facet support
  • 14. 7.X: Configsets / schema • 7.0: - _default configset
 - Data-driven schema: auto-guessed text fields indexed 2 ways: • tokenized for search • strings for sorting/faceting: "*_str" string field, max 256 chars - Turn off data-driven schema functionality:
 curl http://host:8983/solr/mycollection/config 
 -d "{ set-user-property: { update.autoCreateFields: false }}" • 7.5: Disable configset upload: -Dconfigset.upload.enabled=false
  • 15. 7.X: Text analysis / machine learning • 7.1: Bengali normalizer and stemmer • 7.2: Enable off-ZooKeeper storage of large (>1MB) LTR models • 7.3: OpenNLP integration: tokenization, POS tagging, phrase
 chunking, lemmatization, NER, language detection • 7.4: - ProtectedTermFilterFactory: don’t filter protected terms
 - TaggerRequestHandler (a.k.a. SolrTextTagger): NER • 7.5: - "nori" Korean morphological text analysis: "*_txt_ko"
 - PhrasesIdentificationComponent: identify and score
 candidate query phrases based on index statistics
 - UIMA integration removed
  • 16. 7.X: Collections API • 7.3: Add collection level properties similar to cluster properties • 7.4: Cluster-wide defaults for numShards, nrtReplicas,
 tlogReplicas, pullReplicas • 7.5: - Support co-locating replicas of two or more collections
 together in a node via the withCollection parameter
 to the CREATE and MODIFYCOLLECTION commands
 - SPLITSHARD: New split method using hard links: splitMethod=link • 3-5 times faster than the original splitMethod=rewrite • Slows down replication • Increases disk usage on replica nodes
  • 17. 7.X: Queries • 7.1: JSON query DSL
 curl http://localhost:8983/solr/books/query -d ' { query: { bool: { must: [ "title:solr", {lucene: {df: content, query: "lucene solr"}} ], must_not: [ {frange: {u: 3.0, query: ranking}} ]}}}'
  • 18. 7.X: Queries • 7.2: New synonymQueryStyle field type option: enable
 generation of appropriate queries for hierarchical
 relations between overlapping terms • as_same_term (default): SynonymQuery(bird,robin) • pick_best: Dismax(bird,robin) • as_distinct_terms: (bird OR robin) • 7.4: JSON query DSL: Enable query/filter tagging,
 e.g. { "#colorfilt" : "color:blue" } 
 equivalent to local-param {!tag=colorfilt}color:blue

  • 19. 7.X: Large index segment merging • Problem: Overly large segments (e.g. as a result of force-
 merge/optimize) stop being eligible for merging,
 and can start accumulating >50% deleted
 documents, wasting space and skewing index stats. • 7.5: - TieredMergePolicy now respects maxSegmentSizeMB
 by default when executing force-merge/optimize and
 expunge-deletes
 - TieredMergePolicy’s reclaimDeletesWeight has been
 replaced with a new deletesPctAllowed setting to
 control how aggressively deletes should be reclaimed
  • 20. 7.X: Replication/recovery/rolling upgrades • 7.3: The old Leader-Initiated-Recovery (LIR) implementation
 is deprecated and replaced • To perform a rolling upgrade to Solr 8, you must be on Solr 7.3 or higher • 7.4: - IndexFetcher now skips fetching identical files
 - Buffering updates are written to a separate TLog
 - Parallel replay of buffering TLogs
  • 21. 7.X: Block-join / nested documents • 7.3: Added filters and excludeTags local-params for
 {!parent} and {!child} query parsers, usable for
 multi-select faceting • 7.5: WIP: Allow Solr to more faithfully represent deeply
 nested document relationships, rather than requiring
 reconstruction based on the flattened list of child docs
 returned by Solr
  • 22. 7.X: Miscellaneous • 7.3: add-distinct atomic updates • 7.4: - Ignore large document URP
 - TLog: maxSize auto hard-commit setting
 (in addition to maxDocs & maxTime) • 7.5: Custom cluster properties allowed with ext. prefix
  • 23. 8.0 • Autoscaling • Index upgrades • HTTP/2 • Miscellaneous
  • 24. 8.0: Autoscaling • Suggestions API: rebalance options even if no violations • Suggestions API: add-replica for lost replicas • maxOps limit for index size trigger • Autoscaling policy framework will be the default replica placement strategy
  • 25. 8.0: Index upgrades • 7.0: Lucene indexes record the major Lucene version that
 created the index, and the minimum Lucene version
 that contributed to segments. • 8.0: Version N-2 or older indexes will now fail to open,
 even if they have been merged into an N-1 index. • IndexUpgrader will not upgrade 6.X or earlier indexes • Re-indexing will be required to upgrade
  • 26. 8.0: HTTP/2 • May 2018: Mark Miller announced his Star Burst effort:
 many cleanups and performance enhancements • July 2018: Cao Manh Dat took up the HTTP/2 aspects: SOLR-12639 • Indexing test: 33M docs, 1 shard, 2 replicas (SOLR-12642) • Garbage: Leader: 26% less; replica: 76% less • Indexing throughput: 54% higher • CPU time: Leader: 39% higher; replica: 76% lower • Ready to merge back to master, pending release of
 Jetty 9.4.13, containing SPNEGO HTTP/2 implementation
  • 27. 8.0: Miscellaneous • Lucene: scores must be non-negative • Function(Score)Query-s convert negative scores to zero • TODO: remove deprecations • Trie fields? Removal effectively blocked by: • SOLR-12074: Add numeric equivalent to StrField • SOLR-11127: Mechanism to migrate schema for .system collection (a.k.a. blob store) schema from Trie (pre-7.0) to Points (7.0+)
  • 28. 8.X • Lucene/Solr minimum JDK • Luke: Lucene Toolbox • New Lucene features
  • 29. 8.X: Lucene/Solr minimum JDK • Oracle will end free JDK 8 support in January 2019 • Both JDK 9 & 10 are already EOL, no more Oracle support • JDK 11 will very likely be next minimum supported JDK, no schedule yet • Under JDK 9+, Solr’s Hadoop-related functionality has problems, including with Kerberos • Uwe Schindler’s Jenkins server tests Lucene/Solr on Oracle 9+10+11+12 JDKs • All have higher Solr test failure rates than on JDK 8
  • 30. 8.X: Luke: UI framework & licensing • Andrzej Bialecki: Initial implementation: Thinlet, GPL • Mark Harwood: GWT • Mark Miller: Apache Pivot • Dmitry Kan and Tomoko Uchida took ownership on Github • Tomoko Uchida: JavaFX (bundled w/JDK 8) • LUCENE-2562: Make Luke a Lucene/Solr Module • JavaFX/OpenJFX unbundled from Java 11 JDK, GPL+CPE • Tomoko Uchida: Swing (7.5 release available)
  • 31. 8.X: New Lucene features • Index impacts, Block-Max WAND, similarity cleanups • Some queries (especially term queries and disjunctions) are much faster when number of hits is not required • FeatureField: incorporate static relevance signals, e.g. PageRank • Soft deletes • Merge policy retains deleted docs according to policy • Enables document history, e.g. for time-travel indexes • RAMDirectory replaced by ByteBuffersDirectory
  • 33. Thank you! Steve Rowe Senior Software Engineer, Lucidworks @steven_a_rowe #Activate18 #ActivateSearch