Contenu connexe Similaire à eGrove Systems - "SOLR" An Apache Product (20) Plus de Egrove Systems Corporation (6) eGrove Systems - "SOLR" An Apache Product4. INTRODUCTION
• A full text search server based on Lucene
• XML/HTTP Interfaces
• Loose Schema to define types and fields
• Web Administration Interface
• Extensive Caching
• Index Replication
• Extensible Open Architecture
• Written in Java5, deployable as a WAR
4
7. • Advanced full – text search.
• Optimized for high traffic volume.
• Standards based open interfaces – XML, JSON & HTTP
• Comprehensive administration interfaces
• Near real – time indexing
• Extensible plugin architecture
• Multiple search indices
• Apache UIMA
• Rich document parsing
• Advanced storage options
• Performance optimization
FEATURES
7
9. • XML/HTTP and JSON APIs
• Hit highlighting
• Faceted Search and Filtering
• Geospatial Search
• Fast Incremental Updates and Index Replication
• Caching
• Replication
• Web administration interface
FUNCTIONS
9
13. Performance Factors
• Schema design
• # of indexed fields
• omitNorms
• Term – vectors
• Docvalues
• Configuration
• mergeFactor
• Caches
• Indexing
• Bulk updates
• Commit Strategy
• Optimize
• Querying
PERFORMANCE
14
14. 1. Memory Testing – SOLR response time for 1 million volume index
with 8 GB and 32 GB instance.
Source : www.hathitrust.org
PERFORMANCE
15
15. 2. SOLR index size analysis for Twitter dataset
Source : www.dzone.com
PERFORMANCE
16
17. PROS CONS
Easymonitoring.
HighlyScalable.
FaultTolerant.
Flexibleandadaptablewith
easyconfiguration.
PerformanceOptimization.
HighlyConfigurableand
userextensiblecaching.
Freelyavailable.
Multilingualsupport.
Easyimplementationandsetup
Lessresourceutilization
Agenerallackofcommitment
towardsSOLR.
LessattentionsonJVM
settings&garbage.
Increasedlatency.
OccasionallargeIOloadto
replicatelargemerges.
Complicatedloadbalanceand
management.
Reconfigurationifthemaster
islost.
PROs & CONs
18
19. • OOTB Simple Faceted Browsing
• Automatic Database Indexing
• Federated Search
– HA with failover
• Alternate output formats (JSON, Ruby)
• Highlighter integration
• Spellchecker
• Alternate APIs (Google Data, OpenSearch)
FUTURE TRENDS
20