This document summarizes the key changes to search in SharePoint 2013 compared to previous versions like FAST. The main changes include a new unified search architecture with improved user experience, extensibility through new APIs, and a focus on continuous crawling and analytics. Components like the crawl, index, and query processing were overhauled for better performance and scale.
2. Marcus Johansson
• Senior Consultant, Comperio
• V-TSP Enterprise Search, Microsoft
• Course instructor, Cornerstone
Email: marcus.johansson@comperiosearch.com
Twitter: @marcjoha
Blog: http://blog.comperiosearch.com
LinkedIn: http://www.linkedin.com/in/marcusjohansson
3. WHAT HAPPENED TO FAST?
FAST was a true Enterprise Search platform, so…
4. The evolution of FAST
FDS ESP
FSIS
FSIA
FS4SP
Search in
SP2013
Search in
SP2010
Secret sauce
(incl. Mars)
5. End of an era, birth of a New age
• FAST now “fully integrated” into SP2013
– True, but there’s more!
• No longer a “FAST license”
– SP2013 contains everything
– Enterprise version
1997 – 2013
7. User Experience is finally key
• Revamped user/admin interface
• Hover panels, previews
• Query rules, result blocks
• Result types, display templates
• “You’ve seen this result before”
• Query Builder
• Content Search web part
• Etc.
8. For the first time,
Search isn’t defined by the
nuts and bolts,
but from the User Experience
and high-level tools around it.
17. Keeping it all together
Services Processes
Process name Description
hostcontrollerservice.exe Process controller. Monitors and restarts children.
noderunner.exe A search component (except the crawl component)
mssearch.exe The crawl component.
19. Continuous crawls
• Not event-driven indexing
• Starts crawl regardless of prior crawl session
• Large change sets no longer bad for freshness
• Only available for SharePoint content types
– Possible to crawl SP 2010 and 2007
time
Continuous
Full and
incremental
Default 15 min
20. Crawl health reports
Rate Latency Freshness
CPU and
memory
load
Content
Processing
activity
Etc.
Crawl rate per type Crawl load
21. Content processing component
• Schema mapping
– Crawled Managed
properties
• Entity extraction
– Companies and custom
• Advanced Filter Pack is gone
– PDFs are out of the box
• Extensible through web service
• Internally: processing flows
– Replaces Python pipeline
Link
23. Index component
• Propriety disk-based index
• Discrete portions called
partitions
• 1 partition per 10M docs
• Each partition contains 1+
replicas for fault-tolerance
and query volume
• 1 replica, 1 server
• All servers perform indexing
(partially in-memory)
25. Query processing component
• Prepares the queries
– Query rules
– Result sources
– Linguistics/dictionaries
– Etc.
• Manipulates the results
– Display templates
– Late security trimming
– Etc.
• Internally: processing flows
– No custom processing as in Content Processing
– Still MAJOR improvement
26. Query rules
• For a certain term trigger certain action:
– Add/change query terms
– Use alternate sorting/relevance
– Hybrid search (or other federated results)
– Etc.
• Replaces search keywords in SP2010
• Configure at farm, site collection or site-level
29. Query health reports
Trend Overall
Latency in
main flow
Latency in
each
subflow
Index times Etc.
Latency per processing node in SharePoint flow
30. Analytics processing component
• Analyzes crawled items and search usage
• Updates index without re-indexing documents
• Result: relevance becomes self-learning
– Also: search reports and recommendations
Link
Analytics
Reporting
31. Search reports
• Self-learning relevance aside,
never underestimate manual effort!
– Query rules, synonyms, boosts, etc.
• Automatic reports:
– Number of queries
– Top queries
– Abandoned queries
– No-result queries
– Query rule usage
32. Search administration component
• Provisions other search components
• Talks to Admin database on behalf of:
Crawl, Content and Query processing
components
• In previous FAST products, it was hard to make
the admin component redundant
– Not the case in SP2013!
– Scale appropriately
Admin
33. Hardware properties
• Highlights
– In-memory technology
– VMs now supported for production
– SANs less problematic
Component CPU Memory Disk I/O Network
Crawl Medium Medium Medium High
Content processing High High Medium
Index High High High Medium
Query processing Low Medium Medium
Analytics processing Medium Medium Medium High
Search administration Low Low Low Low