Aucune remarque pour cette diapositive
Revoir “admin peut controller taille db”
Attention sharepoint ne respecte pas les parametres de l’autogrowth de la model DB
Attention ceci illustre une ferme fort orientée écriture, sinon inverser contentdb et contend db log files
Nbre data file max 8 on SQL Server 2012 if n cores > 8
TODO : link DB
Okay, now we’re going to start digging in with a quick high level overview of the new search logical architecture. You’ll see here all of the components of the new search engine. We have all our content sources on the left that are going to be picked up by the new crawl component. The crawl component is going to get track what it needs to crawl and whether it was successful in the crawl database. It hands off the content it crawls to the new content processing component. In the CPC we’ll break down what was crawled into the individual pieces of data that will get stored in the index. It also writes to the link database to track all the different anchor content we’ve crawled, and then we push the data to the index component. The index component is what handles the process of creating the physical index, as well as responding to queries. The queries are delivered by the query processing component. It takes the individual queries from a client application and figures out the one to many queries that are required to execute upon that, and then aggregates the results and sends it back to the client. Periodically a timer job will run and perform analytics on the SharePoint content. This includes what content has been access through different query results, as well as what content has been accessed by users just clicking in and around different SharePoint sites. It pulls in data from the links database for additional relevance data and stores the reports in the analytics reporting database; it also pushes a small amount of analytical data back to the content process component so that it can be included in the index. The search component is what’s responsible for keeping it all up and running and healthy, and it stores it’s configuration in the search admin database.
Now we’ll dig into each one of these components in more detail.
Those components are responsable to create or modify the index.
Other pieces outside from the classic search architecture (e.g. APE) are also consuming and modifying the index.
We will break each component role and responsabilities in the next slides
Sid Shah: I like the way you’ve abstracted the topology model. For the content and query processing components, I’d recommend also mentioning CTS/IMS respectively because those are internally used terms – more like an AKA.
New search architecture, signle one. No more FAST, FAST4SP or Search SP.
We’ve made some big crawler improvements over the previous version of SharePoint. Previously, you could have one to many crawler components on a server, but each component was associated with only one crawl database. If some crawl databases had items to crawl and others did not, you could end up in a situation where some crawlers were busy trying to keep up with changes, while other crawlers were sitting idle, doing nothing, because their crawl databases were empty. In SharePoint 2013 we move from having crawl components to crawl roles. Unlike SharePoint 2010 you can’t have multiple crawl components on a single server – you’re either a crawler or not. Now though ALL crawler roles talk to ALL crawl databases. So whenever there is content to crawl, every crawl server in the farm can work on it. Each crawler role will pull items that it needs to crawl out of the crawl database, do the work, send the data to the content processing component, and update the crawl status in the database. One thing that’s also different from SharePoint 2010 is that we no longer have host distribution rules, which is where you could pick certain crawl components to crawl a particular URL. Instead now a host can be distribued across multiple crawl databases. That gives us a much bigger scalability capability. The search admin component will spread out content for a URL, when needed, across multiple crawl databases.
same host gets distributed across multiple crawl databases which splits the work among multiple crawl components.
The SharePoint URLs are partitioned by content databases.
For huge content databases the content gets subpartitioned across different Crawl DBs
Subpartitioning happens automatically, admins don’t see anything
System doesn’t automatically rebalance make the Crawl DBs almost equal in size)
this is something the admin can trigger from the health rules
We are still considering doing this automatically in O15. There is a timer job that does it as of beta 2.
assignment algorithm is usually not great at the beginning (because we don’t know upfront the size of the DB), but in time it gets better because the new content DBs are assigned to the smaller crawl DBs.
Host distribution rules we had in 2010 no longer exist in “15”?
we removed these rules because we are now smarter and don’t need the input from the admin.
If the O14 customers used the rules to pin specific hosts to crawlers, then they will not have that option anymore.
Principal responsible des outils BI-SP : Kay Unkroth