5. A propos
Serge Luca
SharePoint MVP
Consultant, speaker, trainer
Managing partner of www.ShareQL.com
Works with SharePoint since 2001
Blog: http://sergeluca.wordpress.com/
sergeluca@ShareQL.com
@SergeLuca
Serge Luca
SQLSaturday 323 – Paris 2014
6. A propos de nous
ShareQL
SQLSaturday 323 – Paris 2014
6
Isabelle Van Campenhoudt
MVP SQL
TheSQLgrrrl.wordpress.com
Serge Luca
MVP SharePoint
Sergeluca.wordpress.com
ShareQL
un mariage réussi
Près de 40 années
d’expertise et d’expérience
sur le monde de la base de
données et de SharePoint
7. Agenda
Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité (HA) et reprise sur incident (DR)
SQLSaturday 323 – Paris 2014
8. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
9. SharePoint 2013 : catalogue de services
Web Content
Management
***
Document
Management
*****
SQLSaturday 323 – Paris 2014
Collaboration
(teams &
projects)
*****
Social
***
Workflows
*****
Project
Management
*****
Enterprise Search
*****
Self Service BI
****
Integration with
LOB
*****
Application
development
*****
Forms + Access
*****
Apps market
store
****
E-Discovery
****
Info Lifecycle
****
Personal Sites
*****
Enterprise Portal
*****
10. Rapide historique
2001
• v1 Team
Services
2003
• v2
“Windows
SharePoint
Services v2
and
SharePoint
Portal
Server 2003”
• First version
written in
.Net
SQLSaturday 323 – Paris 2014
2007
• v3 “Windows
SharePoint
Services v3
and
Microsoft
Office
SharePoint
Server 2007”
• Very popular
: generated
1.5 billion $
revenue
• Quadrant
leader
according to
Gartner
2010
• v4
“Microsoft
SharePoint
Foundation
2010” and
Microsoft
SharePoint
Server 2010”
+ cloud
(Office 365)
2013:
• v5
“Microsoft
SharePoint
Foundation
2013” and
Microsoft
SharePoint
Server 2013”
+ cloud
(Office 365)
2016 ?
11. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité (HA) et reprise sur incident (DR)
SQLSaturday 323 – Paris 2014
12. Architecture haut niveau
SharePoint = souvent 3
fermes
SQLSaturday 323 – Paris 2014
La ferme SharePoint et la
ferme workflows exploitent
SQL Server
La ferme OWA (Office Web
Apps) permet la
visualization/edition des
documents Office en mode
web
• n’utilise pas directement SQL
Server
Ferme SharePoint
Ferme Office
Web App (OWA)
Ferme Workflows
13. Architecture ferme SharePoint
2 produits
Microsoft SharePoint Server 2013
Microsoft SharePoint Foundation 2013 (which
is the SharePoint engine, free)
SQLSaturday 323 – Paris 2014
Microsoft SharePoint Foundation 2013
Browser Clients
Office Clients
…
Microsoft SharePoint Server 2013
.NET Framework and ASP.NET 4.5
Internet Information Services
Windows Server 2008R2/2012/2012R2
SQL Server 2008 R2 or 2012 or 2014
14. Ferme SharePoint : Architecture logique
SQLSaturday 323 – Paris 2014
Site
collection
Sites
Lists
Ex: team A
Ex: Project 1 Ex: Project 2 Ex: Project 3
15. Ferme SharePoint et bases de données
SQLSaturday 323 – Paris 2014
Farm
Web
applications
(= IIS web
sites)
Site
collection
Sites
Lists
Service
Application
Content
database
Configuration
Databases
Service
Databases
16. Exemple de ferme (minimale) classique
2 Web/Query/Application
/Central Admin/
1 Dedicated Index Server
(With Web role to allow it to
crawl content)
2 SQL Standard Edition
Cluster Nodes
(Active/Passive) – Mirror
also option
Smallest highly available
farm
SQLSaturday 323 – Paris 2014
17. …ou plus complexe
Scale up and Scale out…
SQLSaturday 323 – Paris 2014
18. En bref…
98% du contenu
de SharePoint est
stocké dans SQL
Server
SQLSaturday 323 – Paris 2014
La configuration de
la ferme est
stockée dans la
“configuration
db”
La contenu de
l’administration
central est stocké
dans la “content
db”
La plupart des
services ont au
moins une db
Tous les web
applications ont au
moins une db de
contenu
19. …mais encore
Une ferme SP a
souvent au moins 20
DB
1 site collection se
trouve dans 1 DB
SQLSaturday 323 – Paris 2014
La DB de contenu
peut comporter n site
collections (2000 par
défaut)
Conseil : 1 site coll >
100 GB -> DB dédiée
L’admin SharePoint peut “controller” la taille de
la DB
• Quota Templates pour 1 site coll
• Maximum Number of Site Collections
20. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
21. Préparer la ou les instances SQL
Servers
• 2008 R2, 2012 (Enterprise SP1 pour BI),
2014
• 1 ou n instances SQL Server
• Collation :
Latin1_General_CI_AS_KS_WS (pour
les DB SharePoint)
• MAXDOP=1
SQLSaturday 323 – Paris 2014
Choisir qui crée les DB
(configuration, contenu, services)
• Soit le DBA
• Soit l’admin SharePoint (PowerShell ou
GUI: Administration Centrale)
• Eviter le configuration wizard car DB
avec gui
22. Le service SQL Server : compte de domaine
Instances nommées (ex .SharePoint)
Alias (DNS de préférence)
Toutes les DB peuvent être créées au préalable
Demander à l’admin SP quel est le “compte de setup”
Compte de setup =
security admin server role
db creator server role
dbo owner sur db impactées via Powershell
Description des DB
http://technet.microsoft.com/en-us/library/cc678868(v=office.15).aspx
SQLSaturday 323 – Paris 2014
23. Un installation de SharePoint correcte se déroule via PowerShell à
l’aide d’AutoSPInstaller et éventuellement d’AutoSPInstallerGUI
Un fichier xml comportant le nom de toutes les DB est créé
Vérifier que l’alias soit bien utilisé
SQLSaturday 323 – Paris 2014
24. Modèle de recovery à utiliser
Model db : recovery
model = full
Tempdb : recovery
model = simple
SQLSaturday 323 – Paris 2014
SharePoint DB :
recovery model ?
Content DB = full Config DB= simple
Services App DB= ça
dépend :
• http://technet.micros
oft.com/en-us/
library/cc678868.
aspx
Always On Availability groups: recovery = full !
25. Placement des fichiers
Priorité (du disque
le plus rapide au
plus lent)
Tempdb Data and
Transaction Log Files
Content DB Transaction
Log Files
Search DB Data Files
(except admin db)
Content Database Data
Files
SQLSaturday 323 – Paris 2014
Utiliser plusieurs
data files pour les
content db et
search db
Distribute Equally-Sized
Data Files Across Separate
Disks
Number of Data Files
Should Be <= Number of
Processor Cores
Multiple Data Files Not
Supported for Other DBs
26. Taille des DB de contenu
Les bases de données de
contenu sont les plus grosses
• Bonne pratique : éviter > 200 GB/DB
(raisons opérationnelles)
• 0.5 IOPS/G recommandé
• 4 TB/DB supporté
• Si au moins 0.25 IOPS/G (ideal: 2
IOPS/G)
SQLSaturday 323 – Paris 2014
1 DB par site collection ou
plusieurs site collections dans
une DB
• Exiger d’avoir le SLA de chaque site
collection !
Capacity plan obligatoire
((D × V) × S) + (10 KB × (L + (V × D)))
27. Eviter la defragmentation d’indexes
Job will defragment the
indices
SQLSaturday 323 – Paris 2014
If fragment >
30% &
rowcount >
10.000
Job will update statistics
AUTO_CREATE
_STATISTICS
OFF
28. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
31. Database name IOPS requirements Typical load on I/O subsystem.
Crawl database Medium to high IOPS
SQLSaturday 323 – Paris 2014
10 IOPS per 1 document per
second (DPS) crawl rate.
Link database Medium IOPS
10 IOPS per 1 million items in the
search index.
Search administration database Low IOPS Not applicable.
Analytics reporting database Medium IOPS Not applicable.
32. Latence
latence entre les Web front ends et SQL Server
< 1 ms durant 10 minutes à 99.9 %
Scripts de tests
SQLSaturday 323 – Paris 2014
33. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
34. Les outils BI dans SharePoint sont:
Excel Services
Reporting Services
Est un Service Application (géré entièrement par SP)
Avec PowerView
PowerPivot
PerformancePoint (scorecards)
Ces outils ont été créés par l’équipe SQL Server de Microsoft
L’installation (complexe) peut se faire par l’admin SharePoint
À partir du setup d’installation SQL Server
SQLSaturday 323 – Paris 2014
35. Installer des composants BI de SharePoint 2013
1 Install
SharePoint 2013
(Entr) with SQL
Server 2012 SP1
SQLSaturday 323 – Paris 2014
2 Install Excel
Services
3 Install Analysis
Services en
Mode SharePoint
4 Spécifier le
serveur Analysis
Services dans
Excel Services
5 Installer les
add-in Reporting
Services et RS
en mode intégré
6 Déployer les
add-in
PowerPivot pour
SharePoint
36. Pour la BI la configuration Kerberos est indispensable !
Sinon
problème de double hop
Problème de data refresh
Définir les SPNs
Le DBA SQL Server doit fournir à l’admin SharePoint
la liste de toutes les instance SQL Servers
y compris Analysis services (ne pas oublier l’instance SharePoint)
Les ports (verifier qu’ils soient statiques)
SQLSaturday 323 – Paris 2014
37. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
38. Outils intégrés SharePoint (GUI ou Powershell)
Utilisent le service SQL Server en arrière plan
Y compris pour backup compression, encryption, snapshot
“Data” (granular)
Y compris DB de services
A éviter si Site collection > 85 GB
“Ferme” = les configs IIS, fichier web.configs, + Data
Full ou différentiel
Outils SQL Server
Plus flexibles (log de fichiers de transaction, ect…)
Stopper le SharePoint Timer Service avant le restore
! La config DB peut être backupée & restorée seulement si la ferme est offline !!!!
3th party
SQLSaturday 323 – Paris 2014
39. Introduction
Architecture SharePoint
Installation de SP & configuration SQL Server
Le moteur de recherche
La BI
Les Backups/Restore
Haute disponibilité
SQLSaturday 323 – Paris 2014
40. SharePoint supporte
SQL Server mirroring, log shipping, clustering, availability groups
L’instance Analysis Service SharePoint mode ne supporte
pas le clustering
SQLSaturday 323 – Paris 2014
41. Always On Availability Groups & SharePoint (HA)
SQL 1
SQLSaturday 323 – Paris 2014
FARM 1
SQL 2
High
Availabilty
Synchronous
42. Always On Availability Groups & SharePoint (HA)
SQL 1
SQLSaturday 323 – Paris 2014
FARM 1
SQL 2
Synchronous
High
Availabilty
43. Database Support – Sync Commit
Database Supported
Admin Content Yes
App Management Yes
BDC Yes
Config Yes
Content Yes
Managed Metadata Yes
PerformancePoint Yes
PowerPivot Not Tested
Project Yes
Search Analytic Reporting Yes
Search Admin Yes
SQLSaturday 323 – Paris 2014
Database Supported
Search Crawl Yes
Search Links Yes
Secure Store Yes
State Service Yes
Subscription Settings Yes
Translation Services Yes
UPA Profile Yes
UPA Social Yes
UPA Sync Yes
Usage(=loggingDB) Yes – NR
Word Automation Yes
WE
44. Always On Availability Groups & SharePoint (DR)
SQL 1
SQLSaturday 323 – Paris 2014
FARM 1
SQL 2
FARM 2
Asynchronous
Disaster
Recovery
SQL 3
Synchronous
45. Database Support – Async Commit
Database Supported
Admin Content No
App Management Yes
BDC Yes
Config No
Content Yes
Managed Metadata Yes
PerformancePoint Yes
PowerPivot Not Tested*
Project Yes
Search Analytic Reporting No
Search Admin No
SQLSaturday 323 – Paris 2014
Database Supported
Search Crawl No
Search Links No
Secure Store Yes
State Service No
Subscription Settings Yes
Translation Services Yes
UPA Profile Yes
UPA Social Yes
UPA Sync No
Usage Yes – NR
Word Automation Yes
WE
46. Conclusions
Bonne collaboration necessaire entre Admin SP et DBA SQL
3 types de DB (config, content, services)
Bien comprendre les limites de HA-DR
Capacity planning
SQLSaturday 323 – Paris 2014
Attention sharepoint ne respecte pas les parametres de l’autogrowth de la model DB
Attention ceci illustre une ferme fort orientée écriture, sinon inverser contentdb et contend db log files
Nbre data file max 8 on SQL Server 2012 if n cores > 8
TODO : link DB
Okay, now we’re going to start digging in with a quick high level overview of the new search logical architecture. You’ll see here all of the components of the new search engine. We have all our content sources on the left that are going to be picked up by the new crawl component. The crawl component is going to get track what it needs to crawl and whether it was successful in the crawl database. It hands off the content it crawls to the new content processing component. In the CPC we’ll break down what was crawled into the individual pieces of data that will get stored in the index. It also writes to the link database to track all the different anchor content we’ve crawled, and then we push the data to the index component. The index component is what handles the process of creating the physical index, as well as responding to queries. The queries are delivered by the query processing component. It takes the individual queries from a client application and figures out the one to many queries that are required to execute upon that, and then aggregates the results and sends it back to the client. Periodically a timer job will run and perform analytics on the SharePoint content. This includes what content has been access through different query results, as well as what content has been accessed by users just clicking in and around different SharePoint sites. It pulls in data from the links database for additional relevance data and stores the reports in the analytics reporting database; it also pushes a small amount of analytical data back to the content process component so that it can be included in the index. The search component is what’s responsible for keeping it all up and running and healthy, and it stores it’s configuration in the search admin database.
Now we’ll dig into each one of these components in more detail.
************************************
Lucaband:
Those components are responsable to create or modify the index.
Other pieces outside from the classic search architecture (e.g. APE) are also consuming and modifying the index.
We will break each component role and responsabilities in the next slides
Reviewer Notes:
Sid Shah: I like the way you’ve abstracted the topology model. For the content and query processing components, I’d recommend also mentioning CTS/IMS respectively because those are internally used terms – more like an AKA.
Lucaband:
New search architecture, signle one. No more FAST, FAST4SP or Search SP.
We’ve made some big crawler improvements over the previous version of SharePoint. Previously, you could have one to many crawler components on a server, but each component was associated with only one crawl database. If some crawl databases had items to crawl and others did not, you could end up in a situation where some crawlers were busy trying to keep up with changes, while other crawlers were sitting idle, doing nothing, because their crawl databases were empty. In SharePoint 2013 we move from having crawl components to crawl roles. Unlike SharePoint 2010 you can’t have multiple crawl components on a single server – you’re either a crawler or not. Now though ALL crawler roles talk to ALL crawl databases. So whenever there is content to crawl, every crawl server in the farm can work on it. Each crawler role will pull items that it needs to crawl out of the crawl database, do the work, send the data to the content processing component, and update the crawl status in the database. One thing that’s also different from SharePoint 2010 is that we no longer have host distribution rules, which is where you could pick certain crawl components to crawl a particular URL. Instead now a host can be distribued across multiple crawl databases. That gives us a much bigger scalability capability. The search admin component will spread out content for a URL, when needed, across multiple crawl databases.
********************
same host gets distributed across multiple crawl databases which splits the work among multiple crawl components.
The SharePoint URLs are partitioned by content databases.
For huge content databases the content gets subpartitioned across different Crawl DBs
Subpartitioning happens automatically, admins don’t see anything
System doesn’t automatically rebalance make the Crawl DBs almost equal in size)
this is something the admin can trigger from the health rules
We are still considering doing this automatically in O15. There is a timer job that does it as of beta 2.
assignment algorithm is usually not great at the beginning (because we don’t know upfront the size of the DB), but in time it gets better because the new content DBs are assigned to the smaller crawl DBs.
Host distribution rules we had in 2010 no longer exist in “15”?
we removed these rules because we are now smarter and don’t need the input from the admin.
If the O14 customers used the rules to pin specific hosts to crawlers, then they will not have that option anymore.
Principal responsible des outils BI-SP : Kay Unkroth