This document discusses a presentation on how to warehouse SharePoint data. It provides an agenda for the SharePointalooza conference in Branson, MO in 2014, which will include sessions on Excel data storage in SharePoint, SharePoint content databases, and the benefits of data warehousing. It also outlines architectures for enterprise business intelligence with SharePoint, including ETL tools and considerations for data consumption.
4. 4 | SharePointalooza – Branson, MO 2014
The Bands
What better way to unwind after a
long day of working out your brain
than with some great live music at
the amazing outdoor stage at
Branson Landing! The bands will be
playing both Friday and Saturday
night from 6:30 pm to 10 pm.
14. ONE DOES NOT SIMPLY
RETURN HIS RAW DATA FROM THE DATABASE
15. Storage Design and Visualization
15 | SharePointalooza – Branson, MO 2014
BI Architecture 101
Data Marts
Extract, Transform,
and Load (ETL)
Middleware
Server(s)
Data
Warehouse
Data Cubes and
Tabular Models
E
T
L
Reporting
Server(s)
BI and
Designer
Clients
Source data
16. Storage Design and Visualization
16 | SharePointalooza – Branson, MO 2014
Microsoft enterprise (classic) BI
E
SQL Server DB
SQL Server
Integration
Services (SSIS)
SharePoint (with)
• Excel Services
• PowerPivot for SharePoint
• SSRS SharePoint Mode
• PerformancePoint
SQL Server
DB
SQL Server Analysis
Services Multidimensional
and Tabular modes
L
SQL Server
Reporting
Services
(SSRS)
Excel
SQL Data Tools
Report Builder
3rd party tools
ETL
T
Source data
17. 17 | SharePointalooza – Branson, MO 2014
Team BI and SharePoint Dashboards
Power Pivot Worksheets
• Pivot Tables and Charts
• Power View
Data Marts and
other
Data Cubes and
Tabular Models
Standard Worksheets
• Pivot Tables and Charts
PerformancePoint Scorecards and KPIs
PerformancePoint Reports
• Analytic Charts and Grids
• Decomposition trees
SQL Server Reporting Services Reports
• Standard
• Power View
We aren’t going into data warehouse design here (although we can if you would like) – this is a demo focused, hand on “how to” session.
SharePoint and Excel share a common similarity
Easy place to store data, specifically list data
Once you have a hammer, everything starts to look like a nail
Victim of its own success
Analysts start trying to make it work like a database
Excel gets around it with Vlookup etc, but it’s really not a database
- A lot like SharePoint is some ways. But SharePoint is different, right? I mean, SharePoint stores its data in SQL databases!
Lets have a look at those databases
Lets have a look at those databases
A Good example is the Workflow History list. The bane of many SharePoint administrators.
Start with Expenses – show workflow History as normal
- Show the NintexWorkflowHistory List
- Explain how is connects – and disconnects from source items.
- Show the historylist
- Go to reports – show the report that uses a SharePoint data source
- Run the report – talk a bit
- Show the historylist table do a count
- Run the SQL report
The numbers are fairly compelling
So… HOW?
There is one very important rule to remember when delivering BI solutions.
What we’ve just seen shows that it’s just that much more important with SharePoint.
Business Intelligence is all about the data, but that doesn’t mean that you just wire up Excel to source data and start Extracting (although far too many people do). This is bad for a number of reason
- Security – data level access to production data
- Usability – difficult to understand constructs (Great Plains anyone?)
- Performance – reporting against the production data concentrates the load.
- Organization – data optimized for transactions, not reporting
Instead of querying our source systems directly, we want to take our data and move it into Data Warehouses and data marts, which are optimized for the sorts of analysis that we want to perform. This is done through an ETL operation.
CLICK
The data is extracted from the source system, CLICK transformed into the shape we need it, CLICK then loaded into the data warehouse. CLICK Other ETL processes or cube process will load the data into any necessary marts, cubes or models.
From here various servers and client will access the data, usually from the data marts of cubes, but occasionally from the warehouse directly.
So how does this translate to the Microsoft stack? There are two ways. The Enterprise, or “classic” BI method, or the Power (personal) method.
Starting with the classic method, SQL Server Integration Services is the tool that performs our ETL.
SQL Server Database Engine is used for the storage of the data warehouses and data marts
SQL Server Analysis Services is the multidimensional engine (traditional OLAP cubes) and now is the engine for enterprise tabular models (xVelocity).
SSRS is the traditional server engine for serving reports, and can be deployed either standalone, or through SharePoint.
These tools all ship on SQL server media, but some (SSRS and PowerPivot for SharePoint) may be deployed to SharePoint
Clients of this infrastructure may be servers themselves, or designers and Power Users. Consuming tools include Excel, SQL Server Data Tools, Excel Services, PowerPivot for SharePoint, or a host of other tools.
Recently, there has been a lot of work in the Personal BI space – so how does that compare to this approach? Fundamental BI concepts still apply.
Within SharePoint, we can publish reports and data models, and establish connections to the relevant back end systems. These components can then be used to construct dashboards, or used on their own as dashboards.
Dashboards can contain, but are not necessarily limited to
Worksheets and worksheet components through Excel Services, either directly connected or via PowerPivot
SSRS Reports
PerformancePoint scorecards and KPIs
PerformancePoint reports
SO, how do we actually go about doing this stuff?
We’ve simply focused on moving data from production.
Schemas can be optimized for reporting. We used a view to add in user information, this could be done with SSIS in flight
Star, snowflake schemas
Loading the data places a load on the source system
Complete fill vs incremental load
Once in a warehouse, data is a first class citizen with data from any source. Helps with Mashups
Be sure to schedule data pulls as infrequently as possible to reduce load. Must meet business requirements
How real enough does the data need to be? With Nintex workflows you can pump straight to SQL .
Be sure to clean up the source data in necessary.
Warehousing applies to SharePoint even more so than transactional systems
It’s relatively simple to surface back end transactional data in SharePoint
Use the right tool for the right job.