SlideShare a Scribd company logo
1 of 51
Business Intelligence Portfolio
                               Pamela Staerker
                                       pstaerker.data2info@gmail.com




         http://www.linkedin.com/in/pstaerker
Summary
This Portfolio contains samples from BI solutions developed using Microsoft
SQL Server R2 and the Microsoft Business Intelligence Toolset.

 T-SQL Programming

 MDX Programming

 Integration Services ETL System (SSIS)

 Analysis Services OLAP Database (SSAS)

 Reporting Services (SSRS)

 Sharepoint BI Delivery

     PerformancePoint Services | Reporting Services | Excel Services
T-SQL PROGRAMMING
Freight Allocation
  Specification: Using the Northwind Database Orders and [Order Details] table, create a result set that allocates freight downward
  to all product line items, based on each product's dollars as a percentage of the dollars for the product as a whole. Validate this by
  summing the allocated freight. Grand total of the summed freight is $64,942.69




In order to achieve the precise results needed for
this allocation, it was necessary to CAST the                                         Partial Result Set
OrderTot as FLOAT. The OrderTot = UnitPrice *
Quantity. While Quantity has an INTEGER data
type, the UnitPrice has a MONEY data type. The
MONEY data type restricts any decimal to four
non-expanding decimal places, which creates a
problem for the allocation ratio. When using a
floating point the result is off by only
1/100,000,000 of a penny.
Top Five Vendors
Specification: Using the AdventureWorks2008 Database Purchasing.Vendor and Purchasing.PurchaseOrderHeader tables to show
            the top five vendors for 2003. Generate a ranking number for each vendor and show the data by quarter.



                                                                           Complete Result Set




                                                                                  The native T-SQL Pivot operator is used to display
                                                                                  the data with the VendorRank on rows.

                                                                                  The Pivot request involves three logical
                                                                                  processing phases with associated elements:
                                                                                  1) grouping phase
                                                                                  2) spreading phase
                                                                                  3) aggregation phase

                                                                                  Here the pivot table is grouped by VendorRank,
                                                                                  with the OrderDate quarter spread on columns
                                                                                  and the TotalDue to the vendor aggregated.
DYNAMIC SQL
This an example of using dynamic SQL when a pivoted results set when is needed, but the number of columns to be created is
not known in advance.

                                                                                              Partial Result Set
GET VENDOR PRODUCTS
This procedure gets the top @n Vendors within a specified date range, and ranks them by PurchaseOrderHeader.TotalDue DESC. The
results are stored in the @VendorTable variable. For each of the vendors, the top @y products within the same date range are
subsequently selected and ranked by (PurchaseOrderDetail.UnitPrice * PurchaseOrderDetail.OrderQtry) DESC. The final
result shows the top @n vendors and their top @y product sales.




                                                                                               Partial Result Set




                                                                                 Ranking is done using DENSE_RANK in
                                                                                 the OVER clause. DENSE_RANK indicates
                                                                                 how many distinct ordering rows have
                                                                                 lower values.

                                                                                 The CROSS APPLY operator is used to
                                                                                 return the products for the top vendors
                                                                                 by applying each row in the product
                                                                                 query to each row in the @VendorTable
                                                                                 variable.
USING T-SQL MERGE
Here the T-SQL Merge, which is new for SQL Server 2008, is used to insert new or update existing records in a production table
from a staging table in a data warehouse scenario.




                 The MERGE statement allows data to be
                  inserted, updated or deleted based on
                             conditional logic.
MDX PROGRAMMING
TOP FIVE CITIES WITHIN TOP FIVE MONTHS
Using the Adventure Works cube the top 5 cities were ranked within the top 5 months, based on the percent sales change over
the prior month.

                                                                                                Complete Result Set




                  To get the top 5 cities within the top 5 months based on the %
                  increase of the internet sales over the previous month , the top 5                 RankedMonths (only)
                  months were determined first using the TOPCOUNT function and then
                  ranked using the RANK function.

                  The GENERATE function, which is specifically used as a means of
                  generating a set based on iteration over another set, was then
                  applied to the current [RankedMonths] member and another
                  TOPCOUNT was used to find the top 5 customer cities within each of
                  the top 5 months.
LAST 2 QUARTERS OF DATA FOR HAMBURG
This query uses the Adventure Works cube, on rows to show Hamburg and the states that have the same parent as Hamburg on
rows. Columns show the last 2 available quarters of data for internet sales and the percent of geographical parent.




                                                                                        Complete Result Set
      The significance of this query is
      that by using a FILTER for null
      data combined with the TAIL
      function, the last two quarters
      of available data for the current
      customer geography will always
      be returned.
CALCULATING COMMON TIME BASED METRICS
The Adventure Works Cube is used here to show all customer country members across the columns and on rows show for June
and July of 2004 and then crossed with the following measures: Internet Sales Amount, Sales Amount Last Period, YTD sales, YTD
Sales for last year, Geographic % of Parent for Internet sales for the month and Geographic % of Parent for Internet sales for last
year.




                                                                                                         Time-based expressions
                                                                                                         shown here were combined
                                                                                                         with other time based
                                                                                                         expressions in order to
                                                                                                         assemble more complex
                                                                                                         metrics.

                                                                                                          As an example, the [YTDLY]
                                                                                                         member used the MDX
                                                                                                         function PARALLELPERIOD
                                                                                                         with the [YTD Sales] member
                                                                                                         to determine the sales one
                                                                                                         year ago




                                                                                   Complete Result Set
INTEGRATION SERVICES
  ETL SYSTEM (SSIS)
ALLWORKS CONSTRUCTION COMPANY PROJECT
AllWorks is a fictitious construction company that uses data stored in various formats as part of their
enterprise system. Employee and Client Geography data, along with Overhead and Job Order master
data are stored in spreadsheets. The Material Purchases data is exported from an Oracle database into
XML format, and Timesheet data is provided in .csv files.

For this project all data from the files were transformed and loaded to a normalized database using
SSIS. The Package Flow Design shows the input data flow into the SSIS packages. Each SSIS package was
named for the respective AllWorksOLTP database table that it loads.



          AllWorksOLTP
            Database
PROJECT SOLUTION
The project solution contains all the data load packages as well as Master package, and a Database Maintenance package The Master
package uses the Execute Package Task to run each of the packages in the proper sequence based, database foreign key
constraints, followed by the Database Maintenance package.

The Control Flow of each package generates an email notification for either the success or failure of the package. Success emails
include counts of files processed: rows inserted, rows changed and invalid rows. Package configuration is used to dynamically update
the database server, SMTP Server and mailbox variables based on the current runtime environment.
TIMESHEET PACKAGE CONTROL FLOW
The Timesheet Package Control Flow uses a Foreach Loop Container to loop through a variable number of .csv timesheet files in a
folder. A Script Task inside the loop container accumulates insert/update/error totals in variables, for each file and for the entire
folder. Script Tasks are used to write either a success or a failure email, as applicable. The Send Mail Task is used to send the email to
an SMTP server.
TIMESHEET DATA FLOW
The data for each record that moves through the Timesheet Data Flow pipeline, is first converted from a .csv file data type to a SQL
Server data type. The data is then checked using a Lookup Transformation to verify that ProjectID and EmployeeID are valid. A
Conditional Split Transformation is used to make sure the project has not been closed. If the project is not closed and the ProjectID
and EmployeeID are valid then a Lookup Transformation is used to determine if the Employee Time sheet already exists in the
database, and if not the timesheet record is inserted, otherwise the timesheet record is updated. A Conditional Split is used so that
only modified timesheet data is sent to the database. An OLE DB Destination is used to insert data. An OLE DB Command is used to
update data.
LOOKUP TRANSFORMATION
A Lookup Transformation Editor is used to validate the ProjectID and also to bring the Project Closed Date data into the dataflow
pipeline.
C# SCRIPTING FOR ACCUMULATING TOTALS
The Microsoft Visual Studio Tools for Applications (VSTA) was accessed using the Script Task. C# scripting is used to accumulate the
Timesheet record count totals.




                          Variables used by the
                          Script Task were first
                          created in the SSIS
                          package.




                                                                        Read- write variables were selected for
                                                                        use in Script Task Editor from the SSIS
                                                                        package variables. These variables
                                                                        were then available to be used within
                                                                        the code of the Script Task through the
                                                                        Dts.Variables collection, which
                                                                        Integration Services automatically
                                                                        creates and makes available to the
                                                                        script code.
MASTER PACKAGE
Execute Package Tasks with precedence constraints that mirror PK/FK database constraints were used to execute the individual
packages in the ETL solution. Upon successful completion of the ETL, database maintenance is performed.
DATABASE MAINTENANCE PACKAGE
The Database Maintenance Package shrinks the database, rebuilds the database indexes, updates the database statistics and backs-up
the database nightly following successful completion of the Master Package.
PRODUCTION SCHEDULE
Following deployment of the ETL packages to SQL Server, the Master Package was scheduled to run nightly using SQL Agent.
ANALYSIS SERVICES
OLAP DATABASE (SSAS)
DATA SOURCE VIEW
SQL Views were scripted in the AllWorks OLTP database to generate the dimensions and measures needed to create the AllWorks
dimensional database in Analysis Services. In SSAS a data connection to the SQL Server relational database was established and a data
source view (DSV) created using the dimensional views. In the DSV, entity relationships and logical primary keys were identified.




                                                                             The JobClosdedDate in the
                                                                             vwDimProject view is used in a role
                                                                             playing dimension, and therefore an
                                                                             entity relationship between
                                                                             vw.DimProjct.ClosedDateKey and
                                                                             vwDimDate.WeekendKey is needed.




                                                                                               A named calculation was created to
                                                                                               provide a description for the Boolean
                                                                                               value returned by the Employee Flag
                                                                                               in the vwDimEmployee view.
ALLWORKS DW CUBE
A cube consisting of four measure groups and five dimensions was then generated using the Cube Wizard. The Date dimension is
used in the cube twice as a role-playing dimension for the Project Closed date.
DIMENSION USAGE


Relationships in the cube between the dimensions and
  measures was verified by inspecting the Dimension
  Usage tab. A referenced relationship was manually
   created for the intersection between the Project
Closed Date (role playing dimension) and the Summary
                    Measure Group.




                              This dimension usage tab from another cube shows a many-to-
                              many relationship between BookSales and Authors. Each Author
                              can have multiple Books for sale, and each Book may have more
                              than one Author. This necessitates the intermediate cross
                              reference table, BookXAuthors.
DIMENSIONS
All cube dimension
attributes and user-
defined hierarchies.   The Dimension Structure Tab was used to specify each dimension’s attributes and user-defined
                       hierarchies. Here a date hierarchy was built using the Year, Quarter and Weekend date.




                                                                                           Rigid attribute relationships were
                                                                                           created to associate the attributes
                                                                                           used in the Date Tree hierarchy.

                                                                                             Setting the RelationshipType property
                                                                                             determines whether Analysis Services
                                                                                              creates rigid or flexible aggregations.

                                                                                              After an incremental update, Analysis
                                                                                             Services drops flexible aggregations and
                                                                                              those aggregations must be manually
                                                                                               reprocessed, but Analysis Services
                                                                                                persists rigid aggregations, which
                                                                                                  improves query performance!
BROWSING THE CUBE
After the cube was deployed and processed successfully, the Cube Browser dimensions and measures were selected from the
metadata pane and the results were examined against the original OLTP database for accuracy. The cube can also be easily browsed
using an Excel Pivot Table.
CALCULATIONS
MDX expressions were used to create calculated members and named sets.


                                                   This calculated member computes
                                                   the open receivables as a percent of
                                                   invoice amount.




            It is also rendered here as a Excel
            pivot table which uses it along with
            the Invoice Amount measure and
            the Open Receivables calculated
            member, for the All Clients named
            set in 2006, broken out by quarter.
KPI’S
Key performance indicators (KPIs) were created using the newly deployed calculated members for the KPI value expression. In this KPI
a Traffic Light is used to visually represent the Profit Percent metric, with a goal of greater than 15% profit. The KPI is displayed for all
clients using MS Excel. Green indicates the goal of greater than 15% profit for the client’s projects have been met. A yellow traffic light
indicates that the profit percentage is between 15% and 5% inclusive. A red light indicates that the profit is less than 5%. No traffic
light is displayed for clients who do not have any closed projects.




                                                                                      While the KPI Goal
                                                                                      here is static, it can
                                                                                      also be data driven
ACTIONS
A report URL action was created so that the client application can execute a live Google Maps search based on the project’s county.
PARTITIONS AND AGGREGATIONS
Partitions were created for each fact table to separate current from historical data . MOLAP storage with aggregations designed for a
50% performance increase were specified.




                                                                                                                         Aggregations are
                                                                                                                          precalculated
                                                                                                                      summaries of data from
                                                                                                                             leaf cells
REPORTING SERVICES
      (SSRS)
MOVING AVERAGE OLAP REPORT
This report shows sales dollar revenue as a column bar and the 12 month moving average as a horizontal line. The user can select one
or more years, and a product category, subcategory or product. The Fiscal Year dropdown excludes the first year of sales.
SALES BY CATEGORY EXPLODED PIE CHART
This Exploded Pie Chart report shows sales proportions by category. Here the legend is hidden and an expression is used to label each
pie slice with the Category, Sales Dollar Amount and the % of Total. The labels are configured to display outside of the pie for
readability purposes.
EMPLOYEE SALES QUOTAS W/ SPARKLINES & GAUGES

This is an employee sales matrix OLAP report. The gauges give an ‘at a glance’ indication of each employee’s sales in comparison to
their quota, while the sparklines provide a visual representation of each employee’s sales vs quota trend.
TOP (N) PRODUCTS & TOP (Y) CITIES OLAP REPORT
This report allows the user select to the TOP N Products by Revenue, and within each product the TOP Y Cities. The user can select one
or more years. This report uses the MDX GENARATE function to achieve the TopY within the TopN.
SHAREPOINT BI DELIVERY
    (DASHBOARD)
CONTOSO RETAIL DASHBOARD
The Contoso Retail Dashboard project is comprised of six PerformancePoint pages deployed to SharePoint 2010. An additional SSRS
report with a delivery subscription is also included.                               Performance Point Content
                              Site Collection




                             Dashboard Pages
KPI SCORECARD W/PROFIT MARGIN HOTLINK REPORT

The first dashboard page contains two objective KPIs (Financial and I.T. Systems) and 4 KPIs (Product Gross Profit Margin, Channel
Revenue, Returns Pct and Machine downtime trend). Each KPI has a “hot-link” report. The Financial KPIs show all Product Categories
underneath. The KPI score card and associated charts were created using the PerformancePoint Dashboard Designer.




         The Product Gross Profit Margin KPI is linked at to the
         chart by the filter date, filter geography and product
         category selection. The hot-link chart shows monthly
         sales, gross profit margin and gross profit margin for the
         past year.
KPI SCORECARD W/SALES DRILL DOWN TO BRAND
Charts can be drilled down into for more granular information. This is a drill down for the TV and Video Product Category to see which
Brands are being sold.
KPI SCORECARD W/CHANNEL SALES HOTLINK REPORT

The Channel Revenue KPI is linked at to the chart by the filter date, filter geography and current product category selection. The hot-
link chart shows the monthly sales quota amount, the sales amount and the sales amount last year.
KPI SCORECARD W/RETURNS % HOTLINK REPORT
The Returns Percent KPI is linked to the chart by the filter date, filter geography and current product category selection. The hot-link
chart shows the sales return amount and the sales return percent for the ten stores with the most returns.
KPI SCORECARD W/HOTLINK SUPPORTING GRID
The Machine Downtime KPI contains a hot-link to an analytic grid, that breaks down Machine Down Time, Machine Down Time Last
Year and the Trend Outage Type by Fiscal Year and Geography.
PRODUCT SALES PROFILE WITH SIBLINGS REPORT
The second page of the Dashboard contains a PerformancePoint Analytic Chart of Product Monthly Sales for the selected Fiscal Year,
along with the Product Sales as a percent of the product’s hierarchical parent. A Supporting Grid also displays the Monthly Product %
of Parent Sales for all siblings of the Product selection.
EMPLOYEE PROFILE W/DRILL DOWN TO
                           DECOMPOSITION TREE
The third page of the Performance Point Dashboard contains an Analytic Chart of Sales as a % of Quota, by Employee(s) and months for
a year. A drill down to a decomposition tree for employee Kim Abercrombie, is used to display her subordinates and their % of Sales
Quota, as well as how they rank in relation to their peers.




                                                                                                   Decomposition trees can
                                                                                                   be used for root cause
                                                                                                   analysis.
SSRS SALES MAP AND RETURNS % REPORT
The interactive Sales Map was created in SSRS and deployed to SharePoint. Here PerformancePoint acts as a wrapper for the SQL
Server Reporting Services report, so that the report can be displayed in the dashboard. The PerformancePoint Fiscal Year and Sort
Option filters are linked to the map and chart, which show Sales Amount by State. Additionally, the chart plots the Sales Return
Percent on a second vertical axis.
SSRS SALES MATRIX
The fifth page of the Dashboard is an SSRS Matrix report with two adjacent column groups for Sales by Retail Channel and Promotion,
and Row Groups for Product Category/Subcategory with drilldown and % of total functionality.
EXCEL SALES REPORT WITH SLICERS
A Pivot Table with Slicers for Product Category/Subcategory, Sales Chanel and Fiscal Year was created in Excel 2010 The Pivot Table
shows Sales Amount, Sales Return Amount, Gross Profit Margin and Sales Return Percent by the Geography Hierarchy. The report was
saved to SharePoint from Excel and then utilized by PerformancePoint to create the last page of the Contoso Retail Dashboard.
SSRS TOPS SELLING STORES SUBSCRIPTION REPORT
This is a simple Reporting Services Report deployed to Sharepoint. The report was set up to run nightly for the states of Maryland and
Virginia and be delivered to a Sharepoint folder.
Pamela Staerker
SetFocus Master’s BI Program 2011-2012
Instructor: Kevin S. Goff, Microsoft SQL Server MVP




                              http://www.linkedin.com/in/pstaerker

More Related Content

Viewers also liked

New microsoft word document (3)
New microsoft word document (3)New microsoft word document (3)
New microsoft word document (3)
Bankesh
 
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
TENERIFEWEEK
 
Inorme 2 trimestre 2014 biblioteca del pio x
Inorme 2 trimestre 2014   biblioteca del pio xInorme 2 trimestre 2014   biblioteca del pio x
Inorme 2 trimestre 2014 biblioteca del pio x
Daniel Francisco Doffo
 
Green enviornment
Green enviornmentGreen enviornment
Green enviornment
Razib M
 
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
Vinita Daki
 
Mengenal & Mendaftar Gmail
Mengenal & Mendaftar GmailMengenal & Mendaftar Gmail
Mengenal & Mendaftar Gmail
Ananta Bangun
 

Viewers also liked (20)

簡報2
簡報2簡報2
簡報2
 
New microsoft word document (3)
New microsoft word document (3)New microsoft word document (3)
New microsoft word document (3)
 
Summer 2012 seo 2 earned media
Summer 2012 seo 2 earned mediaSummer 2012 seo 2 earned media
Summer 2012 seo 2 earned media
 
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
Pisaverde www.TENERIFEWEEK.com TENERIFE MODA
 
Spain
SpainSpain
Spain
 
Opinator
OpinatorOpinator
Opinator
 
Inorme 2 trimestre 2014 biblioteca del pio x
Inorme 2 trimestre 2014   biblioteca del pio xInorme 2 trimestre 2014   biblioteca del pio x
Inorme 2 trimestre 2014 biblioteca del pio x
 
Measurement of NY
Measurement of NYMeasurement of NY
Measurement of NY
 
Wales
WalesWales
Wales
 
Does Game-Based Learning Work? Results from Three Recent Studies
Does Game-Based Learning Work? Results from Three Recent StudiesDoes Game-Based Learning Work? Results from Three Recent Studies
Does Game-Based Learning Work? Results from Three Recent Studies
 
Green enviornment
Green enviornmentGreen enviornment
Green enviornment
 
Google
Google Google
Google
 
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
Entrepreneurial - Digital Entrepreneur of the Year - Arjun Chatterjee - Entry...
 
LENTERA NEWS Edisi #14 Mei 2015
LENTERA NEWS Edisi #14 Mei 2015LENTERA NEWS Edisi #14 Mei 2015
LENTERA NEWS Edisi #14 Mei 2015
 
Rb pernik 20-21.06.2013
Rb pernik 20-21.06.2013Rb pernik 20-21.06.2013
Rb pernik 20-21.06.2013
 
www.leader-milyarderunion.com | Preview Milyarer Union
www.leader-milyarderunion.com | Preview Milyarer Unionwww.leader-milyarderunion.com | Preview Milyarer Union
www.leader-milyarderunion.com | Preview Milyarer Union
 
Making the Most of VR: 10 Tips for Sports Marketers
Making the Most of VR: 10 Tips for Sports MarketersMaking the Most of VR: 10 Tips for Sports Marketers
Making the Most of VR: 10 Tips for Sports Marketers
 
Pushing the Boundaries of Sencha and HTML5′s WebRTC
Pushing the Boundaries of Sencha and HTML5′s WebRTCPushing the Boundaries of Sencha and HTML5′s WebRTC
Pushing the Boundaries of Sencha and HTML5′s WebRTC
 
Mengenal & Mendaftar Gmail
Mengenal & Mendaftar GmailMengenal & Mendaftar Gmail
Mengenal & Mendaftar Gmail
 
Pro active datacenter-infrastructures-ccie-bulent-morten-16.10.2014
Pro active datacenter-infrastructures-ccie-bulent-morten-16.10.2014Pro active datacenter-infrastructures-ccie-bulent-morten-16.10.2014
Pro active datacenter-infrastructures-ccie-bulent-morten-16.10.2014
 

Similar to BI Portfolio

Portfolio For Charles Tontz
Portfolio For Charles TontzPortfolio For Charles Tontz
Portfolio For Charles Tontz
ctontz
 
OBIEE 12c Advanced Analytic Functions
OBIEE 12c Advanced Analytic FunctionsOBIEE 12c Advanced Analytic Functions
OBIEE 12c Advanced Analytic Functions
Michael Perhats
 
VMware Report Draft v2.1
VMware Report Draft v2.1VMware Report Draft v2.1
VMware Report Draft v2.1
John White
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
Bharat Rane
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
Satya Ramachandran
 

Similar to BI Portfolio (20)

Oracle SQL Advanced
Oracle SQL AdvancedOracle SQL Advanced
Oracle SQL Advanced
 
Bi Ppt Portfolio John Harisiadis
Bi Ppt Portfolio   John HarisiadisBi Ppt Portfolio   John Harisiadis
Bi Ppt Portfolio John Harisiadis
 
Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdf
 
Portfolio For Charles Tontz
Portfolio For Charles TontzPortfolio For Charles Tontz
Portfolio For Charles Tontz
 
Data warehousing testing strategies cognos
Data warehousing testing strategies cognosData warehousing testing strategies cognos
Data warehousing testing strategies cognos
 
OBIEE 12c Advanced Analytic Functions
OBIEE 12c Advanced Analytic FunctionsOBIEE 12c Advanced Analytic Functions
OBIEE 12c Advanced Analytic Functions
 
Cost Based Optimizer - Part 2 of 2
Cost Based Optimizer - Part 2 of 2Cost Based Optimizer - Part 2 of 2
Cost Based Optimizer - Part 2 of 2
 
Presentation interpreting execution plans for sql statements
Presentation    interpreting execution plans for sql statementsPresentation    interpreting execution plans for sql statements
Presentation interpreting execution plans for sql statements
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...
 
Performance Metrics and Ontology for Describing Performance Data of Grid Work...
Performance Metrics and Ontology for Describing Performance Data of Grid Work...Performance Metrics and Ontology for Describing Performance Data of Grid Work...
Performance Metrics and Ontology for Describing Performance Data of Grid Work...
 
Performance Metrics and Ontology for Describing Performance Data of Grid Work...
Performance Metrics and Ontology for Describing Performance Data of Grid Work...Performance Metrics and Ontology for Describing Performance Data of Grid Work...
Performance Metrics and Ontology for Describing Performance Data of Grid Work...
 
Implementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeImplementation of query optimization for reducing run time
Implementation of query optimization for reducing run time
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
 
Part2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer Statistics
 
VMware Report Draft v2.1
VMware Report Draft v2.1VMware Report Draft v2.1
VMware Report Draft v2.1
 
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic Functions
 
Part5 sql tune
Part5 sql tunePart5 sql tune
Part5 sql tune
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
 

BI Portfolio

  • 1. Business Intelligence Portfolio Pamela Staerker pstaerker.data2info@gmail.com http://www.linkedin.com/in/pstaerker
  • 2. Summary This Portfolio contains samples from BI solutions developed using Microsoft SQL Server R2 and the Microsoft Business Intelligence Toolset.  T-SQL Programming  MDX Programming  Integration Services ETL System (SSIS)  Analysis Services OLAP Database (SSAS)  Reporting Services (SSRS)  Sharepoint BI Delivery  PerformancePoint Services | Reporting Services | Excel Services
  • 4. Freight Allocation Specification: Using the Northwind Database Orders and [Order Details] table, create a result set that allocates freight downward to all product line items, based on each product's dollars as a percentage of the dollars for the product as a whole. Validate this by summing the allocated freight. Grand total of the summed freight is $64,942.69 In order to achieve the precise results needed for this allocation, it was necessary to CAST the Partial Result Set OrderTot as FLOAT. The OrderTot = UnitPrice * Quantity. While Quantity has an INTEGER data type, the UnitPrice has a MONEY data type. The MONEY data type restricts any decimal to four non-expanding decimal places, which creates a problem for the allocation ratio. When using a floating point the result is off by only 1/100,000,000 of a penny.
  • 5. Top Five Vendors Specification: Using the AdventureWorks2008 Database Purchasing.Vendor and Purchasing.PurchaseOrderHeader tables to show the top five vendors for 2003. Generate a ranking number for each vendor and show the data by quarter. Complete Result Set The native T-SQL Pivot operator is used to display the data with the VendorRank on rows. The Pivot request involves three logical processing phases with associated elements: 1) grouping phase 2) spreading phase 3) aggregation phase Here the pivot table is grouped by VendorRank, with the OrderDate quarter spread on columns and the TotalDue to the vendor aggregated.
  • 6. DYNAMIC SQL This an example of using dynamic SQL when a pivoted results set when is needed, but the number of columns to be created is not known in advance. Partial Result Set
  • 7. GET VENDOR PRODUCTS This procedure gets the top @n Vendors within a specified date range, and ranks them by PurchaseOrderHeader.TotalDue DESC. The results are stored in the @VendorTable variable. For each of the vendors, the top @y products within the same date range are subsequently selected and ranked by (PurchaseOrderDetail.UnitPrice * PurchaseOrderDetail.OrderQtry) DESC. The final result shows the top @n vendors and their top @y product sales. Partial Result Set Ranking is done using DENSE_RANK in the OVER clause. DENSE_RANK indicates how many distinct ordering rows have lower values. The CROSS APPLY operator is used to return the products for the top vendors by applying each row in the product query to each row in the @VendorTable variable.
  • 8. USING T-SQL MERGE Here the T-SQL Merge, which is new for SQL Server 2008, is used to insert new or update existing records in a production table from a staging table in a data warehouse scenario. The MERGE statement allows data to be inserted, updated or deleted based on conditional logic.
  • 10. TOP FIVE CITIES WITHIN TOP FIVE MONTHS Using the Adventure Works cube the top 5 cities were ranked within the top 5 months, based on the percent sales change over the prior month. Complete Result Set To get the top 5 cities within the top 5 months based on the % increase of the internet sales over the previous month , the top 5 RankedMonths (only) months were determined first using the TOPCOUNT function and then ranked using the RANK function. The GENERATE function, which is specifically used as a means of generating a set based on iteration over another set, was then applied to the current [RankedMonths] member and another TOPCOUNT was used to find the top 5 customer cities within each of the top 5 months.
  • 11. LAST 2 QUARTERS OF DATA FOR HAMBURG This query uses the Adventure Works cube, on rows to show Hamburg and the states that have the same parent as Hamburg on rows. Columns show the last 2 available quarters of data for internet sales and the percent of geographical parent. Complete Result Set The significance of this query is that by using a FILTER for null data combined with the TAIL function, the last two quarters of available data for the current customer geography will always be returned.
  • 12. CALCULATING COMMON TIME BASED METRICS The Adventure Works Cube is used here to show all customer country members across the columns and on rows show for June and July of 2004 and then crossed with the following measures: Internet Sales Amount, Sales Amount Last Period, YTD sales, YTD Sales for last year, Geographic % of Parent for Internet sales for the month and Geographic % of Parent for Internet sales for last year. Time-based expressions shown here were combined with other time based expressions in order to assemble more complex metrics. As an example, the [YTDLY] member used the MDX function PARALLELPERIOD with the [YTD Sales] member to determine the sales one year ago Complete Result Set
  • 13. INTEGRATION SERVICES ETL SYSTEM (SSIS)
  • 14. ALLWORKS CONSTRUCTION COMPANY PROJECT AllWorks is a fictitious construction company that uses data stored in various formats as part of their enterprise system. Employee and Client Geography data, along with Overhead and Job Order master data are stored in spreadsheets. The Material Purchases data is exported from an Oracle database into XML format, and Timesheet data is provided in .csv files. For this project all data from the files were transformed and loaded to a normalized database using SSIS. The Package Flow Design shows the input data flow into the SSIS packages. Each SSIS package was named for the respective AllWorksOLTP database table that it loads. AllWorksOLTP Database
  • 15. PROJECT SOLUTION The project solution contains all the data load packages as well as Master package, and a Database Maintenance package The Master package uses the Execute Package Task to run each of the packages in the proper sequence based, database foreign key constraints, followed by the Database Maintenance package. The Control Flow of each package generates an email notification for either the success or failure of the package. Success emails include counts of files processed: rows inserted, rows changed and invalid rows. Package configuration is used to dynamically update the database server, SMTP Server and mailbox variables based on the current runtime environment.
  • 16. TIMESHEET PACKAGE CONTROL FLOW The Timesheet Package Control Flow uses a Foreach Loop Container to loop through a variable number of .csv timesheet files in a folder. A Script Task inside the loop container accumulates insert/update/error totals in variables, for each file and for the entire folder. Script Tasks are used to write either a success or a failure email, as applicable. The Send Mail Task is used to send the email to an SMTP server.
  • 17. TIMESHEET DATA FLOW The data for each record that moves through the Timesheet Data Flow pipeline, is first converted from a .csv file data type to a SQL Server data type. The data is then checked using a Lookup Transformation to verify that ProjectID and EmployeeID are valid. A Conditional Split Transformation is used to make sure the project has not been closed. If the project is not closed and the ProjectID and EmployeeID are valid then a Lookup Transformation is used to determine if the Employee Time sheet already exists in the database, and if not the timesheet record is inserted, otherwise the timesheet record is updated. A Conditional Split is used so that only modified timesheet data is sent to the database. An OLE DB Destination is used to insert data. An OLE DB Command is used to update data.
  • 18. LOOKUP TRANSFORMATION A Lookup Transformation Editor is used to validate the ProjectID and also to bring the Project Closed Date data into the dataflow pipeline.
  • 19. C# SCRIPTING FOR ACCUMULATING TOTALS The Microsoft Visual Studio Tools for Applications (VSTA) was accessed using the Script Task. C# scripting is used to accumulate the Timesheet record count totals. Variables used by the Script Task were first created in the SSIS package. Read- write variables were selected for use in Script Task Editor from the SSIS package variables. These variables were then available to be used within the code of the Script Task through the Dts.Variables collection, which Integration Services automatically creates and makes available to the script code.
  • 20. MASTER PACKAGE Execute Package Tasks with precedence constraints that mirror PK/FK database constraints were used to execute the individual packages in the ETL solution. Upon successful completion of the ETL, database maintenance is performed.
  • 21. DATABASE MAINTENANCE PACKAGE The Database Maintenance Package shrinks the database, rebuilds the database indexes, updates the database statistics and backs-up the database nightly following successful completion of the Master Package.
  • 22. PRODUCTION SCHEDULE Following deployment of the ETL packages to SQL Server, the Master Package was scheduled to run nightly using SQL Agent.
  • 24. DATA SOURCE VIEW SQL Views were scripted in the AllWorks OLTP database to generate the dimensions and measures needed to create the AllWorks dimensional database in Analysis Services. In SSAS a data connection to the SQL Server relational database was established and a data source view (DSV) created using the dimensional views. In the DSV, entity relationships and logical primary keys were identified. The JobClosdedDate in the vwDimProject view is used in a role playing dimension, and therefore an entity relationship between vw.DimProjct.ClosedDateKey and vwDimDate.WeekendKey is needed. A named calculation was created to provide a description for the Boolean value returned by the Employee Flag in the vwDimEmployee view.
  • 25. ALLWORKS DW CUBE A cube consisting of four measure groups and five dimensions was then generated using the Cube Wizard. The Date dimension is used in the cube twice as a role-playing dimension for the Project Closed date.
  • 26. DIMENSION USAGE Relationships in the cube between the dimensions and measures was verified by inspecting the Dimension Usage tab. A referenced relationship was manually created for the intersection between the Project Closed Date (role playing dimension) and the Summary Measure Group. This dimension usage tab from another cube shows a many-to- many relationship between BookSales and Authors. Each Author can have multiple Books for sale, and each Book may have more than one Author. This necessitates the intermediate cross reference table, BookXAuthors.
  • 27. DIMENSIONS All cube dimension attributes and user- defined hierarchies. The Dimension Structure Tab was used to specify each dimension’s attributes and user-defined hierarchies. Here a date hierarchy was built using the Year, Quarter and Weekend date. Rigid attribute relationships were created to associate the attributes used in the Date Tree hierarchy. Setting the RelationshipType property determines whether Analysis Services creates rigid or flexible aggregations. After an incremental update, Analysis Services drops flexible aggregations and those aggregations must be manually reprocessed, but Analysis Services persists rigid aggregations, which improves query performance!
  • 28. BROWSING THE CUBE After the cube was deployed and processed successfully, the Cube Browser dimensions and measures were selected from the metadata pane and the results were examined against the original OLTP database for accuracy. The cube can also be easily browsed using an Excel Pivot Table.
  • 29. CALCULATIONS MDX expressions were used to create calculated members and named sets. This calculated member computes the open receivables as a percent of invoice amount. It is also rendered here as a Excel pivot table which uses it along with the Invoice Amount measure and the Open Receivables calculated member, for the All Clients named set in 2006, broken out by quarter.
  • 30. KPI’S Key performance indicators (KPIs) were created using the newly deployed calculated members for the KPI value expression. In this KPI a Traffic Light is used to visually represent the Profit Percent metric, with a goal of greater than 15% profit. The KPI is displayed for all clients using MS Excel. Green indicates the goal of greater than 15% profit for the client’s projects have been met. A yellow traffic light indicates that the profit percentage is between 15% and 5% inclusive. A red light indicates that the profit is less than 5%. No traffic light is displayed for clients who do not have any closed projects. While the KPI Goal here is static, it can also be data driven
  • 31. ACTIONS A report URL action was created so that the client application can execute a live Google Maps search based on the project’s county.
  • 32. PARTITIONS AND AGGREGATIONS Partitions were created for each fact table to separate current from historical data . MOLAP storage with aggregations designed for a 50% performance increase were specified. Aggregations are precalculated summaries of data from leaf cells
  • 34. MOVING AVERAGE OLAP REPORT This report shows sales dollar revenue as a column bar and the 12 month moving average as a horizontal line. The user can select one or more years, and a product category, subcategory or product. The Fiscal Year dropdown excludes the first year of sales.
  • 35. SALES BY CATEGORY EXPLODED PIE CHART This Exploded Pie Chart report shows sales proportions by category. Here the legend is hidden and an expression is used to label each pie slice with the Category, Sales Dollar Amount and the % of Total. The labels are configured to display outside of the pie for readability purposes.
  • 36. EMPLOYEE SALES QUOTAS W/ SPARKLINES & GAUGES This is an employee sales matrix OLAP report. The gauges give an ‘at a glance’ indication of each employee’s sales in comparison to their quota, while the sparklines provide a visual representation of each employee’s sales vs quota trend.
  • 37. TOP (N) PRODUCTS & TOP (Y) CITIES OLAP REPORT This report allows the user select to the TOP N Products by Revenue, and within each product the TOP Y Cities. The user can select one or more years. This report uses the MDX GENARATE function to achieve the TopY within the TopN.
  • 38. SHAREPOINT BI DELIVERY (DASHBOARD)
  • 39. CONTOSO RETAIL DASHBOARD The Contoso Retail Dashboard project is comprised of six PerformancePoint pages deployed to SharePoint 2010. An additional SSRS report with a delivery subscription is also included. Performance Point Content Site Collection Dashboard Pages
  • 40. KPI SCORECARD W/PROFIT MARGIN HOTLINK REPORT The first dashboard page contains two objective KPIs (Financial and I.T. Systems) and 4 KPIs (Product Gross Profit Margin, Channel Revenue, Returns Pct and Machine downtime trend). Each KPI has a “hot-link” report. The Financial KPIs show all Product Categories underneath. The KPI score card and associated charts were created using the PerformancePoint Dashboard Designer. The Product Gross Profit Margin KPI is linked at to the chart by the filter date, filter geography and product category selection. The hot-link chart shows monthly sales, gross profit margin and gross profit margin for the past year.
  • 41. KPI SCORECARD W/SALES DRILL DOWN TO BRAND Charts can be drilled down into for more granular information. This is a drill down for the TV and Video Product Category to see which Brands are being sold.
  • 42. KPI SCORECARD W/CHANNEL SALES HOTLINK REPORT The Channel Revenue KPI is linked at to the chart by the filter date, filter geography and current product category selection. The hot- link chart shows the monthly sales quota amount, the sales amount and the sales amount last year.
  • 43. KPI SCORECARD W/RETURNS % HOTLINK REPORT The Returns Percent KPI is linked to the chart by the filter date, filter geography and current product category selection. The hot-link chart shows the sales return amount and the sales return percent for the ten stores with the most returns.
  • 44. KPI SCORECARD W/HOTLINK SUPPORTING GRID The Machine Downtime KPI contains a hot-link to an analytic grid, that breaks down Machine Down Time, Machine Down Time Last Year and the Trend Outage Type by Fiscal Year and Geography.
  • 45. PRODUCT SALES PROFILE WITH SIBLINGS REPORT The second page of the Dashboard contains a PerformancePoint Analytic Chart of Product Monthly Sales for the selected Fiscal Year, along with the Product Sales as a percent of the product’s hierarchical parent. A Supporting Grid also displays the Monthly Product % of Parent Sales for all siblings of the Product selection.
  • 46. EMPLOYEE PROFILE W/DRILL DOWN TO DECOMPOSITION TREE The third page of the Performance Point Dashboard contains an Analytic Chart of Sales as a % of Quota, by Employee(s) and months for a year. A drill down to a decomposition tree for employee Kim Abercrombie, is used to display her subordinates and their % of Sales Quota, as well as how they rank in relation to their peers. Decomposition trees can be used for root cause analysis.
  • 47. SSRS SALES MAP AND RETURNS % REPORT The interactive Sales Map was created in SSRS and deployed to SharePoint. Here PerformancePoint acts as a wrapper for the SQL Server Reporting Services report, so that the report can be displayed in the dashboard. The PerformancePoint Fiscal Year and Sort Option filters are linked to the map and chart, which show Sales Amount by State. Additionally, the chart plots the Sales Return Percent on a second vertical axis.
  • 48. SSRS SALES MATRIX The fifth page of the Dashboard is an SSRS Matrix report with two adjacent column groups for Sales by Retail Channel and Promotion, and Row Groups for Product Category/Subcategory with drilldown and % of total functionality.
  • 49. EXCEL SALES REPORT WITH SLICERS A Pivot Table with Slicers for Product Category/Subcategory, Sales Chanel and Fiscal Year was created in Excel 2010 The Pivot Table shows Sales Amount, Sales Return Amount, Gross Profit Margin and Sales Return Percent by the Geography Hierarchy. The report was saved to SharePoint from Excel and then utilized by PerformancePoint to create the last page of the Contoso Retail Dashboard.
  • 50. SSRS TOPS SELLING STORES SUBSCRIPTION REPORT This is a simple Reporting Services Report deployed to Sharepoint. The report was set up to run nightly for the states of Maryland and Virginia and be delivered to a Sharepoint folder.
  • 51. Pamela Staerker SetFocus Master’s BI Program 2011-2012 Instructor: Kevin S. Goff, Microsoft SQL Server MVP http://www.linkedin.com/in/pstaerker