SlideShare a Scribd company logo
1 of 156
Download to read offline
NASI SPONSORZY I PARTNERZY
Real-World Cube Design and Performance
Tuning with SSAS Multidimensional
Chris Webb
Who Am I?
•

Chris Webb
• Email: chris@crossjoin.co.uk
• Twitter @Technitrain

•
•

Analysis Services consultant and trainer
www.crossjoin.co.uk & www.technitrain.com
Co-author:
• MDX Solutions
• Expert Cube Development with SSAS 2008
• Analysis Services 2012: The BISM Tabular Model

•
•

SQL Server MVP
Blogger: http://cwebbbi.wordpress.com

BID-202
Agenda
• 9:00-11:00 - Working in BIDS/SSDT and Dimension
Design Tips & Tricks
• 11:15-12:30 – Measure and Measure Group Tips &
Tricks
• 13:30-14:15 – Understanding SSAS Query Execution
• 14:15-15:00 - Designing Effective Aggregations
• 15:15-16:30 – Tuning MDX Calculations
• 16:45 – 17:30 - Caching and Cache Warming, Q&A
Working in BIDS,
Data Sources and Data Source Views
Agenda
•
•
•
•
•

Online vs Project mode
Project properties
Data Sources
Data Source Views
BIDS Helper
Using BI Development Studio
• BI Development Studio (BIDS) or SQL Server Data Tools
(SSDT) in 2012 is a Visual Studio addin for designing SSAS
databases
• You can use it in two ways:
– Project mode, where the SSAS database is held locally and
has to be specifically deployed to the server
– Online mode where you work live on the server and
deployment takes place every time you Save

• Project mode is best because:
– You can use source control
– If you make a mistake and save it, you don’t have the risk
that it will unprocess the cube
Configuring a New SSAS Project
• On a new SSAS project in BIDS, always set the
following properties:
– Processing Option should be set to Do Not Process
– Server should be changed from localhost to the real
name of your SSAS server
– Database should be set to the name of the SSAS
database you want to create
– Deployment Server Edition should be set to the name
of the SQL Server edition you plan to use in
production
Creating a Data Source
• A Data Source represents a connection to the cube’s
relational source data
• You need to choose which provider to use
– In general native OLEDB is faster than .NET

• You need to choose what security is needed for SSAS
to connect to the data source
– Supply username/password, eg for Oracle
– Windows Authentication
Creating a Data Source
• You need to choose which Windows user SSAS will
impersonate to connect to the cube
– Easiest to use the SSAS service account
– Most secure to supply a username/password

• You can have multiple Data Sources in a project but it
is a bad idea
– Your data should already be in one data mart!
– Needs to use SQL Server OpenRowset – slow
Creating a Data Source View
• A Data Source View (DSV) is simply an intermediate
stage where you can
– Pick the tables you wish to use to build your cube
– Define logical relationships between tables (note: they
are not written back to the RDBMS)
– Rename tables and columns
– Tweak the data so it is modelled the way you want

• DSVs are not visible to end users
Named Calculations and Named
Queries
• Named Calculations allow you to create derived
columns in tables
– Need a SQL expression that returns a value

• Named Queries allow you to create the equivalent of
views in the DSV
– Need a SQL SELECT statement
Dos and Don’ts of DSVs
• DO add in logical foreign key relationships between
tables, even if they are not physically present in your
database
– Means BIDS will automatically set other properties for you
later on in the design process

• DO create multiple diagrams to reduce complexity
• DO NOT use them as a replacement for proper ETL!
– It will slow down SSAS processing
– It hides important aspects of your design
– Use relational views if you must, or even better change
your SSIS packages
BIDS Helper
• BIDS Helper is a free tool that adds much important
functionality to BIDS
– Better attribute relationship visualisation
– Better aggregation designer
– Features for checking common problems

• Install it!
http://bidshelper.codeplex.com/
Modelling Complex Dimensions
Agenda
•
•
•
•
•

Dimensions and Attributes
Attribute Hierarchies and User Hierarchies
Attribute Relationships
Slowly Changing Dimensions
Parent Child and Ragged Hierarchies
Dimensions and Attributes
• An SSAS Dimension is composed of multiple
Attributes
• Attributes represent the different levels of
granularity on a dimension that data can be
aggregated up to
– On a Time dimension you might have attributes like
Year and Month
– Based on the columns in the dimension table

• By default, each attribute becomes a hierarchy you
can use when browsing the cube
Dimension Properties
• ProcessingGroup – using ByTable results in a single
query being issued to process the whole dimension
– Might improve performance but not usually a good
idea!
– Also need to turn off Duplicate Key errors

• StringStoresCompatibilityLevel (new in 2012) - avoids
4GB string store limit
– May slow performance and increase size on disk
Attribute Properties
• AttributeHierarchyEnabled controls whether an attribute
hierarchy will be built
– If one isn’t the attribute is visible only as a property of
another attribute
– Useful for things like phone numbers and addresses
– Not building a hierarchy can save time when processing
the dimension

• AttributeHierarchyOptimizedState controls whether
indexes are built
– Can save time processing the dimension
– But can have a big impact on query performance
Attribute Properties
• DefaultMember controls what the default selected
member is
– If you don’t set it, the default member will be the first real
member on the first level
– So usually the All Member
– Setting it can confuse users

• IsAggregatable controls whether an All Member is
created for a hierarchy
– Some hierarchies contain data that should never be
aggregated, eg different budgets
– If this is set to False, always set a specific default member
Ordering Attributes
• The OrderBy property controls the order of members
on a hierarchy
– You can order by name, key, or the value of another
attribute

• The AttributeHierarchyOrdered property controls
whether the attribute is ordered at all
– Not ordering it can result in faster processing
– But means that member ordering is arbitrary
Time Dimensions
• Setting the Type property of a dimension to ‘Time’
enables some time-specific features:
– Certain MDX functions, like YTD, are time-aware
– Semi-additive measures need a time dimension

• You will also need to set the Type property on
attributes such as years, quarters, months and dates
for everything to work properly
Attributes and User Hierarchies
• User hierarchies are multi-level hierarchies that
support more complex drill paths
• Each level in a user hierarchy is based on an attribute
– On a Time dimension there might be a Calendar
Hierarchy with levels Year, Quarter, Month, Date

• If you put an attribute hierarchy in a user hierarchy,
then set its Visible property to False
Attribute or User Hierarchies?
• As always, it depends...
• User hierarchies:
– Are easier for users to understand
– Make writing calculations where you need to find
ancestors/descendants easier
– May lead to better query performance overall

• Attribute hierarchies:
– Are more flexible – you can’t put two levels of the
same user hierarchy on different axes
Attribute Relationships
• Attribute Relationships describe the one-to-many
relationships between attributes
– For example a Year contains many Months and a Month
contains many Dates

• All attributes must have a one-to-many relationship with
the Key attribute
• Natural user hierarchies are user hierarchies where every
level has a one-to-many relationship with the one below
• The BIDS Helper Dimension Health Check functionality
can check whether attribute relationships have been
correctly configured
Optimal Attribute Relationships
• Optimising attribute relationships is very important
for query performance
• In general, query performance will be better with
‘bushy’ rather than ‘flat’ attribute relationships
• Tweaking your data to allow the creation of chains of
relationships is a good thing
– So long as it doesn’t compromise your data

• Also consider if merging many dimensions into one
will allow you to build better relationships
Attribute Relationship Scenarios
• Forks versus Chains
• Creating dummy attributes to make unnatural
hierarchies natural
• Type 2 Slowly Changing Dimensions
• Diamond-shaped relationships
Slowly Changing Dimensions
• Analysis Services has no specific support for SCDs,
but can handle them very easily
• Type 1 SCD changes require a Process Update
– Aggregations may also need to be rebuilt

• Type 2 SCD changes require a Process Add or a
Process Update
• Modelling attribute relationships correctly on a Type
2 SCD can make a big different to query performance
Rigid and Flexible Relationships
• Attribute relationships can be rigid or flexible
• Rigid relationships are relationships that will never
undergo a Type 2 change
• Flexible relationships may undergo a Type 2 change
• Any aggregation that includes an attribute with a
flexible relationship between it and the key attribute
will be dropped after a Process Update on a
dimension
Parent/Child Hierarchies
• Parent/child hierarchies can be built on tables which
have a recursive or “pig’s ear” join
• Allow you to build variable-depth hierarchies such as Org
Charts or Charts of Accounts
• Are very flexible, but have several limitations:
– You can only have one per dimension, and it must be built
from the key attribute
– They can cause serious query performance problems,
especially when they contain more than a few thousand
members

• Avoid using them unless you have no other choice!
Members with Data
• Parent/child hierarchies can be used to load non-leaf
data into a cube
• Useful for data that is non-aggregatable
– Eg Data that has been anonymised

• However will perform relatively badly
• Each member on a parent/child hierarchy has an
associated DataMember
– The non-leaf data is associated with the DataMember
– The DataMember can be visible or hidden
– By default, the DataMember’s value will be aggregated
along with the other children’s values, but this can be
overridden with MDX
Ragged Hierarchies
• In some cases, you can make a normal user hierarchy
look ‘ragged’ by setting the HideMemberIf property
• This allows members to be hidden in a user hierarchy
when they have
– No name
– The same name as their parent
– Either of the above plus they are the only child

• Useful when you know in advance the maximum
number of levels in your hierarchy
Ragged Hierarchies
• Have to set the connection string property MDX
Compatibility=2 to make them appear ragged in all
cases
– Most client tools do this by default; Excel and the
Dimension Editor do not

• To make Excel work, duplicate values from the
bottom of the hierarchy upwards
Modelling complex Measures and Measure
Groups
Cubes and Measure Groups
• A Cube should contain all of the data that any user
would want to query
– You can’t query more than one cube at a time

• Each Fact Table in your data mart will become a
Measure Group in your cube
• Usually better to create one cube with multiple
measure groups than multiple cubes
• You can use Perspectives to simplify what end users
see
When To Use Multiple Cubes
• A single cube has some disadvantages:
– A single cube can become too complex to maintain
and develop
– If a dimension needs a full process, then a single cube
needs completely reprocessing
– Multiple cubes can be spread over multiple servers for
scale-out
– Multiple cubes are easier to secure
Measure Aggregation
• Basic measure aggregation types are Sum, Count
(either rows in the fact table, or non-null column
values), Min and Max
– Average can be written with some simple MDX

• Enterprise Edition gives several more complex
aggregation functions
• Custom aggregation can be achieved with MDX
calculations
Distinct Count
• Distinct Count measures will be created in their own
measure group
– This is for performance reasons

• They are likely to perform worse than other
measures
– Require a special partitioning strategy
– Very IO intensive – solid state disks can be useful
Semi-Additive Measures
• Semi-additive measures act like Sum for all
dimensions except the Time dimension
– NB the Time dimension needs to have its Type
property set to ‘Time’
– If there are multiple Time dimensions, the measures
will only be semi-additive for one

• They are: FirstChild, LastChild, FirstNonEmpty,
LastNonEmpty, AverageOfChildren
• Useful for handling snapshot fact data
Other Aggregation Types
• None means that data only appears at the
intersection of the leaves of all dimensions
• ByAccount allows you to create measures which have
different semi-additive behaviour for different
members
– Again, useful for financial applications
Regular Relationships
• The Dimension Usage tab in BIDS shows how
Dimensions join to Measure Groups
• No Relationship is a valid relationship type!
• Regular relationships are the equivalent of inner
joins between dimensions and measure groups
– Most commonly used relationship type
Different Granularities
• The Granularity Attribute property controls which
attribute is used to make the join on a Regular
relationship
– Usually the Key Attribute is used
– Join always takes place on the columns used in the
attribute’s KeyColumns property

• The IgnoreUnrelatedDimensions property controls what
happens below granularity and when there is no
relationship between dimensions and measure groups
– Can either display nulls
– Or repeat the All Member value
Fact Relationships
• Very similar to Regular Relationships, but are used
when the dimension has been built directly from the
fact table
• Differences between Fact and Regular:
– Fact relationships generate slightly better SQL for
ROLAP queries
– A measure group can only have one Fact relationship
– Fact relationships may be treated differently by some
client tools
Referenced Relationships
• Used when a dimension connects to a measure group via
another dimension
• Materialized referenced relationships give fast query
performance
– Require a join between the Reference dimension and the
fact dimension at processing, which can slow down
processing a lot

• Non-materialized referenced relationships result in
slower query performance but no overhead at processing
time
• Often better to build a single dimension instead, unless it
would result in excessive duplication of attributes
Many-to-Many Relationships
• Allow you to model scenarios where dimension members
have many-to-many relationships with facts
– Eg a Bank Account might be associated with several
Customers

• M2M relationships require a bridge table
– Becomes an ‘intermediate’ measure group
– Once all regular relationships have been set up, the M2M
relationship can be created

• Carry a query performance penalty
• But extremely flexible and a very important feature for
modelling many complex scenarios
Role-Playing Dimensions
• It is possible to add the same Database dimension to
a cube more than once
– This is called a role-playing dimension
– Eg the Date dimension could be an Order Date and a
Ship Date

• Saves time processing and on maintenance
• But since you can only change the name of the
dimension, and not the hierarchies on it, it can be
confusing for users
Measure Expressions
• Measure Expressions allow you to:
– Multiply or divide two measures before aggregation
takes place
– The measures have to be from two different measure
groups
– There needs to be at least one common dimension

• Useful for currency conversion or share calculations
• Faster than the equivalent MDX
Partitions
• Measure groups can be divided up into multiple
partitions
– Enterprise Edition only

• You should partition your measure groups to reflect the
slices that users are likely to use in their queries
– Usually by the Time dimension

• Partitioning improves query performance in two ways:
– It reduces overall IO because AS should only scan the
partitions containing the data requested
– It increases parallelism because AS can scan more than
one partition simultaneously
Partitioning for Regular Measures
• There are two things to consider when planning a
partitioning strategy:
– How and when new data is loaded
– How users slice their queries

• Luckily, partitioning by time usually meets both
needs
– Eg building one partition per day or month

• Can also partition by another dimension if the
resulting partitions are too large, or users regularly
slice queries in other ways
Partition Sizing
• How big should a partition be?
– Up to 20-50 million rows is ok, but 100 million rows can be
fine too
– Try not to create small partitions: those with less than
4096 rows will have no indexes built

• Also, try not to create too many partitions:
– BIDS and SQLMS get slower the more partitions you have
(can be very bad in older versions of SSAS)
– Aggregations are only built per partition, so over
partitioning reduces their effectiveness
– Aim for no more than 500 to 2000 partitions
Setting the Slice
• By default, SSAS knows the minimum and maximum
dataids of members from each attribute in a partition
• Uses this information to work out which partitions to
scan for a query
• If ranges overlap then SSAS may scan many partitions
unnecessarily
• To stop this, set the Slice property on the partition
– Takes the form of an MDX tuple
Partitioning for Distinct Count
• Partitioning for distinct count measures must take a
completely different approach
– Remember distinct count measures should be in their
own measure group anyway

• You need to:
– Ensure each partition contains a non-overlapping
range of values in the column you’re doing a distinct
count on
– Ensure that each query hits as many partitions as
possible, so SSAS can use as many cores as possible in
parallel
Partitioning and M2M
• Partitioning large intermediate measure groups used
in many-to-many relationships can be useful for
performance
– Either partition by the dimensions used in the m2m
relationship
– Or add other dimensions to the relationship

• Querying by a m2m dimension does not result in a
filter being applied on intermediate dimensions
– Can result in a large number of partition scans
Controlling the Amount of Parallelism
• The CoordinatorExecutionMode server property
defines the number of partitions in one measure
group that can be queried in parallel
– By default is set to -4 = 4 * number of processors

• CoordinatorQueryBalancingFactor and
CoordinatorQueryBoostPriorityLevel can be used to
limit the ability of a single query to use all resources
and block other queries
Understanding How Analysis Services
Answers Queries
Agenda
•
•
•
•

SSAS Architecture Overview
The Storage Engine
The Formula Engine
Using Profiler
How SSAS answers queries
MDX Query In

Cellset Out

Cache

Query Subcube
Requests

Cache

Disk

Formula Engine
works out what data is
needed for each query,
and requests it from the
Storage Engine

Storage Engine
handles retrieval of
raw data from disk,
and any aggregation
required
Subcubes
• Subcubes are slices of data from the cube
• Appear in many places:
– MDX Scope statements
– Caching
– Query Subcube requests

• Subcubes are defined by their granularity and slice
Granularity
• Single grain
– List of GROUP BY attributes in SQL SELECT

• Mixed grain
– Both Attribute.[All] and Attribute.MEMBERS
Granularity
All Country,
All City
All
Products

Products

Countries,
All City

Countries,
Cities
Lattice of granularities
All, All, All

All, All, *

*, All, All

All, *, All

All, *, *

*, All, *

*, *, All

*, *, *
Granularity ordering
•
•
•
•
•

(All,*) is above (*,*)
(All,*) is below (All,All)
(All,2008) is at (All,*)
(All,*) and (*,All) are incompatible
Attribute relationships make some arrows
impossible:
(State.*, All Cities) – OK
(State.All, City.*) – cannot occur
Slice
• Single member
– SQL: Where City = ‘Redmond’
– MDX: [City].[Redmond]

• Multiple members
– SQL: Where City IN (‘Redmond’, ‘Seattle’)
– MDX: { [City].[Redmond], [City].[Seattle] }
Slice at granularity
 SQL
SELECT Sum(Sales), City FROM Sales_Table
WHERE City IN (‘Redmond’, ‘Seattle’)
GROUP BY City

 MDX
SELECT Measures.Sales ON 0
, NON EMPTY {Redmond, Seattle} ON 1
FROM Sales_Cube
Slice below granularity
 SQL
SELECT Sum(Sales) FROM Sales_Table
WHERE City IN (‘Redmond’, ‘Seattle’)

 MDX
SELECT Measures.Sales ON 0
FROM Sales_Cube
WHERE {Redmond, Seattle}
Examples
All Years
All Cities
Redmond
Seattle

New York
London

2005

2006

2007

2008
Examples
All Years

2005

All Cities
Redmond
Seattle

New York
London

(Seattle, Year.Year.MEMBERS)

2006

2007

2008
Examples
All Years

2005

All Cities
Redmond
Seattle

New York
London

(Seattle, Year.MEMBERS)

2006

2007

2008
Examples
All Years

2005

2006

2007

All Cities
Redmond
Seattle

New York
London

({Redmond, Seattle, London}, Year.MEMBERS)

2008
Examples
All Years

2005

2006

All Cities
Redmond
Seattle

New York
London

({Redmond, Seattle}, {2005, 2006, 2007})

2007

2008
Retrieving Data
• The first step to answering a query is for the FE to
generate a query plan
– Unfortunately, we can’t see these query plans

• The next step to answering a query is to work out which
subcubes of real (ie not calculated) data are needed
– Not just for ‘real’ measures but also for calculations too

• These are translated to single-grain subcubes
• Each request from the FE to the SE is visible in Profiler as
a Query Subcube request
Sonar
• Sonar is the component of the FE engine that
determines which subcubes should be retrieved to
answer the query
• For complex queries this can involve many difficult
decisions:
– One large subcube or many small ones?
– In some cases it may be more efficient to retrieve
more data than the query requires
2

2

3

4

2

5

6

7

2

3

4

5

6

7
2

2

3

4

2

5

6

7

2

3

4

5

6

7
2

2

3

4

[2, *]
Non precise solution

2

5

6

7

2

3

[2, {2,3,4,5,6}]
Precise solution

4

5

6

7
Which one is better ?
2

2

3

4

[2, *]
Non precise solution

2

5

6

7

2

3

[2, {2,3,4,5,6}]
Precise solution

4

5

6

7
Controlling Sonar
• Sometimes SSAS can decide to prefetch too much
data, causing performance problems
• To stop this happening
– Rewrite your query to use different attributes or user
hierarchies
– Use a subselect instead of a WHERE clause (though
this will impact caching)
– Use the Disable Prefetch Facts=True; Cache Ratio=1
connection string properties
The Storage Engine
• The SE receives requests for subcubes of data from
the FE
• Then:
– If the subcube can be derived from data in cache, it
will read data from cache
– If the subcube can be derived from an aggregation, it
will read data from the aggregation
– Otherwise it will read the data from a partition

• Finally it will perform any aggregation necessary and
return the subcube
Reading Data from Disk
• Next, the SE determines which
partitions/aggregations contain the required slices of
data
• It then reads data from these partitions
– It will attempt to read from several partitions in
parallel if possible

• Finally merges and aggregates the data from all
partitions and loads it into the SE cache
• For distinct count measure groups things are more
complex...
The Formula Engine
• The FE can then read data from the SE cache and
perform calculations
• The FE tries to determine which subcubes have the same
calculations applied to them
– It can then perform calculations over large numbers of
cells very efficiently
– Known as ‘bulk mode’ or ‘block computation’

• When it cannot do this, it has to iterate over every cell,
determine which calculations need to be applied, and
perform the calculations
– This can be very slow
– Known as ‘cell by cell’ mode
Bulk Mode/Block Computation
• Cube space populated at fact table generally extremely sparse
–

•

Goal is to compute expressions only where they need to be computed
–

•

Values only exist for small minority of possible combinations of dimension keys
Most often, everything takes on a default value typically (but not always) null

Take following example:

WITH MEMBER Measures.ContributionToParent AS
Measures.Sales/
(Measures.Sales, Product.CurrentMember.Parent)
SELECT
Product.[Product Family].members ON COLUMNS,
Customer.Country.Members ON ROWS
FROM
Sales
WHERE
Measures.ContributionToParent
Cell-by-Cell Approach
Measures.ContributionToParent
Drink

Food

Non-Consumable

Canada

(null)

(null)

(null)

Mexico

(null)
9.22%

(null)
71.95%

(null)
18.83%

USA

=
(Measures.Sales, Product.Currentmember.Parent)’

measures.[Unit Sales]
Drink

Food

Non-Consumable

Canada

(null)

(null)

(null)

Mexico

(null)

(null)

(null)

USA

$ 24,597.00 $ 191,940.00 $

50,236.00

AS Calc Engine Rules: Null / Null = Null

All Products

/

Canada

(null)

Mexico

(null)

USA

$ 266,773.00
Disadvantages of Cell-by-Cell
• The same navigation for each cell repeated
– Same “relative offset” for each cell yet FE repeats the same navigation
for every cell in the subspace

• Other work repeated for every cell
– An expensive check for recursion to determine if pass should be
decremented
– Eg Sales = Sales * 1.2

• Calculating null values when we should know ahead of time
they are null
– Extremely important in a sparse value subspace
Bulk Mode Goals
• Only calculate non-default values for a subcube
– Usually, but not always, the default value is null

• Perform cell navigation once for each subcube
– Eg PrevMember, Parent – often the same for each cell
in a subcube
Bulk Mode Approach
Drink

Food

Non-Consumable

Canada

(null)

(null)

(null)

Mexico

(null)

(null)

(null)

USA

9.22%

71.95%

18.83%

3) …and everything else is
null
Country

Product

Measure

Value

USA

Drink

Contribution to Parent

9.22%

USA

Food

Contribution to Parent

71.95%

USA

Non-Consumable

Contribution to Parent

18.83%

Country Product

Measure Value

Country Product

USA

Drink

Sales

$24,597.00

USA

USA

Food

Sales

$191,940.00

USA

NonConsumable Sales

$50,236.00

1. Retrieve non-null values from storage engine

2) Perform the computation
for the non-null values only 3 computations
instead of 9…
Measure

All Products Sales

Value
$266,773.00
The Formula Engine Cache
• The results of calculations are then stored in the FE
cache
– Future queries can then avoid doing calculations if the
results can be retrieved from cache

• However, FE caches may exist only for the lifetime of
a query or session
Using Profiler
• SQL Profiler can be used to observe what happens
within SSAS when a query executes
– MDX Studio provides much the same information,
often in a more readable form

• Provides good information on what’s happening in
the SE, much less on FE
• It’s a good idea to create a new trace template with
the following events...
Profiler Events
•
•
•
•
•
•

Query Begin/End
Execute MDX Script Begin/End
Query Cube Begin/End
Progress Report Begin/End
Get Data From Aggregation
Query Subcube Verbose (shows same data as Query
Subcube in a more readable form)
• Resource Usage
Interpreting Query Subcube Verbose
• Query Subcube and Query Subcube Verbose show the subcube
requests from the FE to the SE
– Single grain only
– Granularity bitmask
• 1 is in grain
• 0 is not in grain
Symbol
*
DataID

Meaning
All members at granularity
Single member slice

+

Set slice at granularity

-

Set slice below granularity

0

No slice and no granularity
Arbitrary shape
Query Performance Benchmarking
• When performance tuning, you need to have a
benchmark to test against
– Are you improving performance?

• Methodology:
– Ensure no-one else is running queries on the server
– Ensure you’re running queries on prod data volumes and
server specs
– Clear Cache
– Run query and measure Duration, to get cold cache
performance
– Run query again to get warm cache performance
Designing Effective Aggregations in Analysis
Services 2008
Agenda
•
•
•
•
•
•

What are aggregations and why should I build them?
The importance of good dimension design
The Aggregation Design Wizard
Usage-based optimisation
Designing aggregations manually
Influence of cube design on aggregation usage
What is an aggregation?
• It’s a copy of the data in your fact table, preaggregated to a certain level
– Created when the cube is processed
– Stored on disk

• Think of it as being similar to the results of a GROUP
BY query in SQL
• It makes queries fast because it means SSAS does not
have to aggregate as much data at query time
Visualising aggregations
• You can actually see an aggregation if you:
–
–
–
–

Create a cube using ROLAP storage
Build aggregations
Process the cube
Look in your relational data source at the indexed
view or table created
The price you pay
• Aggregations are created at processing time
• Therefore building more aggregations means
processing takes longer
• Also increases disk space used by the cube
• But SSAS can build aggregations very quickly
• And with the ‘right’ aggregation design, relatively
little extra processing time or disk space is needed
Should I pre-aggregate everything?
• NO!
• There is no need: you don’t need to hit an
aggregation direct to get the performance benefit
– In fact, too many aggregations can be bad for query
performance!

• Building every possible aggregation would also lead
to:
– Very, very long processing times
– Very, very large cubes (‘database explosion’)
Database explosion
10

20

30

40

10

20

30

30

40

70

40

60

100

5 extra values needed to hold all possible aggregations on
a table that had only 4 values in it originally!
When aggregations are used
• Aggregations are only useful when the Storage
Engine has to fetch data from disk
• Aggregations will not be used if the data is in the
Storage Engine cache
• Aggregations may not be useful if the cause of query
performance problems lies in the Formula Engine
Aggregation designs
• Each measure group can have 0 or more aggregation
design objects associated
– They are what gets created when you run either of the
Aggregation Design wizards
– They detail which aggregations should be built
– Remember to assign designs to partitions!

• Each partition in a measure group can be associated
with 0 or 1 aggregation designs
• Aggregations are built on a per-partition basis
Aggregation design methodology
1. Ensure your dimension design is correct
2. Set all properties that affect aggregation design
3. Run the Aggregation Design Wizard to build some
initial aggregations
4. Perform Usage-Based Optimisation for at least a
few weeks, and repeat regularly throughout the
cube’s lifetime (or automate!)
5. Design aggregations manually for individual queries
when necessary
Dimension design
• Dimension design has a big impact on the
effectiveness of aggregations
• Your dimension designs should be stable before you
think about designing aggregations
• Three important things:
– Delete any attributes that won’t ever be used
– Check your attribute relationships to ensure they are
set optimally
– Build any natural user hierarchies you think might be
needed
Attribute relationships
• Attribute relationships allow SSAS to derive values at
higher granularities from aggregations built at lower
granularities
• The more ‘bushy’ the attribute relationships on your
dimension, the more effective aggregations will be
• Flat attribute relationships make designing
aggregations much harder
Properties to set
• Before starting aggregation design, you should set
the following properties:
– EstimatedRows property for partitions
– EstimatedCount property for dimensions
– AggregationUsage property for attributes of cube
dimensions

• Setting these correctly will ensure the aggregation
design wizards will do the best job possible
AggregationUsage
• The AggregationUsage property has the following
possible values:
– Full – meaning every aggregation will include this
attribute
– None – meaning no aggregation will include this
attribute
– Unrestricted – meaning an aggregation may include
this attribute
– Default – means the same as Unrestricted for key
attributes, or attributes used in natural user
hierarchies
The Aggregation Design Wizard
• The Aggregation Design Wizard is a good way to
create a ‘first draft’ of your aggregation design
• Do not expect it to build every aggregation you’ll
ever need...
• ...it may not even produce any useful aggregations at
all!
• Increased cube complexity from AS2005 onwards
means it has a much harder job
The Aggregation Design Wizard
• The first few steps in the wizard ask you to set
properties we’ve already discussed
• However, the Partition Count property can only be
set in the wizard
• It specifies the number of members on an attribute
that are likely to have data in any given partition
• For example, if you partition by month, a partition
will only have data for one month
Set Aggregation Options step
• This is where the aggregations get designed!
• Warning: may take minutes, even hours to run
• Four options for working out which aggregations should be
built:
– Estimated Storage Reaches – build enough aggregations
to fill a certain amount of disk
– Performance Gain Reaches – build aggregations to get
x% of all possible performance gain from aggregations.
– I Click Stop – carry on until you get bored of waiting
– Do Not Design Aggregations
Set Aggregation Options step
• The strategy I use is:
– Choose I Click Stop, then see if the design process
finishes quickly and if so, how many aggregations are
built and what size
– If you have a reasonable size and number of
aggregations (build no more than 50 aggregations
here), click Finish
– Otherwise, click Stop and Reset and choose
Performance Gain Reaches and set it to 30%
– If only a few, small aggregations are designed,
increase by 10% and continue until you are happy
Usage-Based Optimisation
• Usage-Based Optimisation involves logging Query
Subcube requests and using that information to influence
aggregation design
• To set up logging:
– Open SQL Management Studio
– Right-click on your instance and select Properties
– Set up a connection to a SQL Server database using the
LogQuery Log properties
– The QueryLogSampling property sets the % of requests to
log – be careful, as setting this too high might result in
your log table growing very large very quickly
Usage-Based Optimisation
• You should log for at least a few weeks to get a good
spread of queries
• Remember: any changes to your dimensions will
invalidate the contents of the log
• Next, you can run the Usage-Based Optimisation
wizard
• This is essentially the same as the Aggregation
Design wizard, but with log data taken into account
(you can also filter the data in the log)
Manual aggregation design
• You will find that sometimes the wizards do not build the
aggregations needed for specific queries
– Running the wizard is a bit hit-and-miss
– The wizards may never build certain aggregations, for
example they may think they are too large
• In this case you will need to design aggregations manually
• BIDS 2008 allows you to do this on the Aggregations tab of the
cube editor
• BIDS Helper (http://www.codeplex.com/bidshelper) has
better functionality for this
Manual aggregation design
• To design aggregations manually:
–
–
–
–

Clear the cache
Start a Profiler trace
Run your query
Look for Query Subcube events that take more than
500ms
– Build aggregations at the same granularity as the
Query Subcube events
– Save, deploy then run a ProcessIndex
Redundant attributes
• Often more than one attribute from the same
dimension will be used in a subcube request
• If those attributes are connected via attribute
relationships, only include the lowest one in the
aggregation
– Eg If Year, Quarter and Month are selected, just use
Month

• This will ensure that the aggregation can be used by
the widest range of queries
Influence of cube design
• There is a long list of features you might use in your
cube design that affect aggregation design and usage
• Mostly they result in queries executing at lower
levels of granularity than you’d expect
• This means aggregations are less likely to be hit, and
so less useful
• BIDS Helper and the BIDS Aggregations tab will help
you spot these situations
Many-to-many relationships
• Aggregations will never be used on an intermediate
measure group in a many-to-many relationship, if the
measure group is only used as an intermediate
measure group
• M2M queries are resolved at the granularity of the
key attributes of all dimensions common to the main
measure group and the intermediate measure group
• Therefore aggregations are less likely to be hit on the
main measure group
Semi-additive measures
• Queries involving semi-additive measures are always
resolved at the granularity attribute of the Time
dimension
• Therefore any aggregations must include the
granularity attribute of the Time dimension if they’re
to be used
Partitions
• It’s pointless to build aggregations above the
granularity of the attribute you’re using to slice your
partitions
– Eg, if you partition by Month, don’t build aggregations
above Month level
– Nothing stops you doing this though!

• Do this and you get no performance benefit and limit
the queries the aggregations can be used for
Parent-child hierarchies
• You cannot build aggregations that include parentchild hierarchies
• That also means that the levels that appear ‘within’
the parent-child structure cannot be used in an
aggregation
• You should build aggregations on the Key Attribute of
your dimension instead
– Very important if you have a DefaultMember or if
IsAggregatable=False
MDX Calculations
• Calculated members and MDX Script assignments are
very likely to request data for different granularities
from the one you’re querying
• Don’t assume – check what’s happening in Profiler!
Other things to watch for
• Measure expressions cause queries to be evaluated
at the common granularity of the two measure
groups involved
• Unary operators other than + and Custom Rollups
will also force query granularity down
• Do not include attributes from non-materialised
reference dimensions in aggregations as this may
result in incorrect values being returned
Monitoring Aggregation Usage
• Profiler can show when aggregations are used:
– Get Data From Aggregation event is fired
– Progress Report Begin/End events show reads from
aggregations
– ObjectPath column shows exactly which objects are
being read from

• XPerf utility can be used to determine how much
time is spent in IO and how much time is spent
aggregating data
Tuning MDX Calculations
Agenda
•
•
•
•
•
•

Avoiding using MDX calculations
Measure Expressions
Bulk mode and Cell-by-Cell mode
When Bulk Mode can be used
Non Empty evaluation
Using Profiler, Perfmon and MDX Studio to monitor FE
activity
• Using named sets effectively
• Using caching to avoid recalculating expressions
• Scoping calculations appropriately
Avoiding MDX
• Doing calculations in MDX should be a last resort - if
you can design something into the cube, then do so
• Examples:
– Using calculations to aggregate members instead of
creating a new attribute
– Using calculations instead of creating a m2m
relationship
Measure Expressions
• Measure Expressions are simple calculations that are
performed in the SE not the FE
– Set using a measure’s MeasureExpression property
– Can take advantage of parallelism with partitioning
– Faster than FE equivalent, though not always by much

• Take the form M1*M2 or M1/M2, where M1 and M2 are
measures from two different measure groups with at
least one common dimension
– Calculation takes place before measure aggregation

• Useful for:
– Currency conversion
– Percentage ownership/share
Bulk Mode MDX Functions
• Most MDX functions can use bulk mode
• Some, however, cannot – so using them can seriously
harm query performance
– See the topic “Performance Improvements for SQL
Server 2008 Analysis Services” in BOL for a list
– MDX Studio can identify them too
Other Ways to Force Cell-by-Cell
• Use cell security
• Use SSAS stored procedures/custom functions
• Use named sets or set aliases inside calculations
(before SSAS 2012)
• Reference calculations that are defined later on in
the MDX Script
• Use non-deterministic functions, eg NOW()
Non Empty Evaluation
• SSAS regularly needs to filter empty cells
– When queries use NON EMPTY or NonEmpty()
– As part of bulk mode evaluation

• It has two ways of doing this:
– Slow algorithm that evaluates all cell values and then
filters
• Can be very slow when calculations are involved

– Fast algorithm that ‘knows’ if a cell is empty by
looking at raw measure values from the SE
Non_Empty_Behavior
• The Non_Empty_Behavior property in SSAS 2000 and
2005 was a hint to use the fast algorithm
• Should not be used in SSAS 2008 – 98% of the time
the engine knows when to use it anyway
– Was probably the most misused feature in 2005,
leading to incorrect values coming from the cube
– In 2008 only use it if you really know what you’re
doing and as a last resort
Using Profiler To Monitor the FE
• Profiler gives us little useful information on what’s
happening in the FE
• Useful events include:
– Calculate Non Empty Begin/End
• When the value of IntegerData is 11, SSAS is using the
cell-by-cell algorithm

– Get Data From Cache – shows reads from cache, and
which type of cache is used

• But usually these events return too much
information to be useful...
Using Perfmon to Monitor the FE
• The following counters are useful:
– Total Calculation Covers – gives the number of
subcubes over which a common set of calculations is
evaluated. High is bad.
– Total Cells Calculated – the number of cell values
evaluated for a query. High is bad.
– Total Sonar Subcubes – the number of SE subcube
requests generated by Sonar
– Total NON EMPTY/unoptimized/for calculated
members – the number of times a Non Empty is
evaluated
Other Tools
• Kernrate Viewer can be used to identify FE problems
relating to stored procedures
– Shouldn’t be using stored procedures anyway

• MDX Script Performance Analyser can help identify
individual problem calculations
– Often, though, the problem is with a combination of
overlapping calculations
MDX Studio
• MDX Studio is a great tool for diagnosing FE
problems
– Combines Profiler and Perfmon data
– Identifies functions you shouldn’t be using, and links
to blog posts with more detail
– But unlikely to be developed further
MDX Don’ts
• Do not use the LookUpCube function
• Do not use the NonEmptyCrossjoin function, use NonEmpty
instead
• Do not use the StrToX functions inside calculated members
– If you must use them, use the Constrained flag

• Do not use non-deterministic functions inside calculated
members
– Eg Now(), UserName()

• Do not compare objects by name, use the IS operator instead
• Do not return anything other than a Null from an IIF when you
don’t want a calculation to take place
MDX Avoids
• Avoid using subselects wherever possible
– In some cases, despite the impact on caching, it can
produce faster queries though

• Avoid using the LinkMember function
• Avoid using cell security
Using Named Sets
• Named sets can be used to avoid re-evaluating the
same set expression multiple times
– Eg finding the rank of a member in an ordered set,
where the set order remains static

• This can result in massive performance gains
• However, remember that using named sets also
disables bulk mode before 2012
Rewriting to use Bulk Mode
• In some cases, rewriting calculations to avoid certain
functions can enable the use of bulk mode
– Eg Count(Filter()) scenario
– Somewhat non-intuitive, but works
– May get fixed in future service pack or version
Using Caching
• The FE can only cache cell values, not the value of
individual expressions
• So if you have an expensive expression that returns a
numeric value, then it could be worthwhile to put it
into its own calculated measure
– Can then be hidden

• After the first time it is evaluated, its value will be
cached, so any other calculations that use it can read
values from cache
Caching and Cache-Warming
Agenda
• How Analysis Services caching works
• When and why Analysis Services can’t cache data
• Warming the Storage Engine cache with the CREATE
CACHE statement
• Warming the Formula Engine cache by running
queries
• Automating cache warming
Types of Analysis Services cache
• Analysis Services can cache two types of value:
– Values returned by the Storage Engine
• ‘Raw’ measure values, one cache per measure group
• Dimension data, one cache per dimension

– Values returned by the Formula Engine
• Numeric values only – strings can’t be cached

• All SE caches have the same structure, known as the data
cache registry
• The FE can also store values in this structure if
calculations are evaluated in bulk mode
• The FE uses a different structure, the flat cache, for
calculations evaluated in cell-by-cell mode
Storage Engine Cache
• Data in the data cache registry is held in subcubes, ie
data at a common granularity
• Subcubes may not contain an entire granularity –
they may be filtered
• Prefetching is generally good for cache reuse
• Arbitrary-shaped subcubes cannot be cached
• Try to redesign hierarchies or queries to avoid this
Storage Engine Cache Aggregation
• SE cache data can be aggregated to answer queries in
some cases
• Except when the measure data itself cannot be
aggregated, for example with distinct count measures or
many-to-many

• Unlike aggregations, the SE cache isn’t aware of attribute
relationships
• Aggregation can only happen from the lower level of an
attribute hierarchy to the All Member
• For this reason user hierarchies can offer better
performance because of the subcubes they generate
Formula Engine Cache Scopes
• There are three different ‘scopes’ or lifetimes of a FE cache:
– Query – for calculations defined in the WITH clause of a query,
the FE values can only be cached for the lifetime of the query
– Session – for calculations defined for a session, using the
CREATE MEMBER statement executed on the client, FE values
can only be cached for the lifetime of a session
– Global – for calculations defined in the cube’s MDX Script, FE
values can be cached until either
• Any kind of cube processing takes place
• A ClearCache XMLA command is executed
• Writeback is committed
• Global scope is best from a performance point of view!
Cache Sharing
• Values stored in the SE cache can always be shared
between all users
• Values stored in the FE cache can be shared between
users, except when:
– Stored in Query or Session-scoped caches
– Users belong to roles with different dimensions security
permissions
• Note: dynamic security always prevents cache sharing

• Calculations evaluated in bulk mode cannot reference
values stored in the FE flat cache
• Calculations evaluated in cell-by-cell mode cannot
reference values stored in the FE data cache registry
Forcing Query-scoping
• In certain circumstances, SSAS uses query-scoped FE
caches when you would expect it to use global scope
• These are:
– Calculations that use the Username or LookupCube
functions
– Calculations use non-deterministic functions such as Now()
or any SSAS stored procedures
– Queries that use subselects
– When any calculated member is defined in the WITH
clause, whether it is referenced or not in the query
– When cell security is used
Warming the SE cache
• Considerations for warming the SE cache:
– We want to avoid cache fragmentation, for example having
one unfiltered subcube cached rather than multiple
filtered subcubes
– It is possible to overfill the cache – the SE will stop looking
in the cache after it has searched 1000 subcubes
– We want to cache lower rather than higher granularities,
since the latter can be aggregated from the former in
memory
– We need a way of working out which granularities are
useful
Warming the SE cache
• We can warm the SE cache by using either:
– WITH CACHE, to warm the cache for a single query – not
very useful
– The CREATE CACHE command
• Remember that building aggregations is often a better
alternative to warming the SE cache
• But in some cases you can’t build aggregations – for example
when there are many-to-many relationships
CREATE CACHE
• Example CREATE CACHE statement:

CREATE CACHE
FOR [Adventure Works] AS
'({[Measures].[Internet Sales Amount]},
{[Date].[Date].[Date].MEMBERS},
{[Product].[Category].[Category].MEMBERS})'
Which subcubes should I cache?
• The Query Subcube and Query Subcube Verbose
events in Profiler show the subcubes requested from
the SE by the FE
• This is also the information stored in the SSAS query
log, stored in SQL Server
• Analyse this data manually and find the most
commonly-requested, lower-granularity subcubes
• Maybe also query the Query Log, or a Profiler trace
saved to SQL Server, to find other subcubes –
perhaps for queries that have been run recently
Warming the FE cache
• First, tune your calculations! Ensure use of bulk mode
where possible
• The only way to warm the FE cache is to run MDX queries
containing calculations
• Remember, these queries must not:
– Include a WITH clause
– Subselects

• Also, no point trying to cache calculations whose values
cannot be cached
• And think about how security can impact cache usage
Queries to warm the FE Cache
• Again, it is worth manually constructing some MDX queries yourself to
warm the FE cache
• Also, running regularly-used queries (for example those used in SSRS
reports) can be a good idea
• Can easily collect the queries your users are running by running a
Profiler trace, then saving that trace to SQL Server or a .trc file
– The Query Begin and Query End events contain the MDX query
– Need to filter out those with a WITH clause etc
– Watch out for parameterisation (eg SSRS)
– Watch out for use of session sets and calculations (eg Excel 2003)
– Watch out for queries that slice by Time, where the actual slicer
used may change regularly
– Think about the impact of dimension security too
Memory considerations
• SSAS caching can use a lot of memory!
• The cache will keep growing until SSAS thinks it is running out
of memory:
– When memory usage exceeds the % of available system
memory specified in the LowMemoryLimit property, data will be
dropped from cache
– When it exceeds the % specified in the TotalMemoryLimit
property, all data will be dropped from cache
– We therefore don’t want to exceed the LowMemoryLimit
– We also want to avoid paging
– We need to leave space for caching real user queries

• The FE flat cache is limited to 10% of the TotalMemoryLimit
– If it grows bigger than that, it is completely emptied
Automating Cache Warming
• We should perform cache-warming after cube processing
has taken place
• Remember – it may take a long time! It should not
overlap/interfere with real users querying
• We can automate it a number of different ways:
– Running SSRS reports on a data-driven subscription
– Using the ascmd.exe utility
– Building your own SSIS package – the best solution for
overall flexibility.
• Either fetch queries from a SQL Server table
• Or from a Profiler .trc file using the Konesans Trace File
Source component
Summary
• Clearly a lot of problems to watch out for!
• However, some cache-warming (however inefficient) is
often better than none at all
• A perfectly-tuned cube would have little need for cachewarming, but...
– Some performance problems we just don’t know about
– Some we may not be able to fix (eg with complex
calculations, hardware limitations)
– Cache warming is likely to have some positive impact in
these cases – maybe lots, maybe not much
NASI SPONSORZY I PARTNERZY

Organizacja: Polskie Stowarzyszenie Użytkowników SQL Server - PLSSUG
Produkcja: DATA MASTER Maciej Pilecki

More Related Content

What's hot

SQL Saturday Columbus 2014 Exposing SQL Data with SharePoint
SQL Saturday Columbus 2014 Exposing SQL Data with SharePointSQL Saturday Columbus 2014 Exposing SQL Data with SharePoint
SQL Saturday Columbus 2014 Exposing SQL Data with SharePointScott_Brickey
 
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePoint
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePointSharePoint Cincinnati 2015 Exposing Line of Business data with SharePoint
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePointScott_Brickey
 
Your first step by step tutorial for oracle SOA
Your first step by step tutorial for oracle SOAYour first step by step tutorial for oracle SOA
Your first step by step tutorial for oracle SOAhalimelnagar
 
Presentation cloud control enterprise manager 12c
Presentation   cloud control enterprise manager 12cPresentation   cloud control enterprise manager 12c
Presentation cloud control enterprise manager 12cxKinAnx
 
Evolved BI with SQL Server 2012
Evolved BIwith SQL Server 2012Evolved BIwith SQL Server 2012
Evolved BI with SQL Server 2012Andrew Brust
 

What's hot (13)

SQL Saturday Columbus 2014 Exposing SQL Data with SharePoint
SQL Saturday Columbus 2014 Exposing SQL Data with SharePointSQL Saturday Columbus 2014 Exposing SQL Data with SharePoint
SQL Saturday Columbus 2014 Exposing SQL Data with SharePoint
 
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePoint
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePointSharePoint Cincinnati 2015 Exposing Line of Business data with SharePoint
SharePoint Cincinnati 2015 Exposing Line of Business data with SharePoint
 
Your first step by step tutorial for oracle SOA
Your first step by step tutorial for oracle SOAYour first step by step tutorial for oracle SOA
Your first step by step tutorial for oracle SOA
 
cv_Rajesh_K_dba (1)
cv_Rajesh_K_dba (1)cv_Rajesh_K_dba (1)
cv_Rajesh_K_dba (1)
 
Rakesh
RakeshRakesh
Rakesh
 
Tableau API
Tableau APITableau API
Tableau API
 
updated_profile_ak
updated_profile_akupdated_profile_ak
updated_profile_ak
 
Ripon Datta. SQL DBA N
Ripon Datta. SQL DBA NRipon Datta. SQL DBA N
Ripon Datta. SQL DBA N
 
Presentation cloud control enterprise manager 12c
Presentation   cloud control enterprise manager 12cPresentation   cloud control enterprise manager 12c
Presentation cloud control enterprise manager 12c
 
Evolved BI with SQL Server 2012
Evolved BIwith SQL Server 2012Evolved BIwith SQL Server 2012
Evolved BI with SQL Server 2012
 
Resume_Krishna.M
Resume_Krishna.MResume_Krishna.M
Resume_Krishna.M
 
Suvajitbasu
SuvajitbasuSuvajitbasu
Suvajitbasu
 
Migrate SQL Workloads to Azure
Migrate SQL Workloads to AzureMigrate SQL Workloads to Azure
Migrate SQL Workloads to Azure
 

Viewers also liked

Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...
Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...
Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...Wolfgang Strasser
 
Advanced integration services on microsoft ssis 1
Advanced integration services on microsoft ssis 1Advanced integration services on microsoft ssis 1
Advanced integration services on microsoft ssis 1Skillwise Group
 
9\9 SSIS 2008R2_Training - Package Reliability and Package Execution
9\9 SSIS 2008R2_Training - Package Reliability and Package Execution9\9 SSIS 2008R2_Training - Package Reliability and Package Execution
9\9 SSIS 2008R2_Training - Package Reliability and Package ExecutionPramod Singla
 
Step by Step design cube using SSAS
Step by Step design cube using SSASStep by Step design cube using SSAS
Step by Step design cube using SSASAhsan Kabir
 
05 SSIS Control Flow
05 SSIS Control Flow05 SSIS Control Flow
05 SSIS Control FlowSlava Kokaev
 
SSIS Basic Data Flow
SSIS Basic Data FlowSSIS Basic Data Flow
SSIS Basic Data FlowRam Kedem
 
SSIS Data Flow Tasks
SSIS Data Flow Tasks SSIS Data Flow Tasks
SSIS Data Flow Tasks Ram Kedem
 
Control Flow Using SSIS
Control Flow Using SSISControl Flow Using SSIS
Control Flow Using SSISRam Kedem
 
SSIS 2008 R2 data flow
SSIS 2008 R2 data flowSSIS 2008 R2 data flow
SSIS 2008 R2 data flowSlava Kokaev
 
Developing ssas cube
Developing ssas cubeDeveloping ssas cube
Developing ssas cubeSlava Kokaev
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL ServerPeter Gfader
 
Designing and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkDesigning and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkBharat Vadlamudi
 
SSAS - Other Cube Browsers
SSAS - Other Cube BrowsersSSAS - Other Cube Browsers
SSAS - Other Cube BrowsersPeter Gfader
 
Informatica power center 9.x developer & admin Basics | Demo | Introduction
Informatica power center 9.x developer & admin Basics | Demo | Introduction Informatica power center 9.x developer & admin Basics | Demo | Introduction
Informatica power center 9.x developer & admin Basics | Demo | Introduction Kernel Training
 
Informatica student meterial
Informatica student meterialInformatica student meterial
Informatica student meterialSunil Kotthakota
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power CenterEdureka!
 

Viewers also liked (20)

Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...
Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...
Professional Recycling - SSIS Custom Control Flow Components With Visual Stud...
 
Advanced integration services on microsoft ssis 1
Advanced integration services on microsoft ssis 1Advanced integration services on microsoft ssis 1
Advanced integration services on microsoft ssis 1
 
9\9 SSIS 2008R2_Training - Package Reliability and Package Execution
9\9 SSIS 2008R2_Training - Package Reliability and Package Execution9\9 SSIS 2008R2_Training - Package Reliability and Package Execution
9\9 SSIS 2008R2_Training - Package Reliability and Package Execution
 
Step by Step design cube using SSAS
Step by Step design cube using SSASStep by Step design cube using SSAS
Step by Step design cube using SSAS
 
05 SSIS Control Flow
05 SSIS Control Flow05 SSIS Control Flow
05 SSIS Control Flow
 
SSIS Basic Data Flow
SSIS Basic Data FlowSSIS Basic Data Flow
SSIS Basic Data Flow
 
06 SSIS Data Flow
06 SSIS Data Flow06 SSIS Data Flow
06 SSIS Data Flow
 
SSIS Data Flow Tasks
SSIS Data Flow Tasks SSIS Data Flow Tasks
SSIS Data Flow Tasks
 
Control Flow Using SSIS
Control Flow Using SSISControl Flow Using SSIS
Control Flow Using SSIS
 
SSIS 2008 R2 data flow
SSIS 2008 R2 data flowSSIS 2008 R2 data flow
SSIS 2008 R2 data flow
 
Developing ssas cube
Developing ssas cubeDeveloping ssas cube
Developing ssas cube
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL Server
 
Informatica power center 9 Online Training
Informatica power center 9 Online TrainingInformatica power center 9 Online Training
Informatica power center 9 Online Training
 
SSIS control flow
SSIS control flowSSIS control flow
SSIS control flow
 
Designing and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkDesigning and implementing_an_etl_framework
Designing and implementing_an_etl_framework
 
SSAS - Other Cube Browsers
SSAS - Other Cube BrowsersSSAS - Other Cube Browsers
SSAS - Other Cube Browsers
 
Informatica power center 9.x developer & admin Basics | Demo | Introduction
Informatica power center 9.x developer & admin Basics | Demo | Introduction Informatica power center 9.x developer & admin Basics | Demo | Introduction
Informatica power center 9.x developer & admin Basics | Demo | Introduction
 
Informatica student meterial
Informatica student meterialInformatica student meterial
Informatica student meterial
 
Informatica ppt
Informatica pptInformatica ppt
Informatica ppt
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
 

Similar to SQLDay2013_ChrisWebb_CubeDesign&PerformanceTuning

Tableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My LifeTableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My LifeRussell Spangler
 
SSAS Design & Incremental Processing - PASSMN May 2010
SSAS Design & Incremental Processing - PASSMN May 2010SSAS Design & Incremental Processing - PASSMN May 2010
SSAS Design & Incremental Processing - PASSMN May 2010Dan English
 
Ds03 data analysis
Ds03   data analysisDs03   data analysis
Ds03 data analysisDotNetCampus
 
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSINGSKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSINGSkillwise Group
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxNurulIzrin
 
New features of sql server 2016 bi features
New features of sql server 2016 bi featuresNew features of sql server 2016 bi features
New features of sql server 2016 bi featuresChris Testa-O'Neill
 
Monitorando performance no Azure SQL Database
Monitorando performance no Azure SQL DatabaseMonitorando performance no Azure SQL Database
Monitorando performance no Azure SQL DatabaseVitor Fava
 
Got documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionGot documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionMaggie Pint
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerAntonios Chatzipavlis
 
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deck
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deckMySQL Visual Analysis and Scale-out Strategy definition - Webinar deck
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deckVladi Vexler
 
Charlotte SPUG - Planning for MySites and Social in the Enterprise
Charlotte SPUG - Planning for MySites and Social in the EnterpriseCharlotte SPUG - Planning for MySites and Social in the Enterprise
Charlotte SPUG - Planning for MySites and Social in the EnterpriseMichael Oryszak
 
Taming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsTaming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsKellyn Pot'Vin-Gorman
 
Architecture Principles CodeStock
Architecture Principles CodeStock Architecture Principles CodeStock
Architecture Principles CodeStock Steve Barbour
 
Data Warehouse approaches with Dynamics AX
Data Warehouse  approaches with Dynamics AXData Warehouse  approaches with Dynamics AX
Data Warehouse approaches with Dynamics AXAlvin You
 
The Rise of Self -service Business Intelligence
The Rise of Self -service Business IntelligenceThe Rise of Self -service Business Intelligence
The Rise of Self -service Business Intelligenceskewdlogix
 
Evolutionary database design
Evolutionary database designEvolutionary database design
Evolutionary database designSalehein Syed
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analyticsIke Ellis
 
Remote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts
 

Similar to SQLDay2013_ChrisWebb_CubeDesign&PerformanceTuning (20)

Tableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My LifeTableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My Life
 
SSAS Design & Incremental Processing - PASSMN May 2010
SSAS Design & Incremental Processing - PASSMN May 2010SSAS Design & Incremental Processing - PASSMN May 2010
SSAS Design & Incremental Processing - PASSMN May 2010
 
Ds03 data analysis
Ds03   data analysisDs03   data analysis
Ds03 data analysis
 
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSINGSKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptx
 
New features of sql server 2016 bi features
New features of sql server 2016 bi featuresNew features of sql server 2016 bi features
New features of sql server 2016 bi features
 
Monitorando performance no Azure SQL Database
Monitorando performance no Azure SQL DatabaseMonitorando performance no Azure SQL Database
Monitorando performance no Azure SQL Database
 
tw_242.ppt
tw_242.ppttw_242.ppt
tw_242.ppt
 
Got documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionGot documents - The Raven Bouns Edition
Got documents - The Raven Bouns Edition
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deck
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deckMySQL Visual Analysis and Scale-out Strategy definition - Webinar deck
MySQL Visual Analysis and Scale-out Strategy definition - Webinar deck
 
Charlotte SPUG - Planning for MySites and Social in the Enterprise
Charlotte SPUG - Planning for MySites and Social in the EnterpriseCharlotte SPUG - Planning for MySites and Social in the Enterprise
Charlotte SPUG - Planning for MySites and Social in the Enterprise
 
Taming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsTaming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI Options
 
Architecture Principles CodeStock
Architecture Principles CodeStock Architecture Principles CodeStock
Architecture Principles CodeStock
 
Data Warehouse approaches with Dynamics AX
Data Warehouse  approaches with Dynamics AXData Warehouse  approaches with Dynamics AX
Data Warehouse approaches with Dynamics AX
 
The Rise of Self -service Business Intelligence
The Rise of Self -service Business IntelligenceThe Rise of Self -service Business Intelligence
The Rise of Self -service Business Intelligence
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Evolutionary database design
Evolutionary database designEvolutionary database design
Evolutionary database design
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analytics
 
Remote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New Features
 

More from Polish SQL Server User Group

SQLDay2013_PawełPotasiński_ParallelDataWareHouse
SQLDay2013_PawełPotasiński_ParallelDataWareHouseSQLDay2013_PawełPotasiński_ParallelDataWareHouse
SQLDay2013_PawełPotasiński_ParallelDataWareHousePolish SQL Server User Group
 
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012SQLDay2013_PawełPotasiński_GeografiaSQLServer2012
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012Polish SQL Server User Group
 
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitectures
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitecturesSQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitectures
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitecturesPolish SQL Server User Group
 
SQLDay2013_Denny Cherry - Table indexing for the .NET Developer
SQLDay2013_Denny Cherry - Table indexing for the .NET DeveloperSQLDay2013_Denny Cherry - Table indexing for the .NET Developer
SQLDay2013_Denny Cherry - Table indexing for the .NET DeveloperPolish SQL Server User Group
 
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldSQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldPolish SQL Server User Group
 
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&RunningSQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&RunningPolish SQL Server User Group
 
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...Polish SQL Server User Group
 
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...Polish SQL Server User Group
 
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...Polish SQL Server User Group
 
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scriptsPolish SQL Server User Group
 
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_sessionPolish SQL Server User Group
 
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStolecki
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStoleckiSQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStolecki
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStoleckiPolish SQL Server User Group
 

More from Polish SQL Server User Group (20)

SQLDay2013_PawełPotasiński_ParallelDataWareHouse
SQLDay2013_PawełPotasiński_ParallelDataWareHouseSQLDay2013_PawełPotasiński_ParallelDataWareHouse
SQLDay2013_PawełPotasiński_ParallelDataWareHouse
 
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012SQLDay2013_PawełPotasiński_GeografiaSQLServer2012
SQLDay2013_PawełPotasiński_GeografiaSQLServer2012
 
SQLDay2013_MarekAdamczuk_Kursory
SQLDay2013_MarekAdamczuk_KursorySQLDay2013_MarekAdamczuk_Kursory
SQLDay2013_MarekAdamczuk_Kursory
 
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitectures
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitecturesSQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitectures
SQLDay2013_MarcinSzeliga_SQLServer2012FastTrackDWReferenceArchitectures
 
SQLDay2013_MarcinSzeliga_DataInDataMining
SQLDay2013_MarcinSzeliga_DataInDataMiningSQLDay2013_MarcinSzeliga_DataInDataMining
SQLDay2013_MarcinSzeliga_DataInDataMining
 
SQLDay2013_MaciejPilecki_Lock&Latches
SQLDay2013_MaciejPilecki_Lock&LatchesSQLDay2013_MaciejPilecki_Lock&Latches
SQLDay2013_MaciejPilecki_Lock&Latches
 
SQLDay2013_GrzegorzStolecki_RealTimeOLAP
SQLDay2013_GrzegorzStolecki_RealTimeOLAPSQLDay2013_GrzegorzStolecki_RealTimeOLAP
SQLDay2013_GrzegorzStolecki_RealTimeOLAP
 
SQLDay2013_GrzegorzStolecki_KonsolidacjaBI
SQLDay2013_GrzegorzStolecki_KonsolidacjaBISQLDay2013_GrzegorzStolecki_KonsolidacjaBI
SQLDay2013_GrzegorzStolecki_KonsolidacjaBI
 
SQLDay2013_Denny Cherry - Table indexing for the .NET Developer
SQLDay2013_Denny Cherry - Table indexing for the .NET DeveloperSQLDay2013_Denny Cherry - Table indexing for the .NET Developer
SQLDay2013_Denny Cherry - Table indexing for the .NET Developer
 
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldSQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
 
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&RunningSQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
 
SQLDay2013_ChrisWebb_SSASDesignMistakes
SQLDay2013_ChrisWebb_SSASDesignMistakesSQLDay2013_ChrisWebb_SSASDesignMistakes
SQLDay2013_ChrisWebb_SSASDesignMistakes
 
SQLDay2013_MarcinSzeliga_StoredProcedures
SQLDay2013_MarcinSzeliga_StoredProceduresSQLDay2013_MarcinSzeliga_StoredProcedures
SQLDay2013_MarcinSzeliga_StoredProcedures
 
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...
SQL DAY 2012 | DEV Track | Session 6 - Master Data Management by W.Bielski 6 ...
 
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
 
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...
SQL DAY 2012 | DEV Track | Session 9 - Data Mining Analiza Przepływowa by M.S...
 
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts
26th_Meetup_of_PLSSUG_WROCLAW-ColumnStore_Indexes_byBeataZalewa_scripts
 
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session
26th_Meetup_of_PLSSUG-ColumnStore_Indexes_byBeataZalewa_session
 
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStolecki
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStoleckiSQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStolecki
SQLDay2011_Sesja03_Fakty,MiaryISwiatRealny_GrzegorzStolecki
 
SQLDay2011_Sesja02_Collation_Marek Adamczuk
SQLDay2011_Sesja02_Collation_Marek AdamczukSQLDay2011_Sesja02_Collation_Marek Adamczuk
SQLDay2011_Sesja02_Collation_Marek Adamczuk
 

Recently uploaded

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Recently uploaded (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

SQLDay2013_ChrisWebb_CubeDesign&PerformanceTuning

  • 1. NASI SPONSORZY I PARTNERZY
  • 2. Real-World Cube Design and Performance Tuning with SSAS Multidimensional Chris Webb
  • 3. Who Am I? • Chris Webb • Email: chris@crossjoin.co.uk • Twitter @Technitrain • • Analysis Services consultant and trainer www.crossjoin.co.uk & www.technitrain.com Co-author: • MDX Solutions • Expert Cube Development with SSAS 2008 • Analysis Services 2012: The BISM Tabular Model • • SQL Server MVP Blogger: http://cwebbbi.wordpress.com BID-202
  • 4. Agenda • 9:00-11:00 - Working in BIDS/SSDT and Dimension Design Tips & Tricks • 11:15-12:30 – Measure and Measure Group Tips & Tricks • 13:30-14:15 – Understanding SSAS Query Execution • 14:15-15:00 - Designing Effective Aggregations • 15:15-16:30 – Tuning MDX Calculations • 16:45 – 17:30 - Caching and Cache Warming, Q&A
  • 5. Working in BIDS, Data Sources and Data Source Views
  • 6. Agenda • • • • • Online vs Project mode Project properties Data Sources Data Source Views BIDS Helper
  • 7. Using BI Development Studio • BI Development Studio (BIDS) or SQL Server Data Tools (SSDT) in 2012 is a Visual Studio addin for designing SSAS databases • You can use it in two ways: – Project mode, where the SSAS database is held locally and has to be specifically deployed to the server – Online mode where you work live on the server and deployment takes place every time you Save • Project mode is best because: – You can use source control – If you make a mistake and save it, you don’t have the risk that it will unprocess the cube
  • 8. Configuring a New SSAS Project • On a new SSAS project in BIDS, always set the following properties: – Processing Option should be set to Do Not Process – Server should be changed from localhost to the real name of your SSAS server – Database should be set to the name of the SSAS database you want to create – Deployment Server Edition should be set to the name of the SQL Server edition you plan to use in production
  • 9. Creating a Data Source • A Data Source represents a connection to the cube’s relational source data • You need to choose which provider to use – In general native OLEDB is faster than .NET • You need to choose what security is needed for SSAS to connect to the data source – Supply username/password, eg for Oracle – Windows Authentication
  • 10. Creating a Data Source • You need to choose which Windows user SSAS will impersonate to connect to the cube – Easiest to use the SSAS service account – Most secure to supply a username/password • You can have multiple Data Sources in a project but it is a bad idea – Your data should already be in one data mart! – Needs to use SQL Server OpenRowset – slow
  • 11. Creating a Data Source View • A Data Source View (DSV) is simply an intermediate stage where you can – Pick the tables you wish to use to build your cube – Define logical relationships between tables (note: they are not written back to the RDBMS) – Rename tables and columns – Tweak the data so it is modelled the way you want • DSVs are not visible to end users
  • 12. Named Calculations and Named Queries • Named Calculations allow you to create derived columns in tables – Need a SQL expression that returns a value • Named Queries allow you to create the equivalent of views in the DSV – Need a SQL SELECT statement
  • 13. Dos and Don’ts of DSVs • DO add in logical foreign key relationships between tables, even if they are not physically present in your database – Means BIDS will automatically set other properties for you later on in the design process • DO create multiple diagrams to reduce complexity • DO NOT use them as a replacement for proper ETL! – It will slow down SSAS processing – It hides important aspects of your design – Use relational views if you must, or even better change your SSIS packages
  • 14. BIDS Helper • BIDS Helper is a free tool that adds much important functionality to BIDS – Better attribute relationship visualisation – Better aggregation designer – Features for checking common problems • Install it! http://bidshelper.codeplex.com/
  • 16. Agenda • • • • • Dimensions and Attributes Attribute Hierarchies and User Hierarchies Attribute Relationships Slowly Changing Dimensions Parent Child and Ragged Hierarchies
  • 17. Dimensions and Attributes • An SSAS Dimension is composed of multiple Attributes • Attributes represent the different levels of granularity on a dimension that data can be aggregated up to – On a Time dimension you might have attributes like Year and Month – Based on the columns in the dimension table • By default, each attribute becomes a hierarchy you can use when browsing the cube
  • 18. Dimension Properties • ProcessingGroup – using ByTable results in a single query being issued to process the whole dimension – Might improve performance but not usually a good idea! – Also need to turn off Duplicate Key errors • StringStoresCompatibilityLevel (new in 2012) - avoids 4GB string store limit – May slow performance and increase size on disk
  • 19. Attribute Properties • AttributeHierarchyEnabled controls whether an attribute hierarchy will be built – If one isn’t the attribute is visible only as a property of another attribute – Useful for things like phone numbers and addresses – Not building a hierarchy can save time when processing the dimension • AttributeHierarchyOptimizedState controls whether indexes are built – Can save time processing the dimension – But can have a big impact on query performance
  • 20. Attribute Properties • DefaultMember controls what the default selected member is – If you don’t set it, the default member will be the first real member on the first level – So usually the All Member – Setting it can confuse users • IsAggregatable controls whether an All Member is created for a hierarchy – Some hierarchies contain data that should never be aggregated, eg different budgets – If this is set to False, always set a specific default member
  • 21. Ordering Attributes • The OrderBy property controls the order of members on a hierarchy – You can order by name, key, or the value of another attribute • The AttributeHierarchyOrdered property controls whether the attribute is ordered at all – Not ordering it can result in faster processing – But means that member ordering is arbitrary
  • 22. Time Dimensions • Setting the Type property of a dimension to ‘Time’ enables some time-specific features: – Certain MDX functions, like YTD, are time-aware – Semi-additive measures need a time dimension • You will also need to set the Type property on attributes such as years, quarters, months and dates for everything to work properly
  • 23. Attributes and User Hierarchies • User hierarchies are multi-level hierarchies that support more complex drill paths • Each level in a user hierarchy is based on an attribute – On a Time dimension there might be a Calendar Hierarchy with levels Year, Quarter, Month, Date • If you put an attribute hierarchy in a user hierarchy, then set its Visible property to False
  • 24. Attribute or User Hierarchies? • As always, it depends... • User hierarchies: – Are easier for users to understand – Make writing calculations where you need to find ancestors/descendants easier – May lead to better query performance overall • Attribute hierarchies: – Are more flexible – you can’t put two levels of the same user hierarchy on different axes
  • 25. Attribute Relationships • Attribute Relationships describe the one-to-many relationships between attributes – For example a Year contains many Months and a Month contains many Dates • All attributes must have a one-to-many relationship with the Key attribute • Natural user hierarchies are user hierarchies where every level has a one-to-many relationship with the one below • The BIDS Helper Dimension Health Check functionality can check whether attribute relationships have been correctly configured
  • 26. Optimal Attribute Relationships • Optimising attribute relationships is very important for query performance • In general, query performance will be better with ‘bushy’ rather than ‘flat’ attribute relationships • Tweaking your data to allow the creation of chains of relationships is a good thing – So long as it doesn’t compromise your data • Also consider if merging many dimensions into one will allow you to build better relationships
  • 27. Attribute Relationship Scenarios • Forks versus Chains • Creating dummy attributes to make unnatural hierarchies natural • Type 2 Slowly Changing Dimensions • Diamond-shaped relationships
  • 28. Slowly Changing Dimensions • Analysis Services has no specific support for SCDs, but can handle them very easily • Type 1 SCD changes require a Process Update – Aggregations may also need to be rebuilt • Type 2 SCD changes require a Process Add or a Process Update • Modelling attribute relationships correctly on a Type 2 SCD can make a big different to query performance
  • 29. Rigid and Flexible Relationships • Attribute relationships can be rigid or flexible • Rigid relationships are relationships that will never undergo a Type 2 change • Flexible relationships may undergo a Type 2 change • Any aggregation that includes an attribute with a flexible relationship between it and the key attribute will be dropped after a Process Update on a dimension
  • 30. Parent/Child Hierarchies • Parent/child hierarchies can be built on tables which have a recursive or “pig’s ear” join • Allow you to build variable-depth hierarchies such as Org Charts or Charts of Accounts • Are very flexible, but have several limitations: – You can only have one per dimension, and it must be built from the key attribute – They can cause serious query performance problems, especially when they contain more than a few thousand members • Avoid using them unless you have no other choice!
  • 31. Members with Data • Parent/child hierarchies can be used to load non-leaf data into a cube • Useful for data that is non-aggregatable – Eg Data that has been anonymised • However will perform relatively badly • Each member on a parent/child hierarchy has an associated DataMember – The non-leaf data is associated with the DataMember – The DataMember can be visible or hidden – By default, the DataMember’s value will be aggregated along with the other children’s values, but this can be overridden with MDX
  • 32. Ragged Hierarchies • In some cases, you can make a normal user hierarchy look ‘ragged’ by setting the HideMemberIf property • This allows members to be hidden in a user hierarchy when they have – No name – The same name as their parent – Either of the above plus they are the only child • Useful when you know in advance the maximum number of levels in your hierarchy
  • 33. Ragged Hierarchies • Have to set the connection string property MDX Compatibility=2 to make them appear ragged in all cases – Most client tools do this by default; Excel and the Dimension Editor do not • To make Excel work, duplicate values from the bottom of the hierarchy upwards
  • 34. Modelling complex Measures and Measure Groups
  • 35. Cubes and Measure Groups • A Cube should contain all of the data that any user would want to query – You can’t query more than one cube at a time • Each Fact Table in your data mart will become a Measure Group in your cube • Usually better to create one cube with multiple measure groups than multiple cubes • You can use Perspectives to simplify what end users see
  • 36. When To Use Multiple Cubes • A single cube has some disadvantages: – A single cube can become too complex to maintain and develop – If a dimension needs a full process, then a single cube needs completely reprocessing – Multiple cubes can be spread over multiple servers for scale-out – Multiple cubes are easier to secure
  • 37. Measure Aggregation • Basic measure aggregation types are Sum, Count (either rows in the fact table, or non-null column values), Min and Max – Average can be written with some simple MDX • Enterprise Edition gives several more complex aggregation functions • Custom aggregation can be achieved with MDX calculations
  • 38. Distinct Count • Distinct Count measures will be created in their own measure group – This is for performance reasons • They are likely to perform worse than other measures – Require a special partitioning strategy – Very IO intensive – solid state disks can be useful
  • 39. Semi-Additive Measures • Semi-additive measures act like Sum for all dimensions except the Time dimension – NB the Time dimension needs to have its Type property set to ‘Time’ – If there are multiple Time dimensions, the measures will only be semi-additive for one • They are: FirstChild, LastChild, FirstNonEmpty, LastNonEmpty, AverageOfChildren • Useful for handling snapshot fact data
  • 40. Other Aggregation Types • None means that data only appears at the intersection of the leaves of all dimensions • ByAccount allows you to create measures which have different semi-additive behaviour for different members – Again, useful for financial applications
  • 41. Regular Relationships • The Dimension Usage tab in BIDS shows how Dimensions join to Measure Groups • No Relationship is a valid relationship type! • Regular relationships are the equivalent of inner joins between dimensions and measure groups – Most commonly used relationship type
  • 42. Different Granularities • The Granularity Attribute property controls which attribute is used to make the join on a Regular relationship – Usually the Key Attribute is used – Join always takes place on the columns used in the attribute’s KeyColumns property • The IgnoreUnrelatedDimensions property controls what happens below granularity and when there is no relationship between dimensions and measure groups – Can either display nulls – Or repeat the All Member value
  • 43. Fact Relationships • Very similar to Regular Relationships, but are used when the dimension has been built directly from the fact table • Differences between Fact and Regular: – Fact relationships generate slightly better SQL for ROLAP queries – A measure group can only have one Fact relationship – Fact relationships may be treated differently by some client tools
  • 44. Referenced Relationships • Used when a dimension connects to a measure group via another dimension • Materialized referenced relationships give fast query performance – Require a join between the Reference dimension and the fact dimension at processing, which can slow down processing a lot • Non-materialized referenced relationships result in slower query performance but no overhead at processing time • Often better to build a single dimension instead, unless it would result in excessive duplication of attributes
  • 45. Many-to-Many Relationships • Allow you to model scenarios where dimension members have many-to-many relationships with facts – Eg a Bank Account might be associated with several Customers • M2M relationships require a bridge table – Becomes an ‘intermediate’ measure group – Once all regular relationships have been set up, the M2M relationship can be created • Carry a query performance penalty • But extremely flexible and a very important feature for modelling many complex scenarios
  • 46. Role-Playing Dimensions • It is possible to add the same Database dimension to a cube more than once – This is called a role-playing dimension – Eg the Date dimension could be an Order Date and a Ship Date • Saves time processing and on maintenance • But since you can only change the name of the dimension, and not the hierarchies on it, it can be confusing for users
  • 47. Measure Expressions • Measure Expressions allow you to: – Multiply or divide two measures before aggregation takes place – The measures have to be from two different measure groups – There needs to be at least one common dimension • Useful for currency conversion or share calculations • Faster than the equivalent MDX
  • 48. Partitions • Measure groups can be divided up into multiple partitions – Enterprise Edition only • You should partition your measure groups to reflect the slices that users are likely to use in their queries – Usually by the Time dimension • Partitioning improves query performance in two ways: – It reduces overall IO because AS should only scan the partitions containing the data requested – It increases parallelism because AS can scan more than one partition simultaneously
  • 49. Partitioning for Regular Measures • There are two things to consider when planning a partitioning strategy: – How and when new data is loaded – How users slice their queries • Luckily, partitioning by time usually meets both needs – Eg building one partition per day or month • Can also partition by another dimension if the resulting partitions are too large, or users regularly slice queries in other ways
  • 50. Partition Sizing • How big should a partition be? – Up to 20-50 million rows is ok, but 100 million rows can be fine too – Try not to create small partitions: those with less than 4096 rows will have no indexes built • Also, try not to create too many partitions: – BIDS and SQLMS get slower the more partitions you have (can be very bad in older versions of SSAS) – Aggregations are only built per partition, so over partitioning reduces their effectiveness – Aim for no more than 500 to 2000 partitions
  • 51. Setting the Slice • By default, SSAS knows the minimum and maximum dataids of members from each attribute in a partition • Uses this information to work out which partitions to scan for a query • If ranges overlap then SSAS may scan many partitions unnecessarily • To stop this, set the Slice property on the partition – Takes the form of an MDX tuple
  • 52. Partitioning for Distinct Count • Partitioning for distinct count measures must take a completely different approach – Remember distinct count measures should be in their own measure group anyway • You need to: – Ensure each partition contains a non-overlapping range of values in the column you’re doing a distinct count on – Ensure that each query hits as many partitions as possible, so SSAS can use as many cores as possible in parallel
  • 53. Partitioning and M2M • Partitioning large intermediate measure groups used in many-to-many relationships can be useful for performance – Either partition by the dimensions used in the m2m relationship – Or add other dimensions to the relationship • Querying by a m2m dimension does not result in a filter being applied on intermediate dimensions – Can result in a large number of partition scans
  • 54. Controlling the Amount of Parallelism • The CoordinatorExecutionMode server property defines the number of partitions in one measure group that can be queried in parallel – By default is set to -4 = 4 * number of processors • CoordinatorQueryBalancingFactor and CoordinatorQueryBoostPriorityLevel can be used to limit the ability of a single query to use all resources and block other queries
  • 55. Understanding How Analysis Services Answers Queries
  • 56. Agenda • • • • SSAS Architecture Overview The Storage Engine The Formula Engine Using Profiler
  • 57. How SSAS answers queries MDX Query In Cellset Out Cache Query Subcube Requests Cache Disk Formula Engine works out what data is needed for each query, and requests it from the Storage Engine Storage Engine handles retrieval of raw data from disk, and any aggregation required
  • 58. Subcubes • Subcubes are slices of data from the cube • Appear in many places: – MDX Scope statements – Caching – Query Subcube requests • Subcubes are defined by their granularity and slice
  • 59. Granularity • Single grain – List of GROUP BY attributes in SQL SELECT • Mixed grain – Both Attribute.[All] and Attribute.MEMBERS
  • 61. Lattice of granularities All, All, All All, All, * *, All, All All, *, All All, *, * *, All, * *, *, All *, *, *
  • 62. Granularity ordering • • • • • (All,*) is above (*,*) (All,*) is below (All,All) (All,2008) is at (All,*) (All,*) and (*,All) are incompatible Attribute relationships make some arrows impossible: (State.*, All Cities) – OK (State.All, City.*) – cannot occur
  • 63. Slice • Single member – SQL: Where City = ‘Redmond’ – MDX: [City].[Redmond] • Multiple members – SQL: Where City IN (‘Redmond’, ‘Seattle’) – MDX: { [City].[Redmond], [City].[Seattle] }
  • 64. Slice at granularity  SQL SELECT Sum(Sales), City FROM Sales_Table WHERE City IN (‘Redmond’, ‘Seattle’) GROUP BY City  MDX SELECT Measures.Sales ON 0 , NON EMPTY {Redmond, Seattle} ON 1 FROM Sales_Cube
  • 65. Slice below granularity  SQL SELECT Sum(Sales) FROM Sales_Table WHERE City IN (‘Redmond’, ‘Seattle’)  MDX SELECT Measures.Sales ON 0 FROM Sales_Cube WHERE {Redmond, Seattle}
  • 66. Examples All Years All Cities Redmond Seattle New York London 2005 2006 2007 2008
  • 67. Examples All Years 2005 All Cities Redmond Seattle New York London (Seattle, Year.Year.MEMBERS) 2006 2007 2008
  • 68. Examples All Years 2005 All Cities Redmond Seattle New York London (Seattle, Year.MEMBERS) 2006 2007 2008
  • 69. Examples All Years 2005 2006 2007 All Cities Redmond Seattle New York London ({Redmond, Seattle, London}, Year.MEMBERS) 2008
  • 70. Examples All Years 2005 2006 All Cities Redmond Seattle New York London ({Redmond, Seattle}, {2005, 2006, 2007}) 2007 2008
  • 71. Retrieving Data • The first step to answering a query is for the FE to generate a query plan – Unfortunately, we can’t see these query plans • The next step to answering a query is to work out which subcubes of real (ie not calculated) data are needed – Not just for ‘real’ measures but also for calculations too • These are translated to single-grain subcubes • Each request from the FE to the SE is visible in Profiler as a Query Subcube request
  • 72. Sonar • Sonar is the component of the FE engine that determines which subcubes should be retrieved to answer the query • For complex queries this can involve many difficult decisions: – One large subcube or many small ones? – In some cases it may be more efficient to retrieve more data than the query requires
  • 75. 2 2 3 4 [2, *] Non precise solution 2 5 6 7 2 3 [2, {2,3,4,5,6}] Precise solution 4 5 6 7
  • 76. Which one is better ? 2 2 3 4 [2, *] Non precise solution 2 5 6 7 2 3 [2, {2,3,4,5,6}] Precise solution 4 5 6 7
  • 77. Controlling Sonar • Sometimes SSAS can decide to prefetch too much data, causing performance problems • To stop this happening – Rewrite your query to use different attributes or user hierarchies – Use a subselect instead of a WHERE clause (though this will impact caching) – Use the Disable Prefetch Facts=True; Cache Ratio=1 connection string properties
  • 78. The Storage Engine • The SE receives requests for subcubes of data from the FE • Then: – If the subcube can be derived from data in cache, it will read data from cache – If the subcube can be derived from an aggregation, it will read data from the aggregation – Otherwise it will read the data from a partition • Finally it will perform any aggregation necessary and return the subcube
  • 79. Reading Data from Disk • Next, the SE determines which partitions/aggregations contain the required slices of data • It then reads data from these partitions – It will attempt to read from several partitions in parallel if possible • Finally merges and aggregates the data from all partitions and loads it into the SE cache • For distinct count measure groups things are more complex...
  • 80. The Formula Engine • The FE can then read data from the SE cache and perform calculations • The FE tries to determine which subcubes have the same calculations applied to them – It can then perform calculations over large numbers of cells very efficiently – Known as ‘bulk mode’ or ‘block computation’ • When it cannot do this, it has to iterate over every cell, determine which calculations need to be applied, and perform the calculations – This can be very slow – Known as ‘cell by cell’ mode
  • 81. Bulk Mode/Block Computation • Cube space populated at fact table generally extremely sparse – • Goal is to compute expressions only where they need to be computed – • Values only exist for small minority of possible combinations of dimension keys Most often, everything takes on a default value typically (but not always) null Take following example: WITH MEMBER Measures.ContributionToParent AS Measures.Sales/ (Measures.Sales, Product.CurrentMember.Parent) SELECT Product.[Product Family].members ON COLUMNS, Customer.Country.Members ON ROWS FROM Sales WHERE Measures.ContributionToParent
  • 82. Cell-by-Cell Approach Measures.ContributionToParent Drink Food Non-Consumable Canada (null) (null) (null) Mexico (null) 9.22% (null) 71.95% (null) 18.83% USA = (Measures.Sales, Product.Currentmember.Parent)’ measures.[Unit Sales] Drink Food Non-Consumable Canada (null) (null) (null) Mexico (null) (null) (null) USA $ 24,597.00 $ 191,940.00 $ 50,236.00 AS Calc Engine Rules: Null / Null = Null All Products / Canada (null) Mexico (null) USA $ 266,773.00
  • 83. Disadvantages of Cell-by-Cell • The same navigation for each cell repeated – Same “relative offset” for each cell yet FE repeats the same navigation for every cell in the subspace • Other work repeated for every cell – An expensive check for recursion to determine if pass should be decremented – Eg Sales = Sales * 1.2 • Calculating null values when we should know ahead of time they are null – Extremely important in a sparse value subspace
  • 84. Bulk Mode Goals • Only calculate non-default values for a subcube – Usually, but not always, the default value is null • Perform cell navigation once for each subcube – Eg PrevMember, Parent – often the same for each cell in a subcube
  • 85. Bulk Mode Approach Drink Food Non-Consumable Canada (null) (null) (null) Mexico (null) (null) (null) USA 9.22% 71.95% 18.83% 3) …and everything else is null Country Product Measure Value USA Drink Contribution to Parent 9.22% USA Food Contribution to Parent 71.95% USA Non-Consumable Contribution to Parent 18.83% Country Product Measure Value Country Product USA Drink Sales $24,597.00 USA USA Food Sales $191,940.00 USA NonConsumable Sales $50,236.00 1. Retrieve non-null values from storage engine 2) Perform the computation for the non-null values only 3 computations instead of 9… Measure All Products Sales Value $266,773.00
  • 86. The Formula Engine Cache • The results of calculations are then stored in the FE cache – Future queries can then avoid doing calculations if the results can be retrieved from cache • However, FE caches may exist only for the lifetime of a query or session
  • 87. Using Profiler • SQL Profiler can be used to observe what happens within SSAS when a query executes – MDX Studio provides much the same information, often in a more readable form • Provides good information on what’s happening in the SE, much less on FE • It’s a good idea to create a new trace template with the following events...
  • 88. Profiler Events • • • • • • Query Begin/End Execute MDX Script Begin/End Query Cube Begin/End Progress Report Begin/End Get Data From Aggregation Query Subcube Verbose (shows same data as Query Subcube in a more readable form) • Resource Usage
  • 89. Interpreting Query Subcube Verbose • Query Subcube and Query Subcube Verbose show the subcube requests from the FE to the SE – Single grain only – Granularity bitmask • 1 is in grain • 0 is not in grain Symbol * DataID Meaning All members at granularity Single member slice + Set slice at granularity - Set slice below granularity 0 No slice and no granularity Arbitrary shape
  • 90. Query Performance Benchmarking • When performance tuning, you need to have a benchmark to test against – Are you improving performance? • Methodology: – Ensure no-one else is running queries on the server – Ensure you’re running queries on prod data volumes and server specs – Clear Cache – Run query and measure Duration, to get cold cache performance – Run query again to get warm cache performance
  • 91. Designing Effective Aggregations in Analysis Services 2008
  • 92. Agenda • • • • • • What are aggregations and why should I build them? The importance of good dimension design The Aggregation Design Wizard Usage-based optimisation Designing aggregations manually Influence of cube design on aggregation usage
  • 93. What is an aggregation? • It’s a copy of the data in your fact table, preaggregated to a certain level – Created when the cube is processed – Stored on disk • Think of it as being similar to the results of a GROUP BY query in SQL • It makes queries fast because it means SSAS does not have to aggregate as much data at query time
  • 94. Visualising aggregations • You can actually see an aggregation if you: – – – – Create a cube using ROLAP storage Build aggregations Process the cube Look in your relational data source at the indexed view or table created
  • 95. The price you pay • Aggregations are created at processing time • Therefore building more aggregations means processing takes longer • Also increases disk space used by the cube • But SSAS can build aggregations very quickly • And with the ‘right’ aggregation design, relatively little extra processing time or disk space is needed
  • 96. Should I pre-aggregate everything? • NO! • There is no need: you don’t need to hit an aggregation direct to get the performance benefit – In fact, too many aggregations can be bad for query performance! • Building every possible aggregation would also lead to: – Very, very long processing times – Very, very large cubes (‘database explosion’)
  • 97. Database explosion 10 20 30 40 10 20 30 30 40 70 40 60 100 5 extra values needed to hold all possible aggregations on a table that had only 4 values in it originally!
  • 98. When aggregations are used • Aggregations are only useful when the Storage Engine has to fetch data from disk • Aggregations will not be used if the data is in the Storage Engine cache • Aggregations may not be useful if the cause of query performance problems lies in the Formula Engine
  • 99. Aggregation designs • Each measure group can have 0 or more aggregation design objects associated – They are what gets created when you run either of the Aggregation Design wizards – They detail which aggregations should be built – Remember to assign designs to partitions! • Each partition in a measure group can be associated with 0 or 1 aggregation designs • Aggregations are built on a per-partition basis
  • 100. Aggregation design methodology 1. Ensure your dimension design is correct 2. Set all properties that affect aggregation design 3. Run the Aggregation Design Wizard to build some initial aggregations 4. Perform Usage-Based Optimisation for at least a few weeks, and repeat regularly throughout the cube’s lifetime (or automate!) 5. Design aggregations manually for individual queries when necessary
  • 101. Dimension design • Dimension design has a big impact on the effectiveness of aggregations • Your dimension designs should be stable before you think about designing aggregations • Three important things: – Delete any attributes that won’t ever be used – Check your attribute relationships to ensure they are set optimally – Build any natural user hierarchies you think might be needed
  • 102. Attribute relationships • Attribute relationships allow SSAS to derive values at higher granularities from aggregations built at lower granularities • The more ‘bushy’ the attribute relationships on your dimension, the more effective aggregations will be • Flat attribute relationships make designing aggregations much harder
  • 103. Properties to set • Before starting aggregation design, you should set the following properties: – EstimatedRows property for partitions – EstimatedCount property for dimensions – AggregationUsage property for attributes of cube dimensions • Setting these correctly will ensure the aggregation design wizards will do the best job possible
  • 104. AggregationUsage • The AggregationUsage property has the following possible values: – Full – meaning every aggregation will include this attribute – None – meaning no aggregation will include this attribute – Unrestricted – meaning an aggregation may include this attribute – Default – means the same as Unrestricted for key attributes, or attributes used in natural user hierarchies
  • 105. The Aggregation Design Wizard • The Aggregation Design Wizard is a good way to create a ‘first draft’ of your aggregation design • Do not expect it to build every aggregation you’ll ever need... • ...it may not even produce any useful aggregations at all! • Increased cube complexity from AS2005 onwards means it has a much harder job
  • 106. The Aggregation Design Wizard • The first few steps in the wizard ask you to set properties we’ve already discussed • However, the Partition Count property can only be set in the wizard • It specifies the number of members on an attribute that are likely to have data in any given partition • For example, if you partition by month, a partition will only have data for one month
  • 107. Set Aggregation Options step • This is where the aggregations get designed! • Warning: may take minutes, even hours to run • Four options for working out which aggregations should be built: – Estimated Storage Reaches – build enough aggregations to fill a certain amount of disk – Performance Gain Reaches – build aggregations to get x% of all possible performance gain from aggregations. – I Click Stop – carry on until you get bored of waiting – Do Not Design Aggregations
  • 108. Set Aggregation Options step • The strategy I use is: – Choose I Click Stop, then see if the design process finishes quickly and if so, how many aggregations are built and what size – If you have a reasonable size and number of aggregations (build no more than 50 aggregations here), click Finish – Otherwise, click Stop and Reset and choose Performance Gain Reaches and set it to 30% – If only a few, small aggregations are designed, increase by 10% and continue until you are happy
  • 109. Usage-Based Optimisation • Usage-Based Optimisation involves logging Query Subcube requests and using that information to influence aggregation design • To set up logging: – Open SQL Management Studio – Right-click on your instance and select Properties – Set up a connection to a SQL Server database using the LogQuery Log properties – The QueryLogSampling property sets the % of requests to log – be careful, as setting this too high might result in your log table growing very large very quickly
  • 110. Usage-Based Optimisation • You should log for at least a few weeks to get a good spread of queries • Remember: any changes to your dimensions will invalidate the contents of the log • Next, you can run the Usage-Based Optimisation wizard • This is essentially the same as the Aggregation Design wizard, but with log data taken into account (you can also filter the data in the log)
  • 111. Manual aggregation design • You will find that sometimes the wizards do not build the aggregations needed for specific queries – Running the wizard is a bit hit-and-miss – The wizards may never build certain aggregations, for example they may think they are too large • In this case you will need to design aggregations manually • BIDS 2008 allows you to do this on the Aggregations tab of the cube editor • BIDS Helper (http://www.codeplex.com/bidshelper) has better functionality for this
  • 112. Manual aggregation design • To design aggregations manually: – – – – Clear the cache Start a Profiler trace Run your query Look for Query Subcube events that take more than 500ms – Build aggregations at the same granularity as the Query Subcube events – Save, deploy then run a ProcessIndex
  • 113. Redundant attributes • Often more than one attribute from the same dimension will be used in a subcube request • If those attributes are connected via attribute relationships, only include the lowest one in the aggregation – Eg If Year, Quarter and Month are selected, just use Month • This will ensure that the aggregation can be used by the widest range of queries
  • 114. Influence of cube design • There is a long list of features you might use in your cube design that affect aggregation design and usage • Mostly they result in queries executing at lower levels of granularity than you’d expect • This means aggregations are less likely to be hit, and so less useful • BIDS Helper and the BIDS Aggregations tab will help you spot these situations
  • 115. Many-to-many relationships • Aggregations will never be used on an intermediate measure group in a many-to-many relationship, if the measure group is only used as an intermediate measure group • M2M queries are resolved at the granularity of the key attributes of all dimensions common to the main measure group and the intermediate measure group • Therefore aggregations are less likely to be hit on the main measure group
  • 116. Semi-additive measures • Queries involving semi-additive measures are always resolved at the granularity attribute of the Time dimension • Therefore any aggregations must include the granularity attribute of the Time dimension if they’re to be used
  • 117. Partitions • It’s pointless to build aggregations above the granularity of the attribute you’re using to slice your partitions – Eg, if you partition by Month, don’t build aggregations above Month level – Nothing stops you doing this though! • Do this and you get no performance benefit and limit the queries the aggregations can be used for
  • 118. Parent-child hierarchies • You cannot build aggregations that include parentchild hierarchies • That also means that the levels that appear ‘within’ the parent-child structure cannot be used in an aggregation • You should build aggregations on the Key Attribute of your dimension instead – Very important if you have a DefaultMember or if IsAggregatable=False
  • 119. MDX Calculations • Calculated members and MDX Script assignments are very likely to request data for different granularities from the one you’re querying • Don’t assume – check what’s happening in Profiler!
  • 120. Other things to watch for • Measure expressions cause queries to be evaluated at the common granularity of the two measure groups involved • Unary operators other than + and Custom Rollups will also force query granularity down • Do not include attributes from non-materialised reference dimensions in aggregations as this may result in incorrect values being returned
  • 121. Monitoring Aggregation Usage • Profiler can show when aggregations are used: – Get Data From Aggregation event is fired – Progress Report Begin/End events show reads from aggregations – ObjectPath column shows exactly which objects are being read from • XPerf utility can be used to determine how much time is spent in IO and how much time is spent aggregating data
  • 123. Agenda • • • • • • Avoiding using MDX calculations Measure Expressions Bulk mode and Cell-by-Cell mode When Bulk Mode can be used Non Empty evaluation Using Profiler, Perfmon and MDX Studio to monitor FE activity • Using named sets effectively • Using caching to avoid recalculating expressions • Scoping calculations appropriately
  • 124. Avoiding MDX • Doing calculations in MDX should be a last resort - if you can design something into the cube, then do so • Examples: – Using calculations to aggregate members instead of creating a new attribute – Using calculations instead of creating a m2m relationship
  • 125. Measure Expressions • Measure Expressions are simple calculations that are performed in the SE not the FE – Set using a measure’s MeasureExpression property – Can take advantage of parallelism with partitioning – Faster than FE equivalent, though not always by much • Take the form M1*M2 or M1/M2, where M1 and M2 are measures from two different measure groups with at least one common dimension – Calculation takes place before measure aggregation • Useful for: – Currency conversion – Percentage ownership/share
  • 126. Bulk Mode MDX Functions • Most MDX functions can use bulk mode • Some, however, cannot – so using them can seriously harm query performance – See the topic “Performance Improvements for SQL Server 2008 Analysis Services” in BOL for a list – MDX Studio can identify them too
  • 127. Other Ways to Force Cell-by-Cell • Use cell security • Use SSAS stored procedures/custom functions • Use named sets or set aliases inside calculations (before SSAS 2012) • Reference calculations that are defined later on in the MDX Script • Use non-deterministic functions, eg NOW()
  • 128. Non Empty Evaluation • SSAS regularly needs to filter empty cells – When queries use NON EMPTY or NonEmpty() – As part of bulk mode evaluation • It has two ways of doing this: – Slow algorithm that evaluates all cell values and then filters • Can be very slow when calculations are involved – Fast algorithm that ‘knows’ if a cell is empty by looking at raw measure values from the SE
  • 129. Non_Empty_Behavior • The Non_Empty_Behavior property in SSAS 2000 and 2005 was a hint to use the fast algorithm • Should not be used in SSAS 2008 – 98% of the time the engine knows when to use it anyway – Was probably the most misused feature in 2005, leading to incorrect values coming from the cube – In 2008 only use it if you really know what you’re doing and as a last resort
  • 130. Using Profiler To Monitor the FE • Profiler gives us little useful information on what’s happening in the FE • Useful events include: – Calculate Non Empty Begin/End • When the value of IntegerData is 11, SSAS is using the cell-by-cell algorithm – Get Data From Cache – shows reads from cache, and which type of cache is used • But usually these events return too much information to be useful...
  • 131. Using Perfmon to Monitor the FE • The following counters are useful: – Total Calculation Covers – gives the number of subcubes over which a common set of calculations is evaluated. High is bad. – Total Cells Calculated – the number of cell values evaluated for a query. High is bad. – Total Sonar Subcubes – the number of SE subcube requests generated by Sonar – Total NON EMPTY/unoptimized/for calculated members – the number of times a Non Empty is evaluated
  • 132. Other Tools • Kernrate Viewer can be used to identify FE problems relating to stored procedures – Shouldn’t be using stored procedures anyway • MDX Script Performance Analyser can help identify individual problem calculations – Often, though, the problem is with a combination of overlapping calculations
  • 133. MDX Studio • MDX Studio is a great tool for diagnosing FE problems – Combines Profiler and Perfmon data – Identifies functions you shouldn’t be using, and links to blog posts with more detail – But unlikely to be developed further
  • 134. MDX Don’ts • Do not use the LookUpCube function • Do not use the NonEmptyCrossjoin function, use NonEmpty instead • Do not use the StrToX functions inside calculated members – If you must use them, use the Constrained flag • Do not use non-deterministic functions inside calculated members – Eg Now(), UserName() • Do not compare objects by name, use the IS operator instead • Do not return anything other than a Null from an IIF when you don’t want a calculation to take place
  • 135. MDX Avoids • Avoid using subselects wherever possible – In some cases, despite the impact on caching, it can produce faster queries though • Avoid using the LinkMember function • Avoid using cell security
  • 136. Using Named Sets • Named sets can be used to avoid re-evaluating the same set expression multiple times – Eg finding the rank of a member in an ordered set, where the set order remains static • This can result in massive performance gains • However, remember that using named sets also disables bulk mode before 2012
  • 137. Rewriting to use Bulk Mode • In some cases, rewriting calculations to avoid certain functions can enable the use of bulk mode – Eg Count(Filter()) scenario – Somewhat non-intuitive, but works – May get fixed in future service pack or version
  • 138. Using Caching • The FE can only cache cell values, not the value of individual expressions • So if you have an expensive expression that returns a numeric value, then it could be worthwhile to put it into its own calculated measure – Can then be hidden • After the first time it is evaluated, its value will be cached, so any other calculations that use it can read values from cache
  • 140. Agenda • How Analysis Services caching works • When and why Analysis Services can’t cache data • Warming the Storage Engine cache with the CREATE CACHE statement • Warming the Formula Engine cache by running queries • Automating cache warming
  • 141. Types of Analysis Services cache • Analysis Services can cache two types of value: – Values returned by the Storage Engine • ‘Raw’ measure values, one cache per measure group • Dimension data, one cache per dimension – Values returned by the Formula Engine • Numeric values only – strings can’t be cached • All SE caches have the same structure, known as the data cache registry • The FE can also store values in this structure if calculations are evaluated in bulk mode • The FE uses a different structure, the flat cache, for calculations evaluated in cell-by-cell mode
  • 142. Storage Engine Cache • Data in the data cache registry is held in subcubes, ie data at a common granularity • Subcubes may not contain an entire granularity – they may be filtered • Prefetching is generally good for cache reuse • Arbitrary-shaped subcubes cannot be cached • Try to redesign hierarchies or queries to avoid this
  • 143. Storage Engine Cache Aggregation • SE cache data can be aggregated to answer queries in some cases • Except when the measure data itself cannot be aggregated, for example with distinct count measures or many-to-many • Unlike aggregations, the SE cache isn’t aware of attribute relationships • Aggregation can only happen from the lower level of an attribute hierarchy to the All Member • For this reason user hierarchies can offer better performance because of the subcubes they generate
  • 144. Formula Engine Cache Scopes • There are three different ‘scopes’ or lifetimes of a FE cache: – Query – for calculations defined in the WITH clause of a query, the FE values can only be cached for the lifetime of the query – Session – for calculations defined for a session, using the CREATE MEMBER statement executed on the client, FE values can only be cached for the lifetime of a session – Global – for calculations defined in the cube’s MDX Script, FE values can be cached until either • Any kind of cube processing takes place • A ClearCache XMLA command is executed • Writeback is committed • Global scope is best from a performance point of view!
  • 145. Cache Sharing • Values stored in the SE cache can always be shared between all users • Values stored in the FE cache can be shared between users, except when: – Stored in Query or Session-scoped caches – Users belong to roles with different dimensions security permissions • Note: dynamic security always prevents cache sharing • Calculations evaluated in bulk mode cannot reference values stored in the FE flat cache • Calculations evaluated in cell-by-cell mode cannot reference values stored in the FE data cache registry
  • 146. Forcing Query-scoping • In certain circumstances, SSAS uses query-scoped FE caches when you would expect it to use global scope • These are: – Calculations that use the Username or LookupCube functions – Calculations use non-deterministic functions such as Now() or any SSAS stored procedures – Queries that use subselects – When any calculated member is defined in the WITH clause, whether it is referenced or not in the query – When cell security is used
  • 147. Warming the SE cache • Considerations for warming the SE cache: – We want to avoid cache fragmentation, for example having one unfiltered subcube cached rather than multiple filtered subcubes – It is possible to overfill the cache – the SE will stop looking in the cache after it has searched 1000 subcubes – We want to cache lower rather than higher granularities, since the latter can be aggregated from the former in memory – We need a way of working out which granularities are useful
  • 148. Warming the SE cache • We can warm the SE cache by using either: – WITH CACHE, to warm the cache for a single query – not very useful – The CREATE CACHE command • Remember that building aggregations is often a better alternative to warming the SE cache • But in some cases you can’t build aggregations – for example when there are many-to-many relationships
  • 149. CREATE CACHE • Example CREATE CACHE statement: CREATE CACHE FOR [Adventure Works] AS '({[Measures].[Internet Sales Amount]}, {[Date].[Date].[Date].MEMBERS}, {[Product].[Category].[Category].MEMBERS})'
  • 150. Which subcubes should I cache? • The Query Subcube and Query Subcube Verbose events in Profiler show the subcubes requested from the SE by the FE • This is also the information stored in the SSAS query log, stored in SQL Server • Analyse this data manually and find the most commonly-requested, lower-granularity subcubes • Maybe also query the Query Log, or a Profiler trace saved to SQL Server, to find other subcubes – perhaps for queries that have been run recently
  • 151. Warming the FE cache • First, tune your calculations! Ensure use of bulk mode where possible • The only way to warm the FE cache is to run MDX queries containing calculations • Remember, these queries must not: – Include a WITH clause – Subselects • Also, no point trying to cache calculations whose values cannot be cached • And think about how security can impact cache usage
  • 152. Queries to warm the FE Cache • Again, it is worth manually constructing some MDX queries yourself to warm the FE cache • Also, running regularly-used queries (for example those used in SSRS reports) can be a good idea • Can easily collect the queries your users are running by running a Profiler trace, then saving that trace to SQL Server or a .trc file – The Query Begin and Query End events contain the MDX query – Need to filter out those with a WITH clause etc – Watch out for parameterisation (eg SSRS) – Watch out for use of session sets and calculations (eg Excel 2003) – Watch out for queries that slice by Time, where the actual slicer used may change regularly – Think about the impact of dimension security too
  • 153. Memory considerations • SSAS caching can use a lot of memory! • The cache will keep growing until SSAS thinks it is running out of memory: – When memory usage exceeds the % of available system memory specified in the LowMemoryLimit property, data will be dropped from cache – When it exceeds the % specified in the TotalMemoryLimit property, all data will be dropped from cache – We therefore don’t want to exceed the LowMemoryLimit – We also want to avoid paging – We need to leave space for caching real user queries • The FE flat cache is limited to 10% of the TotalMemoryLimit – If it grows bigger than that, it is completely emptied
  • 154. Automating Cache Warming • We should perform cache-warming after cube processing has taken place • Remember – it may take a long time! It should not overlap/interfere with real users querying • We can automate it a number of different ways: – Running SSRS reports on a data-driven subscription – Using the ascmd.exe utility – Building your own SSIS package – the best solution for overall flexibility. • Either fetch queries from a SQL Server table • Or from a Profiler .trc file using the Konesans Trace File Source component
  • 155. Summary • Clearly a lot of problems to watch out for! • However, some cache-warming (however inefficient) is often better than none at all • A perfectly-tuned cube would have little need for cachewarming, but... – Some performance problems we just don’t know about – Some we may not be able to fix (eg with complex calculations, hardware limitations) – Cache warming is likely to have some positive impact in these cases – maybe lots, maybe not much
  • 156. NASI SPONSORZY I PARTNERZY Organizacja: Polskie Stowarzyszenie Użytkowników SQL Server - PLSSUG Produkcja: DATA MASTER Maciej Pilecki