3. What are we covering
• World of Appliance
• Introducing SQL Parallel Data Warehouse (PDW)
• Different Kinds of Nodes in PDW
• Hub and Spoke Architecture
2 Karan Gulati (SSAS Maestro)
4. What’s an Appliance?
A re we talking about a refrigerator or an oven?
3 Karan Gulati (SSAS Maestro)
5. Appliance World…….
Appliance is nothing but preconfigured machine which is dedicated for specific
use in contrast to general use.
In Computer world - An appliance comes with hardware, with pre-installed OS,
and Software, keeping all best practices or guideline in mind while building an
Appliance.
What this means to users?
Just plug and play…... and ready to use just like a refrigerator or an oven.
4 Karan Gulati (SSAS Maestro)
6. Have you heard about SQL PDW
Microsoft SQL Server Parallel Data Warehouse (SQL Server PDW) is:
• Massively Parallel Processing Appliance (MPP)
• Simple to deploy
• Pre-built Appliance with software, hardware and networking components
• Highly scalable data storage, and high-speed data transfer
• One answer to largest data warehouse workloads
5 Karan Gulati (SSAS Maestro)
7. Symmetric Multi Processing
First, lets understand Symmetric multi processing(SMP)
In SMP each CPU core can work with any section of memory or disk, and all
memory and all disk available to each core.
Problem starts when too many CPUs making requests same time for data on the
system bus which creates a traffic jam and that results in queue consequently
slowness and limited amount of processing can take place on SMP creates
limitation as the usage grows System Bus.
6 Karan Gulati (SSAS Maestro)
8. Solution to SMP Problem lies in MPP
Massively Parallel Processing Architecture refers to the use of a large number of
separate computes to perform a set of a job.
In simple words MPP is:
Multiple boxes with their own CPUs, Memory and other resources to perform
given task; this way we are using the power of all machines / nodes in one go.
7 Karan Gulati (SSAS Maestro)
9. SQL PDW: Flow of Query Execution
Control node break When the compute
the Query into DMS or Data nodes are finished,
multiple parallel Movement Service control nodes
Query hits control operations and coordinates any handles post-
node distribute them out needed data processing and re-
to compute nodes movement among integration of result
where the actual nodes sets for delivery
data resides back to the users
8 Karan Gulati (SSAS Maestro)
10. SQL PDW: Nodes and Services
Control Node
Compute Node
Administrative Service Nodes
Data Movement Services
9 Karan Gulati (SSAS Maestro)
11. Control Node
An Control node that is the central point of control for processing queries on the
SQL Server PDW appliance. The Control node receives the user query, creates a
distributed query plan, communicates relevant plan operations and data to
Compute nodes, receives Compute node results, performs any necessary
aggregation of results, and then returns the query results to the user.
10 Karan Gulati (SSAS Maestro)
12. Compute Node
An Compute node that is the basic unit of scalability and storage. Each Compute
node in the SQL Server PDW appliance uses its own user-data and computing
resources to perform a portion of each parallel query.
11 Karan Gulati (SSAS Maestro)
13. Administrative Service Nodes
• Landing Zone node: An appliance node that provides temporary storage and
processing for loading data onto the appliance.
• Management node: An appliance node that performs multiple functions
related to managing the hardware and software in the appliance. This node is
the hub for software deployment and servicing, authentication within the
appliance (not login authentication), and monitoring system health and
performance
• Backup Node: The Backup Node provides high-speed integrated backup at
the database level. This is tied to the organization’s overall backup strategy
and systems.
12 Karan Gulati (SSAS Maestro)
14. Data Movement Services
• When a query is submitted to a control node, it is the
PDW Engine that determines what the query plan will
be on each individual compute node, then submits the
query to all the compute nodes through the DMS
DMS • Further DMS coordinates any needed data movement
among nodes taking place between and handles any
functions that needed to be resolved centrally
• In simple words DMS is the brain that ties all the
nodes together
13 Karan Gulati (SSAS Maestro)
15. Hub and Spoke Architecture
Data warehousing architecture with a central hub data warehouse that provides a
flexible and high speed ability to move or copy EDW data to spokes.
A spoke is typically a data mart in an optimized physical storage for a particular
user group or organization.
A data mart is usually a much smaller subset of the data in the EDW and specific
to the reporting and analytic needs of a specific user community.
14 Karan Gulati (SSAS Maestro)
16. SQL PDW – Act as Hub
Using a true hub-and-spoke architecture, all enterprise data can be
maintained on a SQL Server 2008 R2 Parallel Data Warehouse hub while
departments or business units keep their existing data marts to suit their
needs. High-speed data transfer relieves traditional barriers to hub and
spoke. Power users can even deploy a dedicated MPP appliance as a
spoke so they can autonomously manage resources, while IT can enforce
enterprise standards across all data.
15 Karan Gulati (SSAS Maestro)
17. Recommended Reading
SQL Server 2008 R2 Parallel Data Warehouse
ITIC: Comparison of Oracle Database Appliance to Microsoft SQL Server
Implementing a SQL Server PDW Using the Kimball Approach
Implementing Data Warehouse 2.0 by Immon
16 Karan Gulati (SSAS Maestro)