SlideShare une entreprise Scribd logo
1  sur  38
All up datawarewhouse – From SMP
to Parallel Data warehousing
Take 1 big SAN
Add a little Server
Add a bigger Server
Add more networking
POTENTIAL PERFORMANCE BOTTLENECKS
FC
HBA
A
B
FC
HBA
A
B
FCSWITCH
STORAGE
CONTROLLER
A
B
A
B
CACHE
SERVER
CACHE
SQLSERVER
WINDOWS
CPUCORES
CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate
A
B
DISK DISK
LUN
DISK DISK
LUN
SQL Server
Read Ahead Rate
LUN Read Rate Disk Feed Rate
It’s all about …. SIZING
One SHOEdoes not FIT ALL
Transaction
processing
simplifies and
accelerates data
capture for
accurate business
decisions
Data warehousing
enables common
data model for
single version of the
truth
Analysis leads
to optimized
business
processes and
improved
performance
Data Warehouse Scope
DataPath
Data Warehouse
Analysis Services
Cubes
PerformancePoint
Dedicated
SAN, Storage
Array
Reporting
Services
Web Analytic
Tools
Integration
Services ETL
SharePoint
Services
Microsoft Office
SharePoint
Data Staging,
Bulk Loading
Supporting
Systems
BI Data Storage
Systems
Presentation Layer
Systems
Data Warehouse Scope
(dashed)
PresentationDataPresentationData
Data Warehouse Scenarios
• No longer exclusive to
large enterprises and
specialists analysts
• Growth of affordable
self-service BI tools such
as PowerPivot and
Reporting Services has
created a DW
requirement for smaller
businesses and individual
departments
Microsoft Data Warehousing Offerings
Scalable and reliable
SMP platform for data
warehousing on any
hardware
Scalable and reliable
platform for data
warehousing on any
hardware
Reference
architectures offering
best price
performance for data
warehousing
Appliance for high end
MPP Data Warehousing
delivering highest
scalability and
performance
Ideal for data marts or
small to mid-sized
enterprise data
warehouses (EDWs)
Ideal for large data
marts or mid-sized
EDWs
Ideal for data marts or
small to mid-sized
data warehouses with
scan-centric
workloads
Ideal for high scale or
high performance data
marts and EDWs
Software only
Integrated Appliance
(Software and
Hardware)
Reference
Architectures
(Software and
Hardware)
DW Appliance
(Fully integrated
Software and Hardware)
Scale-Up DW Scale-Up DW Scale-Up DW Scale-Out DW with MPP
10s of terabytes <5 terabytes 5–80 terabytes 10s - 100s of TB
Software Assurance;
Premier Mission Critical
Support
3-Year Support Plus 24
Software Assurance;
Premier Mission Critical
Support
Mission Critical
Advantage Program
Enterprise Fast Track Data
Warehouse RA
BDW
Appliance
Parallel Data
Warehouse
Microsoft Data Warehouse Offerings
Effort to Build Very High Very Low Modera
te
Modera
te
Moderate Mode
rate
Very
Low
Capacity Variable 5 TB 14 TB 20 TB 40 TB 40 TB 500 TB
Concurrency Variable Light Light Medium Medium High Very
High
Query
Complexity
Variable Medium Mediu
m
Medium Medium High Very
High
Business Data Warehouse
Appliance
Business Data Warehouse Appliance
Agile
• Deploy in hours/days, not in
months
• Easy to use through built-in
dedicated tools to load and manage
your data warehouse
• Designed for up to 5TB data
warehouses
• Fast Track 3.0 compliant, license
path to Fast-Track
Complete
• Hardware + Software
+ Services
• Pre-tuned, pre
configured, pre-
installed. Turn on and
go!
• Single point of contact
for support
Optimized
• Specifically for small to
medium data warehouse
workload
• Designed for performance,
energy efficiency, and value
by HP and Microsoft’s best
engineers
• Security and reliability built
in
Scenarios
Small/Departmental
Data Warehouse
Spoke in EDW Hub and
Spoke Architecture
Reference Architectures
Fast Track Data Warehouse Components
Software:
• SQL Server 2008 R2
Enterprise
• Windows Server 2008 R2
Configuration guidelines:
• Physical table structures
• Indexes
• Compression
• SQL Server settings
• Windows Server settings
• Loading
Hardware:
• Tight specifications for
servers, storage
and networking
• ‘Per core’ building block
SQL Server Parallel Data
Warehouse
SQL Server Parallel Data Warehouse
• Tier-1 Enterprise Data Warehouse Appliance Offering
– High scalability from tens to hundreds of terabytes
– High performance through the MPP system
• Flexibility and Choice
– Choice of deployment options through distributed
architecture
• Most Comprehensive Solution
– Complete data warehouse solution spanning desktop,
enterprise data warehouse, and data marts
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
 Client connections always go through the control
node
 Contains no persistent user data
 Parallel Data Warehouse advantages:
oProcesses SQL requests
oPrepares execution plan
oOrchestrates distributed execution
 Local SQL Server processes final query plan and
aggregates results
 Provided by DataDirect
oOpen database connectivity (ODBC),
object linking and embedding database
(OLE DB), Java Database Connectivity
(JDBC), and ActiveX® Data Objects
(ADO.net) client drivers
oWire protocol (SeQuel link)
oDrivers are available for 32 bits and 64 bits
CONTROL NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
 Provides Support and Patching for the
Appliance
 Holds image for re-deployment of compute
node
 Holds Active Directory
MANAGEMENT NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
 Provides high-capacity storage for data files
from ETL processes
 Is available as a sandbox for other
applications and scripts that run on the
internal network
 Provides SQL Server Integration Services
LANDING ZONE
Source
Landing
Zone
Files
Data
Loader
Compute
Nodes
DWLoader or
SQL Server
Integration
Services
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
 Provides Integrated Backup Solution
 Integrates with 3rd party backup option
 Orderable in different sizes
BACKUP NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
• Data Rack Servers 10
active + 1 passive
• HP ProLiant DL360 G7
compute nodes
• InfiniBand, FC and
Ethernet switching, 42U
rack
• Expansion Grow from 1–
4 data racks, storage
options, test/dev system
• Storage 10x HP
StorageWorks MSA
P2000 G3
• Consists of COMPUTE
NODES and STORAGE
NODES
SQL
• Data Rack Servers 10
active + 1 passive
• HP ProLiant DL360 G7
compute nodes
• InfiniBand, FC and
Ethernet switching, 42U
rack
• Expansion Grow from 1–
4 data racks, storage
options, test/dev system
• Storage 10x HP
StorageWorks MSA
P2000 G3
COMPUTE NODE
 Each MPP node is a highly tuned symmetric
multi-processing (SMP) node with standard
interfaces
 Provides dedicated hardware, database,
and storage
 Runs SQL Server
 Spare Node provides failover in case of
node failure
 Drives are configured as RAID 1
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
Client Drivers
ETL Load Interface
Support/Patching
Corporate Backup
Solution
PDW – Client Connectivity
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
QUERY???
???
???
???
???
???
???
???
???
???
???
???
PDW – Query Processing
Replicated
A table structure exists as a full copy within each discrete Parallel
Data Warehouse node.
Data Layout Approaches
Distributed
A table structure is hashed on a single column and uniformly
distributed across all nodes on the appliance. Each distribution is a
separate physical table in the database management system
(DBMS).
Ultra Shared-Nothing
Provides the ability to design a schema of both distributed and
replicated tables to minimize data movement between nodes.
 Small sets of data can be more efficiently stored in full
(replicated).
 Certain set operations (such as single-node operations) are
more efficient against full sets of data.
Ultra Shared-Nothing Architecture
Extends Traditional Shared-Nothing Design
 Pushes shared-nothing architecture into the SMP node—there is IO and CPU affinity
within SMP nodes
o Eliminates contention for user queries
o Uses full resources for each user query
 Provides multiple physical instances of tables
o Distributes large tables
o Replicates small tables
 Redistributes rows as needed
Provides Fault Tolerance
 All hardware components have redundancy (including CPUs, disks, networks, power, and
storage processors)
 Control and compute nodes use failover clustering
 Management nodes have active and standby states
Administrative Console
https://controlnodeipaddress
 Dashboard
 Query activity
 Load activity
 Backup and restore
 Active locks
 Active sessions
 Alerts
 Appliance state
Parallel Data Warehouse Configuration Manager
 Appliance topology
 Services status
 Network
configuration
 Privileges
Parallel database
copy technology
enables rapid data
movement and
consistency between
EDW and data marts
Create SQL Server 2008 R2, Fast Track Data Warehouse,
and SQL Server Analysis Services Data Marts
Supports user groups
with very different
service-level
agreements (SLAs):
• Performance
• Capacity
• Loading
• Concurrency
Flexible Business Alignment
A distributed architecture gives you the flexibility to add or change diverse
workloads
or user groups while maintaining data consistency across the enterprise
Landing
Zone
ETL Tools
Distributed Data Warehouse Architectures
Departmental
Reporting
Regional
Reporting
High-
Performance
Reporting
Central EDW
Hub
Regional
Reporting
with Business
Decision
Appliance
Third-
Party
RDBMS
Third-
Party
Data
Integrati
on
Mobile
Applicati
ons
Determining the Right Solution
What is the workload?
 Number of concurrent users
 Query complexity
 Query mix
 Load processing
 Performance requirements
What is the customer looking for in a solution?
 Simplicity in the appliance
 100 percent compatibility with SQL Server 2008 R2
 Enterprise scalability
 Economical hardware
 Incremental expansion and high availability by default
Parallel Datawarehouse
 Enterprise-class scalability to hundreds of terabytes
 High performance
 Interoperability with leading BI products
 Mission critical support and maintenance
 Mature SQL Server platform with high security and robust
engineering process
 Strong data warehouse vision and roadmap that includes industry-
leading technologies
Value to Customer
Supporting Features
 MPP with ultra shared-nothing architecture
 Distributed query optimization
 Balanced hardware with pre-tested and pre-tuned appliances optimized for data
warehousing
 Third-party product integration (for example, Microstrategy, Business Objects, and
Informatica)
 Mission critical support and maintenance
 Road map includes column store, petabyte scalability, real-time data warehousing, MDM,
All up datawarewhouse – from smp to parallel

Contenu connexe

Dernier

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Dernier (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

En vedette

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

En vedette (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

All up datawarewhouse – from smp to parallel

  • 1. All up datawarewhouse – From SMP to Parallel Data warehousing
  • 2.
  • 3.
  • 4. Take 1 big SAN Add a little Server Add a bigger Server Add more networking
  • 5. POTENTIAL PERFORMANCE BOTTLENECKS FC HBA A B FC HBA A B FCSWITCH STORAGE CONTROLLER A B A B CACHE SERVER CACHE SQLSERVER WINDOWS CPUCORES CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate A B DISK DISK LUN DISK DISK LUN SQL Server Read Ahead Rate LUN Read Rate Disk Feed Rate
  • 6. It’s all about …. SIZING
  • 8. Transaction processing simplifies and accelerates data capture for accurate business decisions Data warehousing enables common data model for single version of the truth Analysis leads to optimized business processes and improved performance
  • 9. Data Warehouse Scope DataPath Data Warehouse Analysis Services Cubes PerformancePoint Dedicated SAN, Storage Array Reporting Services Web Analytic Tools Integration Services ETL SharePoint Services Microsoft Office SharePoint Data Staging, Bulk Loading Supporting Systems BI Data Storage Systems Presentation Layer Systems Data Warehouse Scope (dashed) PresentationDataPresentationData
  • 10. Data Warehouse Scenarios • No longer exclusive to large enterprises and specialists analysts • Growth of affordable self-service BI tools such as PowerPivot and Reporting Services has created a DW requirement for smaller businesses and individual departments
  • 11. Microsoft Data Warehousing Offerings Scalable and reliable SMP platform for data warehousing on any hardware Scalable and reliable platform for data warehousing on any hardware Reference architectures offering best price performance for data warehousing Appliance for high end MPP Data Warehousing delivering highest scalability and performance Ideal for data marts or small to mid-sized enterprise data warehouses (EDWs) Ideal for large data marts or mid-sized EDWs Ideal for data marts or small to mid-sized data warehouses with scan-centric workloads Ideal for high scale or high performance data marts and EDWs Software only Integrated Appliance (Software and Hardware) Reference Architectures (Software and Hardware) DW Appliance (Fully integrated Software and Hardware) Scale-Up DW Scale-Up DW Scale-Up DW Scale-Out DW with MPP 10s of terabytes <5 terabytes 5–80 terabytes 10s - 100s of TB Software Assurance; Premier Mission Critical Support 3-Year Support Plus 24 Software Assurance; Premier Mission Critical Support Mission Critical Advantage Program Enterprise Fast Track Data Warehouse RA BDW Appliance Parallel Data Warehouse
  • 12. Microsoft Data Warehouse Offerings Effort to Build Very High Very Low Modera te Modera te Moderate Mode rate Very Low Capacity Variable 5 TB 14 TB 20 TB 40 TB 40 TB 500 TB Concurrency Variable Light Light Medium Medium High Very High Query Complexity Variable Medium Mediu m Medium Medium High Very High
  • 14. Business Data Warehouse Appliance Agile • Deploy in hours/days, not in months • Easy to use through built-in dedicated tools to load and manage your data warehouse • Designed for up to 5TB data warehouses • Fast Track 3.0 compliant, license path to Fast-Track Complete • Hardware + Software + Services • Pre-tuned, pre configured, pre- installed. Turn on and go! • Single point of contact for support Optimized • Specifically for small to medium data warehouse workload • Designed for performance, energy efficiency, and value by HP and Microsoft’s best engineers • Security and reliability built in
  • 17. Fast Track Data Warehouse Components Software: • SQL Server 2008 R2 Enterprise • Windows Server 2008 R2 Configuration guidelines: • Physical table structures • Indexes • Compression • SQL Server settings • Windows Server settings • Loading Hardware: • Tight specifications for servers, storage and networking • ‘Per core’ building block
  • 18. SQL Server Parallel Data Warehouse
  • 19. SQL Server Parallel Data Warehouse • Tier-1 Enterprise Data Warehouse Appliance Offering – High scalability from tens to hundreds of terabytes – High performance through the MPP system • Flexibility and Choice – Choice of deployment options through distributed architecture • Most Comprehensive Solution – Complete data warehouse solution spanning desktop, enterprise data warehouse, and data marts
  • 20.
  • 22. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL  Client connections always go through the control node  Contains no persistent user data  Parallel Data Warehouse advantages: oProcesses SQL requests oPrepares execution plan oOrchestrates distributed execution  Local SQL Server processes final query plan and aggregates results  Provided by DataDirect oOpen database connectivity (ODBC), object linking and embedding database (OLE DB), Java Database Connectivity (JDBC), and ActiveX® Data Objects (ADO.net) client drivers oWire protocol (SeQuel link) oDrivers are available for 32 bits and 64 bits CONTROL NODE
  • 23. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL  Provides Support and Patching for the Appliance  Holds image for re-deployment of compute node  Holds Active Directory MANAGEMENT NODE
  • 24. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL  Provides high-capacity storage for data files from ETL processes  Is available as a sandbox for other applications and scripts that run on the internal network  Provides SQL Server Integration Services LANDING ZONE Source Landing Zone Files Data Loader Compute Nodes DWLoader or SQL Server Integration Services
  • 25. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL  Provides Integrated Backup Solution  Integrates with 3rd party backup option  Orderable in different sizes BACKUP NODE
  • 26. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL • Data Rack Servers 10 active + 1 passive • HP ProLiant DL360 G7 compute nodes • InfiniBand, FC and Ethernet switching, 42U rack • Expansion Grow from 1– 4 data racks, storage options, test/dev system • Storage 10x HP StorageWorks MSA P2000 G3 • Consists of COMPUTE NODES and STORAGE NODES
  • 27. SQL • Data Rack Servers 10 active + 1 passive • HP ProLiant DL360 G7 compute nodes • InfiniBand, FC and Ethernet switching, 42U rack • Expansion Grow from 1– 4 data racks, storage options, test/dev system • Storage 10x HP StorageWorks MSA P2000 G3 COMPUTE NODE  Each MPP node is a highly tuned symmetric multi-processing (SMP) node with standard interfaces  Provides dedicated hardware, database, and storage  Runs SQL Server  Spare Node provides failover in case of node failure  Drives are configured as RAID 1
  • 28. SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL SQL Client Drivers ETL Load Interface Support/Patching Corporate Backup Solution PDW – Client Connectivity
  • 30. Replicated A table structure exists as a full copy within each discrete Parallel Data Warehouse node. Data Layout Approaches Distributed A table structure is hashed on a single column and uniformly distributed across all nodes on the appliance. Each distribution is a separate physical table in the database management system (DBMS). Ultra Shared-Nothing Provides the ability to design a schema of both distributed and replicated tables to minimize data movement between nodes.  Small sets of data can be more efficiently stored in full (replicated).  Certain set operations (such as single-node operations) are more efficient against full sets of data.
  • 31. Ultra Shared-Nothing Architecture Extends Traditional Shared-Nothing Design  Pushes shared-nothing architecture into the SMP node—there is IO and CPU affinity within SMP nodes o Eliminates contention for user queries o Uses full resources for each user query  Provides multiple physical instances of tables o Distributes large tables o Replicates small tables  Redistributes rows as needed Provides Fault Tolerance  All hardware components have redundancy (including CPUs, disks, networks, power, and storage processors)  Control and compute nodes use failover clustering  Management nodes have active and standby states
  • 32. Administrative Console https://controlnodeipaddress  Dashboard  Query activity  Load activity  Backup and restore  Active locks  Active sessions  Alerts  Appliance state
  • 33. Parallel Data Warehouse Configuration Manager  Appliance topology  Services status  Network configuration  Privileges
  • 34. Parallel database copy technology enables rapid data movement and consistency between EDW and data marts Create SQL Server 2008 R2, Fast Track Data Warehouse, and SQL Server Analysis Services Data Marts Supports user groups with very different service-level agreements (SLAs): • Performance • Capacity • Loading • Concurrency Flexible Business Alignment A distributed architecture gives you the flexibility to add or change diverse workloads or user groups while maintaining data consistency across the enterprise
  • 35. Landing Zone ETL Tools Distributed Data Warehouse Architectures Departmental Reporting Regional Reporting High- Performance Reporting Central EDW Hub Regional Reporting with Business Decision Appliance Third- Party RDBMS Third- Party Data Integrati on Mobile Applicati ons
  • 36. Determining the Right Solution What is the workload?  Number of concurrent users  Query complexity  Query mix  Load processing  Performance requirements What is the customer looking for in a solution?  Simplicity in the appliance  100 percent compatibility with SQL Server 2008 R2  Enterprise scalability  Economical hardware  Incremental expansion and high availability by default
  • 37. Parallel Datawarehouse  Enterprise-class scalability to hundreds of terabytes  High performance  Interoperability with leading BI products  Mission critical support and maintenance  Mature SQL Server platform with high security and robust engineering process  Strong data warehouse vision and roadmap that includes industry- leading technologies Value to Customer Supporting Features  MPP with ultra shared-nothing architecture  Distributed query optimization  Balanced hardware with pre-tested and pre-tuned appliances optimized for data warehousing  Third-party product integration (for example, Microstrategy, Business Objects, and Informatica)  Mission critical support and maintenance  Road map includes column store, petabyte scalability, real-time data warehousing, MDM,

Notes de l'éditeur

  1. The HP Business Data Warehouse Appliance is a great solution for data warehouse environments with light concurrency requirements and relatively low data volumes. This workload profile is becoming increasingly common as organizations recognize the business value in using data marts and departmental data warehouses as a platform for the increasing use of business analysis tools by information workers at all levels of the business. No longer are data warehouses and BI solutions the exclusive domain of huge enterprises – they are now an increasingly important capability for small to medium businesses and decentralized departments. There’s a growing number of businesses who don’t have same concurrency and data volumes, or budgets, as large enterprises; but who want to be able to create a data warehouse for better reporting, analysis, and decision making.
  2. The HP Business Data Warehouse offers a solution for the customers discussed on the previous slide. It’s a solution that is: Complete – the appliance comes with all the hardware and software you need, pre-configured for a data warehouse workload based on expertise from HP and Microsoft, and includes support services from a single source. Optimized – Experts from Microsoft and HP have designed and tuned the appliance specifically for data warehouse workloads, so you can be sure it will meet your data warehouse requirements with efficient power utilization and built in security and reliability features. Agile – Because the BDW is a single hardware appliance, you can just plug it in, switch it on, and within a very short period you’ll have a working data warehouse. The easy to use wizards included in the appliance make it easy to configure and load, enabling your business to start taking advantage of your data warehouse sooner than with a “self-build” solution. And while the BDW is optimized for relatively low data volumes and concurrency, if your business grows significantly you can transfer your BDW software licenses to a Fast Track solution.
  3. There are two key scenarios for using the HP Business Data Warehouse appliance: A small business or departmental data warehouse for a small group of concurrent users who need to store and analyse up to 5 TB of data. A spoke in an Enterprise Data Warehouse “hub and spoke” architecture, where the BDW is used to deliver a subset of the corporate data warehouse to a specific set of users.
  4. © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
  5. The appliance is a complete solution with hardware, software, and service that is needed in a mission critical data warehouse. The database is highly scalable and can handle workloads of hundreds of terabytes while maintaining performance. The EDW appliance also works with your existing data warehouses and data marts so you do not have to rip and replace your current investments. Also, you can use familiar tools such as Microsoft Excel to analyze the data in your data warehouse.
  6. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  7. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  8. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  9. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  10. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  11. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  12. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  13. Customers will purchase at least two racks for a complete EDW Appliance system. [Click] The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  14. Data layout options: Dimension tables are typically replicated Parallel Data Warehouse maintains data integrity across all nodes Fact tables are typically distributed The data model, table sizes, and workloads must all be considered when choosing between replicated and distributed tables The following join types are used to achieve distribution compatibility: Shared-nothing join: Achieves distribution compatibility by using compatible distribution keys in the SQL join criteria Ultra shared-nothing join: Achieves distribution compatibility through a replicated table; no data movement between nodes is required Redistribution join: Requires data to be dynamically distributed between compute nodes to achieve distribution compatibility
  15. By taking the traditional idea of a shared-nothing architecture a step farther, the EDW appliance does not need to share any information between compute nodes. Each table is duplicated in several places to help with load balancing and fault tolerance. The hardware is redundant and supports automatic failover to the standby hardware to increase the overall system uptime.
  16. The Administrative Console is an Internet Information Services (IIS) web application for SQL Server Parallel Data Warehouse that displays the appliance’s state information. Users connect to the Administrative Console through Microsoft Internet Explorer.
  17. The Configuration Manager is an appliance administration tool that SQL Server Parallel Data Warehouse system administrators use to perform appliance-level operations and to change appliance-level settings. For example, use the Configuration Manager to reset passwords, set the time zone, change IP addresses, configure SSL certificates, enable remote access through the firewall, start or stop the appliance, and set Instant File Initialization.
  18. A distributed data warehouse solution, such as that supported by SQL Server Parallel Data Warehouse, comprises a centralized EDW and a set of loosely coupled data marts. For many years, this has been the preferred approach for enterprise-wide data warehousing, and numerous studies since 2003 confirm that hub and spoke is the most popular data warehouse architecture among DW professionals. Traditionally, implementing a hub and spoke architecture has been challenging due to practical limitations of the database engine and network resources. [Click to display types of spoke] With SQL Server Parallel Data Warehouse, you can create a diverse range of types of spoke, from SQL Server Parallel Data Warehouse MPP appliances for user groups that have extreme scalability requirements, Fast Track data warehouse implementations, SQL Server 2008 Enterprise data warehouses, and even SQL Server 2008 Analysis Services OLAP databases. [Click to display parallel database copy point] However, the SQL Server Parallel Data Warehouse parallel database copy technology enables rapid data integration between spokes and the SQL Server Parallel Data Warehouse hub, making it easier to build hub and spoke solutions that integrate your diverse data marts and the enterprise data warehouse. [Click to display multiple-user SLA point] The SQL Server Parallel Data Warehouse hub and spoke architecture enables you to support user groups with very different SLAs; supports hot, warm, and cold data; supports different requirements for loading data loading, and more.
  19. The EDW appliance can be the central hub in this architecture. The spokes can be anything from a SQL Server departmental data mart to a Fast Track reference implementation, a business decision appliance, or a SQL Server Analysis Services system. EDW is not restricted to any particular model, and the high-speed data copy features enable multiple clients.
  20. With so many choices, there are always questions about which solution is right for the organization. These questions help you to determine the correct solution. While there is rarely any one deciding factor, you can find a solution that is optimized for the things that are most important to you.
  21. The EDW appliance fits in with your existing data warehouse solutions and will enable you to query and report on the large amount of data stored in the appliance.