1. Virtualizing Databases
Doing IT Right – The
Sequel
VAPP1318
Michael Corey, Ntirety - A Division of Hosting
Jeff Szastak, VMware, Inc
2. Jeff
Szastak
Jeff Szastak
MSIA, CISSP, VCP, MCSE, etc.
Manager, Systems Engineering
CTO Ambassador
VMware, Inc.
Microsoft Exchange & SQL virtualization BC/DR SME
@szastak
Blog contributor:
blogs.vmware.com/apps
www.virtualinsanity.com
3. Michael
J
Corey
Books Include:
Virtualizing SQL Server with VMware Doing IT
Right
Oracle Database 12c: Install, Configure & Maintain
like a Professional
Oracle 11g A Beginner’s Guide
Oracle 10g A Beginner’s Guide
Oracle 9i - A Beginner's Guide
SQL Server 7 Data Warehousing
Oracle8i - Data Warehousing
Oracle8i - A Beginner's Guide
Oracle8 - Data Warehousing
Oracle8 – Tuning
Oracle8 - A Beginner's Guide
Oracle - Data Warehousing
Oracle - A Beginner's Guide
Tuning Oracle
Key Past/Current Affiliations:
Past President of the IOUG
Founding Board IOUG Virtualization SIG
Past Member IOUG Board of Directors
Past Director of Education IOUG
Founder Professional Association of SQL Server
Talkin’Cloud Top 200 Channel Partner Experts Cloud
Past Member Microsoft Data Warehouse Council
Past Member Oracle Educational Advisory Council
Past Director of Conferences IOUG Alive
Executive Board Massachusetts Robert H. Goddard
Council on Science, Technology, Engineering & Mathematics
Started
Working
with
Oracle
Version
3.0
Beta
Tested
Oracle
5,6,6.2,7,8.X,9.X.….
Presented
on
Technology
&
Business
Topics
from
Brazil
to
Australia
Worked
with
Oracle
on
UNIX,
Linux,
Windows,
MVS,VM,
VMS,..
5. Doing
Something
Different
• PresentaSon
Covers
Both
Oracle
&
MicrosoU
SQL
Server
• More
&
More
DBA’s
are
faced
with
maintaining
both
• Many
Issues
faced
are
shared
5
“This is a Database on Virtualized Infrastructure Session, Principals Apply all Databases”
10. Why
Your
Company
Cares:
VirtualizaSon
is
Strategic
" 1:1 relationship between applications and hardware
" Relevant cost metric = cost per server
• 8% - 12% Utilization is typical
" Many:1 relationship between applications and hardware
" Relevant cost metric = cost per application
• 60 - 80% Utilization: is typical
• 60% reduction in CapEx
• 30% reduction in OpEx
• 80% reduction in Energy
Physical World
1 :1
Virtual World
Many :1
The New
Norm
“Can You Say Right-Sizing”
11. Memory
Hot
Add
/
CPU
Hot
Plug
Reduction in CPU Utilization
Increased processing rate
Adding Memory
13. Oracle
-‐
Hot
Add
Memory
Oracle database memory parameters are defined at instance startup.
You will have to restart the database to take advantage of added
memory.
Unless you have set SGA_MAX_SIZE to Big
Caution Shared Resource Environment !
Typically…
SGA_TARGET_SIZE <= SGA_MAX_SIZE
or could be wasting memory
http://www.vmware.com/files/pdf/solutions/oracle/
Oracle_Databases_VMware_Workload_Characte
rization_Study.pdf
14. 1St Time Goal of Consistency Standardization
Can Be Achieved
“Any Resource,Any Server,At Any Time” in the (Pool)
The 10 Millionth
Model T was
produced on
June 4, 1927
19. Very Large ERP System
• 75+ application tiers – VMware/RHEL
• 8 TB database; 8.8 billion rows of data
• 52 million transactions per day
• 79K IOPS
• 40K blocks per second interconnect traffic
• 40,000+ named users
• 4,000+ peak concurrent users
Source
EMC
“Yes This is Virtualized”
20. Performance
Test
Environment
(Topology)
20
■ VMware vSphere 5.1, Red Hat Enterprise Linux (RHEL) 6.3
■ Oracle 11gR2 (11.2.0.3) Single Instance and RAC
■ 3PAR StoreServ 10400
■ 192 x 15K RPM Fibre Channel Disks
■ 32 x 150K RPM Solid State Disk (SSD)
■ ProLiant DL580 G7 (client)
■ Intel® Xeon® CPU X7560 @ 2.26 GHz (8 cores)
■ 128GB memory
■ ProLiant BL660c Gen8 - 4 sockets / 24 cores (database server)
■ Intel® Xeon® CPU E5-4610 @ 2.40 GHz (6 cores)
■ 64GB memory
■ HP Virtual Connect FlexFabric 10Gb/24-Port Module
Recent “HP” Performance Study
– Choose Your Vendor DU-JOUR
21. Performance
Results
• Virtualization has ~5% overhead as
compared to native
• The database tps on a virtual machine is 5%
less than that on the physical machine.
• 2P represents 12 cores and 4P represents 24
cores
21
• For 100 users the delta is ~6% and that
increases up to ~10% for 1700 users.
• When the system gets busier, native
starts to have a slightly larger
advantage over virtualization.
22. Performance
Results
-‐
ConSnued
• Both
virtual
and
naSve,
by
moving
from
2P
(12
cores)
to
4P
(24
cores)
• The
database
tps
increases
by
40%
to
50%
• The
CPU
uSlizaSon
drops
from
80%
to
60%
22
• For RAC , by moving from 2P (12 cores) to 4P (24
cores)
• The database tps increases by 40% to 60%
• The CPU utilization drops from 75% to 60%
“Who Architects a Database With Less than 5% Overhead - One Busy Day Your Done”
23. Workload
CharacterisScs
• OLTP
type
of
workload
with
a
read
write
raAo
of
2:1
• Oracle
Database
size
of
600GB
• workload
is
an
implementaSon
of
an
online
store
• The
driver
program
simulates
users
logging
in,
browsing
for
products
by
Stle,
or
category,
adding
selected
products
to
their
shopping
cart,
and
then
purchasing
those
products
23
24. Mega
vMoSon
RAC
on
vSphere
FuncSonal
Stress
Test
VMW,
EMC,
Cisco
Executed
by
“Principled
Technologies”
2013
WWW.principledtechnologies.com/Vmware/vMoSon_oracle_rac_1013.pdf
3
RAC
Node,
vMoSon
on
all
3
Nodes
Simultaneously
–
Without
any
network
disrupSon
24
25. Service
Level
Agreement/The
DBA
Situation: Customer Monitors Critical Medical Equipment within a Hospital. A SQL
Server Database is at core of system. Having Huge performance problems
“Failure is not an option”.
Solution: Need to take Server Down. Adjust BIOS Setting Causing SQL Server to only
have access to 50% of the available CPU.
Customer: Never a time they can take Server down for 5 minutes
Stand Alone Instance – Had it been virtualized DBA would have had options
26. No
Win
-‐
SLA
Yet this situation points to
a bigger issue concerning
“Managements”
expectations concerning
the availability of the
database and the
physical infrastructures
ability to support those
expectations.
27. Have
The
ConversaSon
• Get the Resources You Need to meet the expectation
• OR – Reset Expectations concerning Database Uptime
28. Avoid
Good
IntenSon
BIOS
Seong
Check Power Management Settings
• Default lot of Servers is “Green” Friendly Setting
• Saves Energy, When Server Inactive
• Many Times Does Not Ramp UP CPU Quickly and in Some Cases
Completely
• Avoid Dozing Setting
• Slows CPU to half its Speed
Proper Setting for server hosting a Database
is “High Performance”
29. BIOS
Seongs
to
Consider
If Your Processors Support it
• Enable “Turbo Mode”
• Enable “Hyper-threading”
Enable all hardware-assisted virtualization features in the
BIOS.
30. Fun
Facts
30
Faster than the rate of babies born in the U.S.
10 VMs STARTED EVERY MINUTE 80 ,000
VMware-certified
Professionals in 146
Countries (July 2012)
6 vMOTIONS PER SECOND
More VMs are in motion than planes in flight.
20 MILLION VMs - 2011
If they were physical machines they would stretch
2x the length
of
Great Wall of
China
32. Lessons
Learned
–
Tier
1
“What
Works
in
Tier-‐2
(non-‐producAon),
will
not
always
work
with
Tier-‐1
(producAon)”
32
33. Doing
It
Right
1st
Time:
Very
ConservaSve
Designed
to
Insure
You
Avoid
Common
Traps
&
PiUalls
Associated
with
ProducAon
Databases
being
Virtualized
35. Doing
It
Right:
Read
Best
PracSces
Guides
Read The Documentation
From All Your Vendors……
VMware, Microsoft, Storage
Vendor, Network Vendor….
Appendix of this deck
36. Professional
AssociaSon
of
SQL
Server
http://virtualization.sqlpass.org/
“Take Advantage of All resources Available to You”
37. • “Oracle Performance Management with vCenter Operations Manager and Oracle
Enterprise Manager Adapter”
• “Virtualizing Oracle 11gR2 RAC on Vmware vSphere: Best Practices”
• “Virtualization Bootcamp: Optimizing Oracle Databases on Vmware”
Sign-up for the NEW VMware SIG and gain access to content,
webinars and networking opportunities
40. InstallaSon
• Plan
your
SQL
Server
installaSon
q SLAs,
RPOs,
RTOs
q Baseline
current
workload,
at
least
1
business
cycle
q Baseline
exisAng
(workload)
vSphere
implementaAon
q EsSmated
growth
rates
q I/O
requirements
(I/O
per
sec,
throughput,
latency)
q Storage
(Disk
type/speed,
RAID,
flash
cache
soluSon,
etc)
q SoUware
versions
(vSphere,
Windows,
SQL)
q Product
Keys
q Licensing
(may
determine
architecture)
q Workload
type
(OLTP,
Batch,
Warehouse)
q Accounts
needed
for
installaSon
/
service
accounts
q High
Availability
strategy
q Backup
&
Recovery
strategy
“If you aim at nothing,
you will hit it every
time” – Zig Ziglar
41. Planning
a
High
Availability
Strategy
§ Requirements
• Recovery
Time
ObjecSve
(RTO)
• What
does
99.99%
availability
really
mean?
• Recovery
Point
ObjecSve
(RPO)
• Zero
data
lost?
• HA
vs.
DR
requirements
§ EvaluaSng
a
technology
• What’s
the
cost
for
implemenSng
the
technology?
• What’s
the
complexity
of
implemenSng,
and
managing
the
technology?
• What’s
the
downSme
potenSal?
• What’s
the
data
loss
exposure?
Availability
%
DownAme
/
Year
DownAme
/
Month
*
DownAme
/
week
"Two
Nines"
-‐
99%
3.65
Days
7.2
Hours
1.69
Hours
"Three
Nines"
-‐
99.9%
8.76
Hours
43.2
Minutes
10.1
Minutes
"Four
Nines"
-‐
99.99%
52.56
Minutes
4.32
Minutes
1.01
Minutes
"Five
Nines"
-‐
99.999%
5.26
Minutes
25.9
Seconds
6.06
Seconds
*
Using
a
30
day
month
42. Is
Being
Down
3
Days
In
A
Row
Ok?
You
Had
99%
Availability
!
43.
44. Baseline,
Baseline,
Baseline………
44
Why
will
making
it
Virtual
make
it
perform
bexer?
IF
so
how?
– New
Hardware?
– Faster
CPU?
– Faster
Drives?
“There are no silver bullets”
45. “IT”
Food
Groups:
What
to
Baseline
• ExisSng
Physical
Database
Infrastructure
• ExisSng/Proposed
vSphere
Infrastructure
45
46. When
You
Base
Line
a
database
§
Make
Sure
The
Sample
Interval
Is
frequent
§ CPU,
Memory,
Disk
(15
Seconds
or
less)
§
SQL
Server
TSQL
(1
Minute)
“A Lot can
happen in a
short amount of
time”
“SAME Applies to Oracle ! ! ! - A lot Can Happen
47. Oracle
12c
Cloud
Control/DB
Express
The Default thresholds for alerting in Cloud
Control 12c good starting point
49. Database
As
A
Service
–
Road
Map
MulAple
Tier
Approach
• Different
levels
for
different
DB
placement
• Basic
and
Premium
– Basic
=
Low
uSlizaSon,
test
/
dev
DBs
– Premium
=
Moderate
to
High
uSlizaSon,
producSon,
high
visibility
• Different
underlying
hardware
• Different
SLAs,
RTO,
RPOs
and
HA
between
Sers
Center
of
Excellence
• Assist
with
migraSons,
net
new
DBs
and
Capacity
Management
– CommunicaSon,
no
“throwing
it
over
the
wall”
• VMware/SAN/Network/DB
teams
to
discuss
DB
migraSons
– OpSonal
Teams:
Security,
Procurement
49
“Few Dedicated Personnel to each Level of Stack –
End Users are taking advantage automation”
50. Understanding
Workload
Resource
Requirements
Basic
performance
characterisAcs
(CPU,
memory,
IO,
Network)
• Daily
average
resource
usage
• Daily
peak
resource
usage
• Daily
peak
hours
• Month-‐end,
quarter-‐end,
year-‐end
peaks
Monitoring
Tools
• Windows
Perfmon
(Example)
– Processor(*)
à
%Processor
Time
– Process(sqlservr)
à
%Processor
Time
– SQLServer:Memory
Manager
à
Total
Server
Memory
(KB)
– PhysicalDisk(*)
à
Disk
Reads/Sec,
Disk
Writes/Sec
– PhysicalDisk(*)
à
Disk
Reads
Bytes/Sec,
Disk
Write
Bytes/Sec
– Network
Interface(*)
à
Bytes
Received/Sec,
Bytes
Sent/Sec
50
52. %MLMTD
§ VM
Level
-‐
The
percentage
of
Sme
the
vCPU
was
ready
to
run
but
deliberately
wasn’t
scheduled
because
that
would
violate
the
“CPU
limit”
seongs.
If
larger
than
0
the
world
is
being
throxled
due
to
the
limit
on
CPU
57. MigraSon
–
Baseline:
Virtual
(disk)
Post
§ Export output Excel, and
graphed using a variety of tools,
such as Jonathan Kehayias’
Powershell script.
§ Compare the results against the
required IOPS as measured in
the pre-deployment
assessment.
60. Don’t’
keep
it
a
Secret
• DBA’s
–
tell
vSphere,
Storage,
and
Network
Admins
your
needs
– Storage:
(IOPS
/
throughput)
– CPU:
(MHz)
– Memory:
(Total
GB)
– Network:
Bandwidth
– Features
(i.e.:
Windows
clustering)
– AnScipated
Growth
Rates
– AnScipated
AcSvity
– Other
“They Flunked Mind Reading”
61. Before
You
Install
a
Database
on
New
VM
• Do
basic
throughput
tesSng
of
the
IO
subsystem
prior
to
deploying
a
Database
• Tools
you
can
use
– SQLIO/IOMETER
– Slob…..
61
“Check It Before You Wreck it”
-- Jeff Szastak
62. Should
You
PàV
(Via
Converter)
ProducSon
Environment’s
Build
“New”
From
Scratch
–
GI/GO
63. SQL
Server
-‐
Unaxended
InstallaSon
OpSons
§ VMware
vCAC
Command
Line
• hxp://msdn.microsoU.com/en-‐us/library/ms144259
§ ConfiguraSon
File
• hxp://msdn.microsoU.com/en-‐us/library/dd239405
§ Sysprep
• hxp://msdn.microsoU.com/en-‐us/library/ee210664
• FYI
–
Available
as
of
SQL
Server
2008
R2
64. ORACLE-‐
Unaxended
InstallaSon
OpSons
You At the VMworld
Party While your
Database is
Provisioned
VMware vCAC
DBCA Silent Install
http://docs.oracle.com/cd/E11882_01/install.112/e24321/app_nonint.htm#CIHHFDGG
RAC Silent Install
http://docs.oracle.com/cd/E11882_01/install.112/e24660/cripts.htm#RILIN1119
65. Phone-‐A-‐Friend
VMware
has
stated
that
it
will
take
the
______support
call
if
a
customer
calls
______
Support
and
______
Support
is
being
difficult
because
the
customer
is
running
on
VMware.
• Hint…….
“TSANET.ORG--- Hardware or Software”
66. Use
SQL
Server/Oracle
recommended
installaAon
guidelines
for
respecAve
operaAng
system
–
same
as
physical
!
Physical World 1 :1 Virtual World
Many :1
Same
As
Physical
67. If
your
OS
and
database
don’t
know
they
are
virtualized
do
you
need
to
tell
them?
Did You Hear That?
69. OLTP
§ Large amount of small queries
§ Sustained CPU utilization during working hours
§ Sensitive to peak contentions (slow downs affects SLA)
§ Generally Write intensive
§ May generate many chatty network round trips
§ Typically runs during off-peak hours, low CPU utilization
during the normal working hours
§ Can withstand peak contention, but sustain activity is key
Batch / ETL
Database Workloads Types
DSS
§ Small amount of large queries
§ CPU, memory, disk IO intensive
§ Peaks during month end, quarter end, year end
§ Can benefit from inter-query parallelism with large number of
threads
70. OLTP
vs.
Batch
Workloads
§ What
this
says:
• Average
15%
USlizaSon
• Moderate
sustained
acSvity
(around
28%
during
working
hours
8am-‐6pm)
• Minimum
acSviSes
during
non
working
hours
• Peak
uSlizaSon
of
58%
§ What
this
says:
• Average
15%
USlizaSon
• Very
quiet
during
the
working
day
(less
than
8%
uSlizaSon)
• Heavy
acSvity
during
1am-‐4am,
with
avg.
73%,
and
peak
95%
Batch
Workload
(avg.
15%)
OLTP
Workload
(avg.
15%)
71. OLTP
vs.
Batch
Workloads
§ What
This
Means
• Bexer
Server
USlizaSon
• Improved
ConsolidaSon
RaSos
• Less
Equipment
To
Patch,
Service,
Etc
• Saves
Money/Less
Licensing
OLTP/Batch
Combined
Workload
79. More
VMs
vs.
More
DB
Instances
More
VMs
• Bexer
resource
isolaSon
• Bexer
security,
patch
management
• Befer
Performance
• Less
Risk
Fewer
VMs
(More
instances)
• Less
expensive
in
some
licensing
models
• No
OS
isolaSon
(configuraSon,
security,
fault)
• No
resource
isolaSon
• Less
SegmentaSon
(HIPPA,
PCI,…..)
Note: Both Work, Both Valid Strategies
80. General
Rule
of
Thumbs
• Resource
uSlizaSon
is
the
basics,
but
not
all
• Consider
business,
security,
management,
and
other
requirements
• Consider
workload
characterisScs
• OLTP
workloads
can
be
stacked
up
to
a
sustained
uSlizaSon
level
• OLTP
workloads
that
are
high
usage
during
day
Ame,
and
batch
workloads
that
run
during
off-‐peak
hours
mixed
well
together
• Batch/ETL
workloads
with
different
peak
periods
share
well
together
• Consider
operaSonal
history,
e.g.
month
end,
quarter
end
• AddiAonal
VMs
may
be
added
to
handle
peak
period
during
month
end,
quarter
end,
and
year
end
if
scale
out
is
a
possibility
• CPU,
memory
hot-‐add
may
be
used
to
handle
the
peak
workload
• Reduce
VM
density,
or
add
more
hosts
to
the
cluster
82. Golden
Rules
“Your
Database
is
just
an
extension
of
your
Storage”
Michael
Webster
“Your Storage is Just a Set
of containers for your
database”
Don Sullivan
83. Storage
• The
fundamental
relaAonship
between
consumpAon
and
supply
has
not
changed
• Spindle
count
and
RAID
configuraAon
sSll
rules
• host
demand
is
an
aggregate
of
VMs
• Factors
that
affect
storage
performance
• storage
protocols
• storage
configuraSon
• VMFS
configuraSon
(Separate
LUN’s,
All
on
one
LUN,
Does
it
even
maxer?)
VMFS
85. Use
VMFS
vs.
RDM
• VMFS
Advantages
– Negligible
performance
cost
and
superior
funcSonality
– Ability
to
take
full
advantage
of
future
funcSonality
enhancements
(Future
Awesomeness)
• Align
VMFS
on
64K
boundaries
– AutomaSc
with
vCenter
– www.vmware.com/pdf/esx3_parSSon_align.pdf
• With
vSphere
4.1
– Use
VAAI
(Storage
API)*
• With
vSphere
5.x
– Use
VASA
(Storage
API)*
0
1000
2000
3000
4000
5000
6000
7000
8000
4K
IO
16K
IO
64K
IO
VMFS
RDM
(virtual)
RDM
(physical)
IOPS
VMFS Scalability
* Work With Storage Vendor For Details
86. Thin
Provisioning
Perf
/
Block
Zeroing
MBs
I/O
Throughput
§ USE
use Thick Eager Zerod Disk for best
performance
§ Maximum
Performance
happens
eventually,
but
when
using
lazy
zeroing,
zeroing
needs
to
occur
before
you
can
get
maximum
performance
§ At
minimum
Databases,
LOGS,
TEMPDB
§ Check
with
Storage
Vendor
to
see
how
they
handle
Thin
Provisioning.
Your
Mileage
may
vary
§ VAAI
capable
array
can
alter
config
hfp://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf
88. OpSmizaSons
–
SQL
Server:
Disk
§ Disk
• Instant
file
iniSalizaSon
–
add
SQL
Server
service
account
to
PERFORM
VOLUME
MAINTAINCE
TASK
under
User
Rights
Assignment
in
Local
Policies
of
Server’s
seongs.
• By
default,
every
Ame
the
database
file
needs
to
grow,
OS
will
zero
fill
this
file
&
block
writes
unAl
complete
• Adding
requires
a
restart
of
the
SQL
Service,
• removal
requires
a
reboot
hxp://msdn.microsoU.com/en-‐us/library/ms175935(v=SQL.105).aspx
89. SQL
Server:
System
Databases
Tempdb
• Depending
on
workload,
consider
creaSng
mulSple
tempdb
files
(see
next
slide)
• Microson
recommends
1
datafile
per
CPU
• Isolate
tempdb
from
database
and
logs,
and
consider
dedicated
vSCSI
adapter
• Verify
via
tesSng
Oracle - No
Datafile to CPU
relationship
90. For
those
who
want
to
be
less
conservaSve
(for
TempDB
)
SQL
2005
50%
the
number
of
cores
up
to
8,
2008+
25%-‐50%
raSo
of
files
to
cores,
usually
up
to
8.
The
number
of
data
files
and
tempdb
files
is
important
enough
that
MicrosoU
has
two
spots
in
the
Top
10
SQL
Server
Storage
best
pracSces
highlighSng
the
number
of
data
files
per
CPU
TEMPDB 1 datafile per CPU
(DUAL Core Counts as 2 CPU’s)
(Raid 1+0 – Write Intensive)
Data Files 1 datafile per CPU
200GB DB/4 vCPU = 4@50GB
Make Equal Size/Grow Equally
http://technet.microsoft.com/en-us/library/cc966534.aspx
91. Storage
Paravirtual
SCSI
(PVSCSI)
adapters
PVSCSI
adapters
are
high-‐performance
storage
adapters
that
can
result
in
greater
throughput
and
lower
CPU
uSlizaSon.
• Up
to
30%
CPU
Savings
• Up
to
12%
I/O
Improvement
Paravirtual Adapter Knows ItsVirtual
* Very Important to Use Most Current Version
92.
PVSCSI
adapters
are
best
suited
for
environments,
especially
SAN
environments,
where
hardware
or
applicaSons
drive
a
very
high
amount
of
I/O
throughput.
PVSCSI
adapters
are
not
suited
for
DAS
(Direct
Afached
Storage)
environments.
Paravirtual
SCSI
(PVSCSI)
Storage
Adapters
93. Always
Check
Storage
Vendors
Best
PracSces
“>80% of the issues
in a virtualized
Environment have
to do with Storage
misconfigurations”
94. Storage
–
Puong
It
All
Together
• Work
with
storage
engineer,
deliver
realisSc
requirements
early
in
the
cycle
• Size
for
performance,
not
capacity
• Large
number
of
small
drives,
not
small
number
of
large
drives
• More
/
faster
spindles
are
befer
for
performance
• Understand
the
I/O
requirements
of
different
workloads
• TransacSonal
data
vs.
log
vs.
backup
• OLTP
vs.
DSS
95. Storage
–
Puong
It
All
Together
• Understand
the
path
to
the
drives,
i.e.
throughput,
mulA-‐pathing
• Use
eagerzeroedthick
disk
provisioning
to
avoid
lazy
zeroing
• Place
swap
file
on
separate
dedicated
drive
on
SAN,
miSgate
the
impact
of
swapping
with
EFD
(for
high
performance
workload)
• Can
potenSally
slow
down
vMoSons
• Follow
SQL
Server
storage
best
pracSces
hxp://technet.microsoU.com/en-‐us/library/cc966534.aspx
Work
with
your
SAN
Vendor
as
well,
they
have
Best
PracAces
for
running
these
workloads
on
your
array
96. The Bottom Line
“>80% of performance
problems with
virtualization occur at
the storage layer”
Now that you know, don’t
let it happen to YOU
98. vCPUs
–
Hyper-‐Threading
hyper-‐threading
processor
to
appear
as
two
"logical"
processors
to
the
host
operaAng
system
98
⎨ í
Still only One
Processor
99. vCPU’s
• With
Databases
Avoid
Over
Commitment
of
Processor
Resources
Sll
have
“acSonable”
performance
data
you
can
scale
(vCOPs)
• 1-‐1
RaSo
Physical
Cores
to
vCPU’s
• Out
of
the
gate
!
Hyper-Threaded CPU != Full vCPU
100. Within
The
VM
In
a
virtual
environment
each
vCPU
is
a
single
thread.
There
is
no
virtual
equivalent
of
a
hyper-‐
thread.
Guest Operating O/S sees the number of allocated vCPU’s
Non-Virtualized O/S – Would see the Hyper threads.
Oracle: Latches, Parallelism… Based upon visible CPU’s. Be Careful How You Set these things.
101. Hardware
GeneraSon
Maxers
• Use
the
latest
processors
• Support
for
Hardware
Assisted
VirtualizaSon
• H/W
assist
for
CPU
:
AMD-‐V
on
AMD
or
VT-‐x
on
Intel
• H/W
assist
for
MMU
• NPT*
on
AMD
or
EPT
on
Intel
:
NPT
used
in
our
tests
• Enabled
at
BIOs
level
• Enable
NUMA
support
• Understand
VMM
(Virtual
Machine
Manager)
Benefits of hardware assistance for CPU and Memory Virtualization
hxp://www.vmware.com/files/pdf/
perf_vsphere_sql_scalability.pdf
102. Processor
–
Puong
It
All
Together
• Leverage
hardware-‐assisted
virtualizaAon
(enabled
by
default)
• Consider
avg.
and
peak
uSlizaSon
• Be
aware
of
hyper-‐threading,
a
hyper-‐thread
does
not
provide
the
full
power
of
a
physical
core
• Consider
future
growth
of
the
system,
sufficient
head
room
should
be
reserved
• In
high
performance
environment,
consider
adding
addiAonal
hosts
when
avg.
host
CPU
uAlizaAon
exceeds
65%
• Consider
increasing
CPU
resource
if
guest
VM
CPU
uSlizaSon
is
above
65%
in
average
• Ensure
Power
Saving
Features
are
“OFF”
• Use
vCOPs
for
consumpAon
&
capacity
104. OpSmizaSons
SQL
Server:
Memory
Memory
–
Max
/
Min
§ Min
is
set
to
0
• only
change
when
the
OS
is
requesSng
memory
for
other
apps
§ Max,
is
2
TB
by
default
• Should
not
equal
or
exceed
total
VM
RAM,
may
lead
to
OS
starvaSon
• Do
not
set
to
0,
may
prevent
SQL
from
starSng
• If
using
“Hot
Add”
remember
to
modify
this
seong
SSQL Max Memory = VMMem – ThreadStack
–
OS
Mem
–
VM
Overhead
• ThreadStack
=
NumOfSQLThreads(ThreadStackSize)
• ThreadStackSize
=
1
MB
on
x86
|
2
MB
on
x64
hxp://msdn.microsoU.com/en-‐us/library/ms178067.aspx
105. Max
SQL
Mem
Example
NArety
Rule**
• 2
Gig
+
AddiAonal
1
Gig
per
16
Gig
Physical
Memory
105
**In the context of the VM size or Physical Machine Size
106. Running
MulAple
Instances
on
Same
VM
Two
opSons,
and
do
nothing
is
not
one
of
them
OpSon
1:
Use
max
server
memory
• Create
max
seong
for
each
instance
• Give
each
instance
memory
proporSonal
to
expected
workload
/
db
size
• Do
not
exceed
total
RAM
allocated
to
VM
OpSon
2:
Use
min
server
memory
• Create
min
seongs
for
each
instance
• Give
each
instance
memory
proporSonal
to
expected
workload
/
db
size
• The
sum
should
be
1-‐2
GB
less
than
RAM
allocated
to
VM
§ Seongs
can
be
modified
without
having
to
restart
the
instances
Pro
Con
Max
server
memory
When
a
new
process
or
instance
starts,
memory
is
available
immediately
to
fulfill
the
request
If
instances
are
not
running,
the
running
instances
cannot
access
the
available
RAM
Min
server
memory
Running
instances
can
leverage
memory
previously
used
by
instances
that
are
no
longer
running
When
a
new
process
or
instance
starts,
running
instances
need
to
release
memory
107. SQL
Server:
Memory
107
Lock Pages in Memory
■ This keeps SQL more responsive when paging occurs
■ SQL Server Lock Pages in Memory is ON in >= 32/64 bit Standard Edition (2012)
■ Account needs “Locked pages in Memory” rights
▪ Give it the RIGHTS
hxp://msdn.microsoU.com/
en-‐us/library/ms178067.aspx
108. Non-‐Uniform
Memory
Access
(NUMA)
• NUMA,
avoiding
the
performance
hit
when
several
processors
axempt
to
address
the
same
memory
by
providing
separate
memory
for
each
NUMA
Node.
• Speeds
up
Processing
• NUMA
Nodes
Specific
to
Each
Processor
Model
108
109. Non-‐Uniform
Memory
Access
(NUMA)
“All
Processors
Can
Use
All
Memory”
• 4
Sockets,
6
cores.
• 4
NUMA
Nodes
• 128
Gig
RAM
• Each
NUMA
Node
=
32
Gig
RAM
109
“In this example Optimal
Performance:
Each VM < 32GB*”
*CPU Overhead Needs
to be accounted for.
Minimal
*vNuma – Minimizes
Impact when this
happens
110. Home
Node
-‐
NUMA
The
home
node
for
a
virtual
machine
is
first
selected
considering
current
CPU
and
memory
load
across
all
NUMA
nodes.
Wide
NUMA
Allows
for
the
use
of
Mul3ple
NUMA
Nodes
Efficiently
Hot
Add
CPU
disables
vNUMA
****
Properly
Size
Database/Don’t
Need
Hot
Add
CPU
*****
110
111.
112. Memory
Allocated
to
VM
Is
Determined
by……
• DRS
Shares/Limits**
• Total
Memory
of
Host
• ReservaSons
• Memory
Load
of
the
Host
112
** Avoid shares/Limits
Unless you really understand
How they work
113. Swapping
Occurs
Two
Places
1. Guest
VM
Swapping
2. ESXi
Host
Swapping
113
Swapping can slow
down I/O performance
of disks for other VM’s
115. Is
Google
You
Best
Friend….
“There is the Google DBA,
The GUI DBA ,
or the DBA that does all the
work” Charles Kim
116. Ballooning
• Kicks
in
–
When
Physical
Host
experiencing
memory
contenSon
• Balloon
Driver
Runs
on
each
individual
VM
• Communicates
with
guest
O/S
to
determine
what
is
happening
with
memory
• Works
with
the
server
to
reclaim
pages
that
are
considered
least
valuable
by
the
guest
OS
117. Exceeding
Host
Memory
can
lead
to
ballooning,
Memory
Compression
or
Swapping
Swapping can slow down
I/O performance of disks
for otherVM’s
118. Don’t
Shut
Off
Memory
Ballooning
Ballooning
is
Your
First
Line
of
Defense
119. How
Many
VMs
can
I
Put
on
Host?
§ As
many
whose
acSve
memory
will
fit
in
physical
RAM,
while
leaving
some
room
for
memory
spikes.
120. Total
Memory
Demand
AcAve
memory
(%ACTV)
of
VM’s
+
Memory
Overhead
–
Page
sharing
of
VM’s
(DE-‐Duping)
DE-‐Duping
=
Transparent
Page
Sharing
121. Transparent
Page
Sharing
more
effecAve
The
more
similar
the
VM’s
are
“Put
Like
OperaAng
Systems
On
Same
Physical
Host”
122. TPS
–
When
It
Kicks
In
• Before
Ballooning
• Always
Running
on
preset
cycle
looking
for
opportunity
to
reclaim
memory
• Very
Low
Overhead
• Runs
At
HOST
Level
123. • This
is
incorrect
guidance
floaSng
around
the
Internet
–
Here’s
why:
Myth: Disable Memory TPS
124. Disable
Unnecessary
Foreground/Background
within
Guest
O/S
• Windows
Example
– Alerter,
AutomaSc
Updates,
clip
book,
error
reporSng
– Help
&
Support,
indexing
messenger,
netmeeSng
– Remote
desktop
– Once
Established
(Clone
for
reuse
by
Vmware)
124
Keep VM Footprint as small
as Possible: NUMA, Shared
Resource Pool
125. Memory
ReservaSons
• VM
is
only
allowed
to
power
on
if
the
CPU
&
memory
reservaSon
is
available
(Strict
admission)
• The
amount
of
memory
can
be
guaranteed
even
under
heavy
loads.
• SET
CPU/Not
Guaranteed
• VMware
HA
Strict
Admission
Control
–
Seongs
Can
Override
this
behavior
125
126. ReservaAons
Rock
!
• Set
the
appropriate
reservaSons
to
guarantee
physical
memory
for
the
VM.
• In
many
cases,
the
configured
size
and
reservaSon
size
could
be
the
same
127. Oracle Approximate Memory Architecture
Set the memory reservation to SGA size plus OS.
(Reservation & configured memory might be the same.)
Client sessions and context
SGA
(DB buffer cache, and others)
Operating System
VMConfigured
Memory
Instance
(PMON, SMON, DBWR, LGWR, CKPT,
others)
129. Large
Pages/Huge
Pages
-‐-‐
Broken
Down
at
Hypervisor
Level.
Not
Guest
O/S
“Large/Huge
PAGES Do
Not Normally
SWAP”
In the cases where host memory is overcommitted, ESX
may have to swap out pages. Since ESX will not swap
out large pages, during host swapping, a large page
will be broken into small pages. ESX tries to share
those small pages using the pre-generated hashes
before they are swapped out. The motivation of doing
this is that the overhead of breaking a shared page
is much smaller than the overhead of swapping in a
page if the page is accessed again in the future.
http://kb.vmware.com/kb/1021095
130. Oracle – Hugepages
/etc/security/limits.conf to set soft and hard limits.
oracle soft nofile 131072
oracle hard nofile 131072
oracle soft nproc 131072
oracle hard nproc 131072
oracle soft core unlimited
oracle hard core unlimited
# -- The following entries need to adjusted with HugePages settings
# oracle soft memlock 50000000
# oracle hard memlock 50000000
“HUGE PAGES Do Not Normally SWAP”
131. § Use
large
pages
in
the
guest
(start
SQL
Server
w/
Trace
flag
–T834)
SQL
Server
In-‐Guest
Memory
Best
PracSces
133. Memory
–
Puong
It
ALL
Together
• Do
not
overcommit
memory
for
producSon,
mission
criScal
SQL
Server
VMs
• Set
provision
memory
=
reservaSon
=
SQL
Server
max
server
memory
+
OS
memory
+
virtualizaSon
overhead
• Set
provision
memory
=
reservaSon
=
Oracle
SGA
+
OS
memory
+
virtualizaSon
overhead
• To
avoid
swapping,
memory
limit
should
never
be
set
below
the
provisioned
size.
Seong
memory
limit
is
not
recommended
in
general
• To
avoid
NUMA
remote
memory
access,
size
VM
memory
equal
to
or
less
than
the
memory
per
NUMA
node
if
possible
135. Jumbo
Frames
• Jumbo
frames
are
Ethernet
Frames
Ethernet
with
more
than
1500
bytes
of
payload.
ConvenSonally,
jumbo
frames
can
carry
up
to
9000
bytes
of
payload
136. Jumbo
Frames
The
original
1500-‐byte
payload
size
for
Ethernet
frames
was
used
because
of
the
high
error
rates
and
low
speed
of
communicaSons.
“Why The Picture Of A Typewriter Here?”
138. Enable
Jumbo
Frames
Check
to
see
Will
Suceed
ping
-‐M
do
-‐s
8972
-‐c
2
rac01a-‐priv
ping
-‐M
do
-‐s
8972
-‐c
2
rac01b-‐priv
ping
-‐M
do
-‐s
8972
-‐c
2
rac02a-‐priv
ping
-‐M
do
-‐s
8972
-‐c
2
rac02b-‐priv
PING
rac01a
(10.17.33.31)
8972(9000)
bytes
of
data.
8980
bytes
from
rac01a-‐priv
(10.17.33.31):
icmp_seq=1
xl=64
Sme=0.017
ms
8980
bytes
from
rac01a-‐priv
(10.17.33.31):
icmp_seq=2
xl=64
Sme=0.018
ms
Will
Fail
ping
-‐M
do
-‐s
8973
-‐c
2
rac01a-‐priv
ping
-‐M
do
-‐s
8973
-‐c
2
rac01b-‐priv
ping
-‐M
do
-‐s
8973
-‐c
2
rac02a-‐priv
ping
-‐M
do
-‐s
8973
-‐c
2
rac02b-‐priv
Make
sure:
switch
support
is
enabled
9000 Bytes
- 20 Bytes IP Header
- 8 Bytes of ICMP Header
“8192/64 = 128”
139. SQL
Server:
Network
Network
§ Default
packet
size
is
4,096
• If
jumbo
frames
are
available
for
the
enSre
stack,
set
packet
size
to
8,192
§ Maximize
Data
Throughput
for
Network
ApplicaSons
• Limit
file
system
cache
by
OS
• NIC
>
File
&
Printer
Sharing
MicrosoU
Networks
• Use
Minimize
Memory
or
Balance
hxp://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql-‐server-‐checklist.aspx
140. Jumbo
Frames
“Cost
of
Reducing
To
1500
Bytes
Then
Back
Again
is
Very
Expensive”
Splitting Is Bad
141. Network
–
Puong
All
Together
• Separate
SQL
workloads
with
chafy
network
traffic
(MicrosoU
Always
On
–
Are
you
there)
from
the
one
with
chunky
access
into
different
physical
NIC
• With
10Gbe
do
at
VLAN
level
(4Gig-‐E
NICs
(4Gb
total
vs
20Bg
total)
2
10Gbe
Nics)
• Separate
traffic
for
vMo.on,
service
console,
and
SQL
Server
at
physical
NIC
level
• 10Gbe
Sufficient
Bandwidth
at
Host
but
separate
by
VLAN
• Have
4
NICs
per
host
to
ensure
performance
and
redundancy
of
network
(Virtualized
Environment
=
Network
Heavy)
• Using
4
10Gbe
NIC’s
overkill
from
redundancy
perspecSve.
2
10
Gbe
Nic’s
Usually
enough
• vSphere
5.0
Introduced
ability
to
use
more
than
1
NIC
for
vMoAon.
(More
vMoi.ons
going
at
one
.me.
Added
specifically
for
memory
intensive
applica3ons,
ie:
Databases)
• Use
VMXNET3
(VMware
driver
–
reduces
physical
CPU
uSlizaSon)
142. AlwaysOn
Availability
Group
Cluster
Seongs
§ Depending
on
YOUR
network,
tuning
may
be
necessary
–
work
with
Network
Team
and
MicrosoU
to
determine
appropriate
seongs
Cluster
Heartbeat
Parameters
Default
Value
CrossSubnetDelay
1000
ms
CrossSubnetThreshold
5hb
SameSubnetDelay
1000
ms
SameSubnetThreshold
5
hb
View: cluster /cluster:<clustername> /prop
Modify: cluster /cluster:clustername> /prop <prop_name> = <value>
143. WSFC
–
Cluster
ValidaSon
Wizard
143
§ Use
this
to
validate
support
for
your
configuraSon
• Required
by
MicrosoU
Support
for
condiSon
of
support
for
YOUR
configuraSon
§ Run
this
before
installing
AAG
(AlwayOn
Availabilty
Group),
and
every
Sme
you
make
changes
• Save
resulSng
html
reports
for
reference
§ If
running
non-‐symmetrical
storage,
possible
hoƒixes
required
• hxp://msdn.microsoU.com/en-‐us/library/ff878487(SQL.110).aspx#
SystemReqsForAOAG
144. SQL
Server
Best
PracSce
Analyzer
144
§ Use
SQL
Server
Best
PracAce
Anaylzer
to
check
local
or
remote
systems
• If
running
against
remote
system,
issue
Enable-‐PSRemoAng
–f
via
PowerShell
on
the
target
system
• In
the
wizard,
don’t
click
“connect
to
remote
computer
on
Home
page
• On
Enter
Parameters
link,
enter
SQL
Server
under
Alternate_Server_to_Scan
• Select
opSons
• Scan