COMMENTS:
-- It is understandable that without my comments it may be challenging to understand the full content of the presentation.
-- For some most controversial slides (e.g. performance, dNFS setup) I put comments in the notes. Please try to check if there is a comment from me.
-- I am deliverable using dNFS (small cap ‘d’) in the slides to make it looks close to kNFS
-- In most available documentation the DNFS is used
-- For the AUSOUG conference I will have 45 minutes only.
-- AUSOUG conference is relatively small event (can’t compare to OOW most US events). Therefore there are different level participants. I should cover basics and touch some advanced topics at the same time. Can’t talk details as much as I would on OOW or Collaborate.
-- I would like to give more details on performance improvement results however I wouldn't have time to cover the rest than
-- Some slides have animation and I would advice you to run though it get an educated guess that I am going to talk about on those slides
<number>
<number>
NOTES:
-- This is true only if there are no other bottlenecks
-- We are talking about 1.2ms to 0.5ms difference here
-- Most often systems are limited by HW (Network equipment, Storage speed) and therefore we will not see 300% improvement
-- You may expect to see performance improvements somewhere between 0% and 300%
kNFS
Physical reads: 12,042.2 65,923.3
db file sequential read 791,093 1,370 2 97.6 User I/O
DB CPU 53 3.8
1.731781219 ms avg PIO
6.6995916/10000 CPU sec per PIO
dNFS
Physical reads: 37,298.0 193,878.0
db file sequential read 2,326,535 1,229 1 92.5 User I/O
DB CPU 312 23.5
0.52825339 ms avg PIO
13.410501/10000 CPU sec per PIO
Direct storage
Physical reads: 53,897.5 281,475.3
db file sequential read 3,377,685 1,224 0 91.9 User I/O
DB CPU 229 17.2
0.362378375 ms avg PIO
6.7797915/10000 CPU sec per PIO
Elapsed: 1.09 (mins)
DB Time(s): 21.4 116.9 0.36 4.94
DB CPU(s): 0.8 4.4 0.01 0.19
Logical reads: 12,140.1 66,458.8
Block changes: 41.8 228.8
Physical reads: 12,042.2 65,923.3
db file sequential read 791,093 1,370 2 97.6 User I/O
DB CPU 53 3.8
awr_0w_22r.20121023_201639.txt
Tue Oct 23 20:16:40 EDT 2012
real 1m13.117s
user 0m0.576s
sys 0m1.281s
Elapsed: 1.04 (mins)
DB Time(s): 21.3 110.7 0.13 4.68
DB CPU(s): 5.0 26.0 0.03 1.10
Logical reads: 37,408.2 194,450.9
Block changes: 33.3 173.0
Physical reads: 37,298.0 193,878.0
db file sequential read 2,326,535 1,229 1 92.5 User I/O
DB CPU 312 23.5
awr_0w_22r.20121023_203540.txt
Tue Oct 23 20:35:40 EDT 2012
Elapsed: 1.04 (mins)
DB Time(s): 21.3 111.0 0.09 4.69
DB CPU(s): 3.7 19.1 0.02 0.81
Logical reads: 54,761.2 285,985.7
Block changes: 40.5 211.4
Physical reads: 53,897.5 281,475.3
db file sequential read 3,377,685 1,224 0 91.9 User I/O
DB CPU 229 17.2
awr_0w_22r.20121023_183221.txt
Tue Oct 23 18:32:21 EDT 2012
<number>
Picture – source URL: http://www.bannerblog.com.au/news/picts/thumbs_up.jpg
Grid DBA – my friend, Leighton L. Nelson (@leight0nn) blogs about how he speeded up Data Pump using dNFS
“Direct NFS speeds up Data Pump”
http://blogs.griddba.com/2012/02/direct-nfs-speeds-up-data-pump.html
Any other references are welcome! Let me know about others good examples.
Why ASM in NAS ?
https://twitter.com/yvelikanov/status/260674761380749312
Yury@yvelikanov
Personally I don't see the point building ASM on (d)NFS. ASM suppose to exclude unnecessary layers. In NAS case it adds an additional layer.
Kevin Closson @kevinclosson
. @yvelikanov @netofrombrazil @leight0nn Simple, a) Standard Edition 1 and b) ASM striping between filers. "a" is mandatory.
Leighton L. Nelson @leight0nn
@yvelikanov @netofrombrazil easy storage migration?
Yury @yvelikanov
@leight0nn @netofrombrazil u can use Incr refreshable data files copies to migrate on a FS (short downtime). But I do agree. ASM no downtime
http://bit.ly/RTTkxn
Guenadi Jilevski@gjilevski
@netofrombrazil @yvelikanov @leight0nnEx.ASM on additional NFS for quorum of the vote disk in extended RAC clusters.
https://twitter.com/simon_haslam/status/260892761102901248
Simon Haslam@simon_haslam
@netofrombrazil @leight0nn @yvelikanovSo you're saying ASM on dNFS is good? Or just dNFS with anything (say, OMF instead of ASM)?
Yury @yvelikanov
@netofrombrazil @simon_haslam @leight0nn @NetApp The right question is: would you recommend naked NFS or ASM on top of NFS?
Yury @yvelikanov
@yvelikanov @netofrombrazil @simon_haslam @leight0nnOn top of @NetApp of cause. Or it doesn't matter for you as far as #NetApp is in use
neto from Brazil @netofrombrazil
@yvelikanov @simon_haslam @leight0nn @netapp NFS. ASM on top of D or K NFS only in special cases IMHO
REF: https://plus.google.com/u/1/107075205411714880234/posts/G7EPReaJGvF
Direct NFS does not support Oracle Clusterware files. REF: http://docs.oracle.com/cd/E11882_01/install.112/e22489/storage.htm#CDEBGJAGNFS file system on a certified NAS filer => OCR and Voting Disk Files => YeskNFS is supported => dNFS isn't for OCRsGood catch +Arup Nanda
<number>
Simplified - TCP still involved kernel THX to @fritshooglandfrom Twitter
=B===Martin Bach @MartinDBA DNFS Mostly in USER Mode (Good for VM solutions)
does it make sense to say that one of the main benefits is that using dNFS you stay (mostly?) in user mod, whereas knfs requires transitions into the kernel (want to trace using perf, but haven't had time).
According to James Morle dNFS is great for vmware where all user mode code is quick, but kernel transitions (requiring interrupts on x86) take longer since the hypervisor has to "trap" the instruction and translate it to be safe for multiple guests.
=E===Martin Bach @MartinDBA
Additional comment from @fritshoogland: Personally, I would draw a nfsd square which gets the line from io client, partly in/partly outside of kernel, and remove that with DNFS
CAN kNFS work as well as dNFS?
https://twitter.com/yvelikanov/status/260893343150653440
Yury@yvelikanov
@netofrombrazil @kevinclosson Did I get it right? We can tune kNFS to work on dNFS speed if we invest a lot of NFS expert time?
Additional comment from @MartinDBA: The only really cool think worth mentioning is that you have less transitions into kernel code which is good for vitrualized environments. I'm sure if you tested dNFS on a virtual machine it would clearly beat kNFS!
Additional comment from @kevinclosson: DNFS addresses circa-2004 NFS and bonding weaknesses specific Oracle I/O profile and OSs. Times change.
http://bit.ly/RTRLQ5
Kevin Closson @kevinclosson
.@netofrombrazil @leight0nn @yvelikanov Even though Solaris needs to die really badly, you might find Sol 11 x64 dNFS and kNFS show parity
neto from Brazil @netofrombrazil
@kevinclosson @leight0nn @yvelikanov :-) knfs well tuned works good :-)
neto from Brazil @netofrombrazil
@kevinclosson @leight0nn @yvelikanov OL works pretty well too. I've got 2GBytes per second with KNFS :-)
@kevinclosson @yvelikanov agreed you can tune well KNFS but DNFS you can have better optimization. But depends IO that you are generating
<number>
=B===Martin Bach @MartinDBA
you could also use ldd oracle to see the libraries compiled in.
additionally pmap or /proc/pid/smaps in RHEL 6.x
=E===Martin Bach @MartinDBA
<number>
This slide is intentionally made overcrowded. I have 2 goals here:
Provide references and state that the setup is documented with reasonable level of details
Make a joke :)
<number>
dNFS = Speed = High Availability = Scalability (?reduced CPU as we are skipping longer code path?)
http://docs.oracle.com/cd/E11882_01/install.112/e22489/storage.htm#CWLIN274
3.2.3.4 Specifying Network Paths with the Oranfstab File
Direct NFS can use up to four network paths defined in the oranfstab file for an NFS server. The Direct NFS client performs load balancing across all specified paths. If a specified path fails, then Direct NFS reissues I/O commands over any remaining paths.
Could bonding be a good alternative to dNFS?
https://twitter.com/gwenshap/statuses/260668780903026688
Gwen (Chen) Shapira@gwenshap
@yvelikanov you can bond 1Gb nics, but for each client-server pair, you will still only get 1Gb line. Tricky things :)
neto from Brazil @netofrombrazil
@yvelikanov @gwenshap @jantup 2 x 1Gbit max you can get 240MB/s either k or d [nfs]
https://twitter.com/gwenshap/statuses/260958482059112449
Gwen (Chen) Shapira@gwenshap
@netofrombrazil @kevinclosson@yvelikanov so this is one of the things you need to get right when configuring knfs, but dnfs handles for you
Gwen (Chen) Shapira @gwenshap
@netofrombrazil @kevinclosson @yvelikanov how do you get bonded 2Gb with knfs? Linux uses 802.3ad bonding. One link per conversation = 1Gb.
Details
6m43neto from Brazil @netofrombrazil
@gwenshap @kevinclosson @yvelikanov LACP mode 4
neto from Brazil@netofrombrazil
@gwenshap @kevinclosson @yvelikanovand other parameters like backlog, sun rpc etc...
https://twitter.com/kevinclosson/status/260959183900405760
Kevin Closson@kevinclosson
.@gwenshap @netofrombrazil @yvelikanovexactly also dnfs is a combined agg+failover. Simple wire. All my writings from 2006 are sexy now?
Kevin Closson@kevinclosson
. @yvelikanov @gwenshap good lord, gus, read the paper .. I didn't waste ink:http://www.oracle.com/technetwork/articles/directnfsclient-11gr1-twp-129785.pdf … who wrote that?
@netofrombrazil @gwenshap @jantup My questions to you is: Can !!! 1 !!! session get 240MB/s out of 2 x 1Gbit ?
Kevin Closson @kevinclosson
@netofrombrazil @yvelikanov @gwenshap @jantupdepends on your database host (nfs client) kernelhttps://www.google.com/search?num=100&hl=en&site=&source=hp&q=closson+%2Bdirect+NFS&oq=closson+%2Bdirect+NFS&gs_l=hp.3...2255.7134.0.7296.19.18.0.0.0.0.419.2714.4j5j2j3j1.15.0.les%3Bcesh..0.0...1.1.mDyX7Tu3XpI …
Kevin Closson @kevinclosson
@gwenshap @jantup @yvelikanov depends on the form of bonding. dNFS is really better for aggregating NICs.
neto from Brazil @netofrombrazil
@yvelikanov @gwenshap @jantup one tcp session or 1 thread?
Yury @yvelikanov
@netofrombrazil @gwenshap @jantup 1 database session (unix process, foreground process, full table scan)
neto from Brazil @netofrombrazil
@yvelikanov @gwenshap @jantup of course you can :-)
neto from Brazil @netofrombrazil
@netofrombrazil @yvelikanov @gwenshap @jantup one full table scan works if you have the right conf for db multi block read
Kevin Closson @kevinclosson
@yvelikanov @netofrombrazil @gwenshap @jantup a single foregound on modern CPU would have not trouble saturating 2x1GbE (240MB/s)
Kevin Closson @kevinclosson
. @netofrombrazil @yvelikanov @gwenshap @jantup No surprise. But, hold it, Manly Man doesn't use NFS for Oraclehttps://www.google.com/search?num=100&hl=en&biw=1507&bih=707&q=manly+man+NFS&oq=manly+man+NFS&gs_l=serp.3..0l10.12633532.12640635.0.12640847.13.13.0.0.0.0.453.2518.2j1j5j0j2.10.0.les%3Bcesh..0.0...1.1.6bgcFszDVok …
neto from Brazil @netofrombrazil
@netofrombrazil @kevinclosson @yvelikanov @gwenshap@jantup for TCP the rule is 1 Hz to process 1 bit - 1GHz process 1Gbit got it?
https://twitter.com/Djelibeybi/status/260669539027648512
Djelibeybi@Djelibeybi
@yvelikanov you can get more bandwidth with a LACP (802.3ad) bond of two or more NICs. Not a linear scale, though and needs switch support.
Leighton L. Nelson @leight0nn
@Djelibeybi @yvelikanov Read somewhere multiple paths on diff subnet recommended over LACP vif. Could be wrong.
<number>
Local: Is IP on the DB server you cant connections to go though
Path: Is IP on the Filler you can the connections to end up
Export, Mount – is a pair that finalise each block of information
You can have more than one (up to 4) Local/Path parameters specified
Use Dontroute if you about to use several pairs of IPs from the same network
Mnt_timeout is in seconds and defaults to 10 minutes. It seems a bit too high to me. I would set it to 1 minute (disclaimer: didn’t have much experience in failing NFS area)
You can specify block per mount
I suspect that there is a limit of IP connections each session can have. As of now I didn’t hit this limit. But at the same time I didn’t use too many connections per sesison (4 max as o now). If you have experience please let me know (@yvelikanov)
Special thanks to @pioro for comments on UEK
<number>
Local: Is IP on the DB server you cant connections to go though
Path: Is IP on the Filler you can the connections to end up
Export, Mount – is a pair that finalise each block of information
You can have more than one (up to 4) Local/Path parameters specified
Use Dontroute if you about to use several pairs of IPs from the same network
Mnt_timeout is in seconds and defaults to 10 minutes. It seems a bit too high to me. I would set it to 1 minute (disclaimer: didn’t have much experience in failing NFS area)
You can specify block per mount
I suspect that there is a limit of IP connections each session can have. As of now I didn’t hit this limit. But at the same time I didn’t use too many connections per sesison (4 max as o now). If you have experience please let me know (@yvelikanov)
Special thanks to @pioro for comments on UEK
<number>
<number>
<number>
<number>
<number>
<number>
Remove limit on the number of diskmon slaves [bug: 9842238]
ORACLE DATABASE WILL NOT OPEN [bug: 14383403]
LRGIONFS RUN IS FAILING ON WIN2K8 R2 [bug: 13689216]
DATABASE STARTUP AND QUERY TAKING HUGE TIME WHEN DNFS IS ENABLED [bug: 13510654]
LGWR hangs for long periods using DNFS - CF waits likely [bug: 9556189]
<number>