SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" Output
Emre Baransel – Advanced Support Engineer, Employee ACE- Oracle
A Deep Dive into ASM Redundancy in Exadata
A Deep Dive into ASM
Redundancy in Exadata
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" Output
Storage Server 1 Storage Server 2 Storage Server 3
We’ll consider 3 storage servers in examples
Storage Servers Notation
A Deep Dive into ASM
Redundancy in Exadata
12
1
2
3
4
5
6
7
8
9
10
11
Storage Server 1 Storage Server 2 Storage Server 3
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputDisks on Storage Servers
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
PHYSICAL DISC
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputPhysical Disks
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
SYSTEM PARTITIONS DBFS DG RECO DG DATA DG
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputLogical Partitions/Diskgroups
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
RECO DG DATA DG
GRID/ASM DISCS
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputGrid Disks (Partitions)
SYSTEM PARTITIONS DBFS DG
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
RECO DG DATA DG
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputDisks Usage Notation
SYSTEM PARTITIONS DBFS DG
A Deep Dive into ASM
Redundancy in Exadata
FAILGROUP 1 FAILGROUP 2 FAILGROUP 3
NORMAL REDUNDANCY
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputNormal Redundancy Diskgroups
A Deep Dive into ASM
Redundancy in Exadata
HIGH REDUNDANCY
FAILGROUP 1 FAILGROUP 2 FAILGROUP 3
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputHigh Redundancy Diskgroups
A Deep Dive into ASM
Redundancy in Exadata
- Disk Failure
- transient disk failure
- physical disk failure
- Storage Server Failure
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputTypes of Failures
This presentation examines failures in groups, in order to provide clarity. There may be exceptional cases.
A Deep Dive into ASM
Redundancy in Exadata
TRANSIENT FAILURE (OFFLINE)
Storage Server 1 Storage Server 2 Storage Server 3
RECO DG DATA DG
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputTransient Disk Failures
SYSTEM PARTITIONS DBFS DG
A Deep Dive into ASM
Redundancy in Exadata
FAILURE CORRECTED or NEW DISK
Storage Server 1 Storage Server 2 Storage Server 3
FAILURE CORRECTED or DISK REPLACED BEFORE DISK_REPAIR_TIME EXCEEDS
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputTransient Disk Failures
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
DISK IS RESYNCED WITH ASM FAST MIRROR RESYNC
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputTransient Disk Failures
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
IF DISK_REPAIR_TIME EXCEEDS THEN
ASM DROPS THE DISKS AND REBALANCE DATA IF THERE IS ENOUGH SPACE
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputTransient Disk Failures
A Deep Dive into ASM
Redundancy in Exadata
• DISK_REPAIR_TIME is a diskgroup attribute.
• Default is 3.6 hours.
• alter diskgroup data set attribute 'disk_repair_time' = '4.5h‘
• Altering the DISK_REPAIR_TIME attribute has no effect on offline disks
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputDISK_REPAIR_TIME Attribute
A Deep Dive into ASM
Redundancy in Exadata
PHYSICAL DISC FAILURE
Storage Server 1 Storage Server 2 Storage Server 3
RECO DG DATA DG
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputPhysical Disk Failures
SYSTEM PARTITIONS DBFS DG
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
ASM DOESN’T WAIT FOR DISK_REPAIR_TIME,
DROPS THE DISK AND REBALANCE DATA IF THERE IS ENOUGH SPACE
(Pro-Active Disk Quarantine - 11.2.1.3.1)
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputPhysical Disk Failures
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
WHEN DISK IS REPLACED GRID DISCS ARE CREATED & 2. REBALANCE STARTS AUTOMATICALLY
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputPhysical Disk Failures
A Deep Dive into ASM
Redundancy in Exadata
AUTO DISK MANAGEMENT feature in EXADATA
Exadata Automation Manager (XDMG)
initiates automation tasks. Monitors all configured storage cells for state changes.
Exadata Automation Worker (XDWK)
performs automation tasks requested by XDMG.
_AUTO_MANAGE_EXADATA_DISKS controls the auto disk management feature. To disable the feature
set this parameter to FALSE. Range of values: TRUE [default] or FALSE.
_AUTO_MANAGE_NUM_TRIES controls the maximum number of attempts to perform an automatic
operation. Range of values: 1-10. Default value is 2.
_AUTO_MANAGE_MAX_ONLINE_TRIES controls maximum number of attempts to ONLINE a disk.
Range of values: 1-10. Default value is 3.
NOTE:1484274.1 - Auto disk management feature in Exadata
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputAuto Disk Management
A Deep Dive into ASM
Redundancy in Exadata
F A I L E D
Storage Server 1 Storage Server 2 Storage Server 3
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputStorage Server Failures
A Deep Dive into ASM
Redundancy in Exadata
• WHEN A STORAGE SERVER FAILS IT MEANS THE FAILURE OF THE
WHOLE FAILGROUP IN ASM
• ASM DOES NOT DROP DISKS BEFORE DISK_REPAIR_TIME EXCEEDS
• SAME WHEN REBOOTING THE STORAGE SERVER
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputStorage Server Failures
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
IF SERVER IS ALIVE BEFORE DISK_REPAIR_TIME EXCEEDS,
DISKS WILL BE SYNCED – NO REBALANCE
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputStorage Server Failures
A Deep Dive into ASM
Redundancy in Exadata
F A I L E D
Storage Server 1 Storage Server 2 Storage Server 3
IF DISK_REPAIR_TIME EXCEEDS,
ASM WILL REBALANCE DATA IF THERE IS ENOUGH SPACE
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputStorage Server Failures
A Deep Dive into ASM
Redundancy in Exadata
Storage Server 1 Storage Server 2 Storage Server 3
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputStorage Server Failures
WHEN STORAGE SERVER COMES BACK THERE WILL BE A SECOND REBALANCE
A Deep Dive into ASM
Redundancy in Exadata
In Normal Redundancy;
What happens at second failure, is first related with when it occurs.
- If after rebalance/sync is completed,
then procedure is same with the first failure.
- If before rebalance/sync is completed,
then what happens is related with which disk is failed.
- If first & second failed disks are not partner disks, a new rebalance is
in question, if there’s enough space
- If first & second failed disks are partner disks data loss occurs.
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputSecond Failure / Bad Chance
• This is a small possibility but needs consideration.
• Partner disks are on different storage servers (failgroups).
• First incident doesn’t have to be a failure, storage server reboot causes the same.
Exadata Database Machine : How to identify cell failgroups and Partner disks for a grid disk (Doc ID 1431697.1)
A Deep Dive into ASM
Redundancy in Exadata
In High Redundancy;
There are three copies of each extent
So second failure never cause a data loss in High Redundancy
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputSecond Failure / Bad Chance
A Deep Dive into ASM
Redundancy in Exadata
”MOUNT RESTRICTED FORCE FOR RECOVERY” feature
>= 11.2.0.4 BP16
>= 12.1.0.2 BP4
Applicable to NORMAL redundancy diskgroups only.
Potential Use Cases that this procedure will be applicable to :
1. Exadata cell rolling upgrade/patching and a partner disk failure at the same time
2. Transient disk failure in a cell followed by a permanent partner disk failure before the first failed disk
comes back online.
NOTE:1968642.1 - Recover from diskgroup failure using the 12.1.0.2 “mount restricted force for recovery” feature - An Example
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputA New Feature
A Deep Dive into ASM
Redundancy in Exadata
”MOUNT RESTRICTED FORCE FOR RECOVERY” example:
o Cell 1  CellCLI> Alter cell shutdown services all;
o Cell 2  alter physicaldisk <disk> simulate failureType=failed;  database crashes
o SQL> alter diskgroup datac1 mount restricted force for recovery;
o CellCLI> Alter cell start services all;
o SQL> alter diskgroup datac1 online disks in failgroup CELLFG1;
o Wait until MODE_STATUS column in v$asm_disk for the disks being onlined changes to
ONLINE from SYNCING.
o Do NOT execute the subsequent steps if the mode_status column shows SYNCING. It
will lead to data corruption.
o In resync, due to the second disk failure, Arb0 will not be able to read some of the required extents
(which are in the failed second disk) and hence marks those missing extents with BADFDA7A.
(arb0 trace file => WARNING: group 1, file 258, extent 100: filling extent with BADFDA7A during recovery)
o SQL> alter diskgroup datac1 dismount;
SQL> alter diskgroup datac1 mount;
o Start database & Perform RMAN block media recovery
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputExample Procedure
A Deep Dive into ASM
Redundancy in Exadata
In an Exadata ASM Diskgroup, we can mention following disk spaces:
Total Raw Size (TRS)
Used Raw Size (URS)
Free Raw Size (FRS)
Total Allocatable Size (TAS)  TRS / Redundancy Factor
Used Allocatable Size (UAS)  URS / Redundancy Factor
Free Allocatable Size (FAS)  FRS / Redundancy Factor
Size Needed for Disk Failure Coverage (SNDFC)  Largest Disk (or 2 Disks for High R.)
Size Needed for Cell Failure Coverage (SNCFC)  Largest Cell (or 2 Cells for High R.)
Total Disk Failure Safe Allocatable Size  (TRS - SNDFC) / Redundancy Factor
Total Cell Failure Safe Allocatable Size  (TRS - SNCFC) / Redundancy Factor
Free Disk Failure Safe Allocatable Size  (FRS - SNDFC) / Redundancy Factor
Free Cell Failure Safe Allocatable Size  (FRS - SNCFC) / Redundancy Factor
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputWhat kind of Usable Space?
A Deep Dive into ASM
Redundancy in Exadata
Total Raw Size (TRS) 360
Used Raw Size (URS) 120
Free Raw Size (FRS) 240
Total Allocatable Size (TAS) TRS / 2 = 180
Used Allocatable Size (UAS) URS / 2 = 60
Free Allocatable Size (FAS) FRS / 2 = 120
Size Needed for Disk Failure Coverage (SNDFC) 10
Size Needed for Cell Failure Coverage (SNCFC) 120
Total Disk Failure Safe Allocatable Size (TRS - SNDFC) / 2 = 175
Total Cell Failure Safe Allocatable Size (TRS - SNCFC) / 2 = 120
Free Disk Failure Safe Allocatable Size (FRS - SNDFC) / 2 = 115
Free Cell Failure Safe Allocatable Size (FRS - SNCFC) / 2 = 60
Normal Redundancy
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputCalculations for Normal Redundancy
A Deep Dive into ASM
Redundancy in Exadata
Total Raw Size (TRS) 360 360
Used Raw Size (URS) 120 120
Free Raw Size (FRS) 240 240
Total Allocatable Size (TAS) TRS / 2 = 180 TRS / 3 = 120
Used Allocatable Size (UAS) URS / 2 = 60 URS / 3 = 40
Free Allocatable Size (FAS) FRS / 2 = 120 FRS / 3 = 80
Size Needed for Disk Failure Coverage (SNDFC) 10 20
Size Needed for Cell Failure Coverage (SNCFC) 120 240
Total Disk Failure Safe Allocatable Size (TRS - SNDFC) / 2 = 175 (TRS - SNDFC) / 3 = 113.3
Total Cell Failure Safe Allocatable Size (TRS - SNCFC) / 2 = 120 N/A for Quarter Rack
Free Disk Failure Safe Allocatable Size (FRS - SNDFC) / 2 = 115 (FRS - SNDFC) / 3 = 73.3
Free Cell Failure Safe Allocatable Size (FRS - SNCFC) / 2 = 60 N/A for Quarter Rack
Normal Redundancy High Redundancy
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputCalculations for High Redundancy
A Deep Dive into ASM
Redundancy in Exadata
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED NORMAL N 512 4096 4194304 27942912 16708892 9314304 3697294 0 N DATAC1/
MOUNTED NORMAL N 512 4096 4194304 1038240 1036984 346080 345452 0 Y DBFS_DG/
MOUNTED NORMAL N 512 4096 4194304 11973312 7966060 3991104 1987478 0 N RECOC1/
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" OutputWhat we have in ASMCMD
Total_MB  Total Raw Size (TRS)
Free_MB  Free Raw Size (FRS)
Req_mir_free_MB  ≥11.2.0.4.9 & ≥ 12.1.0.2  Size Needed for Disk Failure Coverage (SNDFC)
<11.2.0.4.9 & <12.1.0.2  Size Needed for Cell Failure Coverage (SNCFC)
Usable_file_MB  ≥11.2.0.4.9 & ≥ 12.1.0.2  Free Disk Failure Safe Allocatable Size
≥11.2.0.4.9 & ≥ 12.1.0.2  Free Cell Failure Safe Allocatable Size
A Deep Dive into ASM
Redundancy in Exadata
References
1 – Overview
2 – Failure
3 – Second Failure
4 – Usable Space
5 – ASMCMD "lsdg" Output
Oracle Exadata Database Machine Maintenance Guide
Automatic Storage Management Administrator's Guide
NOTE:1484274.1 - Auto disk management feature in Exadata
NOTE: 443835.1 - ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk
NOTE:1431697.1 - Exadata Database Machine : How to identify cell failgroups and Partner disks for a grid disk
NOTE:1968642.1 - Recover from diskgroup failure using the 12.1.0.2 “mount restricted force for recovery” feature - An Example
NOTE:1386147.1 - How to Replace a Hard Drive in an Exadata Storage Server (Hard Failure)
NOTE:1339373.1 - Operational Steps for Recovery after Losing a Disk Group in an Exadata Environment
NOTE:1551288.1 - Understanding ASM Capacity and Reservation of Free Space in Exadata
NOTE:1319567.1 - ASM Usable Space Calculations in Exadata Environment along with cell failure considerations

Contenu connexe

Tendances

Active dataguard
Active dataguardActive dataguard
Active dataguard
Manoj Kumar
 
Direct SGA access without SQL
Direct SGA access without SQLDirect SGA access without SQL
Direct SGA access without SQL
Kyle Hailey
 

Tendances (20)

Oracle 12c Multitenant architecture
Oracle 12c Multitenant architectureOracle 12c Multitenant architecture
Oracle 12c Multitenant architecture
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuning
 
Oracle Database 12c : Multitenant
Oracle Database 12c : MultitenantOracle Database 12c : Multitenant
Oracle Database 12c : Multitenant
 
Active dataguard
Active dataguardActive dataguard
Active dataguard
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals   part 2 - slidesUnderstanding oracle rac internals   part 2 - slides
Understanding oracle rac internals part 2 - slides
 
Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS process
 
Exploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12cExploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12c
 
TFA Collector - what can one do with it
TFA Collector - what can one do with it TFA Collector - what can one do with it
TFA Collector - what can one do with it
 
AIOUG : OTNYathra - Troubleshooting and Diagnosing Oracle Database 12.2 and O...
AIOUG : OTNYathra - Troubleshooting and Diagnosing Oracle Database 12.2 and O...AIOUG : OTNYathra - Troubleshooting and Diagnosing Oracle Database 12.2 and O...
AIOUG : OTNYathra - Troubleshooting and Diagnosing Oracle Database 12.2 and O...
 
Oracle ASM Training
Oracle ASM TrainingOracle ASM Training
Oracle ASM Training
 
Direct SGA access without SQL
Direct SGA access without SQLDirect SGA access without SQL
Direct SGA access without SQL
 
Same plan different performance
Same plan different performanceSame plan different performance
Same plan different performance
 
Smart monitoring how does oracle rac manage resource, state ukoug19
Smart monitoring how does oracle rac manage resource, state ukoug19Smart monitoring how does oracle rac manage resource, state ukoug19
Smart monitoring how does oracle rac manage resource, state ukoug19
 
Understanding oracle rac internals part 1 - slides
Understanding oracle rac internals   part 1 - slidesUnderstanding oracle rac internals   part 1 - slides
Understanding oracle rac internals part 1 - slides
 
Exadata master series_asm_2020
Exadata master series_asm_2020Exadata master series_asm_2020
Exadata master series_asm_2020
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
 
AWR and ASH Deep Dive
AWR and ASH Deep DiveAWR and ASH Deep Dive
AWR and ASH Deep Dive
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
Ash architecture and advanced usage rmoug2014
Ash architecture and advanced usage rmoug2014Ash architecture and advanced usage rmoug2014
Ash architecture and advanced usage rmoug2014
 

En vedette

Oracle Applications R12 Architecture
Oracle Applications R12 ArchitectureOracle Applications R12 Architecture
Oracle Applications R12 Architecture
Viveka Solutions
 

En vedette (12)

Oracle Active Data Guard 12c New Features
Oracle Active Data Guard 12c New FeaturesOracle Active Data Guard 12c New Features
Oracle Active Data Guard 12c New Features
 
Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012
 
12 Things About WebLogic 12.1.3 #oow2014 #otnla15
12 Things About WebLogic 12.1.3 #oow2014 #otnla1512 Things About WebLogic 12.1.3 #oow2014 #otnla15
12 Things About WebLogic 12.1.3 #oow2014 #otnla15
 
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
 
Oracle 12.2 sharding learning more
Oracle 12.2 sharding learning moreOracle 12.2 sharding learning more
Oracle 12.2 sharding learning more
 
OOW15 - Advanced Architectures for Oracle E-Business Suite
OOW15 - Advanced Architectures for Oracle E-Business SuiteOOW15 - Advanced Architectures for Oracle E-Business Suite
OOW15 - Advanced Architectures for Oracle E-Business Suite
 
Dataguard presentation
Dataguard presentationDataguard presentation
Dataguard presentation
 
Oracle 12.2 sharded database management
Oracle 12.2 sharded database managementOracle 12.2 sharded database management
Oracle 12.2 sharded database management
 
Oracle Applications R12 Architecture
Oracle Applications R12 ArchitectureOracle Applications R12 Architecture
Oracle Applications R12 Architecture
 
RMAN best practices for RAC
RMAN best practices for RACRMAN best practices for RAC
RMAN best practices for RAC
 
Oracle 11g R2 RAC implementation and concept
Oracle 11g R2 RAC implementation and conceptOracle 11g R2 RAC implementation and concept
Oracle 11g R2 RAC implementation and concept
 
Oracle database 12c new features
Oracle database 12c new featuresOracle database 12c new features
Oracle database 12c new features
 

Similaire à A Deep Dive into ASM Redundancy in Exadata

Similaire à A Deep Dive into ASM Redundancy in Exadata (20)

RAC.docx
RAC.docxRAC.docx
RAC.docx
 
SAOUG - Connect 2014 - Flex Cluster and Flex ASM
SAOUG - Connect 2014 - Flex Cluster and Flex ASMSAOUG - Connect 2014 - Flex Cluster and Flex ASM
SAOUG - Connect 2014 - Flex Cluster and Flex ASM
 
les12.pdf
les12.pdfles12.pdf
les12.pdf
 
RAC+ASM: Stories to Share
RAC+ASM: Stories to ShareRAC+ASM: Stories to Share
RAC+ASM: Stories to Share
 
Redis trouble shooting_eng
Redis trouble shooting_engRedis trouble shooting_eng
Redis trouble shooting_eng
 
rac_for_beginners_ppt.pdf
rac_for_beginners_ppt.pdfrac_for_beginners_ppt.pdf
rac_for_beginners_ppt.pdf
 
1.8 Data Protection.pdf
1.8 Data Protection.pdf1.8 Data Protection.pdf
1.8 Data Protection.pdf
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
VMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSANVMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSAN
 
Oracle Flex ASM - What’s New and Best Practices by Jim Williams
Oracle Flex ASM - What’s New and Best Practices by Jim WilliamsOracle Flex ASM - What’s New and Best Practices by Jim Williams
Oracle Flex ASM - What’s New and Best Practices by Jim Williams
 
Oracle prm dul, jvm and os
Oracle prm dul, jvm and osOracle prm dul, jvm and os
Oracle prm dul, jvm and os
 
Making MySQL highly available using Oracle Grid Infrastructure
Making MySQL highly available using Oracle Grid InfrastructureMaking MySQL highly available using Oracle Grid Infrastructure
Making MySQL highly available using Oracle Grid Infrastructure
 
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档
 
Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...
Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...
Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...
 
Power Loss Data Protection of SSD
Power Loss Data Protection of SSDPower Loss Data Protection of SSD
Power Loss Data Protection of SSD
 
Oracle12c flex asm_flexcluster - Y V RAVI KUMAR
Oracle12c flex asm_flexcluster - Y V RAVI KUMAROracle12c flex asm_flexcluster - Y V RAVI KUMAR
Oracle12c flex asm_flexcluster - Y V RAVI KUMAR
 
Operating Systems: Revision
Operating Systems: RevisionOperating Systems: Revision
Operating Systems: Revision
 
12c Flex ASM: Moving to Flex ASM
12c Flex ASM: Moving to Flex ASM12c Flex ASM: Moving to Flex ASM
12c Flex ASM: Moving to Flex ASM
 
Fail-Safe Cluster for FirebirdSQL and something more
Fail-Safe Cluster for FirebirdSQL and something moreFail-Safe Cluster for FirebirdSQL and something more
Fail-Safe Cluster for FirebirdSQL and something more
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

A Deep Dive into ASM Redundancy in Exadata

  • 1. 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" Output Emre Baransel – Advanced Support Engineer, Employee ACE- Oracle A Deep Dive into ASM Redundancy in Exadata
  • 2. A Deep Dive into ASM Redundancy in Exadata 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" Output Storage Server 1 Storage Server 2 Storage Server 3 We’ll consider 3 storage servers in examples Storage Servers Notation
  • 3. A Deep Dive into ASM Redundancy in Exadata 12 1 2 3 4 5 6 7 8 9 10 11 Storage Server 1 Storage Server 2 Storage Server 3 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputDisks on Storage Servers
  • 4. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 PHYSICAL DISC 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputPhysical Disks
  • 5. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 SYSTEM PARTITIONS DBFS DG RECO DG DATA DG 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputLogical Partitions/Diskgroups
  • 6. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 RECO DG DATA DG GRID/ASM DISCS 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputGrid Disks (Partitions) SYSTEM PARTITIONS DBFS DG
  • 7. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 RECO DG DATA DG 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputDisks Usage Notation SYSTEM PARTITIONS DBFS DG
  • 8. A Deep Dive into ASM Redundancy in Exadata FAILGROUP 1 FAILGROUP 2 FAILGROUP 3 NORMAL REDUNDANCY 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputNormal Redundancy Diskgroups
  • 9. A Deep Dive into ASM Redundancy in Exadata HIGH REDUNDANCY FAILGROUP 1 FAILGROUP 2 FAILGROUP 3 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputHigh Redundancy Diskgroups
  • 10. A Deep Dive into ASM Redundancy in Exadata - Disk Failure - transient disk failure - physical disk failure - Storage Server Failure 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputTypes of Failures This presentation examines failures in groups, in order to provide clarity. There may be exceptional cases.
  • 11. A Deep Dive into ASM Redundancy in Exadata TRANSIENT FAILURE (OFFLINE) Storage Server 1 Storage Server 2 Storage Server 3 RECO DG DATA DG 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputTransient Disk Failures SYSTEM PARTITIONS DBFS DG
  • 12. A Deep Dive into ASM Redundancy in Exadata FAILURE CORRECTED or NEW DISK Storage Server 1 Storage Server 2 Storage Server 3 FAILURE CORRECTED or DISK REPLACED BEFORE DISK_REPAIR_TIME EXCEEDS 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputTransient Disk Failures
  • 13. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 DISK IS RESYNCED WITH ASM FAST MIRROR RESYNC 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputTransient Disk Failures
  • 14. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 IF DISK_REPAIR_TIME EXCEEDS THEN ASM DROPS THE DISKS AND REBALANCE DATA IF THERE IS ENOUGH SPACE 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputTransient Disk Failures
  • 15. A Deep Dive into ASM Redundancy in Exadata • DISK_REPAIR_TIME is a diskgroup attribute. • Default is 3.6 hours. • alter diskgroup data set attribute 'disk_repair_time' = '4.5h‘ • Altering the DISK_REPAIR_TIME attribute has no effect on offline disks 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputDISK_REPAIR_TIME Attribute
  • 16. A Deep Dive into ASM Redundancy in Exadata PHYSICAL DISC FAILURE Storage Server 1 Storage Server 2 Storage Server 3 RECO DG DATA DG 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputPhysical Disk Failures SYSTEM PARTITIONS DBFS DG
  • 17. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 ASM DOESN’T WAIT FOR DISK_REPAIR_TIME, DROPS THE DISK AND REBALANCE DATA IF THERE IS ENOUGH SPACE (Pro-Active Disk Quarantine - 11.2.1.3.1) 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputPhysical Disk Failures
  • 18. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 WHEN DISK IS REPLACED GRID DISCS ARE CREATED & 2. REBALANCE STARTS AUTOMATICALLY 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputPhysical Disk Failures
  • 19. A Deep Dive into ASM Redundancy in Exadata AUTO DISK MANAGEMENT feature in EXADATA Exadata Automation Manager (XDMG) initiates automation tasks. Monitors all configured storage cells for state changes. Exadata Automation Worker (XDWK) performs automation tasks requested by XDMG. _AUTO_MANAGE_EXADATA_DISKS controls the auto disk management feature. To disable the feature set this parameter to FALSE. Range of values: TRUE [default] or FALSE. _AUTO_MANAGE_NUM_TRIES controls the maximum number of attempts to perform an automatic operation. Range of values: 1-10. Default value is 2. _AUTO_MANAGE_MAX_ONLINE_TRIES controls maximum number of attempts to ONLINE a disk. Range of values: 1-10. Default value is 3. NOTE:1484274.1 - Auto disk management feature in Exadata 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputAuto Disk Management
  • 20. A Deep Dive into ASM Redundancy in Exadata F A I L E D Storage Server 1 Storage Server 2 Storage Server 3 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputStorage Server Failures
  • 21. A Deep Dive into ASM Redundancy in Exadata • WHEN A STORAGE SERVER FAILS IT MEANS THE FAILURE OF THE WHOLE FAILGROUP IN ASM • ASM DOES NOT DROP DISKS BEFORE DISK_REPAIR_TIME EXCEEDS • SAME WHEN REBOOTING THE STORAGE SERVER 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputStorage Server Failures
  • 22. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 IF SERVER IS ALIVE BEFORE DISK_REPAIR_TIME EXCEEDS, DISKS WILL BE SYNCED – NO REBALANCE 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputStorage Server Failures
  • 23. A Deep Dive into ASM Redundancy in Exadata F A I L E D Storage Server 1 Storage Server 2 Storage Server 3 IF DISK_REPAIR_TIME EXCEEDS, ASM WILL REBALANCE DATA IF THERE IS ENOUGH SPACE 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputStorage Server Failures
  • 24. A Deep Dive into ASM Redundancy in Exadata Storage Server 1 Storage Server 2 Storage Server 3 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputStorage Server Failures WHEN STORAGE SERVER COMES BACK THERE WILL BE A SECOND REBALANCE
  • 25. A Deep Dive into ASM Redundancy in Exadata In Normal Redundancy; What happens at second failure, is first related with when it occurs. - If after rebalance/sync is completed, then procedure is same with the first failure. - If before rebalance/sync is completed, then what happens is related with which disk is failed. - If first & second failed disks are not partner disks, a new rebalance is in question, if there’s enough space - If first & second failed disks are partner disks data loss occurs. 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputSecond Failure / Bad Chance • This is a small possibility but needs consideration. • Partner disks are on different storage servers (failgroups). • First incident doesn’t have to be a failure, storage server reboot causes the same. Exadata Database Machine : How to identify cell failgroups and Partner disks for a grid disk (Doc ID 1431697.1)
  • 26. A Deep Dive into ASM Redundancy in Exadata In High Redundancy; There are three copies of each extent So second failure never cause a data loss in High Redundancy 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputSecond Failure / Bad Chance
  • 27. A Deep Dive into ASM Redundancy in Exadata ”MOUNT RESTRICTED FORCE FOR RECOVERY” feature >= 11.2.0.4 BP16 >= 12.1.0.2 BP4 Applicable to NORMAL redundancy diskgroups only. Potential Use Cases that this procedure will be applicable to : 1. Exadata cell rolling upgrade/patching and a partner disk failure at the same time 2. Transient disk failure in a cell followed by a permanent partner disk failure before the first failed disk comes back online. NOTE:1968642.1 - Recover from diskgroup failure using the 12.1.0.2 “mount restricted force for recovery” feature - An Example 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputA New Feature
  • 28. A Deep Dive into ASM Redundancy in Exadata ”MOUNT RESTRICTED FORCE FOR RECOVERY” example: o Cell 1  CellCLI> Alter cell shutdown services all; o Cell 2  alter physicaldisk <disk> simulate failureType=failed;  database crashes o SQL> alter diskgroup datac1 mount restricted force for recovery; o CellCLI> Alter cell start services all; o SQL> alter diskgroup datac1 online disks in failgroup CELLFG1; o Wait until MODE_STATUS column in v$asm_disk for the disks being onlined changes to ONLINE from SYNCING. o Do NOT execute the subsequent steps if the mode_status column shows SYNCING. It will lead to data corruption. o In resync, due to the second disk failure, Arb0 will not be able to read some of the required extents (which are in the failed second disk) and hence marks those missing extents with BADFDA7A. (arb0 trace file => WARNING: group 1, file 258, extent 100: filling extent with BADFDA7A during recovery) o SQL> alter diskgroup datac1 dismount; SQL> alter diskgroup datac1 mount; o Start database & Perform RMAN block media recovery 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputExample Procedure
  • 29. A Deep Dive into ASM Redundancy in Exadata In an Exadata ASM Diskgroup, we can mention following disk spaces: Total Raw Size (TRS) Used Raw Size (URS) Free Raw Size (FRS) Total Allocatable Size (TAS)  TRS / Redundancy Factor Used Allocatable Size (UAS)  URS / Redundancy Factor Free Allocatable Size (FAS)  FRS / Redundancy Factor Size Needed for Disk Failure Coverage (SNDFC)  Largest Disk (or 2 Disks for High R.) Size Needed for Cell Failure Coverage (SNCFC)  Largest Cell (or 2 Cells for High R.) Total Disk Failure Safe Allocatable Size  (TRS - SNDFC) / Redundancy Factor Total Cell Failure Safe Allocatable Size  (TRS - SNCFC) / Redundancy Factor Free Disk Failure Safe Allocatable Size  (FRS - SNDFC) / Redundancy Factor Free Cell Failure Safe Allocatable Size  (FRS - SNCFC) / Redundancy Factor 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputWhat kind of Usable Space?
  • 30. A Deep Dive into ASM Redundancy in Exadata Total Raw Size (TRS) 360 Used Raw Size (URS) 120 Free Raw Size (FRS) 240 Total Allocatable Size (TAS) TRS / 2 = 180 Used Allocatable Size (UAS) URS / 2 = 60 Free Allocatable Size (FAS) FRS / 2 = 120 Size Needed for Disk Failure Coverage (SNDFC) 10 Size Needed for Cell Failure Coverage (SNCFC) 120 Total Disk Failure Safe Allocatable Size (TRS - SNDFC) / 2 = 175 Total Cell Failure Safe Allocatable Size (TRS - SNCFC) / 2 = 120 Free Disk Failure Safe Allocatable Size (FRS - SNDFC) / 2 = 115 Free Cell Failure Safe Allocatable Size (FRS - SNCFC) / 2 = 60 Normal Redundancy 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputCalculations for Normal Redundancy
  • 31. A Deep Dive into ASM Redundancy in Exadata Total Raw Size (TRS) 360 360 Used Raw Size (URS) 120 120 Free Raw Size (FRS) 240 240 Total Allocatable Size (TAS) TRS / 2 = 180 TRS / 3 = 120 Used Allocatable Size (UAS) URS / 2 = 60 URS / 3 = 40 Free Allocatable Size (FAS) FRS / 2 = 120 FRS / 3 = 80 Size Needed for Disk Failure Coverage (SNDFC) 10 20 Size Needed for Cell Failure Coverage (SNCFC) 120 240 Total Disk Failure Safe Allocatable Size (TRS - SNDFC) / 2 = 175 (TRS - SNDFC) / 3 = 113.3 Total Cell Failure Safe Allocatable Size (TRS - SNCFC) / 2 = 120 N/A for Quarter Rack Free Disk Failure Safe Allocatable Size (FRS - SNDFC) / 2 = 115 (FRS - SNDFC) / 3 = 73.3 Free Cell Failure Safe Allocatable Size (FRS - SNCFC) / 2 = 60 N/A for Quarter Rack Normal Redundancy High Redundancy 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputCalculations for High Redundancy
  • 32. A Deep Dive into ASM Redundancy in Exadata ASMCMD> lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 27942912 16708892 9314304 3697294 0 N DATAC1/ MOUNTED NORMAL N 512 4096 4194304 1038240 1036984 346080 345452 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 11973312 7966060 3991104 1987478 0 N RECOC1/ 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" OutputWhat we have in ASMCMD Total_MB  Total Raw Size (TRS) Free_MB  Free Raw Size (FRS) Req_mir_free_MB  ≥11.2.0.4.9 & ≥ 12.1.0.2  Size Needed for Disk Failure Coverage (SNDFC) <11.2.0.4.9 & <12.1.0.2  Size Needed for Cell Failure Coverage (SNCFC) Usable_file_MB  ≥11.2.0.4.9 & ≥ 12.1.0.2  Free Disk Failure Safe Allocatable Size ≥11.2.0.4.9 & ≥ 12.1.0.2  Free Cell Failure Safe Allocatable Size
  • 33. A Deep Dive into ASM Redundancy in Exadata References 1 – Overview 2 – Failure 3 – Second Failure 4 – Usable Space 5 – ASMCMD "lsdg" Output Oracle Exadata Database Machine Maintenance Guide Automatic Storage Management Administrator's Guide NOTE:1484274.1 - Auto disk management feature in Exadata NOTE: 443835.1 - ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk NOTE:1431697.1 - Exadata Database Machine : How to identify cell failgroups and Partner disks for a grid disk NOTE:1968642.1 - Recover from diskgroup failure using the 12.1.0.2 “mount restricted force for recovery” feature - An Example NOTE:1386147.1 - How to Replace a Hard Drive in an Exadata Storage Server (Hard Failure) NOTE:1339373.1 - Operational Steps for Recovery after Losing a Disk Group in an Exadata Environment NOTE:1551288.1 - Understanding ASM Capacity and Reservation of Free Space in Exadata NOTE:1319567.1 - ASM Usable Space Calculations in Exadata Environment along with cell failure considerations