SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Never a DULL Moment
 How to Avoid Costly Data Recovery


                              RMOUG QEW
                             November 2008
Who am I?
Daniel Fink
  Oracle DBA since 1996
  Diagnosis, Optimization, Data Recovery and
  Training
  Member of Oak Table, BAARF and BAAG

www.optimaldba.com
daniel.fink@optimaldba.com
Agenda
  DULs
  Recoveries
  Case Studies
      Worst Practices
      Best Practices

Some case studies provided by Kurt Van Meerbeeck (www.ora600.be)
Never a DULL Moment
DUL – Data UnLoader
  Extract data from a down database
Option of last resort
  Downtime
  Expensive
  May or May not work
Why do you need a DUL?
Seed
 Incorrect Configurations
 Poor Policies/Procedures
 Inflexible Processes
 Lack of Security
Trigger
 Human Error
 Technology Failure
Bullet Proof Backups
Simply don’t exist
  There will always be a point of failure
Keep it simple, but thorough
  Added complexity = Added risk
  Change management
Protect redo
  Once redo is lost, recovery stops
An Unrecovered Backup
Is No Backup At ALL!
Recovery is Job One
 “Contrary to common opinion, a DBA does not have a
 responsibility to back up a database. The DBA’s real
 responsibility is to be able to recover the database.”
            Essential Oracle8i Data Warehousing (Dodge/Gorman)
 “The actual responsibility is to restore or recover the
 database to the point in time and within the downtime window
 determined by the business needs.”
            Real Life Recovery (RMOUG Training Days 1999)
Audience Participation
Today
 Did you check your backup log?
This Week/Month
 Did you check the backup process?
 Did you recover a backup?
Ever
 Did you check your backup log?
 Did you recover a backup?
Best Core Practices
Find Recovery Opportunities
  Environment Refreshes
  Upgrade/Patch Testing
  Disaster Recovery Training
Every single case study presented would
have been avoided if they had tested
recovery
Best Core Practices
You have known, good processes
 This does not mean every backup is good
 Always test after any changes
You have documented the processes
 Help when thinking is not clear
Best Core Practices
Prevention
  Audits
  Implementation Checklists
Find Opportunities to Recover
  Refreshes
  DBA sandboxes
Case Studies
Situation
  Summary of the issue
Seed
  A condtion that is present
Trigger
  An event that causes the failure
Red Flag
  A “recognized” indication of a future
  problem
“Hot” Backups
Files were not being properly backed up
Seed
  DBA did not understand how files were managed
  Backup the files without putting them into backup
  mode
Trigger
  Media failure
Red Flag
  Lack of desire to learn
Best Practice
Basic Oracle, Backup and Recovery
knowledge
  Oracle Documentation
  DBA Training
Good backup process
No Backup
No backup for production
Seed
 Backups not set up
Trigger
 Media failure
Red Flag
 Production use of a database without
 backup
Best Practice
Backups are part of the implementation
check off/hand over
Test Recovery before implementation
A backup that may work
Backup set does not encapsulate full recovery
set
Seed
  Custom script does not include all commands within
  backup set
Trigger
  Fraud investigation
Red Flag
  Custom hot backup script command sequence
  incorrect
Best Practice
Custom scripts require complete
knowledge
  Full backup set
  Command sequence
Every backup set should be self-
contained
Can you backup your worst-case
recovery scenario?
Known Bad Backup
Archived redo logs were known to be corrupt
Seed
  Bug in Oracle caused corrupt archived redo logs
  Application owner “could not afford downtime to
  fix”
Trigger
  Rollback segment tablespace went offline
  Monitoring software failed
Red Flag
  Backups known to be unrecoverable
Best Practice
Be careful of complicated application
architectures
Have the political will to do the right
thing
Find an interim solution
User not in the Specs
User level export as only backup
Seed
  User added to database, but not script
Trigger
  Media failure
Red Flag
  Static scripts
  Development responsible for backups
Best Practice
If you are responsible for the database,
for recovery of the database…you are
responsible for the backup!
  Export can only restore a database, not
  perform full recovery
Audit
  schema owners v. users being backed up
You are…the weakest link
Improper tape management
Seed
 Unskilled, unmotivated operations personnel
Trigger
 Anything…
Red Flag
 Non-technical personnel in charge of tape
 management
Best Practice
If you are responsible for the database,
for recovery of the database…you are
responsible for the backup!
You have to trust those responsible for
operations
We can just Reload
Data warehouse recovery strategy was
to reload
Seed
 Database grew, but backup strategy did not
Trigger
 Current redo log corruption
Red Flag
 Backup strategy not revisited as database
 grew
Best Practice
Periodically revisit non-standard backup
strategies
Better yet…avoid non-standard backup
strategies
We don’t need no stinkin’
     SYSTEM tablespace
Default installation on local drive with
additional datafiles on external drives
Seed
  Single database has files on separate
  storage systems
Trigger
  Media Failure
Red Flag
  Never checking backup process
Best Practice
Properly plan and install databases
Verify that all needed parts of the
database are being backed up
  Without SYSTEM tablespace, you lose the
  ‘map’ to tables…and data
  Know what is and is not needed
Security
Table is dropped in production
Seed
  Improper security
  Invalid Backups
Trigger
  Wrong environment
  Wrong action
Red Flag
  Access to production
Best Practice
Appropriate Architecture and Policies
  Schema owner logins
  Non-database tier authentication
Security Audits
  Know who has what and why
  Balance safety v. security
It’s Hammer Time!
Disks failed and user level export was
incomplete
Seed
  Known bad hardware
  Exports not dynamic
Trigger
  Disk crash…finally
Red Flag
  A hammer attached to a storage device is rarely a
  good sign
Best Practice
DON’T USE A HAMMER!!!!!
Use dynamic scripting techniques
  Backups
  Exports
Validate scripting
SOX and Recoveries
7 years of data
Could you recover a 7 year old backup?
  2001 – Oracle 9i introduced
    Most systems 7.3 and 8.x
    Do you have a 7.3 install?
Do you have 7 year old
  Hardware?
  O/S and drivers?
How to avoid calling me…
Backups are part of any installation
  Test recovery before turning over to
  user/developer
  Document the process
  Understand the implications of changes
  Adapt the strategy to the system
Monitor backups on a daily basis
  Exception reporting is good, but not perfect
  Know what to do if a backup fails
The only good recovery is a successful
recovery
  Determine likely, unlikely and worst-case scenarios
  Look for opportunities to perform recoveries
  Understand the implications of changes
  Don’t uncover issues on production systems
Audit security
  Know who can access production and how
  Establish policies and procedures to minimize risk
Annual Reviews
Go Forth
  and
Recover!

Contenu connexe

Tendances

Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guidematthewbrahms
 
Chaos Engineering: Injecting Failure for Building Resilience in Systems
Chaos Engineering: Injecting Failure for Building Resilience in SystemsChaos Engineering: Injecting Failure for Building Resilience in Systems
Chaos Engineering: Injecting Failure for Building Resilience in SystemsYury Roa
 
Introduction to PSM Online Interactive Training
Introduction to PSM Online Interactive TrainingIntroduction to PSM Online Interactive Training
Introduction to PSM Online Interactive TrainingJohn Kingsley
 
Root cause analysis arg sc
Root cause analysis arg scRoot cause analysis arg sc
Root cause analysis arg scManish Chaurasia
 
Process plant troubleshooting
Process plant troubleshootingProcess plant troubleshooting
Process plant troubleshootingAnand Mishra
 
MESA 2016 Presentation - Mark Spinks - Remote Isolation
MESA 2016 Presentation - Mark Spinks - Remote IsolationMESA 2016 Presentation - Mark Spinks - Remote Isolation
MESA 2016 Presentation - Mark Spinks - Remote IsolationMark Spinks
 
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...Adrian Sanabria
 

Tendances (9)

Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guide
 
Chaos Engineering: Injecting Failure for Building Resilience in Systems
Chaos Engineering: Injecting Failure for Building Resilience in SystemsChaos Engineering: Injecting Failure for Building Resilience in Systems
Chaos Engineering: Injecting Failure for Building Resilience in Systems
 
Introduction to PSM Online Interactive Training
Introduction to PSM Online Interactive TrainingIntroduction to PSM Online Interactive Training
Introduction to PSM Online Interactive Training
 
Alarm Management_NKS
Alarm Management_NKSAlarm Management_NKS
Alarm Management_NKS
 
Root cause analysis arg sc
Root cause analysis arg scRoot cause analysis arg sc
Root cause analysis arg sc
 
Process plant troubleshooting
Process plant troubleshootingProcess plant troubleshooting
Process plant troubleshooting
 
MESA 2016 Presentation - Mark Spinks - Remote Isolation
MESA 2016 Presentation - Mark Spinks - Remote IsolationMESA 2016 Presentation - Mark Spinks - Remote Isolation
MESA 2016 Presentation - Mark Spinks - Remote Isolation
 
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
 

Similaire à Nevera Dul Moment

Best practices for running MySQL on production - Vaibhav Upadhyay
Best practices for running MySQL on production - Vaibhav UpadhyayBest practices for running MySQL on production - Vaibhav Upadhyay
Best practices for running MySQL on production - Vaibhav UpadhyayMydbops
 
Not having a good backup
Not having a good backupNot having a good backup
Not having a good backupRita Crawford
 
Real liferecoverypresentation
Real liferecoverypresentationReal liferecoverypresentation
Real liferecoverypresentationoracle documents
 
2.6 backup and recovery
2.6 backup and recovery2.6 backup and recovery
2.6 backup and recoverymrmwood
 
Creating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data Disaster Recovery PlanCreating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data Disaster Recovery PlanRishu Mehra
 
Creating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data  Disaster  Recovery  PlanCreating And Implementing A Data  Disaster  Recovery  Plan
Creating And Implementing A Data Disaster Recovery PlanRishu Mehra
 
8 i rman_love_it
8 i rman_love_it8 i rman_love_it
8 i rman_love_itAnil Pandey
 
Metric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleMetric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleSteve Karam
 
5 Essential Techniques for Building Fault-tolerant Systems
5 Essential Techniques for Building Fault-tolerant Systems5 Essential Techniques for Building Fault-tolerant Systems
5 Essential Techniques for Building Fault-tolerant SystemsAtlassian
 
Data recovery report
Data recovery reportData recovery report
Data recovery reporttutannandi
 
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...TeamCain
 
Patch and Vulnerability Management
Patch and Vulnerability ManagementPatch and Vulnerability Management
Patch and Vulnerability ManagementMarcelo Martins
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web ApplicationsDavid Mitzenmacher
 
7 deadly sins of backup and recovery
7 deadly sins of backup and recovery7 deadly sins of backup and recovery
7 deadly sins of backup and recoverygeekmodeboy
 
WI_Symposium_Conference_2014
WI_Symposium_Conference_2014WI_Symposium_Conference_2014
WI_Symposium_Conference_2014Kevin McDaniel
 
When the Back-Ups Fail: Recovery and Reinvention of Digital Collections
When the Back-Ups Fail: Recovery and Reinvention of Digital CollectionsWhen the Back-Ups Fail: Recovery and Reinvention of Digital Collections
When the Back-Ups Fail: Recovery and Reinvention of Digital CollectionsVisual Resources Association
 
How to achieve better backup with Symantec
How to achieve better backup with SymantecHow to achieve better backup with Symantec
How to achieve better backup with SymantecArrow ECS UK
 

Similaire à Nevera Dul Moment (20)

Best practices for running MySQL on production - Vaibhav Upadhyay
Best practices for running MySQL on production - Vaibhav UpadhyayBest practices for running MySQL on production - Vaibhav Upadhyay
Best practices for running MySQL on production - Vaibhav Upadhyay
 
Real liferecoverypaper
Real liferecoverypaperReal liferecoverypaper
Real liferecoverypaper
 
Not having a good backup
Not having a good backupNot having a good backup
Not having a good backup
 
Guide on Raid Data Recovery
Guide on Raid Data RecoveryGuide on Raid Data Recovery
Guide on Raid Data Recovery
 
Real liferecoverypresentation
Real liferecoverypresentationReal liferecoverypresentation
Real liferecoverypresentation
 
2.6 backup and recovery
2.6 backup and recovery2.6 backup and recovery
2.6 backup and recovery
 
Creating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data Disaster Recovery PlanCreating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data Disaster Recovery Plan
 
Creating And Implementing A Data Disaster Recovery Plan
Creating And Implementing A Data  Disaster  Recovery  PlanCreating And Implementing A Data  Disaster  Recovery  Plan
Creating And Implementing A Data Disaster Recovery Plan
 
8 i rman_love_it
8 i rman_love_it8 i rman_love_it
8 i rman_love_it
 
Metric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleMetric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in Oracle
 
5 Essential Techniques for Building Fault-tolerant Systems
5 Essential Techniques for Building Fault-tolerant Systems5 Essential Techniques for Building Fault-tolerant Systems
5 Essential Techniques for Building Fault-tolerant Systems
 
Data recovery report
Data recovery reportData recovery report
Data recovery report
 
DBA Best Practices.ppt
DBA Best Practices.pptDBA Best Practices.ppt
DBA Best Practices.ppt
 
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...
WCRUG 2013 - Procurity Keep Their JDE "On the Go" with Clean Data and Improve...
 
Patch and Vulnerability Management
Patch and Vulnerability ManagementPatch and Vulnerability Management
Patch and Vulnerability Management
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
 
7 deadly sins of backup and recovery
7 deadly sins of backup and recovery7 deadly sins of backup and recovery
7 deadly sins of backup and recovery
 
WI_Symposium_Conference_2014
WI_Symposium_Conference_2014WI_Symposium_Conference_2014
WI_Symposium_Conference_2014
 
When the Back-Ups Fail: Recovery and Reinvention of Digital Collections
When the Back-Ups Fail: Recovery and Reinvention of Digital CollectionsWhen the Back-Ups Fail: Recovery and Reinvention of Digital Collections
When the Back-Ups Fail: Recovery and Reinvention of Digital Collections
 
How to achieve better backup with Symantec
How to achieve better backup with SymantecHow to achieve better backup with Symantec
How to achieve better backup with Symantec
 

Dernier

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 

Dernier (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 

Nevera Dul Moment

  • 1. Never a DULL Moment How to Avoid Costly Data Recovery RMOUG QEW November 2008
  • 2. Who am I? Daniel Fink Oracle DBA since 1996 Diagnosis, Optimization, Data Recovery and Training Member of Oak Table, BAARF and BAAG www.optimaldba.com daniel.fink@optimaldba.com
  • 3. Agenda DULs Recoveries Case Studies Worst Practices Best Practices Some case studies provided by Kurt Van Meerbeeck (www.ora600.be)
  • 4. Never a DULL Moment DUL – Data UnLoader Extract data from a down database Option of last resort Downtime Expensive May or May not work
  • 5. Why do you need a DUL? Seed Incorrect Configurations Poor Policies/Procedures Inflexible Processes Lack of Security Trigger Human Error Technology Failure
  • 6. Bullet Proof Backups Simply don’t exist There will always be a point of failure Keep it simple, but thorough Added complexity = Added risk Change management Protect redo Once redo is lost, recovery stops
  • 7. An Unrecovered Backup Is No Backup At ALL! Recovery is Job One “Contrary to common opinion, a DBA does not have a responsibility to back up a database. The DBA’s real responsibility is to be able to recover the database.” Essential Oracle8i Data Warehousing (Dodge/Gorman) “The actual responsibility is to restore or recover the database to the point in time and within the downtime window determined by the business needs.” Real Life Recovery (RMOUG Training Days 1999)
  • 8. Audience Participation Today Did you check your backup log? This Week/Month Did you check the backup process? Did you recover a backup? Ever Did you check your backup log? Did you recover a backup?
  • 9. Best Core Practices Find Recovery Opportunities Environment Refreshes Upgrade/Patch Testing Disaster Recovery Training Every single case study presented would have been avoided if they had tested recovery
  • 10. Best Core Practices You have known, good processes This does not mean every backup is good Always test after any changes You have documented the processes Help when thinking is not clear
  • 11. Best Core Practices Prevention Audits Implementation Checklists Find Opportunities to Recover Refreshes DBA sandboxes
  • 12. Case Studies Situation Summary of the issue Seed A condtion that is present Trigger An event that causes the failure Red Flag A “recognized” indication of a future problem
  • 13. “Hot” Backups Files were not being properly backed up Seed DBA did not understand how files were managed Backup the files without putting them into backup mode Trigger Media failure Red Flag Lack of desire to learn
  • 14. Best Practice Basic Oracle, Backup and Recovery knowledge Oracle Documentation DBA Training Good backup process
  • 15. No Backup No backup for production Seed Backups not set up Trigger Media failure Red Flag Production use of a database without backup
  • 16. Best Practice Backups are part of the implementation check off/hand over Test Recovery before implementation
  • 17. A backup that may work Backup set does not encapsulate full recovery set Seed Custom script does not include all commands within backup set Trigger Fraud investigation Red Flag Custom hot backup script command sequence incorrect
  • 18. Best Practice Custom scripts require complete knowledge Full backup set Command sequence Every backup set should be self- contained Can you backup your worst-case recovery scenario?
  • 19. Known Bad Backup Archived redo logs were known to be corrupt Seed Bug in Oracle caused corrupt archived redo logs Application owner “could not afford downtime to fix” Trigger Rollback segment tablespace went offline Monitoring software failed Red Flag Backups known to be unrecoverable
  • 20. Best Practice Be careful of complicated application architectures Have the political will to do the right thing Find an interim solution
  • 21. User not in the Specs User level export as only backup Seed User added to database, but not script Trigger Media failure Red Flag Static scripts Development responsible for backups
  • 22. Best Practice If you are responsible for the database, for recovery of the database…you are responsible for the backup! Export can only restore a database, not perform full recovery Audit schema owners v. users being backed up
  • 23. You are…the weakest link Improper tape management Seed Unskilled, unmotivated operations personnel Trigger Anything… Red Flag Non-technical personnel in charge of tape management
  • 24. Best Practice If you are responsible for the database, for recovery of the database…you are responsible for the backup! You have to trust those responsible for operations
  • 25. We can just Reload Data warehouse recovery strategy was to reload Seed Database grew, but backup strategy did not Trigger Current redo log corruption Red Flag Backup strategy not revisited as database grew
  • 26. Best Practice Periodically revisit non-standard backup strategies Better yet…avoid non-standard backup strategies
  • 27. We don’t need no stinkin’ SYSTEM tablespace Default installation on local drive with additional datafiles on external drives Seed Single database has files on separate storage systems Trigger Media Failure Red Flag Never checking backup process
  • 28. Best Practice Properly plan and install databases Verify that all needed parts of the database are being backed up Without SYSTEM tablespace, you lose the ‘map’ to tables…and data Know what is and is not needed
  • 29. Security Table is dropped in production Seed Improper security Invalid Backups Trigger Wrong environment Wrong action Red Flag Access to production
  • 30. Best Practice Appropriate Architecture and Policies Schema owner logins Non-database tier authentication Security Audits Know who has what and why Balance safety v. security
  • 31. It’s Hammer Time! Disks failed and user level export was incomplete Seed Known bad hardware Exports not dynamic Trigger Disk crash…finally Red Flag A hammer attached to a storage device is rarely a good sign
  • 32. Best Practice DON’T USE A HAMMER!!!!! Use dynamic scripting techniques Backups Exports Validate scripting
  • 33. SOX and Recoveries 7 years of data Could you recover a 7 year old backup? 2001 – Oracle 9i introduced Most systems 7.3 and 8.x Do you have a 7.3 install? Do you have 7 year old Hardware? O/S and drivers?
  • 34. How to avoid calling me… Backups are part of any installation Test recovery before turning over to user/developer Document the process Understand the implications of changes Adapt the strategy to the system Monitor backups on a daily basis Exception reporting is good, but not perfect Know what to do if a backup fails
  • 35. The only good recovery is a successful recovery Determine likely, unlikely and worst-case scenarios Look for opportunities to perform recoveries Understand the implications of changes Don’t uncover issues on production systems
  • 36. Audit security Know who can access production and how Establish policies and procedures to minimize risk Annual Reviews
  • 37. Go Forth and Recover!