Top 10 DB2 Support Nightmares #9

•

1 j'aime•1,434 vues

What do you do when disaster strikes? In part 9 of our DB2 Support Nightmare series we look at another DB2 disaster scenario and how it was resolved by the experts at Triton Consulting.

Technologie

Top 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 Support
Nightmares & How toNightmares & How toNightmares & How toNightmares & How to
Avoid ThemAvoid ThemAvoid ThemAvoid Them
#9#9#9#9

Part 9 – In the event of an emergency call
The situation
Image of a junior DBA
A customer was running HADR for disaster
recovery. They had no cluster software used
for monitoring or failover. HADR was being
monitored on a regular basis using a shell
script.

Then what happened…?
Disk failure on the primary site! Some tablespaces
were put into “Rollforward Pending” state.
Transactions accessing data in these tablespaces
failed

Image of a junior DBA
The last run of the HADR state monitoring
script indicated a Peer State so it was decided
to issue TAKEOVER command on the DR site to
switch roles. When the application started,
some transactions failed with the same error as
on the primary site.
List tablespaces command showed a number of
tables in Rollforward Pending state. To get out
of the pending state, ROLLFORWARD command
was issued with the list of affected tablespaces.
The rollforward was trying to retrieve a log,
which was a few thousand logs older than the
current one. After a few tries the
ROLLFORWARD was given up.
The database was restored from the latest
backup image

• We went through the db2diag.log and the notification logs. We could see that there
were physical errors reported in some of the tablespaces on the DR site around 100 days
prior to the incident. This was reported in the db2diag.log and the affected tablespaces
were “excluded from the rollforward set”
• Based on other entries in the db2diag file, we were able to confirm that the log file
requested for rollforward on the DR site was used at the time the physical errors
occurred there.
• HADR continued to apply logs for the other tablespaces and was reporting to be in “Peer”
State. In reality, some of the tablespaces were being ignored.
Triton Analysis

Regular monitoring on the log files is essential to identify and resolve the
issue on the DR site in advance of this incident.
Make sure you know who to call before disaster strikes!
Triton Consulting +44 870 2411 550
www.triton.co.uk
The moral of the story

Recommandé

Top 10 DB2 Support Nightmares #7 Laura Hood

Top 10 DB2 Support Nightmares #8Laura Hood

Top 10 DB2 Support Nightmares #10Laura Hood

How to remove drive fragmentationtomaddison

(Dis)Advantages of DHT: A Perspective with Raghavendra GowdappaGluster.org

IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...Leons Petražickis

Beat the devil: towards a Drupal performance benchmarkPedro González Serrano

Sql Health in a SharePoint environmentEnrique Lima

Recommandé

Top 10 DB2 Support Nightmares #7 Laura Hood

Top 10 DB2 Support Nightmares #8Laura Hood

Top 10 DB2 Support Nightmares #10Laura Hood

How to remove drive fragmentationtomaddison

(Dis)Advantages of DHT: A Perspective with Raghavendra GowdappaGluster.org

IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...Leons Petražickis

Beat the devil: towards a Drupal performance benchmarkPedro González Serrano

Sql Health in a SharePoint environmentEnrique Lima

CongratsyourthedbatooDave Stokes

Hadoop installation by santosh nageSantosh Nage

Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Adam Kawa

Drobo products pptClayton Desouza

Raid data recovery TipsHone Software

Hadoop single node installation on ubuntu 14jijukjoseph

Raid 5Ankita Jadhao

Hadoop Cluster With High AvailabilityEdureka!

Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa

2 db2 instance creationRavikumar Nandigam

Improving Hadoop Performance via LinuxAlex Moundalexis

Introduction to hadoop high availability Omid Vahdaty

Setting up repositoriesIryna Kuchma

leiamesindiconet

Linux Disaster Recovery SolutionsGratien D'haese

DB2 High Availability für IBM Connections, Sametime oder TravelerNico Meisenzahl

A05Kyle Brown

High availability solutions bakostechViktoria Bakos

Design patterns and plan for developing high available azure applicationsHimanshu Sahu

High Availability Options for Modern Oracle InfrastructuresSimon Haslam

Oracle Maximum Availability ArchitectureMarketingArrowECS_CZ

SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groupsturgaysahtiyan

Contenu connexe

Tendances

CongratsyourthedbatooDave Stokes

Hadoop installation by santosh nageSantosh Nage

Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Adam Kawa

Drobo products pptClayton Desouza

Raid data recovery TipsHone Software

Hadoop single node installation on ubuntu 14jijukjoseph

Raid 5Ankita Jadhao

Hadoop Cluster With High AvailabilityEdureka!

Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa

2 db2 instance creationRavikumar Nandigam

Improving Hadoop Performance via LinuxAlex Moundalexis

Introduction to hadoop high availability Omid Vahdaty

Setting up repositoriesIryna Kuchma

leiamesindiconet

Tendances (14)

Congratsyourthedbatoo

Hadoop installation by santosh nage

Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)

Drobo products ppt

Raid data recovery Tips

Hadoop single node installation on ubuntu 14

Raid 5

Hadoop Cluster With High Availability

Apache Hadoop YARN, NameNode HA, HDFS Federation

2 db2 instance creation

Improving Hadoop Performance via Linux

Introduction to hadoop high availability

Setting up repositories

leiame

En vedette

Linux Disaster Recovery SolutionsGratien D'haese

DB2 High Availability für IBM Connections, Sametime oder TravelerNico Meisenzahl

A05Kyle Brown

High availability solutions bakostechViktoria Bakos

Design patterns and plan for developing high available azure applicationsHimanshu Sahu

High Availability Options for Modern Oracle InfrastructuresSimon Haslam

Oracle Maximum Availability ArchitectureMarketingArrowECS_CZ

SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groupsturgaysahtiyan

D02 Evolution of the HADR toolJeyabarathi (JB) Chakrapani

HA & DR System Design - Concepts and SolutionContinuity and Resilience

DB2 V 10 HADR Multiple StandbyDale McInnis

High Availability and Disaster RecoveryAkelios

Oracle database high availability solutionsKirill Loifman

Linux Disaster Recovery Best Practices with rearGratien D'haese

A Step-By-Step Disaster Recovery Blueprint & Best Practices for Your NetBacku...Symantec

High Availability (HA) ExplainedMaciej Lasyk

SharePoint Backup And Disaster Recovery with Joel OlesonJoel Oleson

Architecting for High AvailabilityAmazon Web Services

Best Practices in Disaster Recovery Planning and TestingAxcient

AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...Amazon Web Services

En vedette (20)

Linux Disaster Recovery Solutions

DB2 High Availability für IBM Connections, Sametime oder Traveler

A05

High availability solutions bakostech

Design patterns and plan for developing high available azure applications

High Availability Options for Modern Oracle Infrastructures

Oracle Maximum Availability Architecture

SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups

D02 Evolution of the HADR tool

HA & DR System Design - Concepts and Solution

DB2 V 10 HADR Multiple Standby

High Availability and Disaster Recovery

Oracle database high availability solutions

Linux Disaster Recovery Best Practices with rear

A Step-By-Step Disaster Recovery Blueprint & Best Practices for Your NetBacku...

High Availability (HA) Explained

SharePoint Backup And Disaster Recovery with Joel Oleson

Architecting for High Availability

Best Practices in Disaster Recovery Planning and Testing

AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...

Similaire à Top 10 DB2 Support Nightmares #9

A DBA’s guide to using TSAFrederik Engelen

Top 10 DB2 Support Nightmares #7Carol Davis-Mann

Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016DataStax

URL to HTMLFrancois Marier

Ad disasters & how to prevent themConcentrated Technology

Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)VMware Tanzu

Oracle utilities in 11g R2Guenadi JILEVSKI

Dead Lock In Operating Systemstotallooser

Oracle no sql release 3 4 overviewAnand Chandak

TA-002-P.pdfssuserea9ab8

Oracle sharding : Installation & Configurationsuresh gandhi

PHDAYS: DGAs and Threat IntelligenceJohn Bambenek

URL to HTMLFrancois Marier

DB2UDB_the_Basics Day 6Pranav Prakash

Php melb cqrs-ddd-predaddyDouglas Reith

CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012Amazon Web Services

Hoya for Code ReviewSteve Loughran

db2dart and inspectdbawork

[Hic2011] using hadoop lucene-solr-for-large-scale-search by systexJames Chen

2013 11-19-hoya-statusSteve Loughran

Similaire à Top 10 DB2 Support Nightmares #9 (20)

A DBA’s guide to using TSA

Top 10 DB2 Support Nightmares #7

Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016

URL to HTML

Ad disasters & how to prevent them

Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Oracle utilities in 11g R2

Dead Lock In Operating Systems

Oracle no sql release 3 4 overview

TA-002-P.pdf

Oracle sharding : Installation & Configuration

PHDAYS: DGAs and Threat Intelligence

URL to HTML

DB2UDB_the_Basics Day 6

Php melb cqrs-ddd-predaddy

CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012

Hoya for Code Review

db2dart and inspect

[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex

2013 11-19-hoya-status

Plus de Laura Hood

Top 10 db2 support nightmares #6Laura Hood

Consultancy on Demand - InfographicLaura Hood

A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondLaura Hood

Top 10 DB2 Support Nightmares #1Laura Hood

Db2 10 memory management uk db2 user group june 2013 [read-only]Laura Hood

DB2 10 Security EnhancementsLaura Hood

DbB 10 Webcast #3 The Secrets Of ScalabilityLaura Hood

DB2 10 Webcast #2 - Justifying The UpgradeLaura Hood

DB2 10 Webcast #1 - Overview And Migration PlanningLaura Hood

Time Travelling With DB2 10 For zOSLaura Hood

DB2DART - DB2Night Show October 2011Laura Hood

$DB2 z/OS & Java - What\'s New?$ $DB2 z/OS & Java - What\'s New?$

DB2 z/OS & Java - What\'s New?Laura Hood

Temporal And Other DB2 10 For Z Os HighlightsLaura Hood

DB210 Smarter Database IBM Tech Forum 2011Laura Hood

UKGSE DB2 pureScaleLaura Hood

UKCMG DB2 pureScaleLaura Hood

Episode 4 DB2 pureScale Performance Webinar Oct 2010Laura Hood

Episode 3 DB2 pureScale Availability And Recovery [Read Only] [Compatibility...Laura Hood

Episode 2 Installation Triton SlidesLaura Hood

Episode 2 DB2 pureScale Installation, Instance Management & MonitoringLaura Hood

Plus de Laura Hood (20)

Top 10 db2 support nightmares #6

Consultancy on Demand - Infographic

A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond

Top 10 DB2 Support Nightmares #1

Db2 10 memory management uk db2 user group june 2013 [read-only]

DB2 10 Security Enhancements

DbB 10 Webcast #3 The Secrets Of Scalability

DB2 10 Webcast #2 - Justifying The Upgrade

DB2 10 Webcast #1 - Overview And Migration Planning

Time Travelling With DB2 10 For zOS

DB2DART - DB2Night Show October 2011

$DB2 z/OS & Java - What\'s New?$ $DB2 z/OS & Java - What\'s New?$

DB2 z/OS & Java - What\'s New?

Temporal And Other DB2 10 For Z Os Highlights

DB210 Smarter Database IBM Tech Forum 2011

UKGSE DB2 pureScale

UKCMG DB2 pureScale

Episode 4 DB2 pureScale Performance Webinar Oct 2010

Episode 3 DB2 pureScale Availability And Recovery [Read Only] [Compatibility...

Episode 2 Installation Triton Slides

Episode 2 DB2 pureScale Installation, Instance Management & Monitoring

Dernier

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

A Call to Action for Generative AI in 2024Results

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Dernier (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Tata AIG General Insurance Company - Insurer Innovation Award 2024

CNv6 Instructor Chapter 6 Quality of Service

Advantages of Hiring UIUX Design Service Providers for Your Business

Axa Assurance Maroc - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

A Call to Action for Generative AI in 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

GenCyber Cyber Security Day Presentation

Powerful Google developer tools for immediate impact! (2023-24 C)

Automating Google Workspace (GWS) & more with Apps Script

Breaking the Kubernetes Kill Chain: Host Path Mount

08448380779 Call Girls In Civil Lines Women Seeking Men

Top 10 DB2 Support Nightmares #9

1. Top 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 Support Nightmares & How toNightmares & How toNightmares & How toNightmares & How to Avoid ThemAvoid ThemAvoid ThemAvoid Them #9#9#9#9

2. Part 9 – In the event of an emergency call The situation Image of a junior DBA A customer was running HADR for disaster recovery. They had no cluster software used for monitoring or failover. HADR was being monitored on a regular basis using a shell script.

3. Then what happened…? Disk failure on the primary site! Some tablespaces were put into “Rollforward Pending” state. Transactions accessing data in these tablespaces failed

4. Image of a junior DBA The last run of the HADR state monitoring script indicated a Peer State so it was decided to issue TAKEOVER command on the DR site to switch roles. When the application started, some transactions failed with the same error as on the primary site. List tablespaces command showed a number of tables in Rollforward Pending state. To get out of the pending state, ROLLFORWARD command was issued with the list of affected tablespaces. The rollforward was trying to retrieve a log, which was a few thousand logs older than the current one. After a few tries the ROLLFORWARD was given up. The database was restored from the latest backup image

5. • We went through the db2diag.log and the notification logs. We could see that there were physical errors reported in some of the tablespaces on the DR site around 100 days prior to the incident. This was reported in the db2diag.log and the affected tablespaces were “excluded from the rollforward set” • Based on other entries in the db2diag file, we were able to confirm that the log file requested for rollforward on the DR site was used at the time the physical errors occurred there. • HADR continued to apply logs for the other tablespaces and was reporting to be in “Peer” State. In reality, some of the tablespaces were being ignored. Triton Analysis

6. Regular monitoring on the log files is essential to identify and resolve the issue on the DR site in advance of this incident. Make sure you know who to call before disaster strikes! Triton Consulting +44 870 2411 550 www.triton.co.uk The moral of the story

7. www.triton.co.uk