What do you do when disaster strikes? In part 9 of our DB2 Support Nightmare series we look at another DB2 disaster scenario and how it was resolved by the experts at Triton Consulting.
08448380779 Call Girls In Civil Lines Women Seeking Men
Top 10 DB2 Support Nightmares #9
1. Top 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 SupportTop 10 DB2 Support
Nightmares & How toNightmares & How toNightmares & How toNightmares & How to
Avoid ThemAvoid ThemAvoid ThemAvoid Them
#9#9#9#9
2. Part 9 – In the event of an emergency call
The situation
Image of a junior DBA
A customer was running HADR for disaster
recovery. They had no cluster software used
for monitoring or failover. HADR was being
monitored on a regular basis using a shell
script.
3. Then what happened…?
Disk failure on the primary site! Some tablespaces
were put into “Rollforward Pending” state.
Transactions accessing data in these tablespaces
failed
4. Image of a junior DBA
The last run of the HADR state monitoring
script indicated a Peer State so it was decided
to issue TAKEOVER command on the DR site to
switch roles. When the application started,
some transactions failed with the same error as
on the primary site.
List tablespaces command showed a number of
tables in Rollforward Pending state. To get out
of the pending state, ROLLFORWARD command
was issued with the list of affected tablespaces.
The rollforward was trying to retrieve a log,
which was a few thousand logs older than the
current one. After a few tries the
ROLLFORWARD was given up.
The database was restored from the latest
backup image
5. • We went through the db2diag.log and the notification logs. We could see that there
were physical errors reported in some of the tablespaces on the DR site around 100 days
prior to the incident. This was reported in the db2diag.log and the affected tablespaces
were “excluded from the rollforward set”
• Based on other entries in the db2diag file, we were able to confirm that the log file
requested for rollforward on the DR site was used at the time the physical errors
occurred there.
• HADR continued to apply logs for the other tablespaces and was reporting to be in “Peer”
State. In reality, some of the tablespaces were being ignored.
Triton Analysis
6. Regular monitoring on the log files is essential to identify and resolve the
issue on the DR site in advance of this incident.
Make sure you know who to call before disaster strikes!
Triton Consulting +44 870 2411 550
www.triton.co.uk
The moral of the story