Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Training netbackup6x2

netbackup

  • Identifiez-vous pour voir les commentaires

Training netbackup6x2

  1. 1. Ne tba c kup 6 . x © 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  2. 2. Introduction • Modules/sections designed to help understanding version 6.x of Netbackup. • This training is also designed for L1/L2 who involved in backup/restore incident management
  3. 3. Contents • Section 1 :Netbackup Essentials – Architecture – Netbackup terminology – Backup Components • Section 2: Daemons and process • Checking necessary daemons on Master and Media server • Daemons Overview • Section 3 : Netbackup management • Reports • Policies • Section 4 : Media and Device management • Device monitor • Media • Section 6 : Restores • Section 5 : troubleshooting • commands to troubleshoot
  4. 4. Archite c ture a nd ba c kup flo w © 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  5. 5. Architecture NetBackup Media Manager Server Drive 1 Drive 2 etc ... NetBackup Client NetBackup Master Server NetBackup catalogs Configuration files Image database Device, volume info In relational databases EMM DB Drive 1 Drive 2 etc ... IP FC
  6. 6. #IRM on The Master Allocates resource and tells the Media Manager to start assigning drives and gathering data from the client. NetBackup Media Manager Server The Media Manager requests the backup image from the client. The Client generates an image and sends it to the Media Manager. The Media Manager sends the data to the tape. The Media Manager tells the robotic control server to find and mount a tape. Tape Silo NetBackup Robotic Control Server Drive 1 Drive 2 etc ... The Robotic Control Server tells the robot. NetBackup Master Server * The Client request a EMM DB backup from it’s master. Client can be legacy * Application backup for SAP/oracle are trigerred from client with user backup # scheduled standard type backups are triggered by IRM (nbpem)
  7. 7. Terminology Policy • Policy is set of rules defined for one or more clients to backup • schedule type allowed for backup (Automatic, incr, full, user etc.) • And type of the policy that clients get backed up like windows-NT, standard, RMAN etc. • Preferred Storage unit to be used for backup Schedule • Defining backup window and retention period for tapes • recurring backup information with exclude dates/days if any
  8. 8. Terminology Volume pool • The removable media on which Netbackup stores data called volume • These volumes can be assigned to specific application for backup • Volume pool is the logical destination for volumes to reside • Once volume assigned to a pool can not be re-assigned to other until all images are expired • The default Volume pool is the Netbackup Storage unit • The devices that are used to store backups called Storage Unit • A Storage unit can be configured for one or more devices of similar density (DLT, hcart, hcart2 etc.)
  9. 9. Terminology Catalog • Catalog is the internal databases maintained by the Netbackup • Catalog contains information about backup configuration, media, storage devices and files that were backed Multiplexing • The ability to write multiple data streams from one or more clients/server to single tape drive for optimum performance • With multiplexing more number of jobs can be run with in the backup window
  10. 10. Backup components Master Server • The master server controls the scheduling of backups, archives and restores operations • Master server maintains catalog information which contains valid images of a client Media server • The media server has storage unit defined on master can be additional resource to the master and increase performance • Load can be shared across storage devices that master and media servers manages. This can be achieved by specifiying storage unit option to “ Any Available” for any policy on the master
  11. 11. Backup components Enterprise media manager • EMM server centrally manages relational databases that contains media and devices information of master and media servers • EMM Server can co-exist with Master server else can be separate entity Intelligent Resource Manager • With Netbackup 6.x IRM replaces job scheduler bpsched of earlier versions. • IRM handles scheduling of a job and allocates resources with the help of EMM server
  12. 12. Netbackup Master Processes/Daemons Daemons that runs on Master Server • bpcd – Netbackup client Service • bpcompatd – Netbackup Compatibility Service (supports legacy service) • bpdbm – Netbackup Database Manager • ltid – Netbackup Device Manager • nbemm – Netbackup Enterprise Media Manager • nbjm – Netbackup Job Manager • nbnos – Netbackup Notification Service
  13. 13. Netbackup Master Processes (contd.) Daemons that runs on Master Server • nbpem – Netbackup Policy Execution Manager • bprd – Netbackup Request Manager • nbrb – Netbackup Resource Broker • nbsl – Netbackup Service Layer (helps refreshing GUI) • nbsvcmon – Netbackup Service Monitor ( restarts daemons if terminated ) • vmd – Netbackup Volume Manager • pbx_exchange – VERITAS Private Branch Exchange (port reduction)
  14. 14. Media Manager Processes/ Daemons Daemons that runs on Media Server • bpcd – Netbackup client Service • bpcomatd – Netbackup Compatibility Service (supports legacy service) • ltid – Netbackup Device Manager • nbnos – Netbackup Notification Service • nbsl – Netbackup Service Layer (helps refreshing GUI) • nbsvcmon – Netbackup Service Monitor ( restarts daemons if terminated ) • vmd – Netbackup Volume Manager • pbx_exchange – VERITAS Private Branch Exchange (port reduction)
  15. 15. Daemons Overview nbpem • Policy Execution Manager (nbpem) is the responsible to check for backup due of all the clients • Policy Execution Manager makes list of client backups in each policy that are due for run submits to nbjm nbjm • Is the Netbackup job manager • nbjm accepts jobs submitted by the nbpem and gets necessary resource by coordinating with nbrb to start the backup
  16. 16. Daemons Overview (contd.) nbrb • Netbackup Request Broker (nbrb) is the responsible to allocate the resources to a job • nbrb accepts request for resources from nbjm and allocates resources with the help of nbemm nbemm • Netbackup Enterprise Manager (nbemm) manages relational databases centrally • These databases would contain information about media and device configurations stored in EMM_DATA.db
  17. 17. Java Console login
  18. 18. Java Console
  19. 19. Activity Monitor
  20. 20. Policies
  21. 21. Device Monitor
  22. 22. Reports – Status of Client Backups
  23. 23. Reports – All log Entries of a client
  24. 24. Netbackup Configuration File • on MASTER server: • # more /usr/openv/netbackup/bp.conf • SERVER = {name of Master server} • SERVER = {name of Media manager} • EMMSERVER = {mostly Master server} • CLIENT_NAME = {name of master server}
  25. 25. Netbackup Configuration File (cont.) On MEDIA Server: # more /usr/openv/netbackup/bp.conf SERVER = {name of Master Server} SERVER = {name of Media Manager} EMMSERVER = {mostly name of Master Server} … CLIENT_NAME = {name of Media Manager} … List of options
  26. 26. Netbackup Configuration File (cont.) On client : # more /usr/openv/netbackup/bp.conf SERVER = {name of Master Server} SERVER = {name of Media Manager} … CLIENT_NAME = {name of Media Manager} … List of options
  27. 27. Important Configuration Entries Following entries are very essential for Netbackup 6.x clients /etc/services • bpcd 13782/tcp bpcd • bpdbm 13721/tcp bpdbm • bprd 13720/tcp bprd • vopied 13783/tcp vopied • bpjava-msvc 13722/tcp bpjava-msvc • vnetd 13724/tcp vnetd /etc/inetd.conf • bpcd stream tcp nowait root /usr/openv/netbackup/bin/bpcd bpcd • vnetd stream tcp nowait root /usr/openv/netbackup/bin/vnetd vnetd • vopied stream tcp nowait root /usr/openv/netbackup/bin/vopied vopied • bpjava-msvc stream tcp nowait root /usr/openv/netbackup/bin/bpjava-msvc bpjava-msvc –transient
  28. 28. Important Configuration Entries (contd.) bpcd is the only daemon that runs on client. No special action required for starting this daemon since it is registered with inetd during installation. Command is to verify if bpcd listens on tcp port 13782 #netstat –a |grep bpcd tcp 0 0 *.bpcd *.* LISTEN
  29. 29. IImmppoorrttaanntt CCoonnffiigguurraattiioonn EEnnttrriieess (cont.) • FFoorr MMeeddiiaa MMaannaaggeerrss //eettcc//sseerrvviicceess ((aaddddiittiioonnaall eennttrriieess)) • vmd 13701/tcpvmd • acsd 13702/tcpacsd • tl8cd 13705/tcptl8cd • odld 13706/tcpodld • vtlcd 13708/tcpvtlcd • bpjobd 13723/tcpbpjobd • NB_dbsrv 13785/tcpNB_dbsrv … • lmfcd 13718/tcplmfcd
  30. 30. NB process control • Start NB processes • # /usr/openv/netbackup/bin/goodies/netbackup start • Stop NB processes • # /usr/openv/netbackup/bin/goodies/bp.kill_all • # /usr/openv/netbackup/bin/goodies/netbackup stop • Check NB processes on any server • #bpps -a • #bpps –x (shows shared VERITAS process)
  31. 31. Important directories • /usr/openv – Contains Netbackup, Volmgr binaries, and Netbackup database • example • /usr/openv/netbackup/bin – Binaries • /usr/openv/db/data – EMM & NBDB databases • /usr/openv/netbackup/db – Database of class, schedules, images etc • /usr/openv/netbackup/logs – Netbackup logs
  32. 32. Netbackup Administration Commands • bpps –a – shows Netbackup processes running on a Master or Media Manager) • bpps –x – Shows Netbackup processes running on Master or Media Server including Shared VERITAS process • bpadm – Terminal GUI - much quicker than jnbSA – jnbSA is java administration console
  33. 33. Useful commands • #vmoprcmd –d – shows list of Tape drives that are busy with active job – Useful command to find… – which tapes are mounted on to drive index – Are drives Multihost enabled – Status of the drives (TLD,DOWN-TLD, PEND-TLD) – Pending mount requests if any • #tpconfig –d – Shows tape dev files configured and status – Helpful in identifying EMM server • #robtest – Useful to run SCSI pass-through commands – Scanning drives within the library – Scanning library slots to find empty/full Slot IDs – Unload, moving tapes across
  34. 34. Additional Commands • #bptestbpcd -host <hostname> – Useful command to check if the communication between client and server is fine • #bptestbpcd –host <hostname> -debug – communication problem can be easily found from debug • #vmquery –m <label> – Useful command to check … – Tape density – Slot number – Volume pool assigned – Tape location (On-site or Off-site) – Vault session ID for Tape location is Off-site
  35. 35. NetBackup Management Domain: Master Server and Media Server
  36. 36. Media Manager Daemons/Services • vmd :- Volume Manager daemon Manages Volume Catalog Device Configuration • ltid :- Logical Tape Interface Daemon/Device Mgr Handles Device requests – Mount/Un-mount • txxd :- Individual robotic cntlr daemon (Handles robotic drive requests) • avrd : Allows MM to read Labeled tape volumes and automatically assign them to requesting processes.
  37. 37. Device Configuration • Media and Device Management interface • tpconfig menu-based utility • Configure Storage Devices Wizard
  38. 38. Configure Storage Devices Wizard
  39. 39. Configuring using Admin Console
  40. 40. Required Parameters : • Device Host name • Robotic type • Robotic Number • Volume DB Host • Robotic Ctrl
  41. 41. Adding New Robot
  42. 42. Adding Drives
  43. 43. No Rewind Device : The device remains at its current position on a close operation. 0 (zero) indicates a device. c indicates that compression is on. m specifies medium compression. b specifies to use Berkeley Style close. n indicates a no-rewind device. Ex:o/p tpconfig -d 17 LSM800_D17 hcart3 TLD(800) DRIVE=17 /dev/rmt/c16t0d0cnb /dev/rmt/c16t0d0BESTnb
  44. 44. Robotic Test Tools • Application is located at /usr/openv/volmgr/bin/robtest. This application invokes the device-specific test tool (tl8test,tldtest, and so on). robtest is especially useful for validating drive mappings within a robot. (m-move media,s-Read element status, inquiry-product ID) • Note: Do not leave robtest running in your production environment. Doing so can lead to device errors.
  45. 45. Volume Catalog /usr/openv/volmgr/database • The Volume Catalog is one of the internal catalogs where Media Manager stores information about: Volumes: Individual pieces of media Volume pools: Media used for a common purpose Scratch volume pool: A volume pool that enables Media Manager to logically move tapes into volume pools that do not have media available to use Volume groups: A group used by the administrator and Media Manager for tracking the physical location of volumes
  46. 46. Volume Pools Volume pools identify a logical set of volumes. Volume pools can be used to reserve media for specific NetBackup jobs. A volume pool must exist before volumes may be assigned to it. The following volume pools are automatically created: • NetBackup: For backups of NetBackup catalogs • None: For jobs other than NetBackup –Drive Cleaning • DataStore is the default pool name for DataStore and other types of applications.
  47. 47. Scratch Volume Pool A scratch volume pool is an optional pool. Each media server in your configuration can have one scratch pool configured. A volume is moved by Media Manager from the scratch volume pool to another pool that does not have a volume available. A volume moved from the scratch volume pool becomes a member of the new pool. Select the scratch pool check box in the “Add a New Volume Pool” dialog to create a scratch volume pool.
  48. 48. Volume Group Volume Groups: Identify where the tape is located Are created as a tape is moved into a group Manage groups of tapes for administrative purposes Include two types of groups: • Robotic • Nonrobotic (standalone) A common volume group is offsite. Ex: Vault Volume Group.
  49. 49. Volume Pool & Volume Group
  50. 50. Media Delete Operations The administrator c a n delete any volume that has not been assigned. The administrator c a nno t delete any volume that has been assigned. NetBackup assigns volumes that contain active images on them.
  51. 51. Trouble Shooting Media Unreadable barcode labels Misdefined media types Status 96—Unable to allocate new media Three reasons for this error – No usable tape – No tape in specified volume pool – Bad media Check Media Logs report
  52. 52. Storage Units A Storage Device is an individual place to write backups: Tape drives Optical disk drives Magnetic disk with file system A Storage Unit (STU) is one or more storage devices on one Media Server to which backups are sent: Includes one or more storage devices Same type and density devices (robotic or stand-alone) Controlled by the same robot A Storage Unit Group is a group used to identify storage units as a group Available for various types of storage units Determine the priority of storage units
  53. 53. Storage Units :
  54. 54. On Demand & Any Available : • O n d e m a nd o nly specifies whether the storage unit is available only on demand—that is, only when a policy or schedule is explicitly configured to use this storage unit. • If you do not specify a storage unit for your backup policy, your backup is directed to a ny a va ila ble storage unit. Order of selection of SU..locally-attached storage units first, and if none are found, the storage units are tried in alphabetical order.
  55. 55. Fragmentation
  56. 56. NetBackup Policies A policy is a template for the backup of a single client or a group of cients with common attributes and a common file list. Parts of a Policy Definition: Attributes Client list File list Schedules Policies determine: What – Backup files and data sets Who – Backup clients How – Backup behavior Where – Backup storage location When – Backup time and type
  57. 57. Creating New Policy – Wizard
  58. 58. New Policy from Admin Console
  59. 59. Policy Attributes
  60. 60. Attributes to be set • Policy Type • Multiple copy • Policy storage unit • Policy Volume Pool • Limit Policy per volume pool • Job Priority • Keyword phrase • Active go into effect at • Allow frozen image clients • Cross mount points • Collect true image restore information • Compression • Encryption (option) • Allow multiple data streams • Exclude/Include File list
  61. 61. Backup & Schedules Backup types • Automatic Scheduled: Full or incremental Incremental includes: Differential Cumulative • Client Requested: Run at user request User backup and user archive Schedule method • Frequency-based • Calendar-based Backup window : Defined by start time and duration
  62. 62. Backup Types : • Full Backup : All files in the specified path are backed up on each client. • Differential Incremental Backup : Back up all files changed since the last full o r incremental (differential or cumulative) backup. • Cumulative Incremental Backup : Back up all files changed since the last full backup. Generates more files per backup, but restore is quicker.
  63. 63. New Schedule
  64. 64. Schedule Dialog—Attributes Tab
  65. 65. Schedule Attributes Override policy storage unit : The storage unit setting defined here overrides the storage unit specified in the attributes component of the policy. Override policy volume pool : The volume pool setting defined here overrides the volume pool specified in the attributes component of the policy. Media Multiplexing : Media multiplexing is the maximum number of jobs from this schedule that can be multiplexed onto a single tape. Retention : Retention is the time that the image is to be held in the images catalog. By default, NetBackup does not mix retention periods on a single volume. NetBackup has 25 levels of retention, defined in the table below.
  66. 66. Retention : • Image Retention : 1.Master server Properties window- Retention period 2.Schedule Attributes window • Retention levels are set as a number 0 through 24. • Each number corresponds to a time value set in the System Configuration interface. • Backup images are deleted from the Images Catalog when the retention period expires and the corresponding entry is deleted from the Media Catalog.
  67. 67. Schedule attributes :
  68. 68. Exclude Dates Tab :
  69. 69. Calendar Schedule Tab
  70. 70. Specific Dates
  71. 71. Recurring Week Days
  72. 72. Recurring Days of Month
  73. 73. Backup Window
  74. 74. Ne tba c kup 6 . x Re s to re © 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  75. 75. Restore Process • Restores are not scheduled. • File transfers occur in the background. • The administrator can restore data from any client in the Netbackup domain to any other client in the Netbackup domain (“push”). This known as a server-directed restore. • Restores can be performed on the client side (“pull”), by client request. • If a client performs a user-directed backup, the client has, by default, the ability to initiate a restore. • This known as a us e r-d ire c te d re s to re .
  76. 76. NetBackup 6.0 Restore • Restore of Client Data from Master Server
  77. 77. • For example if the customer want the data from 31-Oct-06 date to be restored You can check the backup status and Media ID details in the catalog information. Click on Catalog and select Client/host name and Date Range as shown in the above screen.
  78. 78. • Click on Backup, Archive, and Restore from Admin Console and then click on “Restore Files” Tab to view Restore Screen
  79. 79. • Click on right top button then you can select the Master Server , Source , Destination client details for restore and Select the policy type appropriate to the type of backup from the following types...standard, MS-Windows_NT, MS-SQL-Server or Lotus Notes.
  80. 80. • If the Source/Destination client names are not available you can Click on the Edit Client List box and add the respective client names (note that client names are case sensitive).
  81. 81. • Click on the Calendar button and Select Start Date , End Date and click “OK” to view the client backup details
  82. 82. • Select the respective Date for restore.
  83. 83. • Trace through the Directory Structure and select the files to restore
  84. 84. • Check availability of Media in the Library using GUI or using vmquery –m <MEDIA ID> on the Master
  85. 85. • Click on the Restore button which is shown at the right bottom of the screen .
  86. 86. • On Task Progress Tab you can see the job progress
  87. 87. • Check the job status from the activity monitor
  88. 88. • Restore job Status can be viewed for details, Successfully completed.
  89. 89. Netbackup 6.0 Restore • Restoring Data from Client end
  90. 90. • On Backup,Archive,and Restore Screen and Select View  Specify NetBackup Machines and Policy Type
  91. 91. • Select the Respective Master , Source Client & Destination client names – Be aware that server names should be Case Sensitive, Select the Policy type for restores which is appropriate to the type of backup from the following types Standard,MS-Windows_NT, MS-SQL-Server or Lotus Notes from the available drop down list box.
  92. 92. • Go to Select for Restore option and Click Restore from Normal Backup as shown in the above . It will show the status of the available backups to Restore.
  93. 93. • Search for the requested files/folder to restore, click on the preview to see the required media id.
  94. 94. • Once you find the files/folder to restore , check for the Media ID’s in the Catalog Information on the Master Server
  95. 95. • Check if the Required Media ID is available in the Tape Library using vmquery –m <media ID> or through GUI from the Master Server. • If the Media ID is not available in the Library engage Media Team to import the Media into the Library.
  96. 96. • Once the Media is available in the Tape Library , start restore
  97. 97. • Be careful about the Restore Destination, • Check the destination path is having sufficient free space to complete the restore.
  98. 98. • It is better to create new folder with SC # Ticket No._Date of the file restoring on the destination path. • Select Restore Options If you want to overwrite the existing file use the option. Or Select Do not restore the file • Click Start Restore
  99. 99. • Monitor the restore job from the Master and the Status of the job on the Client side.
  100. 100. • Monitor the restore job /Check the status after complete.
  101. 101. Common Restore Issues • Incorrect restore criteria – Date range – Type of restore – File search criteria • Insufficient disk space for restore • Improper file permissions
  102. 102. Ne tba c kup 6 . x Tro uble sho o ting © 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  103. 103. Ne tba c kup 6 . x : Lo g s
  104. 104. Logs • Location • Levels • Netbackup Logs • System Logs
  105. 105. Logs • Legacy Logging Location /usr/openv/netbackup/logs <install path>netbackuplogs • Lagacy Filename format • Log.<mmddyy> [root@ethp1032:/usr/openv/netbackup/logs/bpcd] # ll total 67420 -rw-r--r-- 1 root root 1566925 Jan 12 23:59 log.011208 -rw-r--r-- 1 root root 2027747 Jan 13 23:59 log.011308 -rw-r--r-- 1 root root 2729299 Jan 14 23:59 log.011408
  106. 106. Logs • Unified Logging • Unified logging—new in this 6.0 release—creates log file names and messages in a format that is standardized across all VERITAS products. Unified logging is used by certain NetBackup processes, mostly server processes.
  107. 107. Logs Unfied Logging location • All unified logs are written to the /usr/openv/logs directory (UNIX) and the <install_path>NetBackuplogs folder (Windows). Unlike legacy logging, there is no need to create logging subdirectories. Unified Logging File Name Format • Unified logging uses a standardized naming format for log files, as follows: • p ro duc tID-o rig ina to rID-ho s tID-d a te -ro ta tio n.log • p ro duc tID identifies the VERITAS product. The NetBackup product ID is 51216. • o rig ina to rID identifies the log writing entity, such as a process, service, script, or other • software. • ho s tID identifies the host that created the log file. Unless the file was moved, this is the • host where the log resides. • d a te shows when the log was written, in YYMMDD format.
  108. 108. Logs • Setting Logging properties - Master
  109. 109. Logs • Setting Logging properties - Client
  110. 110. Logs • The NetBackup Administration Console Global logging level field allows values of 0 to 5. 0 Very important low-volume diagnostic and debug messages 1 This level adds verbose diagnostic and debug messages 2 Adds progress messages 3 Adds informational dumps 4 Adds function entry/exits 5 Finest detail: everything is logged
  111. 111. Logs • Commonly referred logs • Bpcd :client • Bpbkar :client • Bptm/bpdm :media manager • Bpbrm :master/media manager • Bprd :master • Media manager logging automatically goes to the system logs using syslogd logging facility.
  112. 112. Logs : System • UNIX syslog location [root@bdhp4364:/var/adm/syslog] # more syslog.log | grep -i LSM Sep 26 03:54:33 bdhp4364 ltid[13870]: Operator/EMM server has DOWN'ed drive LSM203_D3 (device 0) Sep 26 03:54:33 bdhp4364 ltid[13870]: Operator/EMM server has DOWN'ed drive LSM203_D3 (device 0) Sep 26 14:31:22 bdhp4364 ltid[13870]: Operator/EMM server has UP'ed drive LSM203_D3 (device 0) • Windows Logs Control PanelAdministrative ToolsEvent viewer – Application Logs
  113. 113. Ne tba c kup 6 . x : Tro uble sho o ting
  114. 114. Troubleshooting • Errors • Return Codes • NB Commands • Host Properties
  115. 115. Troubleshooting: Issues • Media Errors • Device Errors • Networking Errors • Access Errors
  116. 116. Troubleshooting: Media Issues • Scratch Media • Media Density • Media retention • Media Read/Write Errors • Media Labeling
  117. 117. Troubleshooting: Media Issues • Return Codes • RC83 :media open error • RC84 :media write error • RC85 :media read error • RC86 :media position error • RC87 :media close error • RC96 :unable to allocate new media for backup, storage unit has none available • RC98 :error requesting media (tpreq)
  118. 118. Troubleshooting: Media Issues • Scratch media • Check the number and density of media in scratch media pool. • Use Commands bpmedialist, vmquery, available_media (goodies script)
  119. 119. Troubleshooting: Media Issues • Media retention • Check retention cycle of media
  120. 120. Troubleshooting: Media Issues • Media retention • Host properties -> Master -> Media
  121. 121. Troubleshooting: Media Issues • Check media error log • [root@bdhp4496:/usr/openv/netbackup/db/media] • # more errors • 09/14/06 20:39:17 UB0867 0 OPEN_ERROR LSM201_D1 • 09/14/06 20:39:22 UB0172 4 OPEN_ERROR LSM201_D5 • 09/14/06 20:39:24 UB0780 2 WRITE_ERROR LSM201_D3 • 09/14/06 20:41:38 UB0879 6 WRITE_ERROR LSM201_D7 • 09/14/06 20:42:17 UB0404 5 OPEN_ERROR LSM201_D6 • 09/14/06 20:55:48 UB0172 3 POSITION_ERROR LSM201_D4 • 09/14/06 21:05:29 UB0867 4 POSITION_ERROR LSM201_D5 • 09/14/06 21:45:27 UB0867 4 POSITION_ERROR LSM201_D5 • 09/14/06 21:57:32 UB0172 3 POSITION_ERROR LSM201_D4 • 09/14/06 22:00:25 UB0867 4 POSITION_ERROR LSM201_D5 • 09/15/06 18:11:17 UB0867 0 POSITION_ERROR LSM201_D1 • 09/17/06 18:10:09 UB0867 5 POSITION_ERROR LSM201_D6 • 09/17/06 18:50:43 UB0867 5 POSITION_ERROR LSM201_D6 • 09/17/06 19:26:55 UB0867 5 POSITION_ERROR LSM201_D6 • 09/18/06 07:47:08 UB0867 3 POSITION_ERROR LSM201_D4
  122. 122. Troubleshooting: Media Issues • NB Commands # bpmedialist Server Host = bdhp4496 id rl images allocated last updated density kbytes restores vimages expiration last read <------- STATUS -------> -------------------------------------------------------------------------------- UB0000 4 48 01/01/2008 18:36 01/01/2008 20:41 hcart3 622396004 0 MPX 02/16/2008 07:08 N/A FULL UB0025 1 3 01/17/2008 10:05 01/17/2008 10:08 hcart3 81808377 0 3 02/01/2008 10:08 N/A UB0349 4 27 01/11/2008 18:01 01/11/2008 19:06 hcart3 341665402 2 MPX 02/25/2008 20:24 01/22/2008 22:30 FROZEN UB0529 4 18 12/04/2007 18:02 12/06/2007 00:11 hcart3 300763609 0 MPX 01/20/2008 00:11 N/A EXPIRED FROZEN UB0957 4 0 01/21/2008 18:01 N/A hcart3 0 0 0 N/A N/A FROZEN
  123. 123. Troubleshooting: Media Issues • NB Commands • /usr/openv/netbackup/bin/admincmd/bpexpdate -m UB2466 –d 12/24/2007 • bpexpdate - Change the expiration date of backups in the image catalog and media in the media catalog.
  124. 124. Troubleshooting: Media Issues • NB Commands • /usr/openv/netbackup/bin/admincmd/bpmedia –freeze – m AG0412 • bpmedia - Freeze, unfreeze, suspend, or unsuspend NetBackup media.
  125. 125. Troubleshooting: Media Issues • NB Commands [root@bdhp4496:/usr/openv/netbackup/bin/goodies] # available_media media media robot robot robot side/ ret size status ID type type # slot face level KBytes ---------------------------------------------------------------------------- CatalogBackup pool UB0025 HCART3 NONE - - - 1 81808377 ACTIVE UB0062 HCART3 NONE - - - 1 81049494 ACTIVE UB0498 HCART3 NONE - - - 1 80708500 ACTIVE UB0539 HCART3 NONE - - - 1 79542446 ACTIVE UB0741 HCART3 NONE - - - 1 80644961 ACTIVE DataStore pool Eject_W pool BS0013 HCART NONE - - - 4 8121329 ACTIVE BS0083 HCART NONE - - - 4 68277329 ACTIVE BS0086 HCART NONE - - - 4 23408388 ACTIVE
  126. 126. Troubleshooting: Device Issues • Storage Unit unavailable • Drive Down • Robot Down • Robot freezing all media
  127. 127. Troubleshooting: Device Issues • Return Codes • RC129 disk storage unit full. • RC213 no storage units available for use • RC219 the required storage unit is unavailable • RC800 resource request failed
  128. 128. Troubleshooting: Device Issues • Check drive status [root@bdhp4499:/root] # tpconfig -d Id DriveName Type Residence Drive Path Status **************************************************************************** 0 LSM204_D5 hcart3 TLD(204) DRIVE=5 /dev/rmt/c8t0d0BESTnb UP 1 LSM204_D6 hcart3 TLD(204) DRIVE=6 /dev/rmt/c8t0d1BESTnb UP Currently defined robotics are: TLD(200) robot control host = bdhp4498 TLD(204) robot control host = bdhp4498 EMM Server = bdhp4498
  129. 129. Troubleshooting: Device Issues • Check drive status • [root@bdhp4499:/root] • # vmoprcmd • HOST STATUS • Host Name Version Host Status • ========================================= ======= =========== • bdhp4498 600000 ACTIVE • bdhp4547 600000 ACTIVE • PENDING REQUESTS • <NONE> • DRIVE STATUS • Drive Name Label Ready RecMID ExtMID Wr.Enbl. Type • Host DrivePath Status • ============================================================================= • LSM200_D1 Yes Yes UA0158 UA0158 Yes hcart3 • bdhp4498 /dev/rmt/c17t0d1BESTnb ACTIVE • LSM200_D10 Yes Yes UA0604 UA0604 Yes hcart3 • bdhp4254 /dev/rmt/c29t0d1BESTnb ACTIVE
  130. 130. Troubleshooting: Device Issues [root@bdhp4496:/usr/openv/netbackup] # vmoprcmd -reset 0 [root@bdhp4496:/usr/openv/netbackup] # vmoprcmd -up 0 [root@bdhp4496:/usr/openv/netbackup] # vmoprcmd -down 0 [root@bdhp4496:/usr/openv/netbackup] # tpconfig -d Id DriveName Type Residence Drive Path Status **************************************************************************** 0 LSM201_D1 hcart3 TLD(201) DRIVE=1 /dev/rmt/c100t0d1BESTnb DOWN
  131. 131. Troubleshooting: Device Issues • Check Robot status # robtest Configured robots with local control supporting test utilities: TLD(103) robotic path = /dev/rac/c106t0d0 TLD(201) robotic path = /dev/rac/c100t0d0 TLD(800) robotic path = /dev/rac/c110t0d0 Robot Selection --------------- 1) TLD 103 2) TLD 201 3) TLD 800 4) none/quit Enter choice: 2 Robot selected: TLD(201) robotic path = /dev/rac/c100t0d0 Invoking robotic test utility: /usr/openv/volmgr/bin/tldtest -rn 201 -r /dev/rac/c100t0d0 Opening /dev/rac/c100t0d0 MODE_SENSE complete Enter tld commands (? returns help information)
  132. 132. Troubleshooting: Device Issues • Services/Daemons • bpps –a shows Netbackup processes running on a Master or Media Manager) [root@bdhp4496:/root] # bpps -a NB Processes ------------ MM Processes ------------ root 14324 1 0 Jan 21 ? 1:46 /usr/openv/volmgr/bin/ltid root 25380 1 0 Jan 13 ? 3:14 vmd root 14363 1 0 Jan 21 ? 0:28 tldcd root 14351 14324 0 Jan 21 ? 0:04 tldd root 14352 14324 0 Jan 21 ? 3:57 avrd
  133. 133. Troubleshooting: Device Issues • Check media error log • [root@bdhp4496:/usr/openv/netbackup/db/media] • # more errors • 09/14/06 20:39:17 UB0867 0 OPEN_ERROR LSM201_D1 • 09/14/06 20:39:22 UB0172 4 OPEN_ERROR LSM201_D5 • 09/14/06 20:39:24 UB0780 2 WRITE_ERROR LSM201_D3 • 09/14/06 20:41:38 UB0879 6 WRITE_ERROR LSM201_D7 • 09/14/06 20:42:17 UB0404 5 OPEN_ERROR LSM201_D6 • 09/14/06 20:55:48 UB0172 3 POSITION_ERROR LSM201_D4 • 09/14/06 21:05:29 UB0867 4 POSITION_ERROR LSM201_D5 • 09/14/06 21:45:27 UB0867 4 POSITION_ERROR LSM201_D5 • 09/14/06 21:57:32 UB0172 3 POSITION_ERROR LSM201_D4 • 09/14/06 22:00:25 UB0867 4 POSITION_ERROR LSM201_D5 • 09/15/06 18:11:17 UB0867 0 POSITION_ERROR LSM201_D1 • 09/17/06 18:10:09 UB0867 5 POSITION_ERROR LSM201_D6 • 09/17/06 18:50:43 UB0867 5 POSITION_ERROR LSM201_D6 • 09/17/06 19:26:55 UB0867 5 POSITION_ERROR LSM201_D6 • 09/18/06 07:47:08 UB0867 3 POSITION_ERROR LSM201_D4
  134. 134. Troubleshooting: Device Issues • Tpclean - manages tape drive cleaning /usr/openv/volmgr/bin/tpclean –c LSM201_D5
  135. 135. Troubleshooting: Device Issues • Check drive status OS Side # ioscan -funC tape Class I H/W Path Driver S/W State H/W Type Description ======================================================== =================== tape 0 1/0/2/0/0.1.43.255.0.0.0 stape CLAIMED DEVICE HP Ultrium 3-SCSI /dev/rmt/0m /dev/rmt/0mn /dev/rmt/c8t0d0BEST /dev/rmt/c8t0d0BESTn /dev/rmt/0mb /dev/rmt/0mnb /dev/rmt/c8t0d0BESTb /dev/rmt/c8t0d0BESTnb tape 2 1/0/2/0/0.1.43.255.0.0.1 stape NO_HW DEVICE HP Ultrium 3-SCSI /dev/rmt/2m /dev/rmt/2mn /dev/rmt/c8t0d1BEST /dev/rmt/c8t0d1BESTn /dev/rmt/2mb /dev/rmt/2mnb /dev/rmt/c8t0d1BESTb /dev/rmt/c8t0d1BESTnb
  136. 136. Troubleshooting: Device Issues • Check robot status OS Side # ioscan -funC autoch Class I H/W Path Driver S/W State H/W Type Description =========================================================================== autoch 0 0/0/6/0/0.1.82.255.0.0.0 schgr CLAIMED DEVICE HP ESL E-Series /dev/rac/c100t0d0 autoch 1 0/0/6/0/1.36.38.255.0.0.0 schgr NO_HW DEVICE STK L700 /dev/rac/c106t0d0
  137. 137. Troubleshooting: Device Issues • Check Robot & Drive status from Library Side Telnet, Command View TL, other vendor specific tools. Functions :- • See Status, • Diagnose, • Move media, • Errors logs, • events, • Generate support ticket, • Reboot Interface controllers, Library,.. • Firmware Update.
  138. 138. Troubleshooting: Device Issues • Check Robot & Drive status from Library Side telnet /show>library status Component Status Description --------------------- ------- ------------------------------------------------ Advanced Features Secure Manager Green Secure Manager is operational System Health Green Operational Library Green No additional sense information Robotics: Frame 1 Green Operative; Picker Empty Sensors: Frame 1 Green Operative Drives: Drive 1 Green Operative; Tape Loaded Drive 2 Green Operative; Tape Loaded Drive 3 Green Operative; Tape Loaded Drive 4 Green Operative; Tape Loaded Drive 5 Green Operative; Tape Loaded Drive 6 Green Operative; Tape Loaded
  139. 139. Troubleshooting: Device Issues • Check Robot & Drive status Library Side Command View TL
  140. 140. Troubleshooting: Device Issues • Must use local Drive • Relevant if media managers use drives of other media manager.
  141. 141. Troubleshooting: Networking Issues • Read/Connect time out • Slow throughput • Network Connection Broken
  142. 142. Troubleshooting: Networking Issues • Return Codes • RC40 network connection broken • RC41 network connection timed out • RC42 network read failed
  143. 143. Troubleshooting: Networking Issues [root@bdhp4496:/root] # ping bdhp4264 PING bdhp4264.na.pg.com: 64 byte packets 64 bytes from 192.44.190.160: icmp_seq=0. time=140. ms 64 bytes from 192.44.190.160: icmp_seq=1. time=89. ms 64 bytes from 192.44.190.160: icmp_seq=2. time=89. ms ----bdhp4264.na.pg.com PING Statistics---- 3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms) min/avg/max = 89/106/140 [root@bdhp4496:/root] # nslookup bdhp4264 Using /etc/hosts on: bdhp4496 looking up FILES Trying DNS Name: bdhp4264.na.pg.com Address: 192.44.190.160
  144. 144. Troubleshooting: Networking Issues Hostname Resolution [root@bdhp4264:/root] # bpclntcmd -pn expecting response from server bdhp4497.na.pg.com bdhp4264.na.pg.com bdhp4264 192.44.190.160 64522
  145. 145. Troubleshooting: Networking Issues • Hostname Resolution [root@bdhp4264:/root] # bptestbpcd -host bdhp4497 -debug 00:10:48.230 [22013] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006 00:10:48.233 [22013] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpcd 00:10:48.299 [22013] <2> logconnections: BPCD CONNECT FROM 192.44.190.160.64532 TO 192.44.190.68.13724 00:10:48.305 [22013] <2> vnet_connect_to_vnetd_extra: vnet_vnetd.c.178: msg: VNETD CONNECT FROM 192.44.190.160.64533 TO 192.44.190.68.13724 fd = 5 00:10:48.547 [22013] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.528: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a 00:10:48.654 [22013] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.546: ipc_string: /tmp/vnet-11335201324248642328000000000-a11335 1 1 1 192.44.190.160:64532 -> 192.44.190.68:13724 192.44.190.160:64533 -> 192.44.190.68:13724 <2>bptestbpcd: EXIT status = 0 00:10:49.124 [22013] <2> bptestbpcd: EXIT status = 0
  146. 146. Troubleshooting: Networking Issues • Client read timeout [root@sihp8056:/usr/openv/netbackup] # more bp.conf SERVER = sihp8071.ap.pg.com SERVER = sihp8056 CLIENT_NAME = sihp8056 EMMSERVER = sihp8071 ALLOW_MEDIA_OVERWRITE = TAR ALLOW_MEDIA_OVERWRITE = CPIO ALLOW_MEDIA_OVERWRITE = ANSI CLIENT_READ_TIMEOUT = 900
  147. 147. Troubleshooting: Networking Issues • Client Connect timeout [root@bdhp4496:/usr/openv/netbackup] # more bp.conf SERVER = bdhp4496 SERVER = bdhp4496bk1 SERVER = bdhp4496bk2 SERVER = bdc-nbmedia003 CLIENT_CONNECT_TIMEOUT = 900
  148. 148. Troubleshooting: Networking Issues • Host Properties Time out values
  149. 149. Troubleshooting: Networking Issues • Slow Backups • Intermittently failing backups
  150. 150. Troubleshooting: Networking Issues Check NIC statistics UNIX [root:/root] # netstat -ian Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll lan3 1500 155.125.115.0 155.125.115.32 778554304 0 257548791 0 0 lan2 1500 155.125.115.0 155.125.115.47 3166263672 0 4193995719 0 0 lan1 1500 155.125.115.0 155.125.115.37 3803779978 0 1919760867 0 0 lan0 1500 192.44.190.0 192.44.190.67 2584132859 5250 934577700 0 0 lo0 4136 127.0.0.0 127.0.0.1 32529802 0 32529802 0 0 lan4 1500 155.125.115.0 155.125.115.36 2225225670 0 1526293377 0 0
  151. 151. Troubleshooting: Networking Issues • Hostname Resolution
  152. 152. Troubleshooting: Networking Issues NFS Access timeout Relevant if follow NFS is enabled
  153. 153. Troubleshooting: Networking Issues • Snapshots settings
  154. 154. Troubleshooting: Networking Issues • Hostname Command works Windows and UNIX [root@bdhp4264:/root] # hostname bdhp4264 [root@bdhp4264:/root] # C:>hostname BDC-NOTES206 C:>
  155. 155. Troubleshooting: Networking Issues • DNS Suffix Search List
  156. 156. Troubleshooting: Access Issues • Client Access denied
  157. 157. Troubleshooting: Access Issues • Return Codes • RC57 :client connection refused • RC58 :can’t connect to client • RC59 :access to the client was not allowed
  158. 158. Troubleshooting: Access Issues • Checks if all hostnames of master are listed. • UNIX : bp.conf • [root@bdhp4496:/usr/openv/netbackup] • # more bp.conf • SERVER = bdhp4496 • SERVER = bdhp4496bk1 • SERVER = bdhp4496bk2 • SERVER = bdhp4496bk3 • SERVER = bdhp4496bk4
  159. 159. Troubleshooting: Access Issues • Checks if all media servers are listed • UNIX : bp.conf [root@bdhp4464:/usr/openv/netbackup] # more bp.conf SERVER = bdhp5250.na.pg.com SERVER = bdhp4464 SERVER = bdc-nbmedia003 SERVER = bdc-nbmedia006 SERVER = bdc-intra600 CLIENT_NAME = bdhp4464 EMMSERVER = bdhp5250
  160. 160. Troubleshooting: Access Issues • Checks • Check if bpcd port is accessible from master server. [root@bdhp4496:/root] # telnet bdhp4362 bpcd Trying... Connected to bdhp4362.na.pg.com. Escape character is '^]'.
  161. 161. Troubleshooting: Access Issues • Host Properties Master
  162. 162. Troubleshooting: Return Codes • RC0  • RC5 • RC13 • RC25
  163. 163. Troubleshooting: Return Code 5 • RC5 the restore failed to recover the requested files • Checks • Bp.conf server entries • ownership and permission on directories
  164. 164. Troubleshooting: Return Code 13 • RC13 file read failed • Checks • I/O error reading from the file system. • Check the free space in the system. • Network Issues • Read of an incomplete or corrupt file.
  165. 165. Troubleshooting: Return Code 25 • RC25 cannot connect on socket • Checks • Hostname Resolution • Server Busy • Netstat
  166. 166. Question and Answer •Q&A
  167. 167. Thanks GCI Team

×