Skip to content

Troubleshooting Oracle RAC

    Troubleshooting Oracle RAC. The majority of RAC issues encountered have been caused by one or more of the following:

    • Incorrect network configuration. Remember, the public IP addresses, VIPs and SCAN IPs must all be on the same public network. The private IPs must be on a different network to the public network. The public IPs and the private IPs must all be pingable prior to the installation.
    • Incorrect shared disk configuration. The voting disk and OCR location, as well as all the database files, need to be on shared storage for RAC to function properly. Any problems with the shared disk configuration will cause RAC to fail.
    • Missing prerequisites. There are a lot of prerequisites that must be completed before you can start a RAC installation. It may be tempting to miss steps out, but this will invariably cause problems. Make sure all prerequisites are met before starting the installation.
    • Insufficient available resources. This is especially true of people doing virtual RAC installations. The minimum requirements of 11gR2 RAC are quite significant. Without some clever tricks to free up memory, you are going to need at least 4G RAM per node to complete a fairly basic installation. Trying to install RAC on under-specced hardware can lead to some rather unpredictable results.

    csrctl

    Amongst other things in Troubleshooting Oracle RAC, the crsctl command allows you to check the health of the cluster. The following command displays the top-level view of the cluster.

    # cd /u01/app/11.2.0/grid/bin
    
    # ./crsctl check cluster -all
    **************************************************************
    ol6-112-rac1:
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
    **************************************************************
    ol6-112-rac2:
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
    **************************************************************
    #

    The following command gives information about the individual resources.

    # ./crsctl stat res -t
    --------------------------------------------------------------------------------
    NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
    --------------------------------------------------------------------------------
    Local Resources
    --------------------------------------------------------------------------------
    ora.DATA.dg
                   ONLINE  ONLINE       ol6-112-rac1                                 
                   ONLINE  ONLINE       ol6-112-rac2                                 
    ora.LISTENER.lsnr
                   ONLINE  ONLINE       ol6-112-rac1                                 
                   ONLINE  ONLINE       ol6-112-rac2                                 
    ora.asm
                   ONLINE  ONLINE       ol6-112-rac1             Started             
                   ONLINE  ONLINE       ol6-112-rac2             Started             
    ora.gsd
                   OFFLINE OFFLINE      ol6-112-rac1                                 
                   OFFLINE OFFLINE      ol6-112-rac2                                 
    ora.net1.network
                   ONLINE  ONLINE       ol6-112-rac1                                 
                   ONLINE  ONLINE       ol6-112-rac2                                 
    ora.ons
                   ONLINE  ONLINE       ol6-112-rac1                                 
                   ONLINE  ONLINE       ol6-112-rac2                                 
    ora.registry.acfs
                   ONLINE  ONLINE       ol6-112-rac1                                 
                   ONLINE  ONLINE       ol6-112-rac2                                 
    --------------------------------------------------------------------------------
    Cluster Resources
    --------------------------------------------------------------------------------
    ora.LISTENER_SCAN1.lsnr
          1        ONLINE  ONLINE       ol6-112-rac1                                 
    ora.LISTENER_SCAN2.lsnr
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.LISTENER_SCAN3.lsnr
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.cvu
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.oc4j
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.ol6-112-rac1.vip
          1        ONLINE  ONLINE       ol6-112-rac1                                 
    ora.ol6-112-rac2.vip
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.rac.db
          1        ONLINE  ONLINE       ol6-112-rac1             Open                
          2        ONLINE  ONLINE       ol6-112-rac2             Open                
    ora.scan1.vip
          1        ONLINE  ONLINE       ol6-112-rac1                                 
    ora.scan2.vip
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    ora.scan3.vip
          1        ONLINE  ONLINE       ol6-112-rac2                                 
    #

    olsnodes

    Run the olsnodes command on all cluster nodes and see that it returns a list of all the nodes in each case.

    # cd /u01/app/11.2.0/grid/bin
    
    # ./olsnodes
    ol6-112-rac1
    ol6-112-rac2
    #

    cluvfy

    You have probably run the runcluvfy.sh utility from the installation media before the installing the clusterware software. Once the Oracle software is installed, the cluvfy utility is available to provide useful post-installation information. Use the “-help” flag for usage information.

    $ cluvfy stage -help
    
    USAGE:
    cluvfy stage {-pre|-post} <stage-name> <stage-specific options>  [-verbose]
    
    SYNTAX (for Stages):
    cluvfy stage -pre cfs -n <node_list> -s <storageID_list> [-verbose]
    cluvfy stage -pre 
                       crsinst -file <config_file> [-fixup [-fixupdir <fixup_dir>]] [-verbose]
                       crsinst -upgrade [-n <node_list>] [-rolling] -src_crshome <src_crshome> -dest_crshome <dest_crshome>
                               -dest_version <dest_version> [-fixup [-fixupdir <fixup_dir>]] [-verbose]
                       crsinst -n <node_list> [-r {10gR1|10gR2|11gR1|11gR2}]
                               [-c <ocr_location_list>] [-q <voting_disk_list>]
                               [-osdba <osdba_group>] [-orainv <orainventory_group>]
                               [-asm [-asmgrp <asmadmin_group>] [-asmdev <asm_device_list>]] [-crshome <crs_home>]
                               [-fixup [-fixupdir <fixup_dir>]] [-networks <network_list>]
                               [-verbose]
    cluvfy stage -pre acfscfg -n <node_list> [-asmdev <asm_device_list>] [-verbose]
    cluvfy stage -pre 
                       dbinst -n <node_list> [-r {10gR1|10gR2|11gR1|11gR2}] [-osdba <osdba_group>] [-d <oracle_home>]
                              [-fixup [-fixupdir <fixup_dir>]] [-verbose]
                       dbinst -upgrade -src_dbhome <src_dbhome> [-dbname <dbname-list>] -dest_dbhome <dest_dbhome> -dest_version <dest_version>
                              [-fixup [-fixupdir <fixup_dir>]] [-verbose]
    cluvfy stage -pre dbcfg -n <node_list> -d <oracle_home> [-fixup [-fixupdir <fixup_dir>]] [-verbose]
    cluvfy stage -pre hacfg [-osdba <osdba_group>] [-orainv <orainventory_group>] [-fixup [-fixupdir <fixup_dir>]] [-verbose]
    cluvfy stage -pre nodeadd -n <node_list> [-vip <vip_list>] [-fixup [-fixupdir <fixup_dir>]] [-verbose]
    cluvfy stage -post hwos -n <node_list> [-s <storageID_list>] [-verbose]
    cluvfy stage -post cfs -n <node_list> -f <file_system> [-verbose]
    cluvfy stage -post crsinst -n <node_list> [-verbose]
    cluvfy stage -post acfscfg -n <node_list> [-verbose]
    cluvfy stage -post hacfg [-verbose]
    cluvfy stage -post nodeadd -n <node_list> [-verbose]
    cluvfy stage -post nodedel -n <node_list> [-verbose]
    
    $

    Two examples are shown below.

    $ cluvfy stage -post crsinst -n ol6-112-rac1,ol6-112-rac2
    $ cluvfy stage -pre dbcfg -n ol6-112-rac1,ol6-112-rac2 -d /u01/app/oracle/product/11.2.0/db_1

    In all cases, check through the output and correct any errors produced.

    ORAchk

    Download the zip file and install it by simply unzipping.

    $ mkdir orachk
    $ unzip -d orachk orachk.zip
    $ cd orachk
    $ ./orachk
    
    CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /u01/app/12.1.0.1/grid?[y/n][y]
    
    Checking ssh user equivalency settings on all nodes in cluster
    
    Node ol6-121-rac2 is configured for ssh user equivalency for oracle user
     
    
    Searching for running databases
    
    List of running databases registered in OCR
    1. cdbrac
    2. None of above
    
    Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1].1
    
    Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
    -------------------------------------------------------------------------------------------------------
                                                     Oracle Stack Status                            
    -------------------------------------------------------------------------------------------------------
    Host Name  CRS Installed  ASM HOME       RDBMS Installed  CRS UP    ASM UP    RDBMS UP  DB Instance Name
    -------------------------------------------------------------------------------------------------------
    ol6-121-rac1 Yes             N/A             Yes             Yes        Yes      Yes      cdbrac1   
    ol6-121-rac2 Yes             N/A             Yes             Yes        Yes      Yes      cdbrac2   
    -------------------------------------------------------------------------------------------------------
    

    RACcheck

    $ unzip raccheck.zip
    $ cd rachcheck
    $ chmod 755 raccheck
    $ ./raccheck -a
    
    CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /u01/app/11.2.0/grid?[y/n][y]
    
    Checking ssh user equivalency settings on all nodes in cluster
    
    Node ol6-112-rac2 is configured for ssh user equivalency for oracle user
     
    
    Searching for running databases
    
    List of running databases registered in OCR
    1. RAC
    2. None
    
    Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1].
    
    
    Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
    -------------------------------------------------------------------------------------------------
                                                     Oracle Stack Status                            
    -------------------------------------------------------------------------------------------------
    Host Name  CRS Installed  ASM HOME       RDBMS Installed  CRS UP    ASM UP    RDBMS UP  DB Instance Name
    -------------------------------------------------------------------------------------------------
    ol6-112-rac1 Yes             Yes             Yes             Yes        Yes      Yes      RAC1      
    ol6-112-rac2 Yes             Yes             Yes             Yes        Yes      Yes      RAC2      
    ------------------------------------------------------------------------------------------------

    Also See:

    Autonomous Health Framework

    Inspect Problems in Oracle RAC