Update 22/12/2009 After the first comment on the post I now know that there is an easier way to deal with the problem.
How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation [ID 942166.1] (Last update is later than my post might be related 🙂 )
Basically Step 1: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force” on all nodes, except the last one. Step 2: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode” on last node. This command will zero out OCR and VD disk also. |
Last 3 days I was a bit busy with installing Oracle RAC on Solaris 10 x64 on VMWare. I am planning to write a detailed documentation ,but I want to write an issue beforehand, which I managed to solve during the installation .
During grid infrastructure everything went fine till I ran root.sh script for cluster configuration. Script failed with the error stack below (I truncated the worked part)
# /u01/app/11.2.0/grid/root.sh .... .... .... ASM created and started successfully. DiskGroup DATA created successfully. Errors in file : ORA-27091: unable to queue I/O ORA-15081: failed to submit an I/O operation to a disk ORA-06512: at line 4 PROT-1: Failed to initialize ocrconfig Command return code of 255 (65280) from command: /u01/grid/11.2.0/bin/ocrconfig -upgrade grid oinstall Failed to create Oracle Cluster Registry configuration, rc 255 CRS-2500: Cannot stop resource 'ora.crsd' as it is not running CRS-4000: Command Stop failed, or completed with errors. Command return code of 1 (256) from command: /u01/grid/11.2.0/bin/crsctl stop resource ora.crsd -init Stop of resource "ora.crsd -init" failed Failed to stop CRSD CRS-2673: Attempting to stop 'ora.asm' on 'solarac2' CRS-2677: Stop of 'ora.asm' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.ctssd' on 'solarac2' CRS-2677: Stop of 'ora.ctssd' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'solarac2' CRS-2677: Stop of 'ora.cssdmonitor' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'solarac2' CRS-2677: Stop of 'ora.cssd' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'solarac2' CRS-2677: Stop of 'ora.gpnpd' on 'solarac2' succeeded CRS-2679: Attempting to clean 'ora.gpnpd' on 'solarac2' CRS-2681: Clean of 'ora.gpnpd' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'solarac2' CRS-2677: Stop of 'ora.gipcd' on 'solarac2' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'solarac2' CRS-2677: Stop of 'ora.mdnsd' on 'solarac2' succeeded Initial cluster configuration failed. See /u01/grid/11.2.0/cfgtoollogs/crsconfig/rootcrs_solarac2.log for details
I tried to run root.sh again which I shouldn’t have done because it is documented not to do. (I have to confess that I did not read the installation document well)
The error stack was different like below
# /u01/app/11.2.0/grid/root.sh Running Oracle 11g root.sh script... ......... Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root.sh script. Now product-specific root actions will be performed. 2009-12-06 22:57:05: Parsing the host name 2009-12-06 22:57:05: Checking for super user privileges 2009-12-06 22:57:05: User has super user privileges Using configuration parameter file: /u01/11.2.0/grid/crs/install/crsconfig_params CRS is already configured on this node for crshome=0 Cannot configure two CRS instances on the same cluster. Please deconfigure before proceeding with the configuration of new home.
As you see it didn’t allow me to re-run it. I needed to find a way to deconfigure the configuration. After a quick search on official doc I found the way here.
According to the doc, all I needed to do is run the command below and re-run the root.sh
/crs/install/rootcrs.pl -deconfig
Here is what happened when I run deconfigure
2009-12-07 00:35:17: Parsing the host name 2009-12-07 00:35:17: Checking for super user privileges 2009-12-07 00:35:17: User has super user privileges Using configuration parameter file: /u01/grid/11.2.0/crs/install/crsconfig_params Oracle Clusterware stack is not active on this node Restart the clusterware stack (use /u01/grid/11.2.0/bin/crsctl start crs) and retry Failed to verify resources
Still wasn’t working ??? I tried force option and it seemed like it de-configured successfully (maybe 🙂 )
# /u01/grid/11.2.0/crs/install/rootcrs.pl -deconfig -force 2009-12-07 00:39:13: Parsing the host name 2009-12-07 00:39:13: Checking for super user privileges 2009-12-07 00:39:13: User has super user privileges Using configuration parameter file: /u01/grid/11.2.0/crs/install/crsconfig_params PRCR-1035 : Failed to look up CRS resource ora.cluster_vip.type for 1 PRCR-1068 : Failed to query resources Cannot communicate with crsd PRCR-1070 : Failed to check if resource ora.gsd is registered Cannot communicate with crsd PRCR-1070 : Failed to check if resource ora.ons is registered Cannot communicate with crsd PRCR-1070 : Failed to check if resource ora.eons is registered Cannot communicate with crsd CRS-4133: Oracle High Availability Services has been stopped. Successfully deconfigured Oracle clusterware stack on this node
It says it did successfully deconfigured but when I run the root.sh again I got this
Disk Group DATA already exists. Cannot be created again Configuration of ASM failed, see logs for details Did not succssfully configure and start ASM CRS-2500: Cannot stop resource 'ora.crsd' as it is not running CRS-4000: Command Stop failed, or completed with errors. Command return code of 1 (256) from command: /u01/grid/11.2.0/bin/crsctl stop resource ora.crsd -init Stop of resource "ora.crsd -init" failed Failed to stop CRSD CRS-2500: Cannot stop resource 'ora.asm' as it is not running CRS-4000: Command Stop failed, or completed with errors. Command return code of 1 (256) from command: /u01/grid/11.2.0/bin/crsctl stop resource ora.asm -init Stop of resource "ora.asm -init" failed Failed to stop ASM CRS-2673: Attempting to stop 'ora.ctssd' on 'solarac1' CRS-2677: Stop of 'ora.ctssd' on 'solarac1' succeeded CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'solarac1' CRS-2677: Stop of 'ora.cssdmonitor' on 'solarac1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'solarac1' CRS-2677: Stop of 'ora.cssd' on 'solarac1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'solarac1' CRS-2677: Stop of 'ora.gpnpd' on 'solarac1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'solarac1' CRS-2677: Stop of 'ora.gipcd' on 'solarac1' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'solarac1' CRS-2677: Stop of 'ora.mdnsd' on 'solarac1' succeeded Initial cluster configuration failed. See /u01/grid/11.2.0/cfgtoollogs/crsconfig/rootcrs_solarac2.log for details
On the mentioned logfile it says
2009-12-07 00:43:26: Executing as grid: /u01/grid/11.2.0/bin/asmca -silent -diskGroupName DATA -diskList /dev/rdsk/c1t1d0s1,/dev/rdsk/c1t2d0s1,/dev/rdsk/c1t3 d0s1,/dev/rdsk/c1t4d0s1 -redundancy EXTERNAL -configureLocalASM 2009-12-07 00:43:26: Running as user grid: /u01/grid/11.2.0/bin/asmca -silent -diskGroupName DATA -diskList /dev/rdsk/c1t1d0s1,/dev/rdsk/c1t2d0s1,/dev/rdsk/c 1t3d0s1,/dev/rdsk/c1t4d0s1 -redundancy EXTERNAL -configureLocalASM 2009-12-07 00:43:26: Invoking "/u01/grid/11.2.0/bin/asmca -silent -diskGroupName DATA -diskList /dev/rdsk/c1t1d0s1,/dev/rdsk/c1t2d0s1,/dev/rdsk/c1t3d0s1,/d ev/rdsk/c1t4d0s1 -redundancy EXTERNAL -configureLocalASM" as user "grid" 2009-12-07 00:43:30: Configuration of ASM failed, see logs for details
Basically it configures asm with asmca command. asmca utility does not have drop diskgroup option which makes it unusable for this situation. (there is deleteasm option but it does not work fine because it needs a working asm instance which wasn’t possible after failed root.sh)
I didn’t want to delete all CRS installation so I needed a way to remove diskgroup information from ASM disks?
All I needed was dd command to remove the disk header information from the devices.
I had 4 disk presented for that disk group so I used dd command for all of them (I am not sure maybe I needed only the firs device I need to check invaluable presentation of Julian Dyke about ASM Internals)
# dd if=/dev/zero of=/dev/rdsk/c1t2d0s1 bs=1024K count=100 dd: bad numeric argument: "1024K" bash-3.00# dd if=/dev/zero of=/dev/rdsk/c1t2d0s1 bs=1k count=1000000 1000000+0 records in 1000000+0 records out # dd if=/dev/zero of=/dev/rdsk/c1t1d0s1 bs=1k count=1000000 1000000+0 records in 1000000+0 records out # dd if=/dev/zero of=/dev/rdsk/c1t3d0s1 bs=1k count=1000000 1000000+0 records in 1000000+0 records out # dd if=/dev/zero of=/dev/rdsk/c1t4d0s1 bs=1k count=1000000 1000000+0 records in 1000000+0 records out
After this deletion I re-run the deconfigure script and re-run the root.sh. Everything worked fine without any problem at all. The story will continue with How to install 11GR2 RAC on Solaris 10 on VMware (give me a bit more time to finish)
footnoteSmilar issue reported on metalink for Linux ( ML 955550.1)
Sources used
Oracle® Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Solaris Operating System
How to use Files in place of Real Disk Devices for ASM – (Solaris) by Jeff Hunter
How to rerun root.sh during initial installation of GRID Infrastructure. by RACHELP
Please follow the metalink note 942166.1
How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation [ID 942166.1]
Thanks
CRS
Comment by CRS — December 22, 2009 @ 7:01 pm
[…] Important resources used for that purpose: root.sh failed after ASM disk creation for 11GR2 Grid Infrastructure […]
Pingback by Resources added this Week « Oracle Top 5 References's Blog — January 17, 2010 @ 11:26 am
I follow your step and able to complete GRID infratructure installation from the failed point that you mention.
thanks a lot Coskan!!!
Comment by Kundu — July 17, 2010 @ 2:26 am
Abi bir tanesin sen ayni problem benimde basima geldi linuxta kurarken rootcrs.pl -deconfig -force
komutu ilac gibi geldi bana
tesekkurler
Comment by Fikrat — July 21, 2010 @ 2:28 pm
Teşekkürler Coşkan. Çok işime yaradı.
Comment by Derya — July 29, 2010 @ 12:44 pm
new homes are actually cheaper in the long run compared to renting an old house or renovating a used home “”
Comment by Light Fittings · — November 3, 2010 @ 11:59 pm
My root.sh was successful during installation. I get “PRCR-1070 : Failed to check if resource ora.asm is registered” when restarting ASM on the nodes.
Our RAC/Grid and database was setup successfully. Due to a network outage (and hardware upgrade), the nodes were rebooted. We are seeing this error after the reboot on one of the nodes (the first node.)
How do I resolve it without deconfiguring the whole crs? I have created the database and created about 80 tablespaces totalling to about 13TB. Hence, I am reluctant to go back to square one.
Comment by Kanwar — December 10, 2010 @ 11:31 am
Üstadım selam
How can I do this with Exadata system ?
Comment by motodreamer — June 17, 2011 @ 1:55 pm
valla bilsem hemen soyleyeyim de once querter rack alicak param olmasi gerek 🙂
en iyisi Oracle Support a sormak
Comment by coskan — June 17, 2011 @ 1:57 pm
Abi cok değil yaw,lisansları hallet sen ,650 bin dolara bırakırız sana.
Hem para dediğin nedir ki :-))
kal sağlıcakla.
Comment by motodreamer — June 17, 2011 @ 2:01 pm
Hi Coşkan,
what about single node servers grid infra?
I had used below command:
/rootcrs.pl -verbose -deconfig -force
And it worked.
What would I had gotten, if I had issued below command, will it also work?
rootcrs.pl -verbose -deconfig -force -lastnode
Teşekkürler.
Derya
Comment by Derya — May 28, 2012 @ 6:39 pm
THANK YOU very much Mr. Coskan.
you made my day man.
Comment by msy — January 24, 2013 @ 1:28 pm
Thanks for sharing your thoughts about ขายโกดัง.
Regards
Comment by โกดังโรงงาน — December 13, 2019 @ 5:04 am