Oracle por Madrid

Tuesday, June 28, 2011

Otro error ORA-00600

ORA-600 [kkoipt:incorrect pwj].

Hay algunos errores ORA-00600 que pueden ser fácilmente encontrados en Google, y hay otros que son un poco más difíciles de encontrar. En este caso M.O.S. reporta este ORA600 como un error relacionado con un bug (9929660) en la plataforma 11.2 y que va a ser arreglado en la version 12.

Recientemente este error aparecio en una plataforma Windows 2003 con Oracle 11.1.0.7.0. Despues de decodificar el casi encriptado archivo de trace relacionado con este error y de diagnosticar los factores circundantes que pudieran causar el problema, encontre que este error estaba siendo originado por la carencia de procesos en la base de datos.

El error se arreglo en este caso en particular cuando se incremento el parametro de instancia PROCESSES.

He de hacer notar que la única autoridad que puede oficialmente emitir diagnósticos de errores ORA-00600 es los servicios de soporte técnico de Oracle Corp. La información emitida aqui tiene como único propósito el de compartir una experiencia personal que ocurrión en un ambiente muy específico.

Tuesday, October 12, 2010

Oracle Magazine Nov-Dec 2010

Deseo agradecer a Oracle Magazine la publicación de la entrevista en la sección Peer-To-Peer de este bimestre.

Ref. http://www.oracle.com/technetwork/issue-archive/2010/10-nov/o60peer-176064.html

"
Which new features in Oracle Database are you currently finding most valuable? Oracle Automatic Storage Management and the Volume Manager. Another nice feature is Secure Files, which improves performance, optimizes storage, and provides an additional security layer.

What advice do you have for those just getting into application development? When designing an application, seriously evaluate the amount of intelligence it’s going to manage. This will define the amount of coding, complexity, round-trips to the server, and scalability. And be aware that the application user is different from the big database user.

Tell us about your role with Oracle University (OU) and the value you see in this program. I’ve been a certified OU instructor for more than 15 years, and I really believe that an OU course is the best way to get acquainted with Oracle technology. The courses provide a good balance between theory and practice, and professionals can start being productive the day the course concludes.
"

Thursday, May 13, 2010

ORA-02019 : connection description for remote database not found

TRADUCCION EN PROGRESO ...

This error has been appearing on a radom base at the alert.log file. There is no apparent reason for this to appear. And when I take a look at the detailes provided on the trace file the only thing I see is the same error meaningless pattern, the connection descriptor was not found.
In this particular case the contents at the alert.log file showed this information:

Tue Mar 09 15:45:43 2010

Errors in file ...\trace\orapro_reco_4892.trc:

ORA-02019: connection description for remote database not found

From the generated trace file name, it states that it is the RECO background process the one responsible for this trace. RECO is used in distributed transactions, and it happens that when a database goes down RECO tries to resend the pending transactions according to the Two Phase Commit procedure.
In this case there it used to be a database attached to the main database, but it was gone long ago, however some non applied transactions remained in the transaction queue, making RECO to keep on retrying forever. The database was deconfigured and all the related files and file descriptors were removed.

By taking a look at the DBA_2PC_PENDING view there there were a couple of transactions that remained in the queue and which were never applied.

SELECT LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,MIXED,COMMIT#
FROM DBA_2PC_PENDING;

LOCAL_TRAN_ID GLOBAL_TRAN_ID STATE MIXED COMMIT#
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.21.102540 ORCL.ee36125b.9.21.102540 collecting no 9462810864
8.32.86528 ORCL.ee36125b.8.32.86528 collecting no 9462814781

In this case, and since there is no target database to apply the transactions to, then it is just enough to get rid of them by means of the PURGE_LOST_DB_ENTRY procedure:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('8.32.86528');
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('9.21.102540');
commit;

That's it, this way the ORA-02019 error is gone.

Wednesday, May 12, 2010

Data Guard ORA-16789 Health Check Warning

TRADUCCION EN PROCESO

After setting up the Physical Dataguard environment some errors still showed up. Even though it looked fine and the Standby database was properly receiving and applying the archivelogs, the Dataguard Broker still showed some errors

DGMGRL> show configuration verbose;

Configuration
Name: SBDBCONFIG
Enabled: YES
Protection Mode: MaxPerformance
Databases:
PRIDB - Primary database
SBDB - Physical standby database

Fast-Start Failover: DISABLED

Current status for "SBDBCONFIG":
Warning: ORA-16608: one or more databases have warnings

Dataguard broker uses an evaluation criteria that is some sort of everything or nothing at all. Either it works fine, or it doesn't but there is no intermediate point. The issue here is that sometimes you have to be patience to find the errors. This is an 11g release, and though the solution found can work on 10g too you should be aware that after the DIAGNOSTIC_DEST parameter in 11g the path to find the log files has changed : ORACLE_BASE/diag/rdbms/ORACLE_SID/ORACLE_SID/trace/drcORACLE_SID.log, the errors displayed there were as follows:

Error Displayed

RSM Error: LogXptMode value 'ASYNC' of requires this database to have status redo logs, but they are not configured.
RSM0: HEALTH CHECK WARNING: ORA-16789: standby redo logs not configured
Operation CTL_GET_STATUS continuing with warning, status = ORA-16789

Solution

In this case it is pretty obvious why it is failing ... there are two solutions, one is to add the required Standby logfiles after the configured LogXptMode ASYNC mode, or change the LogXptMode to ARCH. I choosed the first one,

SQL> ALTER DATABASE ADD STANDBY LOGFILE GROUP 4
2 'C:\ORACLE\APP\ORADATA\SBDB\SBREDO04.RDO' SIZE 16M;
Database altered.

The command is repeated for group 5 and 6. The sizes and paths shown in this example are just for instructional purposes, on the actual production environment the size of the standby redo logs should be the same used for the production logs.

A Second Error Shows Up

The second error shown refers to the fact that since adding the standby redo log files required the recovery process to be stopped, the Data guard manager complained about it.

DGMGRL> show configuration verbose;

Configuration
Name: SBDBCONFIG
Enabled: YES
Protection Mode: MaxPerformance
Databases:
PRIDB - Primary database
SBDB - Physical standby database

Fast-Start Failover: DISABLED

Current status for "SBDBCONFIG":
Warning: ORA-16607: one or more databases have failed

This time the Dataguard logfile reports the following:
DMON: HEALTH CHECK ERROR: ORA-16766: Redo Apply is stopped
Operation CTL_GET_STATUS canceled during phase 1, error = ORA-16766

In this case it was pretty obvious too, the redo apply process was stopped to be able to add the Standby Database while the Active Data Guard was open.

Just restart the Recovery process and that's it.

Solution

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
Database altered.

DGMGRL> show configuration verbose;

Configuration
Name: SBDBCONFIG
Enabled: YES
Protection Mode: MaxPerformance
Databases:
PRIDB - Primary database
SBDB - Physical standby database

Fast-Start Failover: DISABLED

Current status for "SBDBCONFIG":
SUCCESS

It works as it is supposed to.

References

Oracle® Data Guard Broker
11g Release 2 (11.2)
Part Number E10702-01

Thursday, May 6, 2010

Error al actualizar el repositorio de RMAN a 11.1.0.7.0

En proceso de traduccion ...

WARNING.

The following post describes a particular issue I wanted to document and share, the solution found for this issue worked for my particular environment, which does not mean it will work with yours. Dealing with internal Oracle structures without Oracle Support Services approval and supervision will render your database unsupported and it may (and most probably will) compromise its availability.

When performing the upgrade of the recovery catalog database an error showed up. This error has to do with a known issue when upgrading the recovery catalog to 11.1.0.7.0 after applying the patchset against the database.
The upgrade procedure corrupts the DBMS_RCVCAT procedure, so it is enough to get rid of it and have the rman catalog owner to upgrade its catalog from an rman prompt, as follows:

1. Connected to a SQL*Plus prompt as the recovery catalog owner issue the following commands:
sqlplus rcatOwner/rcatPassword
SQL> drop package DBMS_RCVCAT;

2. From a Recovery Manager prompt issue the following command:
rman catalog rcatOwner/rcatPassword
RMAN> UPGRADE CATALOG;

recovery catalog owner is RMAN
enter UPGRADE CATALOG command again to confirm catalog upgrade

RMAN> UPGRADE CATALOG;

recovery catalog upgraded to version 11.01.00.07
DBMS_RCVMAN package upgraded to version 11.01.00.07
DBMS_RCVCAT package upgraded to version 11.01.00.07

At this moment the procedure has successfully upgraded the repository and the DBMS_RCVCAT stored unit has been properly rebuilt.

BUT ... Yes, the big BUT. In my case a particular issue arose. Instead of reading the above successful message this is what showed up instead:

RMAN> connect catalog rman/rman
connected to recovery catalog database
recovery catalog is partially upgraded to 11.01.00.07; UPGRADE CATALOG again

RMAN> upgrade catalog;
recovery catalog owner is RMAN
enter UPGRADE CATALOG command again to confirm catalog upgrade

RMAN> upgrade catalog;
error creating upgcat_57
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-06004: ORACLE error from recovery catalog database: ORA-02293: cannot validate (RMAN.CDF_C_STATUS) - check constraint violated

I hate whe RMAN complains. And it likes to pretty often.

RMAN.CDF_C_STATUS stands for a constraint that is added to the CDF table. This table stores information about the registered datafile copies. This constraint is defined in the ?/rdbms/admin/recover.bsq file and it literally reads as follows:
<<< define upgcat_57
>>>
alter table cdf add constraint cdf_c_status check (status in ('A','U','D','X'))

Notice the clause define upgcat_57 and the error message above, error creating upgcat_57.
Originally the CDF table is created with an offline constraint that can be found in the same file:

CONSTRAINT cdf_c_status CHECK (status in ('A','U','D','X','F')).

But when the upgcat_57 step is executed it magically leaves the 'F' value out.
In my particular case the CDF table, had four registered datafile copies, these copies were already obsoleted, but they were there any way with a 'F' flag. Since the add constraint applied against the repository table it crashed and made the upgrade procedure abort.
The workaround found for this particular case was to update the CDF table and change the 'F' flag in the status column with a 'U' value.
This renders the backups as Unavailable. The flag values are not documented in the file, but as far as I know the 'A' value means Available, meanwhile 'U' stands for unavailable.
After manually updating the column value the procedure to upgrade the repository was run once again from the RMAN prompt, this time it was successful.
In my particular case these backups were taken a very long time ago, and those are no longer available, so there it was no problem at all in my case taking this direction.
However a big issue that comes to my attention is that Oracle either has a bug in this upgrade procedure or it forgot to update the constraint declaration with the proper value set.

Update ...

While I was finding out if this issue could be reproduced on 11g Rel. 2 i found at the recover.bsq file the following lines:

define upgcat_166
<<<>>>

define upgcat_167
<<<>
CHECK (status in ('A','U','D','X','F')) >>>

So it means that someone in Oracle realized about this mistake, then they dropped the current constraint and then add the constraint with all the flags there included.
I can see that if someone has not hit this bug on 11g then most probably they will not realize about it by the time they migrate to 11g Rel. 2.

Monday, May 3, 2010

ORA-42012: error occurred while completing the redefinition

Al momento de intentar utilizar el paquete DBMS_REDEFINITION para redefinir una partición en línea apareció el error descrito a continuación.

dbms_redefinition.finish_redef_table('MADRID', 'a_table', 'an_int_table', 'a_partition');

El siguiente error apareció:

begin
*
ERROR at line 1:
ORA-42012: error occurred while completing the redefinition
ORA-04020: deadlock detected while trying to lock object MADRID.TABLE1_INT
ORA-06512: at "SYS.DBMS_REDEFINITION", line 78
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1680
ORA-06512: at line 2

Cuando intenté eliminar la tabla internia para reiniciar la redefinición en línea del objeto, este procedimiento tuvo un conflicto con un par de objetos creados al vuelo.

SQL> drop table table1_int;
drop table table1_int
*
ERROR at line 1:
ORA-12083: must use DROP MATERIALIZED VIEW to drop "MADRID"."TABLE1_INT"

SQL> DROP MATERIALIZED VIEW "MADRID"."TABLE1_INT";
Materialized view dropped.

SQL> drop table table1_int;
Table dropped.

Este error se debe a un bug en la version 11.2.0 en cualquier plataforma de sistema operativo. Este bug se va a corregir en futuras versiones, entre tanto la manera de rodear el problema es cambiando un parámetro de instancias, solo hay que poner deferred_segment_creation a FALSE y reintentar la operación.

Friday, January 29, 2010

Problema de Performance con mi Estacion de Trabajo

Mi computadora recientemente ha estado muerta, ha habido varios procesos que han consumido una cantidad impresionante de recursos, hasta el limite de dejarla practicamente congelada. Hay un alto consumo de CPU, I/O y casi siempre se encontraba en limites del 100%, y no podia mas porque solo llega al 100. Ciertamente era una locura poder trabajar con una computadora tan lenta. Tal parece que entre mas grande la computadora, mas consumidores de recursos los procesos. Yo uso mi laptop como servidor, de hecho tiene los fierros para poder hacer esta tarea, y en esta tengo instalado VMWare con lo que puedo crear maquinas virtuales para ambientes de prueba. Uso Windows XP, siip, en lo personal no me gusta Windows Vista ni las nuevas actualizaciones, con excepcion del Windows 2003 y Windows 2008.

Uno de los procesos que encontre eran altos consumidores de recursos era el JQS.EXE, un proceso que corre con baja prioridad en background y que consume una gran cantidad de recursos de I/O, este proceso es necesario para las aplicaciones basadas en java a efecto de que puedan lanzarse mas rapido, y solo vale la pena tenerlo si constantemente se estan lanzando estas aplicaciones, de otro modo el consumo de I/O no vale la pena. Ahora, para deshabilitarlo, esto se hace desde el panel de control, alli hay un icono, la clásica taza de cafe. Al levantar el control de java aparece una seccion de avanzado. En esta seccion hay una caja de chequeo. Hay que desmarcarla y listo, el proceso JQS.EXE desaparece unos segundos mas tarde de la lista de tareas.

CCSCHST.EXE es otro proceso de peso completo de los que me quiero deshacer. Este proceso crea fuertes cuellos de botella a nivel de CPU y Entrada/Salida, o sea un muy fuerte cuello de botella. Tengo instalado Norton Antivirus, y de alli es de donde sale este proceso. Ciertamente no fue una buena elección, e NAV es muy ineficiente en cuanto a la administración de recursos y puede volver mortalmente lenta a la maquina. Dicen que esto se debe, entre otras cosas, a un conflicto entre el updater de Windows y el de Norton. Así que procedí a deshabilitar las características de actualizaciones automáticas de windows, ahora aparece una alerta roja en la barra de windows, lo que me hace recordar que debo de manualmente buscar y aplicar las actualizaciones yo mismo. También deshabilité el LiveUpdate de Norton, esto lo hice desde el panel de control, Norton crea un ícono en el panel de control de windows y con un par de clicks es fácil encontrar la opción para deshabilitar el live update. Esta es la referencia en la que me basé para anular las actualizaciones: Norton vs. Microsoft.

Otra aplicación muy bonita y práctica que tengo es el escritorio de The Weather Channel, esta aplicación me mantiene informado de alertas del clima, lo que me es particularmente importante sobre todo en esta época del año donde uno nunca sabe con las tormentas de nieve, hielo y otros imponderables del clima. Sin embargo, al momento de iniciar windows la cantidad de recursos consumidos es exageradamente alta, tanto en I/O como en CPU, así que tuve que deshabilitarla del inicio automático del sistema y revisar las alertas directamente en la página de The Weather Channel.

También tengo instalado VMWare, pero mucho de mi trabajo tiene que ver con pruebas en ambientes controlados y en donde en caso de que algo salga mal lo peor que suceda es que restaure otra semilla del conjunto de maquinas virtuales que tengo para ese efecto. Asi que no hay mucha opcion más que dejarle recursos a esta aplicación.

Y finalmente FireFox, me encanta FireFox y en lo personal lo prefiero mil veces al Internet Explorer, pero no sé que pasó en las últimas liberaciones del producto, particularmente desde el 3.x a la fecha, se ha robustecido bastante, tiene mas funcionalidad y es mas vistoso, pero ahora es una aplicación que consume mas recursos en CPU. De cualquier manera, después de haber adelgazado mi Estación de Trabajo quedaron suficientes recursos para mantener a este adorable navegador de Internet, así que este lo conservo y lo excluyo de la lista de desinstalación