In-Doubt Transactions and ORA-02049 timeout distributed transaction waiting for lock

Distributed transactions perform DML on multiple databases. If any issue occurs during this transaction such as network or system problems etc. The transaction becomes in-doubt. Normally, Oracle’s RECO background process handles this issue. It recovers from failures in distributed transactions. However, in some cases it cannot achieve. Because of that, some errors like ORA-02049 might be occur.

Distributed transactions can become in-doubt in the following ways:

  • A server system running Oracle Database software crashes
  • A network connection between two or more Oracle Databases involved in distributed processing is disconnected
  • An unhandled software error occurs

When such errors are encountered, in order to see waiting transactions you can select to DBA_2PC_PENDING table.

SQL> SELECT LOCAL_TRAN_ID, GLOBAL_TRAN_ID, STATE, MIXED, COMMIT# FROM DBA_2PC_PENDING;

237.21.7515652	CORE.0d9f9683.882.14.323447	forced rollback	no	40514891501

So, 237.21.7515652 is the transaction id of the distributed transaction. In the above example, it is shown as “forced rollback”. However, If the state of the transaction is “prepeared” then you can force to rollback or commit like below.

ROLLBACK FORCE '237.21.7515652';
OR
COMMIT FORCE '237.21.7515652';

if the command hangs, look at the stuck transactions section below.

If you get ORA-02058 use the following command to purge the transaction then check the DBA_2PC_PENDING that the transaction has gone.

SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('237.21.7515652'); 

PL/SQL procedure successfully completed.

The DBA_2PC_PENDING is the lookup view. The actual transaction entry view is X$KTUNE that stands for [K]ernel [T]ransaction [U]ndo Transa[X]tion [E]ntry. In addition, the concatenation of KTUXEUSN, KTUXESLT and KTUXESQN columns gives us the transaction number.

SELECT * FROM X$KTUXE 
WHERE KTUXEUSN=237
AND KTUXESLT=21
AND KTUXESQN =7515652;

No rows returned.
X$KTUNE

Overcome Stuck Transactions

There are some cases in stuck transactions.

Case 1: DBA_2PC_PENDING view have entries but there is not any transaction in reality

If there is not any transaction in X$KTUXE view. However, you see entries in DBA_2PC_PENDING view. You can run the following command.

SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('237.21.7515652'); 

PL/SQL procedure successfully completed.

EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY command does not delete the records in X$KTUXE view. It manipulates the dictionary views. In addition, we have to delete manually, if the state of the transaction is prepared.

SQL> DELETE FROM SYS.PENDING_TRANS$ WHERE LOCAL_TRAN_ID = '237.21.7515652';

SQL> DELETE FROM SYS.PENDING_SESSIONS$ WHERE LOCAL_TRAN_ID ='237.21.7515652' ;

SQL> DELETE FROM SYS.PENDING_SUB_SESSIONS$ WHERE LOCAL_TRAN_ID = '237.21.7515652';

SQL> COMMIT;

Case 2: DBA_2PC_PENDING view does not have entries however there is a transaction

SQL> ROLLBACK FORCE '237.21.7515652' 
ORA-02058: no prepared transaction found with ID 237.21.7515652

In order to recover the transaction, you need to insert dummy records into dictionary tables and try to force a rollback or commit. In addition, no need to change the insert command except transaction id.

SQL> ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;

SQL> INSERT INTO PENDING_TRANS$ (LOCAL_TRAN_ID, GLOBAL_TRAN_FMT, GLOBAL_ORACLE_ID, STATE, STATUS, SESSION_VECTOR, RECO_VECTOR, TYPE#, FAIL_TIME, RECO_TIME)
VALUES('237.21.7515652', 306206, 'XXXXXXX.12345.1.2.3', 'prepared', 'P', HEXTORAW('00000001'), HEXTORAW('00000000'), 0, SYSDATE, SYSDATE);

SQL> INSERT INTO PENDING_SESSIONS$ VALUES('237.21.7515652', 1, HEXTORAW('05004F003A1500000104'), 'C', 0, 30258592, '', 146);

COMMIT;
SQL> ROLLBACK FORCE '237.21.7515652' 
or
SQL> COMMIT FORCE '237.21.7515652' 
SQL> ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY;
SQL> ALTER SESSION SET "_smu_debug_mode" = 4;
SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('237.21.7515652'); 
SQL> ALTER SESSION SET "_smu_debug_mode" = 0;

Finally, let’s look at X$KTUXE view whether the transaction has gone.

SELECT * FROM X$KTUXE 
WHERE KTUXEUSN=237
AND KTUXESLT=21
AND KTUXESQN =7515652;

No rows returned.

Case 3: DBA_2PC_PENDING has entry and a transaction exists however commit or rollback hangs

Below command will give an error.

SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('237.21.7515652'); 

ERROR at line 1:
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at "SYS.DBMS_TRANSACTION", line 94
ORA-06512: at line 1

As shown in the following code snippet, what we are going to do is to delete the dictionary records then insert dummy records and force commit. Finally, purge the transaction.

SQL> DELETE FROM SYS.PENDING_TRANS$ WHERE LOCAL_TRAN_ID = '237.21.7515652'; 

SQL> DELETE FROM SYS.PENDING_SESSIONS$ WHERE LOCAL_TRAN_ID ='237.21.7515652'; 

SQL> DELETE FROM SYS.PENDING_SUB_SESSIONS$ WHERE LOCAL_TRAN_ID = '237.21.7515652'; 

SQL> COMMIT;

SQL> ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;

SQL> INSERT INTO PENDING_TRANS$ (LOCAL_TRAN_ID, GLOBAL_TRAN_FMT, GLOBAL_ORACLE_ID,STATE, STATUS, SESSION_VECTOR, RECO_VECTOR, TYPE#, FAIL_TIME,RECO_TIME)
VALUES ('237.21.7515652', 306206, 'XXXXXXX.12345.1.2.3', 'prepared','P', hextoraw( '00000001' ), hextoraw( '00000000' ),0, sysdate, sysdate);
SQL> INSERT INTO PENDING_SESSIONS$
VALUES('237.21.7515652', 1, hextoraw('05004F003A1500000104'),'C', 0, 30258592, '',146);

COMMIT;
SQL> COMMIT FORCE '237.21.7515652';
SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('237.21.7515652');
SQL> ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY;
Posted in DBA
Leave a Reply

Your email address will not be published. Required fields are marked *