Sunday, July 12, 2009

DB2 -913 Abend

-913 UNSUCCESSFUL EXECUTION CAUSED BY DEADLOCK OR TIMEOUT. REASON CODE
reason-code, TYPE OF RESOURCE resource-type, AND RESOURCE NAME
resource-name

Explanation: The application was the victim in a deadlock or experienced
a timeout. The reason code indicates whether a deadlock or timeout
occurred.

Refer to message DSNT500I under "DSNT... Messages" in topic 3.14 for an
explanation of 'resource-type' and 'resource-name'. Refer to Table 3 in
Appendix B, "Problem determination" in topic APPENDIX1.2 for an
explanation of resource type codes.

System Action: The SQL statement cannot be executed. If the SQL statement
being executed was a cursor FETCH, DB2 closes the cursor.

SQLERRD(3) also contains the reason-code which indicates whether a
deadlock or timeout occurred. The most common reason codes are:

- 00C90088 - deadlock

- 00C9008E - timeout

Programmer Response: The application should either commit or roll back to
the previous COMMIT. Then, generally, the application should terminate.
See message DSNT376I for possible ways to avoid future deadlocks or
timeouts.

For more information about how CICS and TSO handle deadlocks, see Section
5 of Application Programming and SQL Guide.

Destination: 57033
-PROBLEM_DETERMI


No Time For Timeouts

The performance of a DB2 subsystem depends upon the interaction of many
different factors. However, there is an almost universal acknowledgment
that locking can and does have one of the more significant impacts.

It is important to remember that the job receiving the -911 SQLCODE
is most likely NOT the source of the problem. The source is usually the
job that continues to run. There are five factors to examine in resolving
the -911 conflict. These factors include:

* The lock size parameter that was specified at tablespace/index creation.

* The duration of the locks established by the BIND strategies.
The ACQUIRE and RELEASE strategies for the tablespaces and the
ISOLATION perameters for the page and subpage locks.

* The scope of the COMMIT in the application code.

* The manner in which the tablespace was started (RW or RO).

Most of the time, resource contention and timeouts are due to contention
on indexes, not data. The contention on indexes is something that is
relatively common.

Here are some suggestions to maximize concurrency and reduce the -911
problem:

* Create tables with LOCKSIZE=PAGE to improve concurrency and prevent
automatic lock escalation.

* Try to have the number of RIDs in a clustering index subpage equal to
the number of rows in a data page to limit the scope of a lock on a
leaf page.

* Set the subpage in the index equal to 16, to limit the scope of the
leaf page locking for TABLES WITH HEAVY UPDATE ACTIVITY to the index
columns.

* Start the table in RO for read only tables (rather than RW), and set
the locksize to PAGE. This will maintain flexibility while allowing
DB2 to override the BIND strategies and provide table locking.

* Limit the scope of SQL statements so that a statement locks 150 or
fewer pages on a table or indexspace for each unit of work that is
performed.

* Limit the scope of SQL statements so that the amount of work that is
performd can be completed before the MVS system swaps out the Allied
Address space due to the exaustion of service units as established
by the system programmers. If you're swapped out, you continue to hold
DB2 locks.

No comments:

Post a Comment