Read Digital Edition


ADS BY GOOGLE
Top Three Links You Must Click On


Java Developer's Journal Feature: "Deadlocks in J2EE"
Most non-trivial applications involve high degrees of concurrency and many layers of abstraction

Most non-trivial applications involve high degrees of concurrency and many layers of abstraction. Concurrency is associated with resource contention and an increase in deadlock conditions. The multiple layers of abstraction make it more difficult to isolate and fix the deadlock conditions.

Generally, a deadlock happens when two or more concurrent threads of execution each hold a resource and request another resource. Since neither one continues until it acquires the resource, we say each specific thread is blocked; if each thread is blocked on a resource that is held by another thread in the same group, we say the group of threads is deadlocked.

In this article, we'll discuss two broad categories of deadlocks that occur in typical non-trivial J2EE applications: "simple" database deadlocks and cross-resource deadlocks. Although the discussion is based on J2EE, it also applies to other technology platforms.

Database Deadlocks
In a database, one connection can block another if it's holding database locks needed by the other. If two or more connections are blocking each other, none of them can proceed, and they're deadlocked.

Database deadlocks can be tricky because the locks involved often aren't explicit. Typically, updating a row implicitly requires locking that row, doing the update, and then releasing the lock when the enclosing transaction is committed or rolled back. Depending on the database platform, the configured isolation level, and any query hints, the acquired lock can be fine- or coarse-grained and block or not block other queries on the same row, table, or database.

The locks acquired can depend on the internally generated query plan that can change over time as the data size and distribution changes, so a query that acquired one set of locks in one environment can try to acquire a completely different set of locks in another. The database is free to escalate its locks when necessary - instead of locking 10 rows on the same data page, it may choose to lock the entire page, which may block reads or writes to rows that don't need to be locked.

Depending on the database schema, a read or a write can require traversing or updating multiple indexes, validating constraints, executing triggers, etc., each of which can introduce more locks. Further, other applications may be hitting some of the same database schema objects and acquiring locks differently than your application ever does.

All of these factors conspire to make it practically impossible to eliminate the possibility of database deadlocks. Fortunately, database deadlocks are generally recoverable: the database will notice the deadlock, forcibly kill one of the connections (typically the one that has done the least work), and roll back its transaction. This releases all of the locks associated with the terminated transaction, which should allow at least one of the other connection(s) to acquire the locks on which they were blocking.

Because of this typical deadlock-handling behavior, it often suffices simply to retry the entire transaction in the case of a database deadlock. When a database connection is killed, an exception is emitted that can be caught by the application and identified as a database deadlock condition. If that deadlock exception is allowed to propagate out to the layer of code that initiated the transaction, that code can simply start a new transaction and redo all of the earlier work. For this strategy to be correct, the code inside the transaction must have no side-effects until after the transaction has been successfully committed. Note: You'll want to put a limit on the number of times you retry or else a particularly deadlock-prone piece of code may loop forever.

This feels a bit clunky - if something goes wrong, we just try again. However, because of the freedom the database has in the locks that it acquires, it's nearly impossible to guarantee that two threads of execution can't cause a database deadlock. At least this approach guarantees that the application behaves correctly in the rare case that the deadlock happens, and is far less clunky than asking your users to retry the operation.

In a J2EE application, developers can set an EJB call to use either bean-managed transactions (BMT), where the developer specifically starts and commits or rolls back the transaction, or container-managed transactions (CMT), where the application server starts a transaction before calling the method and commits or rolls back the transaction after the method completes. It would be nice if the EJB vendors supplied a retry-on-deadlock parameter that would do this automatically with container-managed transactions. Without this automated feature, developers end up forcing EJB calls to use bean-managed transactions just to be able to retry on deadlocks. (One of the disadvantages to making your EJB calls use bean-managed transactions is that it's not obvious how to get the same semantics as a container-managed transaction with "RequiresNew" or "NotSupported." See www.onjava.com/pub/a/onjava/2005/07/20/transactions.html for how it can be done.)

The specifics of how often you'll run into deadlocks and which locks will block other threads depend a lot on your database platform, hardware, database schema, and queries. In databases that use lock-based concurrency control, like MSSQL, uncommitted writes can block reads and uncommitted reads can block writes, which makes them more deadlock-prone. In MVCC (multi-version concurrency control) databases like Oracle, uncommitted writes won't block reads - the read will simply see the old version of the row. That can introduce other problems but doesn't create as many deadlock opportunities. Familiarize yourself with these database locking schemes and be aware of which type you are using.

There are a number of good references on how to find, fix, and avoid database deadlocks, but none of them will eliminate the possibility of deadlocks.

Cross-Resource Deadlocks
When your deadlock condition isn't completely contained within a database, it can be much more difficult to track down. Since the database is aware of the locks held and requested, it can detect deadlocks that are entirely contained in a database; also, because database transactions provide a nice boundary of what things should and shouldn't be atomic, the database can simply roll back a transaction to recover from a deadlock. Deadlocks that are in other environments, like the JVM, or deadlocks that span environments can be more dangerous because the environment can't (or doesn't) detect them and try to recover. Worse, these deadlocks can have a compounding effect - if two threads are deadlocked while holding some set of resources, any other thread that tries to access one of those resources also becomes blocked, along with any resources that thread has already acquired. Often, these deadlocks can be difficult to track down, but some familiarity with the general patterns usually helps to identify and fix the problem.

There are a few questions to ask when an environment gets into a suspected deadlock condition. The answers to these questions will indicate which of the following scenarios you're dealing with, if any, and give more detail about how to fix the underlying problem. Some of the key things to ask are:

  1. Which threads are involved and what are their call stacks? This can take some detailed analysis to separate the threads that are actually deadlocked from those that are simply blocked by the deadlocked threads.
  2. Does this deadlock happen consistently in a particular code path (every time a particular operation is performed), or is it dependent on two or more code paths executing at the same time?
  3. What database connections are involved? What database locks does each connection hold, and what database locks are each connection trying to acquire? Which JVM thread does each database connection correspond to?
The section below illustrates three commonly occurring cross-resource deadlock scenarios.

Cross-Resource Deadlock #1: Pool Exhaustion with Escalating Clients
The first deadlock scenario we'll look at happens only under load, when a resource pool is too small and each thread needs more than the available resources from the pool. For example, consider an EJB call that uses a database connection, then makes a nested EJB call that uses a separate database connection from the same connection pool. This will happen if the nested EJB call is declared as "RequiresNew," for example.

Under normal load, or with a sufficiently sized connection pool, the EJB call will get a database connection from the pool, and then call the nested EJB. The nested EJB call will get another database connection from the pool, commit the inner transaction, and return the connection to the pool. The outer EJB call will then commit its transaction, and return its connection to the pool.

However, suppose the connection pool has a maximum size of 10 connections, and there are 10 concurrent calls to the outer EJB. Each of those threads acquires a database connection, emptying the pool. Now, they each try to make the nested EJB call, which needs to acquire a second database connection. None of them can proceed, and none of them will give up their first database connection, so all 10 threads are deadlocked.

When investigating a deadlock of this type, you'll see a large number of threads in your thread dump waiting to acquire resources, and the same number of active database connections, all idle and unblocked. If you can inspect the connection pool at runtime while the application is deadlocked, you should be able to verify that it's actually empty.

The fix for a deadlock of this type is either to increase the size of the connection pool or to refactor the code so that a single thread doesn't require as many database connections at the same time. If the maximum number of database connections required by a single thread is M, and the maximum number of possible concurrent calls is N, the minimum number of connections required in the pool to prevent this problem is (N*(M-1))+1. Alternatively, you could set up the inner EJB call to use a different connection pool, so that even if the outer call's connection pool is empty, the inner calls will be able to proceed using their own connection pool.

Cross-Resource Deadlock #2: Single-Thread, Multiple Conflicting Database Connections
The second cross-resource deadlock scenario can also arise when making nested EJB calls on the same thread, although this case typically happens even in systems that aren't under load. Just as in the example above, the two EJB calls use different connections to the same database. Since the caller can't continue until the nested call completes, the caller's database connection is effectively blocked by the nested call's database connection, although the database isn't aware of this relationship. If the first (outer) connection has acquired a database lock that the second (inner) connection needs, the second connection will block indefinitely waiting for the first connection to be committed or rolled back and a deadlock arises. Since the database isn't aware of the relationship between the two connections, the database won't detect this as a deadlock.

As a concrete example, consider a data-loading EJB call. This EJB call takes in a large object and persists it to the database in several stages. As it performs the data-load, it updates a separate table that records the state of the pending data-load operation. We'd like the state-update to be visible immediately, but we don't want the loaded data to be visible in an incomplete state, so the state-update is done with a call to a "RequiresNew" EJB. Roughly, our (flawed) data-load method looks like the code in Listing 1.


About Michael Nonemacher
Michael Nonemacher is a lead software engineer for Lombardi Software. He has worked with Java since 1997, focusing on server-side database interaction and concurrency in Web-based enterprise applications.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

You mention:
"It would be nice if the EJB vendors supplied a retry-on-deadlock parameter that would do this automatically with container-managed transactions. Without this automated feature, developers end up forcing EJB calls to use bean-managed transactions just to be able to retry on deadlocks. (One of the disadvantages to making your EJB calls use bean-managed transactions is that it's not obvious how to get the same semantics as a container-managed transaction with "RequiresNew" ...)"

Is BMT strictly required to retry? Suppose all business methods pass through an initial method with no transaction attributes. This method front_bizmethod() calls the actual business method which of course
has a transaction attribute of Required/RequiresNew etc.

public void front_bizmethod()
{
int retry_count = 0;
boolean retry = false;
do
{
try
{
bizmethod();
}
catch (RetryException re)
{
retry = true;
retry_count++;
}
} while (retry && retry_count < 10);
}

/** Business method -- has transaction attribute
* Required or RequiresNew
*/
bizmethod() throws RetryException
{
// Do business calls and throw a RetryException
// if a deadlock was detected by the database
}

What I mean is that there is a way around it by
coding an extra method.

What are your thoughts on the same ?

Thanks for this great article, Mike. I wasn't able to make it to JavaOne this year so I missed your BOF at JavaOne that looks like it was talking about this stuff (BOF-0534). Are there any slides from that that I can get ahold of?

With the power and simplicity that CMT (container managed transactions) brings, is there really no simple way to automatically handle database deadlock by retrying the transaction according to some given parameters (ie: number of retries, back off time, etc)? That would really be unfortunate.

Here are a few possible solutions, but they all seem sub-optimal for various reasons: (I would like to stay within the spec (EJB3/JEE5), but I'm not adament on this).

1) JBoss has a proprietary extension for this: (org.jboss.ejb.plugins.TxRetryExceptionHandler) However, besides being outside the spec, this implementation does not allow me to easily keep track of any state regarding the retry (ie: retry count numbers, data for back off algorithms, etc.).

2) I can use the EJB3 Interceptor spec, but this is really cumbersome because it ties in *after* the CMT stuff has already been setup.

3) I'm sure I could write a jboss-aop interceptor that would do the right thing here, but again it would be outside the spec. Has anyone done this already? I certainly don't want to reinvent the wheel on this one.

4) I could switch from CMT (container managed transactions) to BMT (bean managed transactions). This would give me more control over the the transaction endpoints but then I would have to give up all the niceties that CMT give me.

I really would like a nice solution for this...

Most non-trivial applications involve high degrees of concurrency and many layers of abstraction. Concurrency is associated with resource contention and an increase in deadlock conditions. The multiple layers of abstraction make it more difficult to isolate and fix the deadlock conditions.


  Subscribe to our RSS feeds now and receive the next article instantly!
In It? Reprint It! Contact advertising(at)sys-con.com to order your reprints!
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE
But on the web, access to services is implicit in the fact that the business is offering the service...
We talk a lot about social media on Marketing Trenches. And for good reason – Social media seems to...
Oracle has offered to cordon off MySQL inside a combined Oracle-Sun to get the European Commission t...
Intel has put out its promised beta SDK for Windows (C and C++) and Moblin (C) developers working on...
InformationWeek stumbled on a Microsoft patent application dating back to 2006 deceptively titled “M...
Behaving like it’s got a future, Sun Monday put out what it calls a significant new version of Virtu...
Berlin-based ThinPrint AG, the printer virtualization house, thinks it’s got a cloud solution for th...
IBM has acquired Guardium, a seven-year-old subsidiary of Israel’s Log-On Software transplanted to M...
The second set of charges filed last week against Indian outsourcer Satyam Computer Services founder...
Gartner told Reuters that it overestimated how many PCs Acer shipped in the last seven quarters by a...
Office Web Apps, Microsoft’s answer to Google Apps, are supposed to be out sometime in June along wi...
Gartner thinks the server business has stopped sliding into the abyss. Third-quarter sales weren’t a...
Gartner is buying ~$40 million-a-year AMR Research Inc for close to $64 million in cash. AMD special...
Singed by user reaction to its plans to up the price of its support contracts, SAP Tuesday postponed...
Apparently Google Gears ain’t gonna stick around that long. Google Apps will eventually get their of...
Oracle seems to have divided the open source ranks over the MySQL delay it’s having closing its acqu...
The Korean government is going to sink around $172 million into cloud computing next year under a st...
We hear – well, you know how people talk – that Oracle has been quietly meeting with the European Co...
In response to Opera’s complaints Microsoft has reportedly modified the proposed ballot screen that’...
Microsoft has sold the Folio and NXT businesses it got when it bought Fast Search and Transfer, the ...