concurrency+transactions issue

Hi all,

I’m trying to get around a problem I’m having with a concurrent trigger. Here is a simplified scenario: assume we have records with an ID, a name and a score. eg, (1, John, 100).

The trigger receives documents in very quick succession constantly updating the scores of multiple people. On the first instance of seeing their name, I need to insert them into the DB, or otherwise update their score.

My flow’s pseudocode is like this:

pub.art.transaction:startTransaction
personExists = JDBC adapter service select where name=john
if !personExists then
    insert the incoming record into the database
else
    update john's score in the database
end if
pub.art.transaction:commitTransaction

Essentially this works, except sometimes two johns end up inserted into the database. The obvious cause is since the trigger is concurrent, the select executed twice before the first insert occurred.

The select is done with a JDBC adapter service, insert and update likewise, and they all use the same JDBC connection (SQL server) set to LOCAL_TRANSACTION. I have not set the name on the startTransaction, but have used its output for the commit.

I’m sure theres some gap in my understanding here. I thought this would have been quite straightforward (ie, by wrapping the process in a transaction I was under the impression I’d be protected from a scenario like this)

The select-then-insert-if-not-present-else-update approach is not safe in a concurrent environment. This is because the select then insert/update steps are not atomic and do not block other threads from inserting a record–e.g. the transaction boundary is not a lock.

First, you’ll need a primary key on the data you’re inserting/updating.

Then, change the logic to do this:

insert
if primary key constraint violated error
then update

The downside is that this “upsert” approach is noisy in the logs–the PK exception is looged. But it guarantees that only one insert will succeed. A way to avoid the log noise is to make this a stored proc and call that from IS.

HTH

Thanks for the reply Rob. I was really hoping to have a slightly more elegant wM based solution - is there no way to have a synchronised block in wM? I was under the impression thats partly the *transaction services are for.

A sync block of code could help–if this IS instance is the only thing writing to the table. If multiple machines (IS or otherwise) are writing then a sync block doesn’t provide a solution.

To get a sync code block you’d need to use a Java service, which could use doInvoke to call the adapter services.

Search the web for atomic upsert approaches. You may find a technique, other than the one I mentioned, that might be more to your liking.

Thanks again Rob. How about multiple IS nodes? If a Java service has a synchronised block, do multiple nodes observe the block as “one” block (I assume not, as each IS is under a different JVM?)

I’m surprised the IS doesn’t have a native built-in solution! :rolleyes:

Your assumption is correct. Synchronized blocks are available only within a single run-time environment (true for Java, C/C++/C#, etc.)

I’m not aware of any app server (although IS isn’t really an app server like WLS, WS, others it’s kinda close) that provides any support for this. But I may be wrong. I’m surprised databases don’t provide atomic upsert functionality–it’s a very common need.

I believe this is what we need: http://en.wikipedia.org/wiki/Merge_(SQL)

I have in my notes that the merge statement is not atomic either. It is a mechanism to simplify and make more efficient the select-if-none-insert-else-update approach. I don’t recall where I read about this behavior but I do recall that someone stated that it still results in separate select, insert/update steps and is still vulnerable to race conditions. E.g. 2 threads/processes do select at the same time and both think they can do insert–only one will succeed.

Be sure to research this and test it well before committing to using it as a concurrent-safe approach.

Interesting. Unfortunately we’re only on SQL Server 2000 or 2005, so I’ve gone the stored proc path for now!