One of the many important missing pieces of ADO.Net is the inability to do the batch SQL. ADO.Net 2.0 patched in a cheesy batch update support for DataSet only so that when the number of changed rows exceeded certain limit, you get a database exception, not sure about SQL Server but this is at least true for both build in Oracle client and ODP.Net. Bear in mind that this is a pure Microsoft's design flaw by missing out this important feature in their API. It is NOT Oracle's problem. Oracle batch SQL has been working perfectly fine in JDBC world for many many years. In this post, I'm going to share with you a way to make it work with ODP.Net in one use case scenario.
Our project uses Spring.Net framework. In the data access layer, we make mixed use of NHibernate and Spring ADO support. Thus all my code here are based on Spring.Net ADO, if you are not familiar with Spring.Net, you can at least read next section so that 1) there is information for about ODP.Net batch in general; and 2) you get to know some of many benefits that Spring.Net brings.
I have been watching code like below in our project again and again. It loops through a list of value objects and calls AdoTemplate one at a time. This yielded extremely slow performance. In one scenario, it inserted 717 rows in 7.9s.
Obviously, we need to batch those SQL to database for optimal performance. But after looking around the Spring.Net's API as well as ADO.Net's API, to my surprise, there is pretty much nothing out there to help in this situation.
Eventually, I started to look into the extensions that ODP.Net provides. Thanks to Oracle, ODP.Net does provide two ways to execute batch SQL with limitations. One way is to bind the .Net array to PL/SQL associative arrays in stored procedure or PL/SQL block. You execute the stored procedure once to have all the values in the array send to database. Another way is to bind .Net arrays to regular SQL command parameters, you still execute the command once, but ODP.Net automatically create a prepared statement, loop through the arrays, execute the prepared statement multiple times but sending them all in a batch. Both has limitation that you cannot mix different SQL commands in one batch. But it is still a perfect solution to my problem. In the example above, we do execute the exactly same SQL many times in the for loop.
Extension Method to AdoTemplate
While the solution is promising, certainly I don't want to see the ODP.Net proprietary extension and type casting code all over the data access layer. The best is to have Spring.Net API to support this but unfortunately I cannot wait for it to come true and there doesn't seem to be any plan for that. NHibernate has batch support to SQL Server database by hacking the ADO.Net API, but support for Oracle database is still missing. May be they can consider to use of ODP.Net feature I mentioned above.
This is where the extension method come to handy. Hey, we can extend the AdoTemplate or may be better IAdoOperations. Let's extends it with the method below.
I'm not going to explain the first two parameters which are common to all the ExecuteNonQuery methods. The data is a collection of value object that we are going to us as batch SQL parameters. The Converter delegate takes one value object and translate it to Spring's IDbParameters object that can be used to execute the SQL command.
The implementation takes a generic approach by using a factory pattern, so that it can be used with any other database that can provide batch support. When the extension method is called with an implementation of IAdoOperations that also implements the IBatchExecutorFactory interface, it makes use of the factory to execute the SQL in batch, otherwise it falls back to non-batch by looping through the collection and execute the command one by one. The full class can be found here.
And here is the interfaces for IBatchExecutorFactory and IBatchExecutor.
By now, we have extended the Spring.Net framework to support the use of batch SQL. What's left to do is to provide an implementation of IAdoOperations that uses ODP.Net batch SQL feature.
The Batch Executor for ODP.Net
To simplify the task, I decided to let the OracleOdpTemplate inherit from AdoTemplate. The class itself is straightforward, it provides a property for user to set the batch size and defaulted to 100. And it implements the IBatchExecutorFactory interface so that the extension method can detect it and obtain the batch executor from it.
The real worker is the inner class OracleOdpTemplate.BatchExecutor. Basically, it loop through all the data in the collection, calls the converter delegate to get the binding parameters for each value object, accumulate the parameter values in corresponding arrays. When accumulated rows reached the batch size, it flushes them to database using ODP.Net array binding. See the code snippet below.
Putting Things Together
Everything is ready so let's rewrite the example code we had in the beginning of the this post by using the extension method. We create a converter that sets the parameter values for each data object we need to insert then call ExecuteNonQuery extension method once with the list and converter.
In addition to this, we also need to tell Spring.Net to inject the batch capable version of AdoTemplate. So we replace the AdoTemplate in the Spring.Net's configuration file with OracleOdpTemplate.
Now run the test again, the same amount of rows are inserted in 0.7s with batch size of 100. That's ten times faster!
Source Code and Binary
You can find both the specific revision of source code I used in this post and binary in Google Code project. You can also get the latest source code of the SpringExtension project that this piece of function belongs to.
(4/17/2009) I have been struggling to write the unit test for this. It turned out the Spring.Net's AdoTemplate implementation doesn't use virtual. This makes extending the framework a little difficult, especially for writing unit tests.