Program It: April 2009

Monday, April 27, 2009

How to Populate ODP.Net ClientId When Spring.Net ADO is in Use

This post is long overdue but here we go.

One nice benefit of using Spring.Net's abstraction for ADO.Net data access is that I no longer need to write boiler plate code and never worry about connection leaking when someone forgets to close the connection object. But the problem this brought was that I could no longer set the CLIENT_IDENTIFIER in Oracle database as now Spring.Net is responsible for opening and closing the connection.

With help of Mark Pollack, I started with an wrapper of IDbProvider. Spring already provided DelegatingDbProvider (Kudos to Spring.Net team) so this indeed very easy.

    public class CurrentPrincipleToOracleClientIdDbProvider : DelegatingDbProvider
    {
        public override IDbConnection CreateConnection()
        {
            OracleConnection conn = (OracleConnection) TargetDbProvider.CreateConnection();
            conn.ClientId = Thread.CurrentPrincipal.Identity.Name;
            return conn;
        }
    }

with below configuration.

  <object id="DbProvider" type="Example.CurrentPrincipleToOracleClientIdDbProvider">
    <property name="TargetDBProvider" ref="TargetDbProvider"/>
  </object>
  
  <db:provider id="TargetDbProvider" provider="OracleODP-2.0" connectionString="${ConnectionString}"/>

Well, it didn't work, you cannot set the ClietnId when the connection is not open. Fine, let's open it.

            OracleConnection conn = (OracleConnection) TargetDbProvider.CreateConnection();
            conn.Open();
            conn.ClientId = Thread.CurrentPrincipal.Identity.Name;

Nope, doesn't work either. Although the CLIENT_IDENTIFIER was set correctly this time, but exception was thrown by the Spring.Net framework code complains that the connection is already opened when it tried to open. OK, now I know that IDbProvider.CreateConnection() works differently than Java's DataSource.getConnection().

Stuck? After digging around the members of OracleConnection class. I realized that it actually inherits from DbConnection which has an event called StateChange and Reflector tells that it actually raises events. Great, let's add an event handler.

    public class CurrentPrincipleToOracleClientIdDbProvider : DelegatingDbProvider
    {
        public override IDbConnection CreateConnection()
        {
            OracleConnection conn = (OracleConnection) TargetDbProvider.CreateConnection();
            conn.StateChange += StateChangeEventHandler;
            return conn;
        }

        private void StateChangeEventHandler(object sender, StateChangeEventArgs e)
        {
            if(e.OriginalState == ConnectionState.Closed && e.CurrentState == ConnectionState.Open)
            {
                OracleConnection conn = (OracleConnection)sender;
                conn.ClientId = Thread.CurrentPrincipal.Identity.Name;
            }
        }
    }

Now it works!

Update: If you user Spring.Net, here is any easy way out.

Let Oracle Know the Real User of ADO.Net application When Using Connection Pool

When using connection pooling, all connections are made with a fixed user name. In the database, it is difficult to tell who is the real user that is updating the database. While I can pass the user name to every stored procedures, it will be extremely tedious when multi-level of stored procedure call and still won't work for triggers.

The solution to this problem is to set the user name in some sort of database session state storage. For Oracle database, that is the CLIENT_IDENTIFIER variable in USERENV of the SYS_CONTEXT. You can set this variable by calling a build in package procedure dbms_session.SET_IDENTIFIER and retrieve it with SYS_CONTEXT('USERENV', 'CLIENT_IDENTIFIER').

ODP.Net provide a convenient property, OracleConnection.ClientId, for this. It further reset ClientId automatically before the connection is returned back to the pool.

More information about this topic can be found here.

This same technique can be used with any database that provide some kind of database session storage, and in many databases, the temporary table can serve the same purpose.

For project that are required to run on different databases. A custom stored procedure can be used for this purpose and implementation can be vary. Actually, even with the case of Oracle, we end up used a package because

We can set more information then just CLIENT_IDENTIFIER. For example, the name of application, the client machine name and etc.
We caches those information in a package variable for fast access, package variable is 100 times faster then SYS_CONTEXT('USERENV', 'CLIENT_IDENTIFIER'), which is about the same speed as the build in function USER (we cache USER as well in package variable).

In my next post, I'll discuss how to set the ClientId when Spring.Net ADO support is used to manage the connections.

Saturday, April 25, 2009

Converting Oracle Interval Data Type to Seconds

I have been searching for a solution to convert oracle Interval data type to number type of seconds. To my double surprise: a) Although I know Oracle is database that full of holes and patches, but still surprised that it didn't provide such a basic function; b) this time Google failed to find a solution for such a common request. Tom actually had to write up a sum function for interval. The closest that fit my need is this one, which uses below technique to get seconds.

(TRUNC(SYSDATE) + the_interval - TRUNC(SYSDATE)) * 86400 AS seconds

But that has two major problems:

~~It is rare but can happen that the first sysdate returns day1 23:59 and second sysdate returns day2 00:00. Of course this gives you completely wrong result.~~ See update below.
It loses the fraction of seconds.

After given a few tries, I settled on below expression:

(SYSDATE + the_interval*86400 - SYSDATE) AS seconds

This returns accurate result with factions preserved. ~~There are chance that the 2nd sysdate is one second more to the first sysdate but that only change result by 1/86400 second which I don't really care at this moment.~~

~~The sysdate can actually be replaced with any date column, variable or to_date('1-jan-1900') to eliminate the 1/86400 second error.~~

Update (4/28/09): Shammat made a very good point that the SYSDATE provide read consistency inside a single SQL statement. Thus my comment about "off by one day" was wrong. But the read consistency is not there in PL/SQL, thus please make sure assign SYSDATE to a variable and then use that variable instead.

Common.Logging for .Net and It's Use in Unit Test

Background of Common.Logging

The Commons Logging was widely adopted by majority of projects in Java community. It is an ultra-thin layer between your code and different logging implementations.

A library that uses the commons-logging API can be used with any logging implementation at runtime.
An application can also benefit from the ability to change to a different logging system when requirement changes, for example company wide logging policy changes.
It doesn't force you to configure so every friendly to unit testing, if you are not testing your logging code, nothing needs to be done.

Two years ago, when I had to decide a .Net logging system for my project, I was lucky to find that netcommon project has Common.Logging library that does exactly the same. Since then it is my favorite logging choice for .Net libraries and applications.

DevIL's Diary has an interesting blog compares Common.Logging and System.Diagnostics Trace. This makes me feel good on the decision I made two yeas ago :)

Why unit test logging code

In addition to make the coverage report looks better, there is actually a few important reasons that I need to test logging code.

Logging is like Assertion, it has side effects (about Java and .Net). So I need to test my code both when logging is on and off.
I need to ensure some important information, those are warning and above are actually gets logged.
I need to test the logic to suspend repeated warnings (see example below).

else if(!_isWrongWrappedReaderTypeWarningGiven && _log.IsWarnEnabled)
{
    _log.Warn(String.Format(
        "Expected original reader to be {0} but got {1}. " + 
        "Note: warning is suspended for subsequent repeated events.", 
        typeof(OracleDataReader).FullName, unwrappedReader.GetType().FullName));
    _isWrongWrappedReaderTypeWarningGiven = true;
}

In memory sink for unit testing

To unit test logging code, I need an in memory sink to store logging events so I can verify against it. I had been using log4net MemoryAppender for a while until recently I believe that my life can be better with something other then MemoryAppender, which:

I had to configure log4net and Common.Logging for my unit test
My unit test is strong coupled with log4net
The unit test code is not at all looked clean

That drove me to wrote the InMemoryLoggerFactoryAdaptor for Common.Logging. It is available in the download area. Example below is a test case of OdpNetDataReaderWrapperTests:

        [Test] public void GiveWarningOnceOnlyWhenNonOdpReaderIsEncountered()
        {
            var factory = (InMemoryLoggerFactoryAdaptor) LogManager.Adapter;
            InMemoryLogger logger = factory.GetInMemoryLogger(typeof (OdpNetDataReaderWrapper));
            logger.Level = LogLevel.Warn;
            _mockery.ReplayAll();
            _testee.WrappedReader = _wrapped;
            _testee.RowsExpected = 2313;
            _testee.RowsExpected = 948;
            var query = from entry in logger.LogEntries
                        where entry.Message.ToString().Contains(
                            "Expected original reader to be " + 
                            typeof (OracleDataReader).FullName)
                        select entry;
            Assert.That(query.Count(), Is.EqualTo(1));
            Assert.That(query.First().LogLevel, Is.EqualTo(LogLevel.Warn));
            _mockery.VerifyAll();
        }

In the real situation, the factory and logger should be initialized in the test setup and log entries should be cleared in the when tear down. But the example here shows how easily I can verify a particular warning was given and only given once with help of Ling (If you are still writing code targeting to .Net 2.0 just like me, check out LinqBridge).

Thursday, April 16, 2009

Rhino Mocks Strikes to Mock Non-Virtual Interface Implementation

Rhino Mocks has been my favorite mocking framework for a long time now and I'm more than a happy user. Today I got into a test scenario that seems to be straightforward but no matter how I struggle with it, Rhino Mocks refused to do its work.

That is a none virtual method on a class that I need to mock. Ok Ok, I hear you! Rhino Mocks cannot override none virtual method and declaring new is meaningless as you can never reference directly to the mock class. But the class implements an interface, "Then mock the interface!" I hear you again. Well, a) I don't want to mock every method or I would like to CallOriginalMethod; b) The class I want to mock is the class I want to test...

Totally confused? Let's jump to the code that will explain it better.

Let's say there is an interface and an implementation that I don't own

    public interface IDoNotOwn {
        int DirtyWork(int x);
        int Outer(int y);
    }

    public class DoNotOwn : IDoNotOwn {
        public int DirtyWork(int x) {
            // do something that hard for unit test to setup.
            throw new Exception("Don't call me in unit test");
        }

        public int Outer(int y) {
            return DirtyWork(y + y);
        }
    }

Unfortunately, it is out of my control that DoNotOwn implementation didn't declare the method to be virtual ("Many thanks" to Microsoft for the thoughtful default!)

I need to write my own implementation of IDoNotOwn but don't want to re-write the complex logic in the DirtyWork which is working perfectly fine. So naturally I have MyClass inherit from DoNotOwn.

    public class MyClass : DoNotOwn, IDoNotOwn {
        public new int Outer(int y) {
            return ((IDoNotOwn)this).DirtyWork(y*y);
        }
    }

So far everything looked normal and should just work fine. Let's write unit test for MyClass. I want to mock the call to the DirtyWork. Since my class implement an interface, I hope Rhino Mocks will be able to do that for me. My first attempt was easy to understand but didn't work.

    [TestFixture] public class MyClassTest {
        [Test] public void UsingRhinoMocks() {
            const int workResult = 293848;
            MockRepository mockery = new MockRepository();
            IDoNotOwn o = mockery.CreateMock<MyClass>();
            Expect.Call(o.DirtyWork(4)).Return(workResult);
            mockery.ReplayAll();
            Assert.That(o.Outer(2), Is.EqualTo(workResult));
            mockery.VerifyAll();
        }
    }

It gave me the Exception

System.Exception: Don't call me in unit test 
at MockDemo.DoNotOwn.DirtyWork(Int32 x) in MockDemoTest.cs: line 14 
at MockDemo.MyClassTest.OuterMethod() in MockDemoTest.cs: line 33

Alright, so Rhino Mocks doesn't generate the method stub even if there is an interface exists. How about let me telling it explicitly? Let's give it second try by using MultiMock:

            IDoNotOwn o = mockery.CreateMultiMock<MyClass>(typeof(IDoNotOwn));

Now I got this:

Rhino.Mocks.Exceptions.ExpectationViolationException: IDoNotOwn.Outer(2); Expected #0, Actual #1. 
...... 
at MyClassProxybd7bb9a610da4d3fb0971368461c5b7c.Outer(Int32 y) 
at MockDemo.MyClassTest.OuterMethod() in MockDemoTest.cs: line 35

Fine, I'll setup expectation and call the original method:

            IDoNotOwn o = mockery.CreateMultiMock<MyClass>(typeof(IDoNotOwn));
            Expect.Call(o.Outer(2)).CallOriginalMethod(OriginalCallOptions.NoExpectation);

Still, it throws me off

System.InvalidOperationException: Can't use CallOriginalMethod on method Outer because the method is abstract. 
at Rhino.Mocks.Impl.MethodOptions`1.AssertMethodImplementationExists() 
at Rhino.Mocks.Impl.MethodOptions`1.CallOriginalMethod(OriginalCallOptions options) 
at MockDemo.MyClassTest.OuterMethod() in MockDemoTest.cs: line 34

Now, I'm out of idea of using Rhino Mocks to achieve this. Anybody made this work with Rhino Mocks, please let me know.

I end up writing my own stub which worked very well for me. I believe the same kind of Stub can be easily generated by any mocking system.

        [Test] public void UsingMyStub() {
            const int workResult = 293848;
            MyClassStub stub = new MyClassStub();
            IDoNotOwn o = stub.ExpectCallDirtyWork(4).WillReturn(workResult);
            Assert.That(o.Outer(2), Is.EqualTo(workResult));
            stub.VerifyAll();
        }

        private class MyClassStub : MyClass, IDoNotOwn {
            private int _expectedI, _returnValue;
            private bool _isCalled;
            internal MyClassStub ExpectCallDirtyWork(int i) {
                _expectedI = i; return this;
            }

            internal MyClassStub WillReturn(int value) {
                _returnValue = value; return this;
            }

            internal void VerifyAll() {
                Assert.IsTrue(_isCalled, "Call to DirtyWork was not made.");
            }

            public new int DirtyWork(int i) {
                Assert.That(i, Is.EqualTo(_expectedI));
                Assert.IsFalse(_isCalled, "Duplicated call to DirtyWork.");
                _isCalled = true;
                return _returnValue;
            }
        }

Well, the example here is rather fictional but this all came out from a real world problme when writing the test cases for my OracleOdpTemplate.

Update

(4/19/2009) I asked in the RhinoMocks group mailing list. Tim Barcz reminded me about the adapter pattern. I didn't give that a consideration because, in my real world problem, I would have to wrote a hundred of methods plus their test cases if I attempt to adapt it. But after gave it another thought, I found that I can use a variation of adapter to help testing. Here is the changed the class.

        public class MyClass : DoNotOwn, IDoNotOwn {
            internal IDoNotOwn self;

            public MyClass() {
                self = this;
            }

            public new int Outer(int y) {
                return self.DirtyWork(y * y);
            }
        }

And the test case for it.

        [Test] public void UsingRhinoMocks() {
            const int workResult = 293848;
            MockRepository mockery = new MockRepository();
            var o = new MyClass {self = mockery.CreateMock<IDoNotOwn>()};
            Expect.Call(o.self.DirtyWork(4)).Return(workResult);
            mockery.ReplayAll();
            Assert.That(o.Outer(2), Is.EqualTo(workResult));
            mockery.VerifyAll();
        }

That worked fine for now but I still wish I can easily mock any interface method using RhinoMocks regardless of the virtual declaration of the implementation.

Monday, April 13, 2009

ODP.Net Batch Update With Spring.Net

One of the many important missing pieces of ADO.Net is the inability to do the batch SQL. ADO.Net 2.0 patched in a cheesy batch update support for DataSet only so that when the number of changed rows exceeded certain limit, you get a database exception, not sure about SQL Server but this is at least true for both build in Oracle client and ODP.Net. Bear in mind that this is a pure Microsoft's design flaw by missing out this important feature in their API. It is NOT Oracle's problem. Oracle batch SQL has been working perfectly fine in JDBC world for many many years. In this post, I'm going to share with you a way to make it work with ODP.Net in one use case scenario.

Project Background

Our project uses Spring.Net framework. In the data access layer, we make mixed use of NHibernate and Spring ADO support. Thus all my code here are based on Spring.Net ADO, if you are not familiar with Spring.Net, you can at least read next section so that 1) there is information for about ODP.Net batch in general; and 2) you get to know some of many benefits that Spring.Net brings.

Inception

I have been watching code like below in our project again and again. It loops through a list of value objects and calls AdoTemplate one at a time. This yielded extremely slow performance. In one scenario, it inserted 717 rows in 7.9s.

Slow loop insert example

        public int SaveXyzDetail(IList<Xyz> xyzList)
        {
            string sql = RetrieveSQL("SaveXyzDetail");
            foreach (Xyz xyz in xyzList)
            {
                IDbParameters param = CreateDbParameters();

                param.Add("pxyzId", DbType.Int32).Value = xyz.XyzId;
                ......
                return AdoTemplate.ExecuteNonQuery(CommandType.Text, sql, param);
            }
        }

Obviously, we need to batch those SQL to database for optimal performance. But after looking around the Spring.Net's API as well as ADO.Net's API, to my surprise, there is pretty much nothing out there to help in this situation.

Eventually, I started to look into the extensions that ODP.Net provides. Thanks to Oracle, ODP.Net does provide two ways to execute batch SQL with limitations. One way is to bind the .Net array to PL/SQL associative arrays in stored procedure or PL/SQL block. You execute the stored procedure once to have all the values in the array send to database. Another way is to bind .Net arrays to regular SQL command parameters, you still execute the command once, but ODP.Net automatically create a prepared statement, loop through the arrays, execute the prepared statement multiple times but sending them all in a batch. Both has limitation that you cannot mix different SQL commands in one batch. But it is still a perfect solution to my problem. In the example above, we do execute the exactly same SQL many times in the for loop.

Extension Method to AdoTemplate

While the solution is promising, certainly I don't want to see the ODP.Net proprietary extension and type casting code all over the data access layer. The best is to have Spring.Net API to support this but unfortunately I cannot wait for it to come true and there doesn't seem to be any plan for that. NHibernate has batch support to SQL Server database by hacking the ADO.Net API, but support for Oracle database is still missing. May be they can consider to use of ODP.Net feature I mentioned above.

This is where the extension method come to handy. Hey, we can extend the AdoTemplate or may be better IAdoOperations. Let's extends it with the method below.

Extension Method Signature


1:   public static int ExecuteNonQuery<T>(
2:     this IAdoOperations operation,
3:     CommandType cmdType,
4:     string cmdText, 
5:     ICollection<T> data,
6:     Converter<T, IDbParameters> dataToParameters)

I'm not going to explain the first two parameters which are common to all the ExecuteNonQuery methods. The data is a collection of value object that we are going to us as batch SQL parameters. The Converter delegate takes one value object and translate it to Spring's IDbParameters object that can be used to execute the SQL command.

The implementation takes a generic approach by using a factory pattern, so that it can be used with any other database that can provide batch support. When the extension method is called with an implementation of IAdoOperations that also implements the IBatchExecutorFactory interface, it makes use of the factory to execute the SQL in batch, otherwise it falls back to non-batch by looping through the collection and execute the command one by one. The full class can be found here.

And here is the interfaces for IBatchExecutorFactory and IBatchExecutor.

IBatchExecutorFactory


1:   public interface IBatchExecutorFactory
2:      {
3:   /// <summary>
4:          /// Get an instance of <see cref="IBatchExecutor"/>.
5:          /// </summary>
6:          /// <returns>An instance of <see cref="IBatchExecutor"/>.</returns>
7:          IBatchExecutor GetExecutor();
8:   }

IBatchExecutor


1:   public interface IBatchExecutor
2:      {
3:   /// <summary>
4:          /// Executes batch of non queries with common command and different
5:          /// parameters. 
6:          /// </summary>
7:          /// <typeparam name="T">
8:          /// The type of the data object.
9:          /// </typeparam>
10:          /// <param name="operation">
11:          /// An <see cref="Spring.Data.IAdoOperations"/> object to perform
12:          /// database updates.
13:          /// </param>
14:          /// <param name="cmdType">
15:          /// The type of command.
16:          /// </param>
17:          /// <param name="cmdText">
18:          /// The text of command.
19:          /// </param>
20:          /// <param name="data">
21:          /// A collection of data object to be updated in batch.
22:          /// </param>
23:          /// <param name="dataToParameters">
24:          /// Delegate that converts data object to parameters.
25:          /// </param>
26:          /// <returns>
27:          /// The total updated count if 0 or positive. When -1 is returned,
28:          /// it indicates that the update count cannot be obtained due the
29:          /// the limitation of the batch implementation.
30:          /// </returns>
31:          int ExecuteNonQuery<T>(
32:   IAdoOperations operation,
33:   System.Data.CommandType cmdType,
34:   string cmdText,
35:   System.Collections.Generic.ICollection<T> data,
36:   System.Converter<T, Common.IDbParameters> dataToParameters);

By now, we have extended the Spring.Net framework to support the use of batch SQL. What's left to do is to provide an implementation of IAdoOperations that uses ODP.Net batch SQL feature.

The Batch Executor for ODP.Net

To simplify the task, I decided to let the OracleOdpTemplate inherit from AdoTemplate. The class itself is straightforward, it provides a property for user to set the batch size and defaulted to 100. And it implements the IBatchExecutorFactory interface so that the extension method can detect it and obtain the batch executor from it.

The real worker is the inner class OracleOdpTemplate.BatchExecutor. Basically, it loop through all the data in the collection, calls the converter delegate to get the binding parameters for each value object, accumulate the parameter values in corresponding arrays. When accumulated rows reached the batch size, it flushes them to database using ODP.Net array binding. See the code snippet below.

Code snippet of OracleOdpTemplate.BatchExecutor

            #region IBatchExecutor Members

            public int ExecuteNonQuery<T>(
                IAdoOperations operation,
                CommandType cmdType,
                string cmdText,
                ICollection<T> data,
                Converter<T, IDbParameters> dataToParamters)
            {
                int totalRows = data.Count;
                int batchSize = _odpTemplate.BatchSize;
                if (totalRows < batchSize) batchSize = totalRows;

                int count = 0, bindCount = 0, result = 0;
                object[][] valueBuffer = null;
                
                foreach (T row in data)
                {
                    IDbParameters parameters = dataToParamters(row);
                    if (parameters != null)
                    {
                        if (valueBuffer == null)
                        {
                            valueBuffer = InitBatchParameters(parameters, batchSize);
                        }

                        string error = ValidateAndCopyParams(parameters, valueBuffer, bindCount++);
                        if (error != null)
                        {
                            throw new InvalidDataAccessApiUsageException(error + " for row: " + row);

                        }
                    }
                    ++count;
                    if (bindCount == batchSize || (count == totalRows) && bindCount > 0)
                    {
                        _bindCount = bindCount;
                        result += operation.ExecuteNonQuery(cmdType, cmdText, this);
                        bindCount = 0;
                    }

                }
                return result;
            }

            #endregion

Putting Things Together

Everything is ready so let's rewrite the example code we had in the beginning of the this post by using the extension method. We create a converter that sets the parameter values for each data object we need to insert then call ExecuteNonQuery extension method once with the list and converter.

Fast batch insert example

        public int SaveXyzDetail(IList<Xyz> xyzList)
        {
            string sql = RetrieveSQL("SaveXyzDetail");
            IDbParameters param = CreateDbParameters();

            var paramXyzId = param.Add("pxyzId", DbType.Int32);
            ......

            Converter<Xyz, IDbParameters> converter = delegate(Xyz xyz)
            {
                paramXyzId.Value = xyz.XyzId;
                ......
                return param;
            };
            return AdoTemplate.ExecuteNonQuery(CommandType.Text, sql, xyzList, converter);
        }

In addition to this, we also need to tell Spring.Net to inject the batch capable version of AdoTemplate. So we replace the AdoTemplate in the Spring.Net's configuration file with OracleOdpTemplate.

Modified Spring.Net configuration file

  <db:provider id="DbProvider" provider="OracleODP-2.0" connectionString="${DB.ConnectionString}"/>

  <object id="AdoTemplate" type="Spring.Data.Generic.OracleOdpTemplate, Spring.Extension">
    <property name="DbProvider" ref="DbProvider"/>
    <property name="BatchSize" value="${DB.MaxBatchSize}"/> <!-- number of rows -->
  </object>

  <object id="AdoXyzDao" type="Example.Dao.AdoXyzDao, Example.Dao">
        <property name="AdoTemplate" ref="AdoTemplate"/>
  </object>

Now run the test again, the same amount of rows are inserted in 0.7s with batch size of 100. That's ten times faster!

Source Code and Binary

You can find both the specific revision of source code I used in this post and binary in Google Code project. You can also get the latest source code of the SpringExtension project that this piece of function belongs to.

Updates

(4/17/2009) I have been struggling to write the unit test for this. It turned out the Spring.Net's AdoTemplate implementation doesn't use virtual. This makes extending the framework a little difficult, especially for writing unit tests.

Program It