Thursday, February 28, 2008

ACID Principles Applied to Testing

Sometimes you don't realize that you have built your house on a fault line. Then the tectonic plates shift and you are left in a pile of rubble wondering, WTF just happened?

I recently found myself in that very position, as it applies to testing applications. We had been focused on writing unit, integration, and functional tests and automating them for a .NET application that was really nothing more than CRUD but with some significantly complex business rules thrown in. We made some mistakes along the way though that finally came to a head earlier this week when 3 out of 3 of our builds were broken, we were running a day late in releasing an application to UAT and we had made a significant number of refactoring changes that could have very well broken 50% of our functionality, but we couldn't tell for sure. Amidst the rubble, I took a moment to reflect on what got us to that point so that I could share with you, kind reader, the err of our ways.

Our biggest mistake in writing tests was not making sure that they were Isolated. I've always been kind of a persistence geek though, so I naturally started noticing how the ACID principles of database transactions apply so well to testing. For reference, the ACID properties are Atomic, Consistent, Isolated, and Durable, see wikipedia.

Atomic - Can't be broken into smaller parts.

Good unit and integration tests should be atomic. Not so much for functional tests as they are typically the assembly of multiple atomic parts into one whole. The unit and integration tests, however, should be focused on testing one feature of any target. Nothing more, nothing less. This makes the tests cleaner, it makes uncovering causes to failures easier, and it helps to differentiate functional tests from unit and integration tests.

Consistent - Repeated execution results in the same outcome.

Consistency, or the lack thereof, typically rears it's head in the integration and functional tests and in my experience, the cause is often broken dependencies. Database integration tests mysteriously start failing because another developer went in and updated some data that your test depends on. Making integration tests consistent requires some work. Typically it involves automatically prepping a database with all of the values and foreign keys populated so that your test follows it's directed path. This can be made much easier with a framework like DBUnit in Java which preps and removes data as needed.

Isolated - Concurrent operations do not have an affect on one another.

As mentioned earlier, this one was a biggie for us. We found that any person running tests while the build was also running them, caused both to fail. This was due to some poor choices in how we were trying to cleanup after ourselves (blindly searching by a non primary key field that picked up data from other tests). The easiest solution in many instances is to use a pattern provided by the bright fellas over at Spring, and start a transaction in your setup method and then roll that puppy back in your tear down method. This allows everything running in that test to see changes made by the code (assuming it uses the same connection as your tests), but it removes the sometimes complex logic of undoing your changes by hand. Additionally, it ensures that even if the same two tests are running at the same time, each will be on a separate connection and totally unaware of the other's changes. See AbstractTransactionalContextTests in Spring.

Durable - Once successful it is permanent.

It should really take a change in business logic or interface design to get a test breaking once it's in a successful state. This goes hand in hand with the previously mentioned anti-pattern, so don't let data or systems outside of your control muck with your tests. Make your functional tests fully dependent on mocks, stubs, or dummies instead of external systems or databases. Make sure your integration tests are completely responsible for setting up all of the data that they need before running. This will lower the probability of environmental issues throwing up a false negative on your CI server.

I've made a distinction throughout this article with Unit/Functional/Integration tests. Make sure you know the difference between them and be adamant about treating them differently.

Unit tests should never interact with a database, email server, web service, or other disparate system. Unit tests should be automated and run upon every code delivery. They should be extremely fast to keep ADD developers from staring blindly out the window or surfing the net while waiting for the tests to complete.

Integration tests should never be mocking the integrated system and in many cases should only be run on a daily or semi daily basis. Unless your database or external system is being modified multiple times throughout the day, just set these to run up when the lights go off at night. If someone is actively changing your entire persistence layer, they should have the common sense to run these manually before checking in.

Functional tests should almost solely rely on mocks, dummies, or stubs instead of external dependencies and since they'll be pretty zippy if you write your mocks correctly. If they are in the "runs in less than 5 minutes" category, then run them at every check in so that the staff is well aware of any issues as soon as a delivery is made.

Keep in mind the ACID principles when testing your code. They'll hopefully keep you away from the fault lines and in the happy state of green builds.

Wednesday, February 6, 2008

Failing to Fail Early

A coworker and I stumbled across some weird behavior with NHibernate today that totally surprised me. I come from the Hibernate world which tends to be about a version to a version and a half ahead of NHibernate. So I'm used to going to look for excellent support for stored procedures and realizing I'm SOL.

Today though we found an issue that wasn't lack of support, but just a horrible design decision that differs between the two frameworks. In a nutshell, Session.Load was being called with an ID that didn't exist and we weren't receiving an exception. Huh? The difference between Get and Load in my mind was always, same behavior if the PK passed in exists in the database, if it doesn't Get returns null silently, and Load throws ObjectNotFoundException, violently alerting you that your PK is not what you think it is. In this case we were getting back a proxied object that would immediately throw ObjectNotFoundException whenever one of it's getter methods were called. Nice. I assumed that we were misusing the Interceptor functionality or something to cause this but a little googlage revealed this description of the ObjectNotFoundException class in the NHibernate docs:

"Thrown when ISession.Load() fails to select a row with the given primary key (identifier value). This exception might not be thrown when Load() is called, even if there was no row on the database, because Load() returns a proxy if possible..."

Come again? What use do I have for a proxied object that is just going to throw ObjectNotFoundException the moment I start accessing it? Even worse, what if I have a business method that checks a private attribute for null on this object. You shouldn't be able to do anything to this object because according to the database IT DOESN'T EXIST. When all else fails, go to the pragmatic programmers for advice and listen when they tell you to FAIL EARLY. Give me null or give me exceptions, don't fool me into thinking my row is there when it isn't.