Friday, July 6, 2007

Unit Testing from the trenches

Unit testing is one of the cornerstones of modern software development. Working on several projects with Spring and Hibernate I have come to the conclusion that (in that setting?) there are actually three types of unit testing:

1. Basic Unit testing
2. Dao Layer Unit testing
3. Component Integration testing

As the name of the last type suggests, the categories are actually on a scale towards integration testing. This leads to the conclusion that the boundaries between unit and integration testing are not very sharp. This is interesting, since in our build systems (maven) and best practices, the two are distinct steps.

In this blog I’ll discuss the differences between the three types and describe some guidelines for creating tests.

Basic Unit testing


Basic unit testing is where you test the smallest units in your system in isolation. The units are usually single classes, in rare cases a class can be considered a sub unit of another class not worthy of a separate test. In my experience a satisfactory level of isolation can only be achieved by mocking. The reason for this is that only mocking ensures that when the functionality of a class changes, the change doesn’t cascade through the tests for all classes using that functionality. This is what isolation is all about. We use RMock, although rumor has it that on a Java5 enabled project the latest version of EasyMock would be superior.

A serious drawback of this approach, in my opinion, is that it doesn’t really inspire TDD. One of the basics of TDD is that the design of the solution should follow from the test and the test should follow directly from a requirement. Most requirements are not easily interpreted as “add this functionality to this class”. To arrive at the class level, some (upfront) design is necessary. So instead of writing a test that encapsulates the requirement, the requirement is broken down into functionality that is assigned to classes (this is the upfront design) and this functionality is tested.

DAO layer testing


When using Hibernate or any other framework for your DAO layer, a lot of business logic ends up in units or artifacts that are not testable by basic unit testing. With Hibernate the business logic gets injected into the mapping files and the queries.
Hibernate mappings basically allow the application to do CRUD operations on entities. A basic test would start with inserting an entity, check that it can be retrieved, update it, check again and then delete it. Envisioning a test framework for this is not that hard. But Hibernate configuration files also define complex properties such as cascading and inverse-ness. Correct configuration of these properties is just as important in your application. Testing these properties is a bit more complicated, involving navigating associations and collections.

Testing queries is a bit more complicated. Based upon an HQL, SQL or Criteria query, Hibernate generates SQL that can be executed against a specific RDBMS. One approach towards testing the query would be checking that the generated query is correct. This involves parsing the SQL produced and then checking certain features, such as the presence of tables in the FROM-clause, the correct restrictions in the WHERE-clause and the columns in the SELECT clause. This seems pretty laborious. Another approach would be to actually execute the SQL against a database and check the results. Using an in memory database such as HSQLDB, this is quite an easy task. One important side effect of this is that Hibernate will generate different SQL for a different type of RDBMS. When the test setup uses a different type of RDBMS, the queries that are used in the deployed situation are actually never tested during the unittest phase.

Data is another problem when running tests against an actual RDBMS. To be able to check the logic of a query, it should be executed against a number of different datasets. Usually the query is sensitive to parameters as well as to the actual data in the system. Both sensitivities should be tested with different variations. One way of getting data in to the system to enable testing is using DBUnit. This framework reads XML files and creates insert statements that can be executed against the RDBMS. The problem with this approach is that test expectations and results for different tests tend to get correlated, since the data for the whole test case is read from the same XML file. This file tends to get very big and hard to understand, the different variations that are tested are hardly documented and not evident from a quick scan of the file.

One solution would be to use a separate file for each test. In this situation, tests will not be correlated and a comment at the top of the file could describe the variation that is tested. This would probably lead to a large number of very similar files in the project. Due to foreign key restrictions, inserting a single record in the database often cascades through a lot of related tables that are reference data. This could be resolved by using a hierarchical structure, in which reference data is loaded first and used with a number of smaller and more specific datasets.

A different solution would be to create the objects necessary for the test in the code end persist them using Hibernate. The resulting tests would be easier to refactor and would be more independent. It could lead to a lot of code duplication, but this might be solved by using the ObjectMother Pattern. An extra advantage of this approach would be that the mapping files are tested indirectly. This might actually be a bad thing, since when a mapping is corrupted, completely unrelated tests might fail.

As always, reality isn’t simple and the best solution will be a mix of both. I’ve actually never tried mixing both strategies (DBunit and ObjectMother) on the same project. I’m very curious what the pitfalls will be. A problem that will probably remain with every strategy is that test coverage cannot be determined (or can it?).

Component Integration testing


When using Spring, units are glued together in an application context into ‘components’ or top level beans. The application context is never really used in the unit test phase, since its sole purpose is glueing together units that are tested in isolation in this phase. The application context is one of the most important files in your application. Starting up an application with a typo in its application context will fail miserably, though luckily early.

Some of the properties and structures that are defined in the application context can have a far reaching impact on your application. Usually transaction boundaries are defined here and other cross-cutting concerns as well. Not testing the application context should in my opinion be considered a serious shortcoming of the automatic testing of an application. Some of the components that are used in the deployed situation (and thus configured in the application context) cannot be used in a test situation. An example of these are interfaces to backend systems. To facilitate this, these components should be defined in a separate application context, that can be left out or replaced when testing.

On the other hand, these type of tests are very similar to integration tests. I think the difference should be that the “Component Integration” tests hardly check the results of the application. If a certain service method can be executed without running into a NullPointerException and it outputs something that at first glance appears to be meaningful, the test has succeeded. Further testing should be considered part of the functional and integration tests.

Conclusion


In my opinion there are several steps that should be executed sequentially in the unit test phase. A failure in an early step should terminate the building process without executing the other steps, since their erroneous output might rely on the previously found error. One of my colleagues suggested TestNG to create test groups. The types of tests described above can be used as starting points for these steps. There might be other types of tests that deserve their own group. The methods described above to write the tests seem promising, but are not yet proven technology. Do you think the grouping of unittests is useful? Do you recognize the groups and guidelines described above? 
By Maarten Winkels

No comments: