Friday, July 23, 2010

Logging and (Problems with) Test Runners: GSoC 2010 Weeks 6, 7, 8, and 9

It has been four weeks since my last "weekly" update, but only because I've been too busy working to write about it! In the intervening time, I wrote, rewrote, and re-rewrote the logging code for the Cabal test agent. (It seems the third time's the charm.) Finally happy with it, I implemented test runners for the most common Haskell test frameworks: HUnit and both major versions of QuickCheck.

Test Logging

The logging code and format have gone through several iterations, but finally seem to have settled into their final, usable form. By default, all test logs are located in the dist/test directory, unless the user sets a different dist path. Cabal produces two types of logs: human- and machine-readable. The human-readable logs record the standard output of the test executables (for both exitcode-stdio and detailed type tests). As I discussed in my last update, the human-readable logs are named based on the same type of path templates used elsewhere in Cabal. By default, each test suite is logged to its own human-readable file to aid debugging; the default path template is "$pkgid-$test-suite.log". However, a machine should have no problem parsing through all the test results, so there is only one machine-readable log file per package; also, the machine-readable log file stores platform and architecture information, so this reduces the amount of duplicate data stored.

Using path templates to name the log files, it is not always possible to determine in advance the name of the log file, or even if two test suites will be logged to different files or the same file. This can make the issue of overwriting vs. appending logs between test runs somewhat complicated. To settle the issue, Cabal overwrites the dist/testdirectory by default, but will preserve its contents and append any human-readable log files if the --append-human-logs option is specified. This applies only to the human-readable logs; the machine-readable log is always overwritten.

Test Runners

In the past week, I have focused on writing test runners for HUnit and QuickCheck which implement the detailed test interface. The test runners are separate packages that provide an interface between the selected test framework and Cabal's new detailed test interface. The primary obstacle to this effort is the fact that these libraries are designed to output test result information to the terminal, which is obviously unsuitable for the detailed test interface. Here, at least, HUnit provides performTest, allowing developers to collect additional information. However, the (unnecessary) use of IO still prevents me from writing a pure test interface.

Writing test runners for the QuickCheck versions has been more difficult. For QuickCheck 1, I have had to rewrite part of the test runner to have a pure interface, but this was a relatively simple process. However, QuickCheck 2 has been more of a problem: the test code is written in an imperative style where results are reported to the terminal and then discarded. I can access certain data directly from the API, such as the overall result of a test, but the API does not expose information such as the failing inputs determined by the shrink loop: these are reported to the terminal, then discarded. Initially, I sought to rewrite the test runner to expose this information, but I quickly found myself rewriting the bulk of the library.

This leaves me in quite a bind; I can either 1) rewrite the library (which is so silly a suggestion that it's not even really an option) or 2) write a parser for QuickCheck's terminal output (which I do not want to do, on the principle that parsing text like this is inelegant and likely to break later on). I find the latter option particularly abhorrent because the test runner code cannot interact directly with QuickCheck's output, since it is written immediately to the terminal, so I would have to integrate the QuickCheck output parser directly into Cabal itself! Until I can resolve this problem, I have written a bare-bones test runner that unfortunately omits useful information for simplicity's sake.

What's Next?

I will probably have to be satisfied with my simplified QuickCheck 2 test runner for now. Hopefully, once my patches to Cabal are committed, I can engage its maintainer in a discussion about the pitfalls I have encountered. Other than that, the remaining weeks of my Google Summer of Code will be spent adding a few more flags to polish off Cabal's test support and documenting my work.