Cabal Testing Summer of Code: 2010

Friday, July 23, 2010

Logging and (Problems with) Test Runners: GSoC 2010 Weeks 6, 7, 8, and 9

It has been four weeks since my last "weekly" update, but only because I've been too busy working to write about it! In the intervening time, I wrote, rewrote, and re-rewrote the logging code for the Cabal test agent. (It seems the third time's the charm.) Finally happy with it, I implemented test runners for the most common Haskell test frameworks: HUnit and both major versions of QuickCheck.

Test Logging

The logging code and format have gone through several iterations, but finally seem to have settled into their final, usable form. By default, all test logs are located in the dist/test directory, unless the user sets a different dist path. Cabal produces two types of logs: human- and machine-readable. The human-readable logs record the standard output of the test executables (for both exitcode-stdio and detailed type tests). As I discussed in my last update, the human-readable logs are named based on the same type of path templates used elsewhere in Cabal. By default, each test suite is logged to its own human-readable file to aid debugging; the default path template is "$pkgid-$test-suite.log". However, a machine should have no problem parsing through all the test results, so there is only one machine-readable log file per package; also, the machine-readable log file stores platform and architecture information, so this reduces the amount of duplicate data stored.

Using path templates to name the log files, it is not always possible to determine in advance the name of the log file, or even if two test suites will be logged to different files or the same file. This can make the issue of overwriting vs. appending logs between test runs somewhat complicated. To settle the issue, Cabal overwrites the dist/testdirectory by default, but will preserve its contents and append any human-readable log files if the --append-human-logs option is specified. This applies only to the human-readable logs; the machine-readable log is always overwritten.

Test Runners

In the past week, I have focused on writing test runners for HUnit and QuickCheck which implement the detailed test interface. The test runners are separate packages that provide an interface between the selected test framework and Cabal's new detailed test interface. The primary obstacle to this effort is the fact that these libraries are designed to output test result information to the terminal, which is obviously unsuitable for the detailed test interface. Here, at least, HUnit provides performTest, allowing developers to collect additional information. However, the (unnecessary) use of IO still prevents me from writing a pure test interface.

Writing test runners for the QuickCheck versions has been more difficult. For QuickCheck 1, I have had to rewrite part of the test runner to have a pure interface, but this was a relatively simple process. However, QuickCheck 2 has been more of a problem: the test code is written in an imperative style where results are reported to the terminal and then discarded. I can access certain data directly from the API, such as the overall result of a test, but the API does not expose information such as the failing inputs determined by the shrink loop: these are reported to the terminal, then discarded. Initially, I sought to rewrite the test runner to expose this information, but I quickly found myself rewriting the bulk of the library.

This leaves me in quite a bind; I can either 1) rewrite the library (which is so silly a suggestion that it's not even really an option) or 2) write a parser for QuickCheck's terminal output (which I do not want to do, on the principle that parsing text like this is inelegant and likely to break later on). I find the latter option particularly abhorrent because the test runner code cannot interact directly with QuickCheck's output, since it is written immediately to the terminal, so I would have to integrate the QuickCheck output parser directly into Cabal itself! Until I can resolve this problem, I have written a bare-bones test runner that unfortunately omits useful information for simplicity's sake.

What's Next?

I will probably have to be satisfied with my simplified QuickCheck 2 test runner for now. Hopefully, once my patches to Cabal are committed, I can engage its maintainer in a discussion about the pitfalls I have encountered. Other than that, the remaining weeks of my Google Summer of Code will be spent adding a few more flags to polish off Cabal's test support and documenting my work.

Wednesday, June 30, 2010

GSoC 2010 Weeks 4 and 5: Test Suite Logging

Despite the title, I have spent most of the last two weeks working on library test suite support. However, I will be talking mostly about test suite logging, where more user-visible developments have taken place.

Controlling Test Suite Output and Logging

My main project for week 4 was to implement command-line options for Cabal controlling console output and log file names. During week 3, I implemented a set of eight output filter options allowing the user to specify separately which test suites' output would be displayed on the console and which would be logged to file. We eventually settled on a simpler scheme: output from test suites is always logged to file, and a single option, --test-filter={summary,failures,all} controls console output. These choices have the following effects:

summary: indicate whether each test suite was successful or not
failures: display summary information and the output from any unsuccessful test suites
all: display summary information and the output from all test suites

The names of the test suite log files may now be specified by a path template with the option --test-log. Path templates are already in use for build reports, but several variables have been added for test log path templates; a comprehensive listing follows.

$pkg: the name of the package
$version: the version of the package
$pkgid: the name and version of the package
$compiler: the compiler which built the pacakge
$os: the operating system the package was built on
$arch: the machine architecture the package was built on
$test-suite: the name of the test suite; if this variable is not used, all test suites are logged to the same file
$result: the result of the test suite, one of pass or fail
$stdio: the output channel being logged, one of stdout or stderr; if this variable is not used, test suite stdout and stderr are logged to the same file

Security Improvements

Because of the $result path template variable, test suite output must be logged first to a temporary file until the result is determined. I wasn't thinking much about security, and being unfamiliar with openTempFile, I naively (and quite unsecurely) reinvented the wheel. After Duncan pointed out my error and the existence of a standard library solution, I fixed the bug.

Detailed Test Suites

In week 5, I submitted to Duncan a draft of support for detailed test suites, where individual test cases are exposed to Cabal. These patches included support for building the library-based detailed test suites as well as for running detailed test suites alongside the other type of test suite. There are two challenging details in this aspect of the project: building and registering the test suite libraries and exposing the individual test case results to the parent process (Cabal, in this case).

Building Library Test Suites

There are a couple of challenges here. The source for a stub executable--to run all the test cases listed in the test library--must be written during the preprocesing stage. The stub executable is relatively simple, because the it is nearly the same for every test suite. During the build stage, Cabal must build this stub executable along with the library for the test suite. I chose to construct a fake Library and PackageDescription, named after the test suite, for the library component of the test because Cabal does not support multiple libraries in the same package, and thus derives the library name from the package name. The library is registered in the in-place package database before building the stub executable, which is named after the test suite with "Stub" appended. Name conflicts between the package and the test suite and between the test suites and executables must be avoided because of these choices.

Running Library Test Suites

The problem here is deciding how to pass data between Cabal and the stub executable when running the test suite. In particular, the stub executable needs the log file template, the path template environment, and the location of dist. The calling process also needs a detailed list of test results from the stub. In my latest patches, Cabal stores all this information in an intermediate structure and shows it into the standard input of the stub, which runs and logs the test cases and shows the list of results on its standard output. Cabal reads this and decides what information to display to the user on the console. There is no support for running only selected test cases from a test suite at this time; this functionality is not a high priority and may be left to third-party test agents.

Next Steps

There are still decisions to be made about the log file format, specifically about how to balance the advantages of human- versus machine-readable logs. The ideal test log format would be readable both by human users and, e.g., Hackage. The current patch set simply dumps the standard output to file as a convenient, if temporary, response to this indecision. Designing a better log format will occupy the rest of week 6. Once the format is settled upon, a test or tests of the test runners will be included in the Cabal test suite.

Tuesday, June 15, 2010

GSoC 2010 Weeks 2 and 3: More Parsing and Improvements to Cabal's Test Suite Runner

My focus this week has been on submitting my executable test suite patches. These patches have just been added to the head repository; although there is ongoing discussion about cosmetic issues in the .cabal file format, executable test suite support is probably approaching its final incarnation.

The most notable of the changes is our conscious decision to use "test suite" instead of "testsuite" everywhere in Cabal, and to emphasize the distinction between individual tests and test suites. As a result, the test stanza has become the test-suite stanza. We have also decided to accept only single versions for the test suite interface type in the .cabal file, instead of version ranges as I previously wrote. As a result, the new test-suite stanza looks like this:

test-suite foo
    type: exitcode-stdio-1.0
    main-is: main.hs
    hs-source-dirs: tests

test-suite bar
    type: library-1.0
    test-module: Bar
    hs-source-dirs: tests

I have also implemented a set of options (--log-{success,failure}-{file,terminal,both,none}) controlling how Cabal logs test suite output. Output logged to file goes in a uniquely named file in the system temporary directory; the other options should be self-explanatory. The exit code is also set depending on the success or failure of the package test suites, making it possible to do things like:

$ cabal configure --enable-tests && cabal build && cabal test --log-success-none --log-failure-file && release-software

in order to have a (nearly) completely automated testing process.

From here, the next step is to create the test interface for the detailed (library) test suite type. As I have written before, the interface must support setting various options for tests from different frameworks, including setting the seed used to generate random values--e.g., with QuickCheck tests--so that tests are reproducable. Ideally, the interface would also distinguish between tests that must actually be run in IO and otherwise pure tests that use random values, which are actually deterministic given the seed. This latter point isn't actually necessary (as shown by the lack of similar support in existing test runners), but it would be a beneficial guarantee of parallelizability of tests.

Monday, May 31, 2010

GSoC 2010 Week 1: Parsing, Building, and Running Testsuites

The first week of the Google Summer of Code 2010 has been a productive one, for me! I've been working on:

parsing test stanzas from .cabal files,
building executable-type testsuites, and
running testsuites and collecting results.

On the first two points, I have patches ready and awaiting approval from Duncan Coutts, Cabal's maintainer; hopefully, they will be available in the repository soon.

As I have worked on the implementation, there has been one notable change to the design of the test stanza in the .cabal file: rather than indicate the testsuite interface version in its own field, it is now indicated in the same field as the interface type in the same way versions are indicated in the build dependencies. If the description I just gave is difficult to understand, then a picture may be worth a thousand words:


Test foo
    test-is: foo.hs
    type: executable == 1

A few notes about this interface:

Version 1 is the only version of the executable interface at this time.
No guarantee of compatibility is made between interface versions, so although this style would allow package authors to specify version ranges, they are encouraged to do so carefully. In fact, there is no reason to do so, as newer versions of Cabal will always support older testsuite interface versions.
Because all versions of the interface will always be available and be mutually incompatible, there is no sane default version; therefore, the version must be specified. This differs from the build-depends field.

In week 2, I will be focusing on running testsuites and collecting their results. The standard output and standard error will be captured, along with the exit code, which will indicate the success or failure of the testsuite. Hopefully, I will have patches submitted and reviewed by the end of the week, and executable testsuite support will be (more or less) complete!

Tuesday, May 11, 2010

A Prototype Test Suite Library Interface

This post is Literate Haskell.

> {-# LANGUAGE ExistentialQuantification #-}
> module Distribution.Testsuite where

This requires base >= 4 for extensible exceptions.

> import Control.Exception ( SomeException, Exception(toException) )
> import Control.Monad ( liftM )
> import Data.Dynamic ( Dynamic() )
> import Data.Function ( on )
> import Data.List ( unionBy )
> import Data.Monoid ( Monoid(..) )

The use of unsafePerformIO is an unfortunate consequence of the method used to catch exceptions below.

> import System.IO.Unsafe ( unsafePerformIO )

Tests are separated based on their purity. Although it appears that any test using the RNG, i.e. all tests using QuickCheck, must be impure, the use of the RNG is actually pure if the starting seed is specified in the Options. Therefore, only tests using other sorts of IO need be impure. Hopefully, with this purity information, test agents can make more informed decisions about which tests can and cannot be run in parallel.

> data Test
>     = forall p. PureTestable p => PureTest p
>     | forall i. ImpureTestable i => ImpureTest i
> 
> class TestOptions t => ImpureTestable t where
>     getResult :: t -> Options -> IO Result
> 
> class TestOptions t => PureTestable t where
>     result :: t -> Options -> Result
> 
> class TestOptions t where
>     name :: t -> Name
>     options :: t -> [String]
>     defaultOptions :: t -> IO Options

The defaultOptions are returned in IO because it may be necessary to generate a random seed for some types of tests.

> type Name = String
> 
> newtype Options = Options [(String, Dynamic)]
>
> data Result = Pass | Fail Reason | Error SomeException
>     deriving Show
> 
> type Reason = String

The instances of the PureTestable, ImpureTestable and TestOptions classes will be left to the test libraries, e.g. HUnit and QuickCheck. It is not our purpose to reinvent the features they already provide, but to specify a uniform interface between them and test agents.

We provide a sensible instance of Monoid to allow the combination of sets of Options with the default options.

> instance Monoid Options where
>     mempty = Options []
> 
>     mappend (Options a) (Options b) =
>         Options $ unionBy ((==) `on` fst) a b

Default options go on the right argument of mappend and get overwritten by Options to the left.

What remains are some helper functions for handling options and exceptions:

> wrapException :: IO Result -> IO Result
> wrapException go = catch go $ return . Error . toException
> 
> mergeOptions :: TestOptions t => t -> Options -> IO Options
> mergeOptions test opts = liftM (mappend opts) $ defaultOptions test
> 
> runImpureTest :: ImpureTestable t => t -> Options -> IO Result
> runImpureTest = (wrapException .) . getResult
> 
> runPureTest :: PureTestable t => t -> Options -> Result
> runPureTest test opts = unsafePerformIO $ wrapException $ return $ result test opts

The use of unsafePerformIO in runPureTest is actually safe, since the function we are catching exceptions from is pure.

Changes to Cabal Command-Line Interface

Configuring and Building

In the configure stage, Cabal will have to recognize the --{en,dis}able-tests options. With tests disabled, the Test sections can be ignored entirely, but with tests enabled, Cabal will need to generate stub executables and do dependency resolution for the Test sections and any stubs.

During the build stage, Cabal will build any test suite executables, test suite libraries, and stub executables. Test suite libraries and executables may depend on the library exported by the package (if any), and therefore must be built after the rest of the package's contents.

Running Tests

The test stage can be invoked by the command cabal test. If the name of a test suite is supplied, then only that test suite will be run. Otherwise, all test suites named in the package description file will be run. For shell-type test suites, Cabal will run the test suite executable, collecting any output and the exit code, and report these to the user, indicating the success or failure of the test suite. For library-type test suites, Cabal will run the stub executable linked to the test suite library which will run each test in the suite and report on the individual results.

Library-type test suites may also be invoked by external agents. In this case, the method of invocation will depend on the test agent. The test agent will be responsible for generating and compiling a stub executable. The behavior of this executable will also be agent-specific, allowing for functionality to be extended beyond what Cabal's basic test runner will support. The choice of test agent is left to the user and the use of one external test agent will not prevent the use of Cabal's basic test runner or of other external agents.

Monday, May 10, 2010

Changes to .cabal File Format

This project will allow for a new type of section in Cabal's package description files: the aptly named Test section. The Test section will contain the field type indicating the test suite interface and the field test-is, the meaning of which will depend on the type of test suite. Test suite types will support versions to maintain backwards compatibility if improvements are made to the interface types.

Test Section (Shell Interface)

If the type field is set to shell-1 (indicating a test suite using the first version of the shell type), the test-is field should refer to a file that compiles into a valid executable. The rest of the section should contain any of the usual build info fields necessary. For example:

Test test-foo
    type: shell-1
    test-is: TestFoo.hs
    ...

Test Section (Library Interface)

If the type field is set to library-1, the test-is field should be a module that exports a symbol tests. The rest of the section should contain any of the necessary build info fields. For example:

Test test-bar
    type: library-1
    test-is: Tests.Bar
    ...

In the file "Tests/Bar.hs" we should find:

module Tests.Bar (tests) where
import Distribution.Testsuite (Test(..))

tests :: [Test]
tests = [ test1, test2, ... ]

Build Systems

For packages using the Simple build system, Cabal will automatically handle building test suite executables or libraries. For library test suites, Cabal will also build a stub executable allowing the user to selectively run tests with results reported to the standard output. (This is the default test agent.) To allow other test agents to use the test suite library, Cabal will install it to a local package database reserved for this purpose. In order to be compatible with this interface, packages with custom build systems must likewise handle test suite executables and libraries, including the installation of the test suite libraries to a local database.

Introductions

Who am I?

I am Thomas Tuegel from Indianapolis, USA. I have just completed my B.S. in physics and mathematics at Butler University. In August, I will begin studying for my Ph.D. in physics at the University of Illinois at Urbana-Champaign. I have been accepted as a student in the 2010 Google Summer of Code to work on testing support in Cabal.

Overview

The immediate goal of this project is to implement testing support in Cabal so that users have access to useful information about package test suite results presented in a uniform format. Initially, this will make it easy for users to run test suites locally, but eventually Hackage will collect test suite results for uploaded packages. By making this information readily available to the public, we hope to incline more authors to provide useful, working test suites for their packages, improving the quality of software available on Hackage.

Cabal will support two test suite interfaces: a shell interface designed for easy integration with existing test suites and a library interface that will be the standard for integrating new test suites. For tests using the shell interface, Cabal will collect the exit code (indicating success or failure of the test suite) and standard output from the test suite executable. The shell interface will allow us to collect meaningful information from existing test suites without modification; existing test suites need only be designated as such in the .cabal file.

The library interface is really the interesting part of this project. A library test suite will export an entry point of a particular type which Cabal (or other test agents) can import into a stub executable. Then, the stub executable can access information about each test in the suite, run selected tests, and report the results. Cabal will specify a uniform interface for library test suites so that test agents can be written which are compatible with the many packages we hope will have test suites implementing the library interface in the future.

Cabal will be one test agent providing the capability to run selected tests from a library test suite and report the results on the standard output. To facilitate other test agents compiling stub executables linking to the test suite library, Cabal will install the library to a local package database reserved for this purpose.

Cabal Testing Summer of Code