According to data in Soyuz, Automated tests will be run on a configurable schedule, on particular SourcePackageReleases (and perhaps eventually on branches working toward a release), and the results made available to those who are involved with, or interested in, those aspects of the branches being tested. Perhaps tests for some software might be carried out with each new changeset, immediately highlighting the problematic change in many cases. After committing a new feature, the upstream developer and Debian maintainer might receive notification of it having broken functionality or buildability on a particular platform (e.g., Debian), and be able to fix the problem prior to release.
Types of tests
- Package file tests
- These are carried out on the binary and source package files themselves, without installing them
- Package self-tests
- These are executed on a system which has the target package installed. They are generally independent of the platform, though there will be exceptions where distributions diverge in functionality. Usually they will be package-specific, but common themes will be factored out into generic tests and test building blocks
- Packaging system tests
- these require a system (real or virtual) where packages can be installed and upgraded. They are platform-specific insofar as they interface with the platform's packaging system, but will generally apply to any package
- Macro tests
- These will combine other types of tests in interesting ways. This may require changing the state of the system as a whole (such as a reboot, or system installation testing), and as such, some of them may only be feasible to automate on a virtual system.
The testing framework will almost certainly need a Sandbox where it can freely install and remove packages and otherwise abuse the system.
Something similar to Tinderbox to show when something breaks and give some idea of blame
- Must take into account packages other than the one being tested: if not testing with the same versions of libraries, for example, the breakage might not be due to a change in the target package at all. Must store a list of all software in the test environment
Maybe do some extra work to isolate breakage in some cases, along the lines of this paper: Yesterday, my program worked. Today, it does not. Why?
- Testing framework should measure time as well; if the tests are designed for it, this may also provide a framework for performance measurement (which will share similar reporting requirements)
- Should integrate results from builds as well as tests
- Build failures
JamesTroup proposed testing for questionable compiler flags during the build
* DafyddHarries pointed out that Skolelinux is already doing some upgrade testing: http://developer.skolelinux.no/~pere/upgrade-testing/
Package file tests
Binary package tests
- Sanity checks
- Test presence, extractability, syntax of control data
- Test presence, extractability of contents
- Package signatures?
- Comparison-based tests
- Compare package file list to previous version (debdiff), report differences; automatically generate a warning when the package changed considerably (previously: 100 files, now 10)
- Test for file overlaps with any other known binary package, report undeclared conflicts
Source package tests
We already test buildability, which is the most important source pacakge test. Ideally, the reporting for build failures should feed into the same presentation layer as test failures (probably via Soyuz)
- Test functionality of debian/rules clean target
- Test for presence of required, optional debian/rules targets
- Especially those we add as extensions
- Useful for generating "to do" lists across the archive when implementing packaging extensions
- Even simplistic tests, such as testing that a binary links and loads (foo --version), are very useful
- Packages providing a network service should confirm that it is running
- Packages running a daemon should confirm that it is running
- Add test cases for bugs as they are fixed, to prevent regressions
- Is there any way to make use of upstream-supplied testing frameworks?
- Usually designed to run from source tree, so usually not installed at all; however, results could be included into the package, similar to binutils
- Testing at install time is better, this will also catch library transition problems, etc.
- -dev packages could compile and link a trivial test program; this would catch many common errors
For the kernel, LTP looks excellent
- Run self-tests under profiling/debugging tools like valgrind
JeffWaugh: maybe use click-testing of X programs with JNEE
Find general testing framework which can be utilized; packages should install a test script in a hook directory (e. g. /usr/share/selftest/package)
Packaging system tests
- Package installation
- Permute Debconf answers?
- check which Debconf questions are actually shown (needs noninteractive, but logging frontend)
- Package removal
- Package purge
- Compare system file list before install and after purge, report differences
- Package upgrades
- Test upgrade from previous version of package
- Test upgrade from latest stable version of package
- Package installation, followed by self-tests
- Install on bare minimal system and test (catch missing deps)
- Install on huge system and test (catch conflicts)
- Install on old system (stable?) and test (catch missing versioned deps)
- Permute Debconf answers?
- Package removal
- Test install-purge, install-remove-purge cycles
- System startup
- Reboot system with package installed, test for any startup errors (how?)
- Test that any daemons are running (self-test integration?)
- LSB compliance testing
- System installation testing
- Collect data:
- debconf questions asked
- installed system disk space requirements
- size of download?
- Collect data:
Assignments (Hoary goals)
JamesTroup: decide on and install sandbox
ColinWatson: packaging system tests