Property-based testing

A time-honored techniques to avoid introducing bugs while modifying software is to set up automated tests which assert that code still does what it’s supposed to do (e.g. produces expected results). But it is not so easy to get this right:

A test is most useful when it targets a small software component (e.g. single function, class…) because when that fails, it gives you a precise indication of where the problem is, which saves you debugging time. But that means you need to write a lot of tests, which is tedious.
Because writing tests is tedious, manually written tests tend to be very unimaginative: they always test against the same inputs, do not try many different configurations… and that increases the odds that bugs which only affect more exotic edge cases will be missed.
The above problem is magnified by the fact that in most scientific projects, the same person writes the code being tested and the test that exercises it: what edge case you missed while writing the code, you are likely to also miss when writing the associated test.

Thankfully, there is a way to make fine-grained tests a little easier to write and much more imaginative, and that is to write tests that operate over random inputs, and assert properties that should be true of any output. This is called property-based testing.

Defining a property

The concept of property-based testing is borrowed from functional programming. To summarize, the idea is no longer to predict the output knowing a particular input, but to determine and test the properties of our function. But what do we mean by property? Mathematically, a property p for all elements of a set X is usually defined as a function p: X → {true, false}. Well, it is not always obvious to determine in practice.

Let’s try to reason on a simple example that takes two numbers and adds them:

def add_numbers(a: int, b: int) -> int:
    return a + b

What particular property(ies) characterizes the add_numbers function?

Looking at the function, we see that it is based around the addition operator. Unlike subtraction, addition is commutative, so add_numbers(a, b) == add_numbers(b, a). But we could have made a bug by writing the operator * instead of + and the test would still be valid… So what other property distinguishes addition from multiplication? The neutral element! add_numbers(a, 0) == a is not satisfied for multiplication.

We have therefore isolated at least two properties which allow us to test our function (its implementation) without having to resort to explicit input/output values (well almost, see later). Finding properties is the real difficulty of this type of test. Here are some properties that often come up in our scientific developments (the list is endless!):

The invariants (e.g. size of a collection when applying a map, or the contents of a collection when sorting it)
Inverses (invertible function)
Idempotence (applying twice the function left the results identical than applying it once)
Properties of internal composition laws: associativity, commutativity…
Structural induction (solve a smaller problem first)
Hard to prove, easy to verify (think of a maze)

A less trivial example

Let us assume, for example, that you are writing a C++ routine that takes an integer as input and decomposes it into prime factors. It could have the following signature:

std::vector<unsigned> prime_factors(unsigned input);

What can we tell about such a function without looking at the implementation?

All output integers should be prime numbers.
The product of all output integers should be equal to input.

So we can write a test that, given random unsigned integers, passes them through prime_factors and asserts that the result has these two properties:

void assert(bool expectation, std::string message) {
    if (!expectation) {
        throw std::runtime_error(message);
    }
}

bool is_prime(unsigned number) {
    for (unsigned i = 2; i <= number / 2; ++i) {
        if (number % i == 0) {
            return false;
        }
    }
    return true;
}

int main() {
    for (int i = 0; i < 100; ++i) {
        // NOTE: This does not cover all possible values of "unsigned", in the
        //       exercises we will use a better randomness source that does.
        unsigned input = rand();

        std::vector<unsigned> factors = prime_factors(input);

        unsigned product = 1;
        for (const unsigned factor: factors) {
            assert(is_prime(factor), "Returned factor is not a prime number");
            product *= factor;
        }
        assert(product == input, "Factor product is not original input");
    }
}

And that, in a nutshell, is the idea of property-based testing. Of course, real world property-based testing frameworks have more features, such as printing out what inputs made the test fails or searching for the simplest input that makes the test fail. But the basic principle is simple: unless you are explicitly testing for a known tricky edge case, write your tests to operate over random inputs and assert things that should be true of any output.

Now that you are hopefully convinced that this is a testing technique worth learning, please choose in which programming language you would like to experiment with it:

Robust Programming

Property-based testing

Defining a property

A less trivial example