Notes: Property-based testing

The following is a rough draft of links and notes I have put together while researching property-based testing.

Example-based testing

Most common form of testing.
Tests if a code works for a pre-defined set of examples.
Very easy and cheap to write.
The number of test cases can grow exponentially depending on how many paths a program can take.

Example:

test("sum(a, b)", () => {
  expect(sum(1, 3)).toBe(4)
  expect(sum(1, 0)).toBe(1)
  expect(sum(0, 0)).toBe(0)
})

Property-based testing

Shifts the focus to the properties of a program.
Tests the boundaries of the outputs instead of a limited set of instances.
Example based vs. property-based - “There-exists” vs. “For-all”

Some definitions:

Property based testing is the construction of tests such that, when these tests are fuzzed, failures in the test reveal problems with the system under test that could not have been revealed by direct fuzzing of that system. - David Maclver

Property based testing works like this: first, you describe the arguments of your program, then you describe the result that you expect from those arguments. After that the computer does multiple attempts to prove your code wrong. - Someone on the internet

In property based testing, rather than checking the results with specific input, properties are asserted - “for any possible input, [some condition] should hold” - and a test runner searches for counter-examples. - NarendraC

Properties

Predicates that have to hold true for all inputs generated.
Invariants - won’t change given the input.
- They follow the same idea of mathematical properties.
Coming up with properties is the hard part of property based testing. Coming up with examples is quite easy.

Example - testing the sum function with jsverify:

describe("sum(a, b)", () => {
  jsc.property(
    "sum is commutative (a + b = b + a)",
    jsc.integer,
    jsc.integer,
    (a, b) => sum(a, b) === sum(b, a)
  )
})

Patterns

Update: I’ve implemented some of the patterns mentioned below. See it here.

Choosing properties is hard, but we have some patterns to help:

Inverse functions
- decode(encode(x)) === x
Fuzzing
- Looks for unexpected crashes (500 status code, unexpected exceptions, etc.)
Test Oracle
- Use an alternative version of implementation to check the result.
- myCode() === oracleCode()

More patterns can be found on the links:

Benefits

Some benefits of property-based testing:

Forces you to reason about your code in a way you are not used to.
Good for finding and tracking corner cases that weren’t considered.
Good for finding bad inputs of a program.

Tooling

QuickCheck:

Invented in 2000 by John Hughes and Koen Clasessen.
First tool created to test the properties of programs.
QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs
Ported to many languages since then.

JavaScript:

Shrinking:

A mechanism to simplify failing inputs of a test case.
Finds the minimal reproducible case of a failing test.

Generators/Seed:

A mechanism to produce random inputs from a type.
Random values are generated from a seed.
- You can use the seed to reproduce failing tests later with the same input generated before.

Who’s using?

Volvo
- Used for testing third-party car components.
LevelDB
- Found a bug in a sequence of 17 operations with stateful testing.
Clojure

Fuzzing vs. Property based testing

Definition of Fuzzing:

Fuzzing is feeding a piece of code (function, program, etc.) data from a large corpus, possibly dynamically generated, possibly dependent on the results of execution on previous data, in order to see whether it fails.

Fuzz testing is a form of property-based testing.
Tests the property “it doesn’t crash”:
- No 500 status code returned from API.
- All responses are valid JSON.
- No unexpected exceptions are thrown.
What’s the difference between fuzz-testing and property-based testing?

Stateful Properties

Complex systems have state!!
- E.g. a key-value database.
With Stateful Property testing, you define a set of possible actions and the framework will try to find sequences of those actions that result in a failure.
Stateful Properties

Misc

Local vs. CI:

Good scenario: use a few iterations on local development and bump the number on CI.
- E.g. 25 iterations on local environments and 500 on CI.

Unit vs. Property:

They both go well together.
Unit tests are still better for doing regressions.