DEV Community

Cover image for Property-based testing with StreamData
calvinsadewa
calvinsadewa

Posted on

Property-based testing with StreamData

Recently, i have been (re)implementing KSUID library on elixir (called ExKsuid). KSUID is a way to generate identifier (like UUID) which result can be sorted by time, i have been using KSUID a lot, and i like how it can make DB operation like find all notification after X timestamp really efficient if used as primary key. There is an old KSUID library (elixir-ksuid) but some of feature that i want hasn't implemented yet (like generate KSUID from a timestamp).

During implementing ExKsuid, i remember there is StreamData elixir library for Property-based testing, which then i took a fancy of using in testing.

Suprise to me! StreamData actually help me tremendously during implementing to find out edge cases.

What is Property-based testing

When we are thinking of testing, usually what come to mind is testing by example, in which we test process input example and expect result to conform to output example, like:

Describe Find Longest Pair Point of Set

Given set of 2D-point [{1.3, 1.6}, {4.2, 1.0}, {0.3, 0.6}, {2.0, 4.5}, {5.3, 4.3}]
When Find Longest Pair Point of the set
Then return [{0.3, 0.6}, {5.3, 4.3}]

or maybe

Describe Transfer

Given User A with Account Balance 20000,
  AND User B with Account Balance 10000
When User A transfer to User B with amount 5000
Then User A should have Balance 15000,
  AND User B should have Balance 15000

In contrast, Property-based testing try to leverage property of the tested code or model to then generate a bunch of input and try to find when property is broken. Property can be any rule the code/model is expected to hold if it is implemented correctly.

for example, if we have an rectangle with height A, and width B, then pair {(0, A), (B, 0)} is going to be longer then all of set of point taken inside of rectangle
Alt Text
using StreamData, we can generate list of point inside rectangle like (here generate 10 list)

a = 4
b = 5
StreamData.list_of(
  StreamData.tuple({
    StreamData.float(min: 0, max: a), 
    StreamData.float(min: 0, max: b)}
  )
) |> Enum.take(10)
# result
[
  [],
  [],
  [{2.5, 0.0}],
  [{1.875, 2.0}, {2.5, 4.0}, {1.25, 0.0}, {0.625, 2.0}],
  [{0.0, 4.0}],
  [{3.75, 0.0}, {4.375, 1.5}, {5.0, 1.125}, {1.25, 0.0}],
  [
    {3.046875, 4.0},
    {4.375, 2.3125},
    {2.5, 1.875},
    {0.0, 2.0},
    {0.390625, 4.0},
    {0.46875, 0.5}
  ],
  [
    {0.68359375, 0.21875},
    {3.4375, 4.0},
    {0.78125, 0.0},
    {0.9765625, 2.0},
    {0.0, 4.0},
    {1.25, 1.484375},
    {4.2578125, 2.8125},
    {0.0, 3.5}
  ],
  [{5.0, 3.6875}, {2.5, 0.5}, {4.765625, 3.625}, {2.44140625, 1.25}],
  [
    {0.46875, 2.4765625},
    {3.125, 1.2734375},
    {1.7578125, 2.484375},
    {2.5, 0.46875},
    {0.9375, 2.25}
  ]
]

then from this generated test data, we can test if find_longest_pair_point maintain the property or not

defmodule TestFindLongestPair do
  use ExUnitProperties

  property "pair corner of a rectangle is longer than all pair of point inside rectangle" do
    check all a <- StreamData.positive_integer(),
        b <- StreamData.positive_integer(),
        inside_points <- StreamData.list_of(
          StreamData.tuple({
            StreamData.float(min: 0, max: a),
            StreamData.float(min: 0, max: b)}
          )
        )
    do
      points = Enum.shuffle([{0, a}, {b, 0}] ++ inside_points)
      assert [{x1, y1}, {x2, y2}] = find_longest_pair_point(points) |> Enum.sort()
      assert a == y1
      assert b == x2
      assert x1 == 0
      assert y2 == 0
    end
  end
end

which if not implemented correctly, would fail like this
Alt Text

We can also make a property from Transfer process, we can say that during transferring should maintain total account balance before and after transfer.

Describe Transfer Maintain Total Account Balance

Given User A with Account Balance X,
  AND User B with Account Balance Y
When User A transfer to User B with amount Z
Then X + Y == currentBalance(User A) + currentBalance(User B)

However, Transfer process is what we usually say stateful process, in which it change state of user A and user B. Property-based testing (and StreamData in particular) sadly doesn't have much support in testing stateful process.

Example in Real Life

There are two main module in ExKsuid, ExKsuid (which handle how to generate/parse KSUID) and ExKsuid.Base62 (which handle how to encode/decode data in Base62 format.

Helping clarify by giving edge cases

Property i would like to test for Base62 is that data which is encoded and decoded should return the same, so i write

  property "encode and decode binary data return same binary data" do
    check all(data <- StreamData.binary()) do
      assert data == Base62.decode(Base62.encode(data))
    end
  end

but it return error like,
Alt Text

turn out binary data is <<0, 245>>, which after decode & encode return <<245>>. After considering it, i would like to only care integer value of binary data (in this case both is 245) due to seemingly hard problem to match leading 0 position in binary form and base62 form, so i update test to reflect on it.

  property "encode and decode binary data return same binary data" do
    check all(data <- StreamData.binary()) do
      x = :binary.decode_unsigned(data)
      y = :binary.decode_unsigned(Base62.decode(Base62.encode(data)))
      assert x == y
    end
  end

Actually finding bugs in edge cases

On main ExKsuid, i test property "KSUID generated from lower timestamp is lower that KSUID generated"

  property "Generated KSUID from earlier time lexicographically smaller than later time" do
    check all(int1 <- StreamData.positive_integer(), int2 <- StreamData.positive_integer()) do
      assert ExKsuid.generate(timestamp: int1 + @epoch) <
        ExKsuid.generate(timestamp: int1 + int2 + @epoch)
    end
  end

and immediately it fail with

  1) property Generated KSUID from earlier time lexicographically smaller than later time (ExKsuidTest)
     test/ex_ksuid_test.exs:37
     Failed with generated values (after 4 successful runs):

         * Clause:    int1 <- StreamData.positive_integer()
           Generated: 5

         * Clause:    int2 <- StreamData.positive_integer()
           Generated: 3

     Assertion with < failed
     code:  assert ExKsuid.generate(timestamp: int1 + @epoch) < ExKsuid.generate(timestamp: int1 + int2 + @epoch)
     left:  "e3RlO2qT8eNY7rPCfQ8Ku0"
     right: "10OPH1xqiF0vG3XHXcAAFYj"
     stacktrace:
       test/ex_ksuid_test.exs:39: anonymous fn/3 in ExKsuidTest."property Generated KSUID from earlier time lexicographically smaller than later time"/1 

after looking closely at test data,i realize that KSUID should have 27 character, and it seems 0-encoding of Base62 have connection with the problem. Turn out that binary data for low timestamp has a lot of leading 0

ExKsuid.generate_raw(timestamp: 1400000008)
#result
<<0, 0, 0, 8, 40, 147, 107, 12, 82, 97, 52, 48, 149, 155, 109, 162, 194, 203,
  98, 182>>

When it is encoded to base62, those extra leading 0 is ignored. The fix is simple, just add leading 0 until it is 27 character, and the test goes green.

Conclusion

Does Property-based test help finding out bugs and giving confidence with low effort? Absolutely!

Can it be used to wholly replace example based test though? Not in my case, I still use example test use to cross check my implementation with existing reference library to give even more confidence.

The confidence and ease is what's matter for me in testing after all.

Discussion (0)