Tomasz Wegrzanowski

Posted on Mar 7, 2022

Open Source Adventures: Episode 11: Bit Vectors support for Crystal Z3

#ruby #crystal #z3

Time to add the last common sort - bit vectors. Z3 has some other sorts, like floating point numbers, so you can prove why Quake inverse square root works, as well as arrays, functions, and so on, but they're really complicated, and very rarely used.

Singleton issue

BoolSort, IntSort, and RealSort are all singletons, while BitvecSort(N) is a different sort for each N.

The principled way would be to treat none of them as singletons, and that's how Ruby Z3 does it (so you can do IntSort.new.new(7)), but for now I'll leave their interfaces different.

If I ever wanted to add composite types like arrays and functions, I'd really need to make all sorts use consistent interface.

It's tempting to go full C++ templates and make it a generic type parametrized by N, so BitvecExpr(8) and BitvecExpr(12) are different Crystal types, but that's a bad idea as N is possibly unknown at compile time anyway, and can be very large. And I'm not even sure if Crystal would support that.

Operator issues

Other types could get away with basic operators, bit vectors cannot - signed > and unsigned > are fundamentally different, and we can't really use > for either of these. If Crystal allowed custom operators like >+ and >- or something (having same priority as their base versions) it would be awesome for this use case.

I'd also want two >>+ and >>- for signed and unsigned right shift, and maybe >>> and <<< for rotate right and rotate left.

Most languages don't support adding new operators, but a few do, notably OCaml and Haskell.

OCaml simply couldn't function without additional operators, as it has zero polymorphism, so float addition (2.5 +. 3.5) needs a different operator from int addition (2 + 3).

Haskell is not in quite as dire need as OCaml, but operators are tied to type classes, so you often can't reuse existing operators in new contexts.

It would definitely be a nice-to-have, and I don't think it would cause any issues.

Signed and Unsigned Bitvectors

Computers generally work with bitvectors, or N-bit integers. These can be "unsigned" (like 0 to 255) or "signed" (like 0 to 127, then continuing -128 to -1).

This works so well, as almost every operation, like +, -, * and so on, works exactly identically on both. But a few like / need to know if they're dealing with signed or unsigned bitvectors. Most notably .to_s doesn't know if 0xFF should mean 255 or -1 without knowing signedness.

There are two ways to solve it. The more common way is to make some bitvectors signed and others unsigned, and then / means one thing or the other depending on types of arguments. Nearly every programming language with bitvectors does it this way.

Alternatively, you can have only one type of bitvector, and then any ambiguous operation needs to specify if it wants signed or unsigned version. That's the way Z3 works.

Back with Ruby Z3, I thought about hacking this, and having SigendBitvec and UnsignedBitvec classes (in addition to plain Bitvec for things like results coming out of Z3, which you could then .signed or .unsigned), so all the common operators like / would work, and .to_s would do the smart things. I think this is a viable idea (in both Ruby and Crystal), but it would be a lot of additional complexity.

Should Not Respond

Overall I think this is the biggest issue with Crystal I ran into so far, as there aren't really any sensible workarounds.

With multimethods burried many layers down the call stack, Crystal has more magic that Ruby, so it's really likely that some combinations "work" which really shouldn't.

There's nothing along the lines of:

expect{ anInt + "x" }.to not_compile
expect{ anInt + "x" }.to not_be_a_method
anInt.should_not respond_to(:+, String)

or such.

Spawning new compiler process for every such test assertion isn't really sensible, especially considering Crystal compile times aren't great, so that x100 would be pretty dreadful.

I thought about adding some kind of wildcard Object + Object just in spec helper, and making that raise FakeMethod exception, but I'm not sure that it wouldn't mess up other things somehow.

Now arguably I could just limit that by doing a lot of explicit type annotations, but I really don't want to do that.

`to_s` vs `inspect`

For sake of debugging I added #inspect separate from #to_s. This is especially important with more complex sorts like bit vectors:

it "to_s and inspect" do
  t = Z3::BoolSort[true]
  f = Z3::BoolSort[false]
  t.to_s.should eq("true")
  f.to_s.should eq("false")
  a.to_s.should eq("a")
  t.inspect.should eq("BoolExpr<true>")
  f.inspect.should eq("BoolExpr<false>")
  a.inspect.should eq("BoolExpr<a>")
end

`Bitvec` spec

The implementation isn't all that interesting, so here's just the spec - largely adapted from the Ruby version:

describe Z3::BitvecExpr do
  a = Z3.bitvec("a", 8)
  b = Z3.bitvec("b", 8)
  c = Z3.bitvec("c", 8)
  d = Z3.bitvec("d", 12)
  e = Z3.bitvec("e", 4)
  x = Z3.bool("x")
  bv8 = Z3::BitvecSort.new(8)

  it "==" do
    [a == 2, b == -254, x == (a == b)].should have_solution({x => true})
    [a == 2, b == 2, x == (a == b)].should have_solution({x => true})
    [a == 2, b == 3, x == (a == b)].should have_solution({x => false})
  end

  it "!=" do
    [a == 2, b == -254, x == (a != b)].should have_solution({x => false})
    [a == 2, b == 2, x == (a != b)].should have_solution({x => false})
    [a == 2, b == 3, x == (a != b)].should have_solution({x => true})
  end

  it "+" do
    [a == 2, b == 40, c == (a + b)].should have_solution({c => 42})
    [a == 200, b == 40, c == (a + b)].should have_solution({c => 240})
    [a == -1, b == -1, c == (a + b)].should have_solution({c => 254})
  end

  it "-" do
    [a == 50, b == 8, c == (a - b)].should have_solution({c => 42})
    [a == 200, b == 40, c == (a - b)].should have_solution({c => 160})
    [a == 40, b == 200, c == (a - b)].should have_solution({c => 96})
  end

  it "*" do
    [a == 3, b == 40, c == (a * b)].should have_solution({c => 120})
    [a == 30, b == 42, c == (a * b)].should have_solution({c => 236})
  end

  it "/" do
    expect_raises(Z3::Exception) { a / b }
    [a == 200, b == 20, c == a.unsigned_div(b)].should have_solution({c => 10})
    [a == 200, b == 20, c == a.signed_div(b)].should have_solution({c => 254})
  end

  it "%" do
    expect_raises(Z3::Exception) { a % b }
    [a == 200, b == 20, c == a.signed_mod(b)].should have_solution({c => 4})
    [a == 200, b == 20, c == a.signed_rem(b)].should have_solution({c => 240})
    [a == 200, b == 20, c == a.unsigned_rem(b)].should have_solution({c => 0})
  end

  it "&" do
    [a == 50, b == 27, c == (a & b)].should have_solution({c => 18})
  end

  it "|" do
    [a == 50, b == 27, c == (a | b)].should have_solution({c => 59})
  end

  it "^" do
    [a == 50, b == 27, c == (a ^ b)].should have_solution({c => 41})
  end

  it "xnor" do
    [a == 50, b == 27, c == a.xnor(b)].should have_solution({c => 214})
  end

  it "nand" do
    [a == 50, b == 27, c == a.nand(b)].should have_solution({c => 237})
  end

  it "nor" do
    [a == 50, b == 27, c == a.nor(b)].should have_solution({c => 196})
  end

  it "unary -" do
    [a == 50, b == -a].should have_solution({b => 206})
  end

  # ! can't be overriden in Crystal
  it "~" do
    [a == 50, b == ~a].should have_solution({b => 205})
  end

  it ">> (sign-dependent)" do
    [a == 234, b == 2, c == a.unsigned_rshift(b)].should have_solution({c => 58})
    [a == 234, b == 2, c == a.signed_rshift(b)].should have_solution({c => 250})
    expect_raises(Z3::Exception) { a.rshift(b) }
    expect_raises(Z3::Exception) { a >> b }
  end

  # There's only one way to shift left so these are all aliases
  # but it would be confusing API to have different names for left and right shifts
  it "<< (sign-independent)" do
    [a == 234, b == 2, c == a.signed_lshift(b)].should have_solution({c => 168})
    [a == 234, b == 2, c == a.unsigned_lshift(b)].should have_solution({c => 168})
    [a == 234, b == 2, c == a.lshift(b)].should have_solution({c => 168})
    [a == 234, b == 2, c == (a << b)].should have_solution({c => 168})
  end

  it ">" do
    # expect_raises(Z3::Exception) { a > b }
    [a == 100, b ==  20, x == a.unsigned_gt(b)].should have_solution({x => true})
    [a == 100, b == 100, x == a.unsigned_gt(b)].should have_solution({x => false})
    [a == 100, b == 120, x == a.unsigned_gt(b)].should have_solution({x => false})
    [a == 100, b == 200, x == a.unsigned_gt(b)].should have_solution({x => false})
    [a == 100, b ==  20, x == a.signed_gt(b)].should have_solution({x => true})
    [a == 100, b == 100, x == a.signed_gt(b)].should have_solution({x => false})
    [a == 100, b == 120, x == a.signed_gt(b)].should have_solution({x => false})
    [a == 100, b == 200, x == a.signed_gt(b)].should have_solution({x => true})
  end

  it ">=" do
    # expect_raises(Z3::Exception) { a >= b }
    [a == 100, b ==  20, x == a.unsigned_ge(b)].should have_solution({x => true})
    [a == 100, b == 100, x == a.unsigned_ge(b)].should have_solution({x => true})
    [a == 100, b == 120, x == a.unsigned_ge(b)].should have_solution({x => false})
    [a == 100, b == 200, x == a.unsigned_ge(b)].should have_solution({x => false})
    [a == 100, b ==  20, x == a.signed_ge(b)].should have_solution({x => true})
    [a == 100, b == 100, x == a.signed_ge(b)].should have_solution({x => true})
    [a == 100, b == 120, x == a.signed_ge(b)].should have_solution({x => false})
    [a == 100, b == 200, x == a.signed_ge(b)].should have_solution({x => true})
  end

  it "<" do
    # expect_raises(Z3::Exception) { a < b }
    [a == 100, b ==  20, x == a.unsigned_lt(b)].should have_solution({x => false})
    [a == 100, b == 100, x == a.unsigned_lt(b)].should have_solution({x => false})
    [a == 100, b == 120, x == a.unsigned_lt(b)].should have_solution({x =>  true})
    [a == 100, b == 200, x == a.unsigned_lt(b)].should have_solution({x =>  true})
    [a == 100, b ==  20, x == a.signed_lt(b)].should have_solution({x => false})
    [a == 100, b == 100, x == a.signed_lt(b)].should have_solution({x => false})
    [a == 100, b == 120, x == a.signed_lt(b)].should have_solution({x =>  true})
    [a == 100, b == 200, x == a.signed_lt(b)].should have_solution({x => false})
  end

  it "<=" do
    # expect_raises(Z3::Exception) { a <= b }
    [a == 100, b ==  20, x == a.unsigned_le(b)].should have_solution({x => false})
    [a == 100, b == 100, x == a.unsigned_le(b)].should have_solution({x =>  true})
    [a == 100, b == 120, x == a.unsigned_le(b)].should have_solution({x =>  true})
    [a == 100, b == 200, x == a.unsigned_le(b)].should have_solution({x =>  true})
    [a == 100, b ==  20, x == a.signed_le(b)].should have_solution({x => false})
    [a == 100, b == 100, x == a.signed_le(b)].should have_solution({x =>  true})
    [a == 100, b == 120, x == a.signed_le(b)].should have_solution({x =>  true})
    [a == 100, b == 200, x == a.signed_le(b)].should have_solution({x => false})
  end

  it "zero_ext / sign_ext" do
    [a ==  100, d ==  a.zero_ext(4)].should have_solution({d => 100})
    [a == -100, d ==  a.zero_ext(4)].should have_solution({d => 2**8-100})
    [a ==  100, d ==  a.sign_ext(4)].should have_solution({d => 100})
    [a == -100, d ==  a.sign_ext(4)].should have_solution({d => 2**12-100})
  end

  it "rotate_left / rotate_right" do
    [a == 0b0101_0110, b == a.rotate_left(1)].should have_solution({b => 0b101_0110_0})
    [a == 0b0101_0110, b == a.rotate_left(4)].should have_solution({b => 0b0110_0101})
    [a == 0b0101_0110, b == a.rotate_right(1)].should have_solution({b => 0b0_0101_011})
    [a == 0b0101_0110, b == a.rotate_right(4)].should have_solution({b => 0b0110_0101})
  end

  # This (hi, lo) API from Z3 feels bad in Crystal
  # we should probably drop it and accept Range instead
  it "extract" do
    [a == 0b0101_0110, e == a.extract(3, 0)].should have_solution({e => 0b0110})
    [a == 0b0101_0110, e == a.extract(7, 4)].should have_solution({e => 0b0101})
    expect_raises(Z3::Exception) { a.extract(8, 4) }
    expect_raises(Z3::Exception) { a.extract(2, 3) }
    # expect_raises(Z3::Exception) { a.extract(2, -1) }
  end

  it "concat" do
    [a == 0b0101_0110, e == 0b1101, d == a.concat(e)].should have_solution({d => 0b0101_0110_1101})
    [a == 0b0101_0110, e == 0b1101, d == e.concat(a)].should have_solution({d => 0b1101_0101_0110})
  end

  it "simplify" do
    u = bv8[100]
    v = bv8[50]
    ((u+v).to_s).should eq("(bvadd #x64 #x32)")
    ((u+v).simplify.to_s).should eq("150")
  end

  it "to_s and inspect" do
    u = bv8[5]
    v = bv8[-3]
    u.to_s.should eq "5"
    v.to_s.should eq "253"
    a.to_s.should eq "a"
    u.inspect.should eq "BitvecExpr(8)<5>"
    v.inspect.should eq "BitvecExpr(8)<253>"
    a.inspect.should eq "BitvecExpr(8)<a>"
  end

  it "const?" do
    bv8[100].const?.should be_true
    bv8[-100].const?.should be_true
    (bv8[100] + bv8[50]).const?.should be_false
    a.const?.should be_false
    (a + b).const?.should be_false
  end

  it "to_i" do
    bv8[5].to_i.should eq(5)
    bv8[-10].to_i.should eq(246)
    (bv8[2] + bv8[2]).to_i.should eq(4)
    expect_raises(Z3::Exception){ a.to_i }
    expect_raises(Z3::Exception){ (a + b).to_i }
  end
end

Story so far

All the episode code is in this repo.

Coming next

There's a ton of functionality we could add, like notably:

install Z3 error handler to turn Z3 errors into Crystal exceptions, instead of "print and exit program" default handler
better to_s, as current one just falls back to S-Expressions unless AST is a constant
I'm sure half the safe conversions are missing, and there's probably some questionable ones allowed
overall only approximately 10% of Z3 APIs are implemented
I could port some examples from Ruby Z3 to Crystal Z3, that could validate that there isn't any major missing functionality in the Solver/Model/etc. APIs

But I think it's time to package what's done so far into something others can use.

DEV Community

Open Source Adventures: Episode 11: Bit Vectors support for Crystal Z3

Singleton issue

Operator issues

Signed and Unsigned Bitvectors

Should Not Respond

`to_s` vs `inspect`

`Bitvec` spec

Story so far

Coming next

Top comments (0)

Singleton issue

Operator issues

Signed and Unsigned Bitvectors

Should Not Respond

to_s vs inspect

Bitvec spec

Story so far

Coming next

`to_s` vs `inspect`

`Bitvec` spec