DEV Community

Jeff Lindsay
Jeff Lindsay

Posted on

Parsing bridgesupport schema files

Tuesday I started on the potentially arduous effort of generating a Go API for AppKit, the Apple framework for building apps. With the core bridge working, I want to have native bindings for all the Apple APIs. There are a lot of them!

Luckily as I mentioned a while back, every framework has with it a gigantic XML file describing every part of the API. These are there specifically for generating bindings and/or header files. So the first step would be to parse these into Go structures.

Although it's not super well documented, Go has amazing support for parsing XML into structs. You just lay out the data types and map them using struct tags. There is some documentation on the structure of these XML files, but I also just referenced a few of the files as well to figure out the data model.

There was one twist. When the schema needs to talk about specific data types of any value or variable, they are encoded into these strange strings. Luckily theses are... mostly documented. I had to poke around to find out why methods on informal protocols have some extra numbers it turns out I can ignore, and there are a few examples where I just don't know what they are.

On top of this, they're encoded into a strange string encoding where most types are represented as single characters, but a few like bitfield masks and pointers have extra information, and compound types like structs and arrays are of course more complicated structures. It took me a while to figure out how I should parse these. I ended up using bufio.Scanner, a sort of programmable tokenizer you can use to relatively easily throw together a lexer. After a bunch of experimentation I got a system that seems to work and I can extend and customize as I run into specific scenarios. I use this to create a TypeInfo struct that has a more friendly representation of the type data.

The best part is, I can have the XML parser automatically unmarshal into the TypeInfo type for those encoded fields. There aren't many examples of this, but it can be done and it works great. At this point I can parse a couple of these XML files without error, and although there's plenty of holes, this is enough to start generating some Go bindings. Next week!

Top comments (2)

yorodm profile image
Yoandy Rodriguez Martinez

[From your robot overlords 🤖]: We are very pleased with this post. We will also like to know if there's a Github repo for us to learn and hopefully collaborate with your code.

progrium profile image
Jeff Lindsay