One of the advantages of RakuAST is that you can build executable code in an object oriented style. One of the areas that has had the attention of yours truly in that respect, is handling of
Historically the C programming language provided the first implementation of
The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values.
Of course, the Raku Programming Language also has an implementation of
printf, with a set of supported options. However the Rakudo implementation of the Raku Programming Language does not have an implementation of
printf formats of its own, but borrows it from
NQP. Which has its pros and its cons.
The pros are of course being that NQP in general is much faster than Raku (as NQP is sort of the "assembly language" of Rakudo). However, the NQP implementation of
sprintf is written using a grammar (written in NQP). And each time a call is made to this
sprintf functionality (e.g, when printing a line in a report using
sprintf), the whole grammar is run again, with the associated actions building the string to be returned. And running a grammar is costly in terms of memory and CPU.
This makes Rakudo's
sprintf and associated functions (such as
.fmt) one of the slowest features of Raku in Rakudo. Surely with RakuAST there must be a better way to do this?
Almost two years ago, yours truly started on the RakuAST journey. First by writing tests, later trying out RakuAST features. And one of the first things attempted, was to port the original NQP grammar for
sprintf to Raku. And create a separate set of actions, that would not produce a string, but instead would produce an AST.
Such an AST can then be converted to executable code, and then run whenever a set of arguments would need to be converted to a string. So instead of needing to run the same grammar over and over again, it would be possible to just call some (static) code, which then, like any other piece of code in Raku, would get run-time optimized for better performance.
This then became the
Formatter class. And since this was a completely new feature, it only became available for use by opting into the
6.e.PREVIEW language version. And then it went largely unnoticed and uncared for the next 1.5 year. As clearly the time wasn't right for it yet.
One of the obstacles that was encountered while working on
Formatter, was that there were no comprehensive tests for
sprintf functionality for Raku. Yes, there was a test-file that tested some features, but no comprehensive tests for all possible combinations of format features, combined with possible values that they should operate upon.
This became one of the first things that needed to be done, to be able to say the new implementation would be matching the old. During the development of these tests, it became clear there were some inconsistencies in the existing implementation, and worse: outright bugs.
So the question became: should the new implementation follow the behaviour of the old implementation to the letter, or not?
This then became part of discussions at the first Raku Core Summit. There it was decided that the new implementation should not adhere to the old one, because the new
sprintf functionality would require a language version bump anyway. And that the current
sprintf tests should be frozen for language level
6.d. And so it was done.
Another decision that was made, was that the functionality offered by the
Formatter class (which converts a format string to a
Callable), should be embedded into a
Format class. Which should normally act as a string, but when used as a
Callable should perform the processing according to the format given. This has now also been implemented. Which means you can now do:
my $f = Format.new("%5s");
dd $f; # Format.new("%5s")
say $f; # %5s
dd $f("foo"); # " foo"
say "'$f'"; # '%5s'
say "'$f("foo")'"; # ' foo'
As you can see, the
Format object acts very much so as a string. Which isn't so strange when you realize the method resolution order of
say Format.^mro; # ((Format) (Str) (Cool) (Any) (Mu))
Only when used as a
Callable (by putting
() after it with arguments), will it show its special functionality.
At the RCS it was suggested that the
Format.new("%5s") would be too clunky in common usage. More seriously it would prevent compile time creation of the
Format object because the lookup of methods are a runtime action in Raku (generally). A way to get around that, would be to introduce a new string quoting construct / adverb / processor. This became the
"format" adverb (or
"o" for the short version):
use v6.e.PREVIEW; # RAKUDO_RAKUAST=1 also required for now
my $format = q:format/%5s/;
say "'$format("foo")'"; # ' foo'
or more directly and shorter using the "o" alternate adverb:
dd q:o/%5s/("foo"); # " foo"
"o" for the short version? Because
"f" was already taken to enable/disable *f*unction interpolation. And
q:o felt visually closer to
fo than anything else.
So what is the performance of this new way of formatting values into a string?
Without any optimization of the RakuAST version yet, upto a 30x speed increase has been reported. So that's quite an improvement. And potentially better than many other programming languages that also depend on some kind of runtime state-machine to create the string, rather than depending on executable code.
A new way of creating strings from a given set of values and format string (in other words
printf functionality) has been implemented using RakuAST in the Raku Programming Language, making it up to 30x as fast.
This functionality will be available from release 2023.06 of Rakudo and selecting the
6.e language level by specifying
use v6.e.PREVIEW. The new quote adverb will for now only be available when compiled with the new RakuAST grammar, which can be activated by specifying the
RAKUDO_RAKUAST environment variable.