It's time for the next installment in the Create Your Own Programming Language series. This time we're going to add conditionals to the Wanda language.
We're actually going to add 3 different kinds of conditionals: if
, cond
, and when
.
if
is a standard if expression, with 2 branches based on whether a test comes back truthy or falsy.
cond
lets you chain together multiple conditions and expressions, with an :else
branch at the end in case nothing matches. The :else
branch is always required. cond
expressions desugar to a chain of if
expressions.
A cond
expression looks like this. Note that each branch is enclosed within its own set of parentheses:
(cond
((= x 0) "zero")
((< x 10) "less than 10")
((< x 0) "less than zero")
(:else "10 or greater"))
when
is imperative and lets you execute a block of code if the test is truthy. It evaluates to nil
either way, so you wouldn't want to use it to assign a toplevel variable. It's more of an imperative construct, ideally for containing side effects.
When also allows you to nest any number of expressions inside it, whereas each branch of an if
expression can only take a single expression. And since when
doesn't have an else branch, you can use it in cases where you don't need an alternate execution path if the main one isn't taken.
Here's an example of a when
expression:
(when (= 1 1)
(println "Wow, one equals one?")
(println "Who woulda thunk it???"))
We're also going to add a new form, the LogicalExpression
. This is for expressions with the and
and or
forms.
Some Lisps will let you use any number of arguments to the and
and or
forms, but we're going to restrict users to 2 arguments for each form to simplify things.
Finally, we'll add 2 new node types to help the type checker that will both desugar to call expressions: BinaryExpression
which will include the equal?
and not-equal?
operations, and UnaryExpression
which is only used with not
.
These typechecker-only nodes will help with a process known as type narrowing, which allows the type checker to better understand things like which arm of a union to check against in branches of a conditional.
As always, if you haven't read the previous article in which we added several new types to the type checker, you should do that before reading ahead.
Ok? Let's go!
Changes to The Lexer and Reader
None! Both are fine just the way they are for today.
Changes to The Core Library
We need to add a not-equal?
function to the core library to match up with the not-equal?
binary expression.
Ordinarily in a Lisp when you want to check if a value is not equal you'd just prefix equal?
with not
, as in (not (equal? x y))
, but to give the type checker some more help with narrowing we're adding this additional form.
In lib/js/core.js
go down to the equal?
function and add this between it and is?
:
"not-equal?": rt.makeFunction(
(a, b) => {
if (rt.hasDict(a) && rt.hasDict(b)) {
return !equal(rt.getMetaField(a, "dict"), rt.getMetaField(b, "dict"));
}
return !equal(a, b);
},
{ contract: "(any, any) -> boolean", name: "not-equal?" }
),
That's it for changes to the core library, so now we can move on to changes to the parser.
Changes to The Parser
We need to define new AST node types and constructors for those nodes.
In src/parser/ast.js
add the following to the ASTTypes
enum:
export const ASTTypes = {
// other members...
IfExpression: "IfExpression",
CondExpression: "CondExpression",
WhenExpression: "WhenExpression",
BinaryExpression: "BinaryExpression",
LogicalExpression: "LogicalExpression",
UnaryExpression: "UnaryExpression",
};
And in the AST
object add the following node constructors:
export const AST = {
// other constructors...
IfExpression(test, then, elseBranch, srcloc) {
return {
kind: ASTTypes.IfExpression,
test,
then,
else: elseBranch,
srcloc,
};
},
CondExpression(clauses, elseBranch, srcloc) {
return {
kind: ASTTypes.CondExpression,
clauses,
else: elseBranch,
srcloc,
};
},
WhenExpression(test, body, srcloc) {
return {
kind: ASTTypes.WhenExpression,
test,
body,
srcloc,
};
},
BinaryExpression(left, op, right, srcloc) {
return {
kind: ASTTypes.BinaryExpression,
left,
op,
right,
srcloc,
};
},
LogicalExpression(left, op, right, srcloc) {
return {
kind: ASTTypes.LogicalExpression,
left,
op,
right,
srcloc,
};
},
UnaryExpression(op, operand, srcloc) {
return {
kind: ASTTypes.UnaryExpression,
op,
operand,
srcloc,
};
},
}
We'll need 6 new parse functions in src/parser/parse.js
:
parseIfExpression
parseCondExpression
parseWhenExpression
parseBinaryExpression
parseLogicalExpression
parseUnaryExpression
Add the following cases to the switch
statement in parseList
:
// other cases
case "if":
return parseIfExpression(form);
case "cond":
return parseCondExpression(form);
case "when":
return parseWhenExpression(form);
case "equal?":
case "not-equal?":
return parseBinaryExpression(form);
case "and":
case "or":
return parseLogicalExpression(form);
case "not":
return parseUnaryExpression(form);
// default case
parseIfExpression
is pretty simple. Just grab the test
, then
, and elseBranch
forms off the main form, parse each form, and pass them to the constructor.
const parseIfExpression = (form) => {
const [_, test, then, elseBranch] = form;
const srcloc = form.srcloc;
const parsedTest = parseExpr(test);
const parsedThen = parseExpr(then);
const parsedElse = parseExpr(elseBranch);
return AST.IfExpression(parsedTest, parsedThen, parsedElse, srcloc);
};
parseCondExpression
is a little more complicated. We have to grab all the clauses off the main form, parse each one individually, and then when we get to the else branch handle that one separately since it gets its own property on the AST node object. Also, since the else branch is required, we throw an error if there isn't one.
const parseCondExpression = (form) => {
const [_, ...clauses] = form;
const srcloc = form.srcloc;
/** @type {import("./ast.js").CondClause[]} */
const parsedClauses = [];
let hasElse = false;
if (!(clauses.length >= 2)) {
throw new Exception(
`Cond expression must have at least 2 clauses; ${clauses.length} given at ${srcloc.file}, (${srcloc.line}:${srcloc.col})`
);
}
for (let clause of clauses) {
const [test, expr] = clause;
if (test.type === TokenTypes.Keyword && test.value === ":else") {
hasElse = true;
break;
}
const parsedTest = parseExpr(test);
const parsedExpr = parseExpr(expr);
parsedClauses.push({ test: parsedTest, expression: parsedExpr });
}
if (!hasElse) {
throw new Exception(
`Cond expression expects an :else clause; none given at ${srcloc.file}, (${srcloc.line}:${srcloc.col})`
);
}
/** @type {Cons} */
const elseClause = clauses[clauses.length - 1];
// we need the first element of the tail of the else clause
const parsedElse = parseExpr(elseClause.cdr.car);
return AST.CondExpression(parsedClauses, parsedElse, srcloc);
};
The other 4 parse functions, parseWhen
, parseBinaryExpression
, parseLogicalExpression
, and parseUnaryExpression
, are all simple:
const parseWhenExpression = (form) => {
const [_, test, ...body] = form;
const srcloc = form.srcloc;
const parsedTest = parseExpr(test);
const parsedBody = body.map(parseExpr);
return AST.WhenExpression(parsedTest, parsedBody, srcloc);
};
const parseBinaryExpression = (form) => {
const [op, left, right] = form;
const srcloc = form.srcloc;
const parsedLeft = parseExpr(left);
const parsedRight = parseExpr(right);
return AST.BinaryExpression(parsedLeft, op.value, parsedRight, srcloc);
};
const parseLogicalExpression = (form) => {
const [op, left, right] = form;
const srcloc = form.srcloc;
const parsedLeft = parseExpr(left);
const parsedRight = parseExpr(right);
return AST.LogicalExpression(parsedLeft, op.value, parsedRight, srcloc);
};
const parseUnaryExpression = (form) => {
const [op, operand] = form;
const srcloc = form.srcloc;
const parsedOperand = parseExpr(operand);
return AST.UnaryExpression(op.value, parsedOperand, srcloc);
};
Changes to The Type Checker
Obviously we need to infer and check types for the new nodes, but the most significant change we'll be implementing in the type checker today is the addition of type narrowing.
Type narrowing allows the type checker to check certain expressions as if a certain condition were true or false, which allows it to make deductions about types it otherwise would not be able to make.
For example, let's say we have a discriminated union type:
(type Coord
{ type: "cartesian", x: number, y: number }
|| { type: "polar", radius: number, magnitude: number })
(def (point: Coord) { type: "cartesian", x: 5, y: 2.5 })
Given this when
expression:
(when (equal? point.type "cartesian")
; assume point is the bottom-right point of
; a square that begins at the origin (0, 0)
(def area (* point.x point.y))
; continue
Without type narrowing, the above when
expression would have failed to type check because due to the union type we couldn't be certain the object would have x
and y
properties. Thanks to narrowing, since we know type
is "cartesian" we know which arm of the union the object satisfies.
As Jake Donham says in his article about type narrowing, it can be helpful to think of the type environment as a logical statement that asserts facts about the types of values. Narrowing allows us to supplement the known facts with assumptions guaranteed to be correct if the right conditions are met.
Adding The Negation Type
To implement narrowing we're going to add a type that is only used in the type checker: the negation type, or not
. The not
type is only used in the type checker, it can't be annotated, and in fact it is only used in type narrowing and it's limited just to that functionality so we don't have to rewrite the entire checker just to account for it.
Let's add it in src/typechecker/types.js
to the TypeTypes
enum:
export const TypeTypes = {
// other members...
Not: "Not",
};
We'll also need a constructor in src/typechecker/constructors.js
:
export const not = (base) => ({ kind: TypeTypes.Not, base });
and a validator in src/typechecker/validators.js
:
export const isNot = (type) => {
return type.kind === TypeTypes.Not;
};
We'll also need some utility functions and types for checking truthiness and falsiness of a value, when it can be known at compile time. We'll create those in src/typechecker/utils.js
:
export const isTruthy = (type) => {
switch (type.kind) {
case TypeTypes.Nil:
return false;
case TypeTypes.Singleton:
if (type.value === "false") {
return false;
}
default:
if (type.kind !== TypeTypes.Boolean) {
return true;
}
}
};
export const isFalsy = (type) => {
switch (type.kind) {
case TypeTypes.Nil:
return true;
case TypeTypes.Singleton:
if (type.value === "false") {
return true;
}
default:
if (type.kind !== TypeTypes.Boolean) {
return false;
}
}
};
export const falsy = union(singleton("Boolean", "false"), nil);
export const truthy = intersection(
not(singleton("Boolean", "false")),
not(nil)
);
Remember that in Wanda all values except false
and nil
are truthy.
Let's add our utils to the Type
module in src/typechecker/Type.js
:
import { TypeTypes } from "./types.js";
import * as Validators from "./validators.js";
import * as Constructors from "./constructors.js";
import { typeToString } from "./typeToString.js";
import { union } from "./union.js";
import { map } from "./map.js";
import { intersection } from "./intersection.js";
import { isTruthy, isFalsy, truthy, falsy } from "./utils.js";
export const Type = {
Type: TypeTypes,
...Constructors,
...Validators,
toString: typeToString,
union,
map,
intersection,
isTruthy,
isFalsy,
truthy,
falsy,
};
Just a quick note: I had a weird circular reference bug that popped up sometimes in Node when trying to use the type checker. The solution turned out to be removing the import of the Type
module into src/typechecker/isSubtype.js
and instead importing each of the type validators used in isSubtype
one at a time and removing the Type.
prefix for validators within isSubtype
. If you have errors like "Cannot use Type before its initialization" after adding the rest of the code in this section that's how you fix them.
Narrowing
Now let's actually implement the narrowing functions. It's actually a lot of functions! But the thing most of them have in common is they restrict the applications of a type within an environment and return that modified environment to be used by the caller.
For a fuller explanation of what I'm about to show you, see Donham's post about type narrowing.
First is the narrow
function.
When we type check a branch of a conditional, we call narrow
with the test expression, the current environment, and a flag according to whether we're assuming the test to be true or false. It returns a new environment incorporating the information of the test and assumption.
When we deduce that a subexpression must satisfy a type, we call narrowUnary
, narrowLogical
, narrowBinary
, or narrowPath
to incorporate the appropriate expression type into the environment. When we reach a variable we call narrowType
, which performs an operation similar to subtype checking between the current type from the environment and the deduced type in order to update the variable's type with the result.
Here are the narrowing functions, including the function widenNots
that keeps not
types from being returned in the environments produced by narrowing, in src/typechecker/narrow.js
:
import { AST, ASTTypes } from "../parser/ast.js";
import { Exception, TypeException } from "../shared/exceptions.js";
import { Type } from "./Type.js";
import { TypeEnvironment } from "./TypeEnvironment.js";
import { infer } from "./infer.js";
import { isSubtype } from "./isSubtype.js";
import { propType } from "./propType.js";
import { TypeTypes } from "./types.js";
export const narrow = (ast, env, assume) => {
switch (ast.kind) {
case ASTTypes.UnaryExpression:
return narrowUnary(ast, env, assume);
case ASTTypes.LogicalExpression:
return narrowLogical(ast, env, assume);
case ASTTypes.BinaryExpression:
return narrowBinary(ast, env, assume);
default:
return narrowPath(ast, env, assume ? Type.truthy : Type.falsy);
}
};
const narrowUnary = (ast, env, assume) => {
switch (ast.op) {
case "not":
return narrow(ast.operand, env, !assume);
default:
throw new Exception(`Unknown unary operator ${ast.op}`);
}
};
const narrowLogical = (ast, env, assume) => {
switch (ast.op) {
case "and":
if (assume) {
env = narrow(ast.left, env, true);
return narrow(ast.right, env, true);
} else {
if (Type.isTruthy(infer(ast.left, env))) {
return narrow(ast.right, env, false);
} else if (Type.isTruthy(infer(ast.right, env))) {
return narrow(ast.left, env, false);
} else {
return env;
}
}
case "or":
if (!assume) {
env = narrow(ast.left, env, false);
return narrow(ast.right, env, false);
} else {
if (Type.isFalsy(infer(ast.left, env))) {
return narrow(ast.right, env, true);
} else if (Type.isFalsy(infer(ast.right, env))) {
return narrow(ast.left, env, true);
} else {
return env;
}
}
default:
throw new Exception(`Unknown logical operator ${ast.op}`);
}
};
const narrowBinary = (ast, env, assume) => {
const left = infer(ast.left, env);
const right = infer(ast.right, env);
if ((ast.op === "equal?" && assume) || (ast.op === "not-equal?" && !assume)) {
env = narrowPath(ast.left, env, right);
return narrowPath(ast.right, env, left);
} else if (
(ast.op === "not-equal?" && assume) ||
(ast.op === "equal?" && !assume)
) {
if (Type.isSingleton(right)) {
env = narrowPath(ast.left, env, Type.not(right));
}
if (Type.isSingleton(left)) {
env = narrowPath(ast.right, env, Type.not(left));
}
return env;
} else {
return env;
}
};
const narrowPath = (ast, env, type) => {
switch (ast.kind) {
case ASTTypes.Symbol:
return narrowPathSymbol(ast, env, type);
case ASTTypes.MemberExpression:
return narrowPathMember(ast, env, type);
default:
return env;
}
};
const narrowPathSymbol = (ast, env, type) => {
const idType = env.get(ast.name);
if (!idType) {
throw new TypeException(
`Expected bound identifier, got undefined`,
ast.srcloc
);
}
env.set(ast.name, narrowType(idType, type));
return env;
};
const narrowPathMember = (ast, env, type) => {
return narrowPath(
ast.object,
env,
Type.object([{ name: ast.property.name, type: type }])
);
};
export const narrowType = (x, y) => {
if (Type.isAny(x) || Type.isAny(y)) return Type.any;
if (Type.isUndefined(x) && Type.isUndefined(y)) return Type.undefinedType;
if (Type.isUndefined(x)) return widenNots(y);
if (Type.isUndefined(y)) return x;
if (Type.isNever(x) || Type.isNever(y)) return Type.never;
if (Type.isUnknown(x)) return widenNots(y);
if (Type.isUnknown(y)) return x;
if (Type.isTuple(x)) {
return Type.tuple(x.types.map((a) => narrowType(a, y)));
}
if (Type.isTuple(y)) {
return Type.tuple(y.types.map((b) => narrowType(x, b)));
}
if (Type.isUnion(x)) {
return Type.union(...x.types.map((a) => narrowType(a, y)));
}
if (Type.isUnion(y)) {
return Type.union(...y.types.map((b) => narrowType(x, b)));
}
if (Type.isIntersection(x)) {
return Type.intersection(...x.types.map((a) => narrowType(a, y)));
}
if (Type.isIntersection(y)) {
return Type.intersection(...y.types.map((b) => narrowType(x, b)));
}
if (Type.isNot(y)) {
if (isSubtype(x, y.base)) {
return Type.never;
} else if (
Type.isBoolean(x) &&
Type.isSingleton(y.base) &&
Type.isBoolean(y.base.base)
) {
return Type.singleton(
"Boolean",
y.base.value === "true" ? "false" : "true"
);
} else {
return x;
}
}
if (Type.isSingleton(x) && Type.isSingleton(y)) {
return x.value === y.value ? x : Type.never;
}
if (Type.isSingleton(x)) {
return x.base === y.kind ? x : Type.never;
}
if (Type.isSingleton(y)) {
return y.base === x.kind ? y : Type.never;
}
if (Type.isObject(x) && Type.isObject(y)) {
const properties = x.properties.map(({ name, type: xType }) => {
const yType = propType(y, name);
const type = yType ? narrowType(xType, yType) : xType;
return { name, type };
});
if (properties.some(({ type }) => Type.isNever(type))) {
return Type.never;
} else {
return Type.object(properties);
}
}
return Type.intersection(x, y);
};
const widenNots = (type) => {
switch (type.kind) {
case TypeTypes.Not:
return Type.unknown;
case TypeTypes.Union:
return Type.union(...type.types.map(widenNots));
case TypeTypes.Object:
return Type.object(
type.properties.map(({ name, type }) => ({
name,
type: widenNots(type),
}))
);
default:
return type;
}
};
That's actually a lot of code! Please check out Jake's article if you don't understand anythingβhe explains the details quite well.
Inference
We need to make some changes to the map
function in src/typechecker/map.js
before we move on to the inference functions. Namely, we need to make it so map
can accept either 2 or 3 arguments based on whether we're mapping over 1 type or 2. That way we can use map
with binary expressions.
Rename map
to map1
, then add this new function in as map
and export it:
export const map = (...args) => {
switch (args.length) {
case 2:
return map1(args[0], args[1]);
case 3:
return map2(args[0], args[1], args[2]);
default:
throw new Exception(`Unexpected args length ${args.length}`);
}
};
Here's the map2
function:
export const map2 = (t1, t2, fn) => {
if (isUnion(t1) || isUnion(t2)) {
const t1s = isUnion(t1) ? t1.types : [t1];
const t2s = isUnion(t2) ? t2.types : [t2];
const ts = [];
for (const t1 of t1s) {
for (const t2 of t2s) {
ts.push(map(t1, t2, fn));
}
}
return union(...ts);
} else if (isIntersection(t1) || isIntersection(t2)) {
const t1s = isIntersection(t1) ? t1.types : [t1];
const t2s = isIntersection(t2) ? t2.types : [t2];
const ts = [];
let error = undefined;
for (const t1 of t1s) {
for (const t2 of t2s) {
try {
ts.push(fn(t1, t2));
} catch (e) {
if (!error) error = e;
}
}
}
if (ts.length === 0) {
throw error;
} else {
return intersection(...ts);
}
} else {
return fn(t1, t2);
}
};
Now we need 6 new cases in the switch
statement in infer
in src/typechecker/infer.js
:
// other cases...
case ASTTypes.IfExpression:
return inferIfExpression(ast, env, constant);
case ASTTypes.WhenExpression:
return inferWhenExpression(ast, env, constant);
case ASTTypes.CondExpression:
return inferCondExpression(ast, env, constant);
case ASTTypes.UnaryExpression:
return inferUnaryExpression(ast, env, constant);
case ASTTypes.BinaryExpression:
return inferBinaryExpression(ast, env, constant);
case ASTTypes.LogicalExpression:
return inferLogicalExpression(ast, env, constant);
// default case
For inferring if expressions, we first infer the test. Then we use the test to narrow for each of the consequent and alternate branches, wrapping each in a thunk so inference only happens if it's needed.
If the test is known to be truthy or falsy, we infer only the appropriate branch. Otherwise, we unify the two branches.
TypeScript constructs a union type of the two branches if they produce different types, but I want to enforce some degree of similarity between the branches so we're unifying the two types instead of going with a simple union. That means the type of one branch must be the same as or a subtype of the other.
Here's inferIfExpression
:
const inferIfExpression = (node, env, constant) => {
const test = infer(node.test, env, constant);
const consequent = () => infer(node.then, narrow(node.test, env, true));
const alternate = () => infer(node.else, narrow(node.test, env, false));
if (Type.isTruthy(test)) {
return consequent();
} else if (Type.isFalsy(test)) {
return alternate();
} else {
return unify(consequent(), alternate());
}
};
inferWhenExpression
is a little bit strange. It returns the nil
type no matter what. But we still infer a type for the test then loop over the body block and infer a type for each expression to make sure there are no inference errors:
const inferWhenExpression = (node, env, constant) => {
const test = infer(node.test, env, constant);
// if the test is falsy, this will never run
if (!Type.isFalsy(test)) {
// check expressions in body
for (let expr of node.body) {
infer(expr, narrow(node.test, env, true), constant);
}
}
return Type.nil;
};
inferCondExpression
loops over all the clauses. If one of the clause tests produces a known truthy type, we return the type of its expression. Otherwise we collect the types of all clauses, including the :else
clause, and unify them all. That means the common subtype is produced as the type for the expression so, like with an if expression, the types for each branch need to be subtypes of each other.
const inferCondExpression = (node, env, constant) => {
let types = [];
for (let clause of node.clauses) {
const test = infer(clause.test, env, constant);
const type = infer(
clause.expression,
narrow(clause.test, env, true),
constant
);
if (Type.isTruthy(test)) {
// the first truthy condition type will trigger the clause's expression
return type;
}
types.push(type);
}
const elseType = infer(node.else, env, constant);
return unifyAll(...types, elseType);
};
Since not
is the only unary expression, that's the only case inferUnaryExpression
handles. If the type of its operand is truthy or falsy it returns a singleton type of the opposite boolean. Otherwise, if it can't be determined at compile time, it returns the type boolean.
const inferUnaryExpression = (node, env, constant) => {
const operand = infer(node.operand, env, constant);
return Type.map(operand, (operand) => {
switch (node.op) {
case "not":
if (Type.isTruthy(operand)) {
return Type.singleton("Boolean", "false");
} else if (Type.isFalsy(operand)) {
return Type.singleton("Boolean", true);
} else {
return Type.boolean;
}
default:
throw new Exception(`Unknown unary operator ${node.op}`);
}
});
};
inferBinaryExpression
is similar, but it has 2 cases to handle and each case has 2 operands:
const inferBinaryExpression = (node, env, constant) => {
const left = infer(node.left, env, constant);
const right = infer(node.right, env, constant);
return Type.map(left, right, (left, right) => {
switch (node.op) {
case "equal?":
if (Type.isSingleton(left) && Type.isSingleton(right)) {
return Type.singleton(left.value === right.value ? "true" : "false");
} else {
return Type.boolean;
}
case "not-equal?":
if (Type.isSingleton(left) && Type.isSingleton(right)) {
return Type.singleton(left.value !== right.value ? "true" : "false");
} else {
return Type.boolean;
}
default:
throw new Exception(`Unknown binary operator ${node.op}`);
}
});
};
Finally, with inferLogicalExpression
we can make assumptions to narrow the types based on the truthiness or falsiness of the left operand. For and
, we unify the right with the result of narrowing the left type with the falsy
type union, and for or
we do the opposite:
const inferLogicalExpression = (node, env, constant) => {
const left = infer(node.left, env, constant);
const right = () => infer(node.right, env, constant);
switch (node.op) {
case "and":
if (Type.isFalsy(left)) {
return left;
} else if (Type.isTruthy(left)) {
return right();
} else {
return unify(narrowType(left, Type.falsy), right());
}
case "or":
if (Type.isTruthy(left)) {
return left;
} else if (Type.isFalsy(left)) {
return right();
} else {
return unify(narrowType(left, Type.truthy), right());
}
default:
throw new Exception(`Unknown logical operator ${node.op}`);
}
};
Checking Against If and Cond Expressions
We need to add cases to the check
function in src/typechecker/check.js
for if and cond expressions:
// other cases
if (ast.kind === ASTTypes.IfExpression) {
return checkIfExpression(ast, type, env);
}
if (ast.kind === ASTTypes.CondExpression) {
return checkCondExpression(ast, type, env);
}
// const inferredType etc...
The checking functions work similiarly to their respective inference functions:
const checkIfExpression = (ast, type, env) => {
const test = infer(ast.test, env);
const consequent = () => check(ast.then, type, narrow(ast.test, env, true));
const alternate = () => check(ast.else, type, narrow(ast.test, env, false));
if (Type.isTruthy(test)) {
consequent();
} else if (Type.isFalsy(test)) {
alternate();
} else {
consequent();
alternate();
}
};
const checkCondExpression = (ast, type, env) => {
for (let clause of ast.clauses) {
const test = infer(clause.test, env);
check(clause.expression, type, narrow(ast.test, env, true));
if (Type.isTruthy(test)) {
// we can stop here if test is truthy
return;
}
}
check(ast.else, type, env);
};
Checking with The Type Checker
First we need to add the relevant cases to the switch
statement in checkNode
in src/typechecker/TypeChecker.js
:
// other cases
case ASTTypes.UnaryExpression:
return this.checkUnaryExpression(node, env);
case ASTTypes.BinaryExpression:
return this.checkBinaryExpression(node, env);
case ASTTypes.LogicalExpression:
return this.checkLogicalExpression(node, env);
case ASTTypes.IfExpression:
return this.checkIfExpression(node, env);
case ASTTypes.CondExpression:
return this.checkCondExpression(node, env);
case ASTTypes.WhenExpression:
return this.checkWhenExpression(node, env);
// default case
checkUnaryExpression
checks the operand then returns a new node annotated with the inferred node type:
checkUnaryExpression(node, env) {
const operand = this.checkNode(node.operand, env);
return { ...node, operand, type: infer(node, env) };
}
checkBinaryExpression
does the same thing except with left and right operands:
checkBinaryExpression(node, env) {
const left = this.checkNode(node.left, env);
const right = this.checkNode(node.right, env);
return { ...node, left, right, type: infer(node, env) };
}
checkLogicalExpression
is the same as checkBinaryExpression
:
checkLogicalExpression(node, env) {
const left = this.checkNode(node.left, env);
const right = this.checkNode(node.right, env);
return { ...node, left, right, type: infer(node, env) };
}
`checkIfExpression is similar, but with all 3 expressions:
`js
checkIfExpression(node, env) {
const test = this.checkNode(node.test, env);
const then = this.checkNode(node.then, env);
const elseBranch = this.checkNode(node.else, env);
return { ...node, test, then, else: elseBranch, type: infer(node, env) };
}
`
The checkCondExpression
method loops over the node's clauses and checks each test and expression, then checks the else branch and returns the fully checked node:
`js
checkCondExpression(node, env) {
/** @type {TypedAST[]} */
let clauses = [];
for (let clause of node.clauses) {
const test = this.checkNode(clause.test, env);
const expression = this.checkNode(clause.expression, env);
clauses.push({ test, expression });
}
const elseBranch = this.checkNode(node.else, env);
return { ...node, clauses, else: elseBranch, type: infer(node, env) };
}
`
Finally, checkWhenExpression
checks the test, then checks each node of the body, then returns the annotated node:
`js
checkWhenExpression(node, env) {
if (!node.env) {
node.env = env.extend("WhenExpression");
}
const whenEnv = node.env;
const test = this.checkNode(node.test, env);
let body = [];
for (let expr of node.body) {
body.push(this.checkNode(expr, whenEnv));
}
return { ...node, test, body, type: infer(node, whenEnv) };
}
`
And that's it for changes to the type checker!
Changes to The Visitor
We need to add new cases and methods for each of the new AST nodes in src/visitor/Visitor.js
. First, let's add the cases to the switch
statement in visit
:
js
// other cases...
case ASTTypes.IfExpression:
return this.visitIfExpression(node);
case ASTTypes.CondExpression:
return this.visitCondExpression(node);
case ASTTypes.WhenExpression:
return this.visitWhenExpression(node);
case ASTTypes.UnaryExpression:
return this.visitUnaryExpression(node);
case ASTTypes.BinaryExpression:
return this.visitBinaryExpression(node);
case ASTTypes.LogicalExpression:
return this.visitLogicalExpression(node);
// default case
Here's visitIfExpression
:
`js
visitIfExpression(node) {
const test = this.visit(node.test);
const then = this.visit(node.then);
const elseBranch = this.visit(node.else);
return { ...node, test, then, else: elseBranch };
}
`
and visitCondExpression
:
`js
visitCondExpression(node) {
let clauses = [];
for (let clause of node.clauses) {
const test = this.visit(clause.test);
const expression = this.visit(clause.expression);
clauses.push({ test, expression });
}
const elseBranch = this.visit(node.else);
return { ...node, clauses, else: elseBranch };
}
`
and visitWhenExpression
:
`js
visitWhenExpression(node) {
/** @type {AST[]} */
let body = [];
const test = this.visit(node.test);
for (let expr of node.body) {
body.push(this.visit(expr));
}
return { ...node, test, body };
}
`
and visitUnaryExpression
:
js
visitUnaryExpression(node) {
const operand = this.visit(node.operand);
return { ...node, operand };
}
and visitBinaryExpression
:
`js
visitBinaryExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);
return { ...node, left, right };
}
`
and, finally, visitLogicalExpression
:
`js
visitLogicalExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);
return { ...node, left, right };
}
`
With the new Visitor
methods in place, let's look at the desugarer.
Changes to The Desugarer
We need to desugar binary expressions, unary expressions, and cond expressions, so we need methods for each of them in src/desugarer/Desugarer.js
. The first two are simple.
Here's visitBinaryExpression
:
`js
visitBinaryExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);
const func = AST.Symbol(Token.new(TokenTypes.Symbol, node.op, node.srcloc));
return AST.CallExpression(func, [left, right], node.srcloc);
}
`
and visitUnaryExpression
:
`js
visitUnaryExpression(node) {
const operand = this.visit(node.operand);
return AST.CallExpression(
AST.Symbol(Token.new(TokenTypes.Symbol, node.op, node.srcloc)),
[operand],
node.srcloc
);
}
`
visitCondExpression
is a bit more complicated. We're going to create a recursive helper function called condVisitHelper
that takes in an array of clauses and the accumulated nested IfExpression
node.
If the clauses array has more than one element, we return the accumulated IfExpression
(or, if it's the first time calling the function, we create it) with a recursive call to condVisitHelper
in the else
branch position.
Otherwise, if clauses has one element, we return the base case which is an IfExpression
node with the else
branch from the original CondExpression
node in the else
branch position.
Here's visitCondExpression
:
`js
visitCondExpression(node) {
const srcloc = node.srcloc;
/**
* Recursive helper to construct IfExpression from CondExpression
* @param {import("../parser/ast.js").CondClause[]} clauses
* @param {import("../parser/ast.js").IfExpression} accum
*/
const condVisitHelper = (clauses, accum) => {
if (clauses.length > 1) {
const clause = clauses[0];
accum = AST.IfExpression(
clause.test,
clause.expression,
condVisitHelper(clauses.slice(1), accum),
srcloc
);
return accum;
} else if (clauses.length === 1) {
const clause = clauses[0];
const elseBranch = node.else;
return AST.IfExpression(
clause.test,
clause.expression,
elseBranch,
srcloc
);
}
};
return condVisitHelper(node.clauses, {});
}
`
That's it for changes to the desugarer. Now let's look at changes to the emitter.
Changes to The Emitter
Since we're desugaring away 3 of the new AST node types, we only need emitter methods for IfExpression
, WhenExpression
, and LogicalExpression
nodes.
First let's add the cases to the switch
statement in emit
in src/emitter/Emitter.js
:
js
// other cases...
case ASTTypes.LogicalExpression:
return this.emitLogicalExpression(node, ns);
case ASTTypes.IfExpression:
return this.emitIfExpression(node, ns);
case ASTTypes.WhenExpression:
return this.emitWhenExpression(node, ns);
// default case
emitLogicalExpression
switch
es on the operator and calls a helper method depending on which operator the node has:
`js
emitLogicalExpression(node, ns) {
switch (node.op) {
case "and":
return this.emitLogicalAnd(node, ns);
case "or":
return this.emitLogicalOr(node, ns);
default:
throw new Exception(`Unknown logical operator ${node.op}`);
}
}
emitLogicalAnd(node, ns) {
const left = this.emit(node.left, ns);
const right = this.emit(node.right, ns);
return `rt.isTruthy(${left}) ? ${right} : ${left}`;
}
emitLogicalOr(node, ns) {
const left = this.emit(node.left, ns);
const right = this.emit(node.right, ns);
return `rt.isFalsy(${left}) ? ${left} : ${right}`;
}
`
We have to compile to a ternary expression and use rt.isTruthy
and rt.isFalsy
since the runtime semantics of Wanda are slightly different from those of JavaScript with regards to the truthiness and falsiness of values.
emitIfExpression
emits a ternary expression:
js
rt.isTruthy(${this.emit(node.test, ns)}) ? ${this.emit(
emitIfExpression(node, ns) {
return
node.then,
ns
)} : ${this.emit(node.else, ns)};
}
Since a when expression is imperative, it returns nil
instead of a type based on the expression itself. Because of that, we need to emit a little bit of ternary expression magic to cover the case where the test is falsy so the body code is not executed. We wrap the consequent branch of the ternary statement in an IIFE so we can execute its contents and then return nil
from the expression body. Here's emitWhenExpression
:
js
(rt.isTruthy(${this.emit(node.test, ns)}))
emitWhenExpression(node, ns) {
const whenNs = ns.extend("WhenExpression");
let code =
? (() => {\n`;
for (let expr of node.body) {
code += ` ${this.emit(expr, whenNs)};\n`;
}
code += " return null";
code += " })()\n";
code += " : null\n";
return code;
}
`
That's all the changes to the emitter, so let's look at printing the new AST nodes.
Changes to The Printer
We don't have any new types of values to print, but we do have 6 new node types.
First, let's add them to be dispatched from the switch
statement in print
in src/printer/printAST.js
:
js
// other cases...
case ASTTypes.IfExpression:
return this.printIfExpression(node, indent);
case ASTTypes.CondExpression:
return this.printCondExpression(node, indent);
case ASTTypes.WhenExpression:
return this.printWhenExpression(node, indent);
case ASTTypes.BinaryExpression:
return this.printBinaryExpression(node, indent);
case ASTTypes.LogicalExpression:
return this.printLogicalExpression(node, indent);
case ASTTypes.UnaryExpression:
return this.printUnaryExpression(node, indent);
// default case
The methods themselves are pretty simple, so I won't belabor the point. Here they are:
js
${prIndent(indent)}IfExpression:\n
printIfExpression(node, indent) {
let prStr =;
${prIndent(indent + 2)}Test:\n
prStr +=;
${prIndent(indent + 2)}Then:\n
prStr += this.print(node.test, indent + 4) + "\n";
prStr +=+ "\n";
${prIndent(indent + 2)}Else:\n`;
prStr += this.print(node.then, indent + 4) + "\n";
prStr +=
prStr += this.print(node.else, indent + 4) + "\n";
return prStr;
}
printCondExpression(node, indent) {
let prStr = ${prIndent(indent)}CondExpression:\n
;
for (let clause of node.clauses) {
prStr += `${prIndent(indent + 2)}Test:\n`;
prStr += this.print(clause.test, indent + 4) + "\n";
prStr += `${prIndent(indent + 2)}Expression:\n`;
prStr += this.print(clause.expression, indent + 4) + "\n";
}
prStr += `${prIndent(indent + 2)}Else:\n`;
prStr += this.print(node.else, indent + 4) + "\n";
return prStr;
}
printWhenExpression(node, indent) {
let prStr = ${prIndent(indent)}WhenExpression:\n
;
prStr += ${prIndent(indent + 2)}Test:\n
;
prStr += this.print(node.test, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Body:
;
for (let expr of node.body) {
prStr += this.print(expr, indent + 4) + "\n";
}
return prStr;
}
printBinaryExpression(node, indent) {
let prStr = ${prIndent(indent)}BinaryExpression:\n
;
prStr += ${prIndent(indent + 2)}Left:\n
;
prStr += this.print(node.left, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Operator:\n
;
prStr += ${prIndent(indent + 4)}${node.op}\n
;
prStr += ${prIndent(indent + 2)}Right:\n
;
prStr += this.print(node.right, indent + 4) + "\n";
return prStr;
}
printLogicalExpression(node, indent) {
let prStr = ${prIndent(indent)}LogicalExpression:\n
;
prStr += ${prIndent(indent + 2)}Left:\n
;
prStr += this.print(node.left, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Operator:\n
;
prStr += ${prIndent(indent + 4)}${node.op}\n
;
prStr += ${prIndent(indent + 2)}Right:\n
;
prStr += this.print(node.right, indent + 4) + "\n";
return prStr;
}
printUnaryExpression(node, indent) {
let prStr = ${prIndent(indent)}UnaryExpression:\n
;
prStr += ${prIndent(indent + 2)}Operator:\n
;
prStr += ${prIndent(indent + 4)}${node.op}\n
;
prStr += ${prIndent(indent + 2)}Operand:\n
;
prStr += this.print(node.operand, indent + 4) + "\n";
return prStr;
}
`
That's all the changes to the printer. Now we're going to make some changes to the CLI to make it more flexible.
Changes to The CLI
We're going to make 3 major changes to the CLI:
- Allow running a Wanda file directly from the command line (compiling and executing it in one step)
- Interactively load a Wanda file as a script and use its definitions in the REPL
- Output a compiled file and optionally build the whole bundle
In the case of a compiled file, you have the option to either use ES2015 imports to import the global library or to use ESBuild to bundle everything into one giant JavaScript file.
Changes to The REPL
First, we're going to change the arguments the REPL function takes. Instead of a single string argument mode
, we're going to give it an options object with properties mode
(string) and path
(string). Everything has defaults, so it can still be called simply by running the bare function with no arguments.
The way mode
works isn't changing. We will, however, enable the ability to turn off printing the AST from within the REPL if you've enabled it. Not doing that sooner was a silly oversight on my part.
The big change is if there's a value for the path
option it will grab the contents of that file, compile it, and execute the result in the global environment so you'll have access to any toplevel definitions in that file within the REPL.
Here's the new version of the repl
function in src/cli/repl.js
:
`js
import os from "os";
import vm from "vm";
import fs from "fs";
import readlineSync from "readline-sync";
import { pprintAST, pprintDesugaredAST } from "./pprint.js";
import { println } from "../printer/println.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";
import { build } from "./build.js";
import { compile } from "./compile.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";
import { countIndent, inputFinished } from "./utils.js";
const read = (prompt) => readlineSync.question(prompt);
export const repl = ({ mode = "repl", path = "" } = {}) => {
// Create global compile environment
const compileEnv = makeGlobalNameMap();
const typeEnv = makeGlobalTypeEnv();
// Build global module and instantiate in REPL context
// This should make all compiled global symbols available
const globalCode = build(emitGlobalEnv());
vm.runInThisContext(globalCode);
if (path) {
// load file in REPL interactively
const fileContents = fs.readFileSync(path, { encoding: "utf-8" });
const compiledFile = compile(fileContents, path, compileEnv, typeEnv);
vm.runInThisContext(compiledFile);
}
let prompt = "wanda> ";
let input = "";
let indent = 0;
while (true) {
try {
input += read(prompt + " ".repeat(indent));
switch (input) {
// If it's a command, execute it
case ":quit":
process.exit(0);
case ":print-ast":
mode = "printAST";
input = "";
break;
case ":print-desugared":
mode = "printDesugared";
input = "";
break;
case ":no-print-ast":
mode = "repl";
input = "";
break;
// If it's code, compile and run it
default:
if (inputFinished(input)) {
let compiled = compile(input, "stdin", compileEnv, typeEnv);
let result = vm.runInThisContext(compiled);
if (mode === "printAST") {
console.log(pprintAST(input));
} else if (mode === "printDesugared") {
console.log(pprintDesugaredAST(input));
}
println(result);
input = "";
indent = 0;
} else {
input += os.EOL;
indent = countIndent(input);
}
}
} catch (e) {
console.error(e.stack);
input = "";
indent = 0;
}
}
};
`
Changes to The run
Function
I've removed the print
command from the run
function, now if you want to print the AST from within the REPL you have to enable it inside the REPL. You can decide whether or not you want to do the sameβit's all the same to me either way.
We need to add a function that can compile and run a Wanda file, and parse the run
command to do it. I've also replaced throwing exceptions with logging an error message and passing 1
to process.exit
to fall more in line with UNIX shell script conventions. Here's the new run
function and the runFile
function that goes with it in src/cli/run.js
:
`js
import vm from "vm";
import fs from "fs";
import { join } from "path";
import { repl } from "./repl.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";
import { build } from "./build.js";
import { compile } from "./compile.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";
export const run = () => {
switch (process.argv[2]) {
case "-i":
if (!process.argv[3]) {
console.log(-i option requires file path as argument
);
process.exit(1);
}
const path = join(process.cwd(), process.argv[3]);
repl({ path });
break;
case undefined:
case "repl":
return repl();
case "run":
if (!process.argv[3]) {
console.log(run command requires file path as argument
);
process.exit(1);
}
return runFile(join(process.cwd(), process.argv[3]));
default:
console.log("Invalid command specified");
process.exit(1);
}
};
const runFile = (path) => {
const fileContents = fs.readFileSync(path, { encoding: "utf-8" });
const globalNs = makeGlobalNameMap();
const typeEnv = makeGlobalTypeEnv();
const globalCode = build(emitGlobalEnv());
const compiledCode = compile(fileContents, path, globalNs, typeEnv);
vm.runInThisContext(globalCode);
return vm.runInThisContext(compiledCode);
};
`
Compiling and Building Files
We also need the ability to compile Wanda files and output the results, with the optional step of building the whole program instead of just using ES2015 import
statements for the global library.
We'll call the main function wandac
after compilers like javac
and rustc
, and add an auxiliary compileFile
function to handle the actual compilation. Here are those functions, in src/cli/wandac.js
:
`js
import os from "os";
import fs from "fs";
import { join, basename } from "path";
import { compile } from "./compile.js";
import { build } from "./build.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";
const compileFile = (path) => {
const contents = fs.readFileSync(path, { encoding: "utf-8" });
return compile(contents, path, makeGlobalNameMap(), makeGlobalTypeEnv());
};
export const wandac = () => {
if (!process.argv[2]) {
console.log(wandac requires either a file path or command argument
);
process.exit(1);
}
switch (process.argv[2]) {
case "build": {
const pathname = join(process.cwd(), process.argv[3]);
const compiledFile = compileFile(pathname);
const globals = emitGlobalEnv();
const code = globals + os.EOL + os.EOL + compiledFile;
const bName = basename(pathname).split(".")[0];
const outfile = bName + ".build" + ".js";
const built = build(code, outfile, bName);
fs.writeFileSync(outfile, built, { encoding: "utf-8" });
break;
}
default: {
// should be a file path
const pathname = join(process.cwd(), process.argv[2]);
const compiledFile = compileFile(pathname);
const globals = emitGlobalEnv();
const code = globals + os.EOL + os.EOL + compiledFile;
const outfile = basename(pathname).split(".")[0] + ".js";
fs.writeFileSync(outfile, code, { encoding: "utf-8" });
break;
}
}
};
`
Now let's create a new Node shell script in the bin
directory (not src/bin
) and link it up to run as a global system command. First, add this to bin/wandac.js
:
`js
!/usr/bin/env node
import { wandac } from "../src/cli/wandac.js";
wandac();
`
Next, we need to make a slight change to package.json
to account for the fact that there are now 2 executable scripts. Change
json
"bin": "bin/wanda.js",
to:
json
"bin": {
"wanda": "bin/wanda.js",
"wandac": "bin/wandac.js"
},
Now make sure you're in the compiler's root directory and run the command npm link
. Now you should have both wanda
and wandac
available as global shell commands.
Example Files
Finally, we need a couple of example files to load into the REPL and play with! I've created a couple of simple ones just to get you started.
I'm using .wanda
as the file extension.
Create a new directory in the compiler root called examples
and add the following 2 files:
examples/inc.wanda
:
clojure
(def inc (x: number) -> number (+ x 1))
examples/fib.wanda
:
clojure
(def fib (n: number) -> number
(if (= n 1)
n
(* n (fib (- n 1)))))
Now if you run the command wanda -i examples/inc.wanda
you can run the inc
function from within the REPL! You can also compile them if you want to see the output.
Conclusion
We've made some major strides today! While our language was technically Turing complete with just functions (it's amazing what you can do with just functions), it's a lot easier to consider it so now that we have branching enabled through conditional expressions. Recursion allows for repeating code, so the combination of conditionals and functions makes the language fully Turing complete.
The CLI is also now much more fully functional compared to just being able to run a simple REPL.
Please feel free to add your own examples. If you come up with anything cool, by all means send a pull request and I'll incorporate it into the examples I've got!
As always, you can find the code as of this point on the GitHub repo under the relevant tag.
Next time we're going to add 2 iterative constructs: loop
and for
. I'll see you then!
Top comments (0)