Jason Barr

Posted on Jul 1, 2023

Create Your Own Programming Language 8: Conditionals

#programminglanguages #compilers #interpreters #javascript

It's time for the next installment in the Create Your Own Programming Language series. This time we're going to add conditionals to the Wanda language.

We're actually going to add 3 different kinds of conditionals: if, cond, and when.

if is a standard if expression, with 2 branches based on whether a test comes back truthy or falsy.

cond lets you chain together multiple conditions and expressions, with an :else branch at the end in case nothing matches. The :else branch is always required. cond expressions desugar to a chain of if expressions.

A cond expression looks like this. Note that each branch is enclosed within its own set of parentheses:

(cond
  ((= x 0) "zero")
  ((< x 10) "less than 10")
  ((< x 0) "less than zero")
  (:else "10 or greater"))

when is imperative and lets you execute a block of code if the test is truthy. It evaluates to nil either way, so you wouldn't want to use it to assign a toplevel variable. It's more of an imperative construct, ideally for containing side effects.

When also allows you to nest any number of expressions inside it, whereas each branch of an if expression can only take a single expression. And since when doesn't have an else branch, you can use it in cases where you don't need an alternate execution path if the main one isn't taken.

Here's an example of a when expression:

(when (= 1 1)
  (println "Wow, one equals one?")
  (println "Who woulda thunk it???"))

We're also going to add a new form, the LogicalExpression. This is for expressions with the and and or forms.

Some Lisps will let you use any number of arguments to the and and or forms, but we're going to restrict users to 2 arguments for each form to simplify things.

Finally, we'll add 2 new node types to help the type checker that will both desugar to call expressions: BinaryExpression which will include the equal? and not-equal? operations, and UnaryExpression which is only used with not.

These typechecker-only nodes will help with a process known as type narrowing, which allows the type checker to better understand things like which arm of a union to check against in branches of a conditional.

As always, if you haven't read the previous article in which we added several new types to the type checker, you should do that before reading ahead.

Ok? Let's go!

Changes to The Lexer and Reader

None! Both are fine just the way they are for today.

Changes to The Core Library

We need to add a not-equal? function to the core library to match up with the not-equal? binary expression.

Ordinarily in a Lisp when you want to check if a value is not equal you'd just prefix equal? with not, as in (not (equal? x y)), but to give the type checker some more help with narrowing we're adding this additional form.

In lib/js/core.js go down to the equal? function and add this between it and is?:

    "not-equal?": rt.makeFunction(
      (a, b) => {
        if (rt.hasDict(a) && rt.hasDict(b)) {
          return !equal(rt.getMetaField(a, "dict"), rt.getMetaField(b, "dict"));
        }
        return !equal(a, b);
      },
      { contract: "(any, any) -> boolean", name: "not-equal?" }
    ),

That's it for changes to the core library, so now we can move on to changes to the parser.

Changes to The Parser

We need to define new AST node types and constructors for those nodes.

In src/parser/ast.js add the following to the ASTTypes enum:

export const ASTTypes = {
  // other members...
  IfExpression: "IfExpression",
  CondExpression: "CondExpression",
  WhenExpression: "WhenExpression",
  BinaryExpression: "BinaryExpression",
  LogicalExpression: "LogicalExpression",
  UnaryExpression: "UnaryExpression",
};

And in the AST object add the following node constructors:

export const AST = {
  // other constructors...
  IfExpression(test, then, elseBranch, srcloc) {
    return {
      kind: ASTTypes.IfExpression,
      test,
      then,
      else: elseBranch,
      srcloc,
    };
  },

  CondExpression(clauses, elseBranch, srcloc) {
    return {
      kind: ASTTypes.CondExpression,
      clauses,
      else: elseBranch,
      srcloc,
    };
  },

  WhenExpression(test, body, srcloc) {
    return {
      kind: ASTTypes.WhenExpression,
      test,
      body,
      srcloc,
    };
  },

  BinaryExpression(left, op, right, srcloc) {
    return {
      kind: ASTTypes.BinaryExpression,
      left,
      op,
      right,
      srcloc,
    };
  },

  LogicalExpression(left, op, right, srcloc) {
    return {
      kind: ASTTypes.LogicalExpression,
      left,
      op,
      right,
      srcloc,
    };
  },

  UnaryExpression(op, operand, srcloc) {
    return {
      kind: ASTTypes.UnaryExpression,
      op,
      operand,
      srcloc,
    };
  },
}

We'll need 6 new parse functions in src/parser/parse.js:

parseIfExpression
parseCondExpression
parseWhenExpression
parseBinaryExpression
parseLogicalExpression
parseUnaryExpression

Add the following cases to the switch statement in parseList:

    // other cases
    case "if":
      return parseIfExpression(form);
    case "cond":
      return parseCondExpression(form);
    case "when":
      return parseWhenExpression(form);
    case "equal?":
    case "not-equal?":
      return parseBinaryExpression(form);
    case "and":
    case "or":
      return parseLogicalExpression(form);
    case "not":
      return parseUnaryExpression(form);
    // default case

parseIfExpression is pretty simple. Just grab the test, then, and elseBranch forms off the main form, parse each form, and pass them to the constructor.

const parseIfExpression = (form) => {
  const [_, test, then, elseBranch] = form;
  const srcloc = form.srcloc;
  const parsedTest = parseExpr(test);
  const parsedThen = parseExpr(then);
  const parsedElse = parseExpr(elseBranch);

  return AST.IfExpression(parsedTest, parsedThen, parsedElse, srcloc);
};

parseCondExpression is a little more complicated. We have to grab all the clauses off the main form, parse each one individually, and then when we get to the else branch handle that one separately since it gets its own property on the AST node object. Also, since the else branch is required, we throw an error if there isn't one.

const parseCondExpression = (form) => {
  const [_, ...clauses] = form;
  const srcloc = form.srcloc;
  /** @type {import("./ast.js").CondClause[]} */
  const parsedClauses = [];
  let hasElse = false;

  if (!(clauses.length >= 2)) {
    throw new Exception(
      `Cond expression must have at least 2 clauses; ${clauses.length} given at ${srcloc.file}, (${srcloc.line}:${srcloc.col})`
    );
  }

  for (let clause of clauses) {
    const [test, expr] = clause;

    if (test.type === TokenTypes.Keyword && test.value === ":else") {
      hasElse = true;
      break;
    }

    const parsedTest = parseExpr(test);
    const parsedExpr = parseExpr(expr);

    parsedClauses.push({ test: parsedTest, expression: parsedExpr });
  }

  if (!hasElse) {
    throw new Exception(
      `Cond expression expects an :else clause; none given at ${srcloc.file}, (${srcloc.line}:${srcloc.col})`
    );
  }

  /** @type {Cons} */
  const elseClause = clauses[clauses.length - 1];
  // we need the first element of the tail of the else clause
  const parsedElse = parseExpr(elseClause.cdr.car);

  return AST.CondExpression(parsedClauses, parsedElse, srcloc);
};

The other 4 parse functions, parseWhen, parseBinaryExpression, parseLogicalExpression, and parseUnaryExpression, are all simple:

const parseWhenExpression = (form) => {
  const [_, test, ...body] = form;
  const srcloc = form.srcloc;
  const parsedTest = parseExpr(test);
  const parsedBody = body.map(parseExpr);

  return AST.WhenExpression(parsedTest, parsedBody, srcloc);
};

const parseBinaryExpression = (form) => {
  const [op, left, right] = form;
  const srcloc = form.srcloc;
  const parsedLeft = parseExpr(left);
  const parsedRight = parseExpr(right);

  return AST.BinaryExpression(parsedLeft, op.value, parsedRight, srcloc);
};

const parseLogicalExpression = (form) => {
  const [op, left, right] = form;
  const srcloc = form.srcloc;
  const parsedLeft = parseExpr(left);
  const parsedRight = parseExpr(right);

  return AST.LogicalExpression(parsedLeft, op.value, parsedRight, srcloc);
};

const parseUnaryExpression = (form) => {
  const [op, operand] = form;
  const srcloc = form.srcloc;
  const parsedOperand = parseExpr(operand);

  return AST.UnaryExpression(op.value, parsedOperand, srcloc);
};

Changes to The Type Checker

Obviously we need to infer and check types for the new nodes, but the most significant change we'll be implementing in the type checker today is the addition of type narrowing.

Type narrowing allows the type checker to check certain expressions as if a certain condition were true or false, which allows it to make deductions about types it otherwise would not be able to make.

For example, let's say we have a discriminated union type:

(type Coord
  { type: "cartesian", x: number, y: number }
  || { type: "polar", radius: number, magnitude: number })

(def (point: Coord) { type: "cartesian", x: 5, y: 2.5 })

Given this when expression:

(when (equal? point.type "cartesian")
  ; assume point is the bottom-right point of
  ; a square that begins at the origin (0, 0)
  (def area (* point.x point.y))
  ; continue

Without type narrowing, the above when expression would have failed to type check because due to the union type we couldn't be certain the object would have x and y properties. Thanks to narrowing, since we know type is "cartesian" we know which arm of the union the object satisfies.

As Jake Donham says in his article about type narrowing, it can be helpful to think of the type environment as a logical statement that asserts facts about the types of values. Narrowing allows us to supplement the known facts with assumptions guaranteed to be correct if the right conditions are met.

Adding The Negation Type

To implement narrowing we're going to add a type that is only used in the type checker: the negation type, or not. The not type is only used in the type checker, it can't be annotated, and in fact it is only used in type narrowing and it's limited just to that functionality so we don't have to rewrite the entire checker just to account for it.

Let's add it in src/typechecker/types.js to the TypeTypes enum:

export const TypeTypes = {
  // other members...
  Not: "Not",
};

We'll also need a constructor in src/typechecker/constructors.js:

export const not = (base) => ({ kind: TypeTypes.Not, base });

and a validator in src/typechecker/validators.js:

export const isNot = (type) => {
  return type.kind === TypeTypes.Not;
};

We'll also need some utility functions and types for checking truthiness and falsiness of a value, when it can be known at compile time. We'll create those in src/typechecker/utils.js:

export const isTruthy = (type) => {
  switch (type.kind) {
    case TypeTypes.Nil:
      return false;
    case TypeTypes.Singleton:
      if (type.value === "false") {
        return false;
      }
    default:
      if (type.kind !== TypeTypes.Boolean) {
        return true;
      }
  }
};

export const isFalsy = (type) => {
  switch (type.kind) {
    case TypeTypes.Nil:
      return true;
    case TypeTypes.Singleton:
      if (type.value === "false") {
        return true;
      }
    default:
      if (type.kind !== TypeTypes.Boolean) {
        return false;
      }
  }
};

export const falsy = union(singleton("Boolean", "false"), nil);

export const truthy = intersection(
  not(singleton("Boolean", "false")),
  not(nil)
);

Remember that in Wanda all values except false and nil are truthy.

Let's add our utils to the Type module in src/typechecker/Type.js:

import { TypeTypes } from "./types.js";
import * as Validators from "./validators.js";
import * as Constructors from "./constructors.js";
import { typeToString } from "./typeToString.js";
import { union } from "./union.js";
import { map } from "./map.js";
import { intersection } from "./intersection.js";
import { isTruthy, isFalsy, truthy, falsy } from "./utils.js";

export const Type = {
  Type: TypeTypes,
  ...Constructors,
  ...Validators,
  toString: typeToString,
  union,
  map,
  intersection,
  isTruthy,
  isFalsy,
  truthy,
  falsy,
};

Just a quick note: I had a weird circular reference bug that popped up sometimes in Node when trying to use the type checker. The solution turned out to be removing the import of the Type module into src/typechecker/isSubtype.js and instead importing each of the type validators used in isSubtype one at a time and removing the Type. prefix for validators within isSubtype. If you have errors like "Cannot use Type before its initialization" after adding the rest of the code in this section that's how you fix them.

Narrowing

Now let's actually implement the narrowing functions. It's actually a lot of functions! But the thing most of them have in common is they restrict the applications of a type within an environment and return that modified environment to be used by the caller.

For a fuller explanation of what I'm about to show you, see Donham's post about type narrowing.

First is the narrow function.

When we type check a branch of a conditional, we call narrow with the test expression, the current environment, and a flag according to whether we're assuming the test to be true or false. It returns a new environment incorporating the information of the test and assumption.

When we deduce that a subexpression must satisfy a type, we call narrowUnary, narrowLogical, narrowBinary, or narrowPath to incorporate the appropriate expression type into the environment. When we reach a variable we call narrowType, which performs an operation similar to subtype checking between the current type from the environment and the deduced type in order to update the variable's type with the result.

Here are the narrowing functions, including the function widenNots that keeps not types from being returned in the environments produced by narrowing, in src/typechecker/narrow.js:

import { AST, ASTTypes } from "../parser/ast.js";
import { Exception, TypeException } from "../shared/exceptions.js";
import { Type } from "./Type.js";
import { TypeEnvironment } from "./TypeEnvironment.js";
import { infer } from "./infer.js";
import { isSubtype } from "./isSubtype.js";
import { propType } from "./propType.js";
import { TypeTypes } from "./types.js";

export const narrow = (ast, env, assume) => {
  switch (ast.kind) {
    case ASTTypes.UnaryExpression:
      return narrowUnary(ast, env, assume);

    case ASTTypes.LogicalExpression:
      return narrowLogical(ast, env, assume);

    case ASTTypes.BinaryExpression:
      return narrowBinary(ast, env, assume);

    default:
      return narrowPath(ast, env, assume ? Type.truthy : Type.falsy);
  }
};

const narrowUnary = (ast, env, assume) => {
  switch (ast.op) {
    case "not":
      return narrow(ast.operand, env, !assume);

    default:
      throw new Exception(`Unknown unary operator ${ast.op}`);
  }
};

const narrowLogical = (ast, env, assume) => {
  switch (ast.op) {
    case "and":
      if (assume) {
        env = narrow(ast.left, env, true);
        return narrow(ast.right, env, true);
      } else {
        if (Type.isTruthy(infer(ast.left, env))) {
          return narrow(ast.right, env, false);
        } else if (Type.isTruthy(infer(ast.right, env))) {
          return narrow(ast.left, env, false);
        } else {
          return env;
        }
      }

    case "or":
      if (!assume) {
        env = narrow(ast.left, env, false);
        return narrow(ast.right, env, false);
      } else {
        if (Type.isFalsy(infer(ast.left, env))) {
          return narrow(ast.right, env, true);
        } else if (Type.isFalsy(infer(ast.right, env))) {
          return narrow(ast.left, env, true);
        } else {
          return env;
        }
      }

    default:
      throw new Exception(`Unknown logical operator ${ast.op}`);
  }
};

const narrowBinary = (ast, env, assume) => {
  const left = infer(ast.left, env);
  const right = infer(ast.right, env);

  if ((ast.op === "equal?" && assume) || (ast.op === "not-equal?" && !assume)) {
    env = narrowPath(ast.left, env, right);
    return narrowPath(ast.right, env, left);
  } else if (
    (ast.op === "not-equal?" && assume) ||
    (ast.op === "equal?" && !assume)
  ) {
    if (Type.isSingleton(right)) {
      env = narrowPath(ast.left, env, Type.not(right));
    }

    if (Type.isSingleton(left)) {
      env = narrowPath(ast.right, env, Type.not(left));
    }

    return env;
  } else {
    return env;
  }
};

const narrowPath = (ast, env, type) => {
  switch (ast.kind) {
    case ASTTypes.Symbol:
      return narrowPathSymbol(ast, env, type);

    case ASTTypes.MemberExpression:
      return narrowPathMember(ast, env, type);

    default:
      return env;
  }
};

const narrowPathSymbol = (ast, env, type) => {
  const idType = env.get(ast.name);

  if (!idType) {
    throw new TypeException(
      `Expected bound identifier, got undefined`,
      ast.srcloc
    );
  }

  env.set(ast.name, narrowType(idType, type));
  return env;
};

const narrowPathMember = (ast, env, type) => {
  return narrowPath(
    ast.object,
    env,
    Type.object([{ name: ast.property.name, type: type }])
  );
};

export const narrowType = (x, y) => {
  if (Type.isAny(x) || Type.isAny(y)) return Type.any;
  if (Type.isUndefined(x) && Type.isUndefined(y)) return Type.undefinedType;
  if (Type.isUndefined(x)) return widenNots(y);
  if (Type.isUndefined(y)) return x;
  if (Type.isNever(x) || Type.isNever(y)) return Type.never;
  if (Type.isUnknown(x)) return widenNots(y);
  if (Type.isUnknown(y)) return x;

  if (Type.isTuple(x)) {
    return Type.tuple(x.types.map((a) => narrowType(a, y)));
  }

  if (Type.isTuple(y)) {
    return Type.tuple(y.types.map((b) => narrowType(x, b)));
  }

  if (Type.isUnion(x)) {
    return Type.union(...x.types.map((a) => narrowType(a, y)));
  }

  if (Type.isUnion(y)) {
    return Type.union(...y.types.map((b) => narrowType(x, b)));
  }

  if (Type.isIntersection(x)) {
    return Type.intersection(...x.types.map((a) => narrowType(a, y)));
  }

  if (Type.isIntersection(y)) {
    return Type.intersection(...y.types.map((b) => narrowType(x, b)));
  }

  if (Type.isNot(y)) {
    if (isSubtype(x, y.base)) {
      return Type.never;
    } else if (
      Type.isBoolean(x) &&
      Type.isSingleton(y.base) &&
      Type.isBoolean(y.base.base)
    ) {
      return Type.singleton(
        "Boolean",
        y.base.value === "true" ? "false" : "true"
      );
    } else {
      return x;
    }
  }

  if (Type.isSingleton(x) && Type.isSingleton(y)) {
    return x.value === y.value ? x : Type.never;
  }

  if (Type.isSingleton(x)) {
    return x.base === y.kind ? x : Type.never;
  }

  if (Type.isSingleton(y)) {
    return y.base === x.kind ? y : Type.never;
  }

  if (Type.isObject(x) && Type.isObject(y)) {
    const properties = x.properties.map(({ name, type: xType }) => {
      const yType = propType(y, name);
      const type = yType ? narrowType(xType, yType) : xType;
      return { name, type };
    });

    if (properties.some(({ type }) => Type.isNever(type))) {
      return Type.never;
    } else {
      return Type.object(properties);
    }
  }

  return Type.intersection(x, y);
};

const widenNots = (type) => {
  switch (type.kind) {
    case TypeTypes.Not:
      return Type.unknown;

    case TypeTypes.Union:
      return Type.union(...type.types.map(widenNots));

    case TypeTypes.Object:
      return Type.object(
        type.properties.map(({ name, type }) => ({
          name,
          type: widenNots(type),
        }))
      );

    default:
      return type;
  }
};

That's actually a lot of code! Please check out Jake's article if you don't understand anything—he explains the details quite well.

Inference

We need to make some changes to the map function in src/typechecker/map.js before we move on to the inference functions. Namely, we need to make it so map can accept either 2 or 3 arguments based on whether we're mapping over 1 type or 2. That way we can use map with binary expressions.

Rename map to map1, then add this new function in as map and export it:

export const map = (...args) => {
  switch (args.length) {
    case 2:
      return map1(args[0], args[1]);
    case 3:
      return map2(args[0], args[1], args[2]);
    default:
      throw new Exception(`Unexpected args length ${args.length}`);
  }
};

Here's the map2 function:

export const map2 = (t1, t2, fn) => {
  if (isUnion(t1) || isUnion(t2)) {
    const t1s = isUnion(t1) ? t1.types : [t1];
    const t2s = isUnion(t2) ? t2.types : [t2];
    const ts = [];
    for (const t1 of t1s) {
      for (const t2 of t2s) {
        ts.push(map(t1, t2, fn));
      }
    }
    return union(...ts);
  } else if (isIntersection(t1) || isIntersection(t2)) {
    const t1s = isIntersection(t1) ? t1.types : [t1];
    const t2s = isIntersection(t2) ? t2.types : [t2];
    const ts = [];
    let error = undefined;
    for (const t1 of t1s) {
      for (const t2 of t2s) {
        try {
          ts.push(fn(t1, t2));
        } catch (e) {
          if (!error) error = e;
        }
      }
    }
    if (ts.length === 0) {
      throw error;
    } else {
      return intersection(...ts);
    }
  } else {
    return fn(t1, t2);
  }
};

Now we need 6 new cases in the switch statement in infer in src/typechecker/infer.js:

    // other cases...
    case ASTTypes.IfExpression:
      return inferIfExpression(ast, env, constant);
    case ASTTypes.WhenExpression:
      return inferWhenExpression(ast, env, constant);
    case ASTTypes.CondExpression:
      return inferCondExpression(ast, env, constant);
    case ASTTypes.UnaryExpression:
      return inferUnaryExpression(ast, env, constant);
    case ASTTypes.BinaryExpression:
      return inferBinaryExpression(ast, env, constant);
    case ASTTypes.LogicalExpression:
      return inferLogicalExpression(ast, env, constant);
    // default case

For inferring if expressions, we first infer the test. Then we use the test to narrow for each of the consequent and alternate branches, wrapping each in a thunk so inference only happens if it's needed.

If the test is known to be truthy or falsy, we infer only the appropriate branch. Otherwise, we unify the two branches.

TypeScript constructs a union type of the two branches if they produce different types, but I want to enforce some degree of similarity between the branches so we're unifying the two types instead of going with a simple union. That means the type of one branch must be the same as or a subtype of the other.

Here's inferIfExpression:

const inferIfExpression = (node, env, constant) => {
  const test = infer(node.test, env, constant);
  const consequent = () => infer(node.then, narrow(node.test, env, true));
  const alternate = () => infer(node.else, narrow(node.test, env, false));

  if (Type.isTruthy(test)) {
    return consequent();
  } else if (Type.isFalsy(test)) {
    return alternate();
  } else {
    return unify(consequent(), alternate());
  }
};

inferWhenExpression is a little bit strange. It returns the nil type no matter what. But we still infer a type for the test then loop over the body block and infer a type for each expression to make sure there are no inference errors:

const inferWhenExpression = (node, env, constant) => {
  const test = infer(node.test, env, constant);

  // if the test is falsy, this will never run
  if (!Type.isFalsy(test)) {
    // check expressions in body
    for (let expr of node.body) {
      infer(expr, narrow(node.test, env, true), constant);
    }
  }

  return Type.nil;
};

inferCondExpression loops over all the clauses. If one of the clause tests produces a known truthy type, we return the type of its expression. Otherwise we collect the types of all clauses, including the :else clause, and unify them all. That means the common subtype is produced as the type for the expression so, like with an if expression, the types for each branch need to be subtypes of each other.

const inferCondExpression = (node, env, constant) => {
  let types = [];

  for (let clause of node.clauses) {
    const test = infer(clause.test, env, constant);
    const type = infer(
      clause.expression,
      narrow(clause.test, env, true),
      constant
    );

    if (Type.isTruthy(test)) {
      // the first truthy condition type will trigger the clause's expression
      return type;
    }

    types.push(type);
  }

  const elseType = infer(node.else, env, constant);

  return unifyAll(...types, elseType);
};

Since not is the only unary expression, that's the only case inferUnaryExpression handles. If the type of its operand is truthy or falsy it returns a singleton type of the opposite boolean. Otherwise, if it can't be determined at compile time, it returns the type boolean.

const inferUnaryExpression = (node, env, constant) => {
  const operand = infer(node.operand, env, constant);

  return Type.map(operand, (operand) => {
    switch (node.op) {
      case "not":
        if (Type.isTruthy(operand)) {
          return Type.singleton("Boolean", "false");
        } else if (Type.isFalsy(operand)) {
          return Type.singleton("Boolean", true);
        } else {
          return Type.boolean;
        }

      default:
        throw new Exception(`Unknown unary operator ${node.op}`);
    }
  });
};

inferBinaryExpression is similar, but it has 2 cases to handle and each case has 2 operands:

const inferBinaryExpression = (node, env, constant) => {
  const left = infer(node.left, env, constant);
  const right = infer(node.right, env, constant);

  return Type.map(left, right, (left, right) => {
    switch (node.op) {
      case "equal?":
        if (Type.isSingleton(left) && Type.isSingleton(right)) {
          return Type.singleton(left.value === right.value ? "true" : "false");
        } else {
          return Type.boolean;
        }

      case "not-equal?":
        if (Type.isSingleton(left) && Type.isSingleton(right)) {
          return Type.singleton(left.value !== right.value ? "true" : "false");
        } else {
          return Type.boolean;
        }

      default:
        throw new Exception(`Unknown binary operator ${node.op}`);
    }
  });
};

Finally, with inferLogicalExpression we can make assumptions to narrow the types based on the truthiness or falsiness of the left operand. For and, we unify the right with the result of narrowing the left type with the falsy type union, and for or we do the opposite:

const inferLogicalExpression = (node, env, constant) => {
  const left = infer(node.left, env, constant);
  const right = () => infer(node.right, env, constant);

  switch (node.op) {
    case "and":
      if (Type.isFalsy(left)) {
        return left;
      } else if (Type.isTruthy(left)) {
        return right();
      } else {
        return unify(narrowType(left, Type.falsy), right());
      }

    case "or":
      if (Type.isTruthy(left)) {
        return left;
      } else if (Type.isFalsy(left)) {
        return right();
      } else {
        return unify(narrowType(left, Type.truthy), right());
      }

    default:
      throw new Exception(`Unknown logical operator ${node.op}`);
  }
};

Checking Against If and Cond Expressions

We need to add cases to the check function in src/typechecker/check.js for if and cond expressions:

  // other cases
  if (ast.kind === ASTTypes.IfExpression) {
    return checkIfExpression(ast, type, env);
  }

  if (ast.kind === ASTTypes.CondExpression) {
    return checkCondExpression(ast, type, env);
  }
  // const inferredType etc...

The checking functions work similiarly to their respective inference functions:

const checkIfExpression = (ast, type, env) => {
  const test = infer(ast.test, env);
  const consequent = () => check(ast.then, type, narrow(ast.test, env, true));
  const alternate = () => check(ast.else, type, narrow(ast.test, env, false));

  if (Type.isTruthy(test)) {
    consequent();
  } else if (Type.isFalsy(test)) {
    alternate();
  } else {
    consequent();
    alternate();
  }
};

const checkCondExpression = (ast, type, env) => {
  for (let clause of ast.clauses) {
    const test = infer(clause.test, env);
    check(clause.expression, type, narrow(ast.test, env, true));

    if (Type.isTruthy(test)) {
      // we can stop here if test is truthy
      return;
    }
  }

  check(ast.else, type, env);
};

Checking with The Type Checker

First we need to add the relevant cases to the switch statement in checkNode in src/typechecker/TypeChecker.js:

      // other cases
      case ASTTypes.UnaryExpression:
        return this.checkUnaryExpression(node, env);
      case ASTTypes.BinaryExpression:
        return this.checkBinaryExpression(node, env);
      case ASTTypes.LogicalExpression:
        return this.checkLogicalExpression(node, env);
      case ASTTypes.IfExpression:
        return this.checkIfExpression(node, env);
      case ASTTypes.CondExpression:
        return this.checkCondExpression(node, env);
      case ASTTypes.WhenExpression:
        return this.checkWhenExpression(node, env);
      // default case

checkUnaryExpression checks the operand then returns a new node annotated with the inferred node type:

  checkUnaryExpression(node, env) {
    const operand = this.checkNode(node.operand, env);
    return { ...node, operand, type: infer(node, env) };
  }

checkBinaryExpression does the same thing except with left and right operands:

  checkBinaryExpression(node, env) {
    const left = this.checkNode(node.left, env);
    const right = this.checkNode(node.right, env);
    return { ...node, left, right, type: infer(node, env) };
  }

checkLogicalExpression is the same as checkBinaryExpression:

  checkLogicalExpression(node, env) {
    const left = this.checkNode(node.left, env);
    const right = this.checkNode(node.right, env);
    return { ...node, left, right, type: infer(node, env) };
  }

`checkIfExpression is similar, but with all 3 expressions:

`js
checkIfExpression(node, env) {
const test = this.checkNode(node.test, env);
const then = this.checkNode(node.then, env);
const elseBranch = this.checkNode(node.else, env);

return { ...node, test, then, else: elseBranch, type: infer(node, env) };

}
`

The checkCondExpression method loops over the node's clauses and checks each test and expression, then checks the else branch and returns the fully checked node:

`js
checkCondExpression(node, env) {
/** @type {TypedAST[]} */
let clauses = [];

for (let clause of node.clauses) {
  const test = this.checkNode(clause.test, env);
  const expression = this.checkNode(clause.expression, env);

  clauses.push({ test, expression });
}

const elseBranch = this.checkNode(node.else, env);

return { ...node, clauses, else: elseBranch, type: infer(node, env) };

}
`

Finally, checkWhenExpression checks the test, then checks each node of the body, then returns the annotated node:

`js
checkWhenExpression(node, env) {
if (!node.env) {
node.env = env.extend("WhenExpression");
}

const whenEnv = node.env;
const test = this.checkNode(node.test, env);
let body = [];

for (let expr of node.body) {
  body.push(this.checkNode(expr, whenEnv));
}

return { ...node, test, body, type: infer(node, whenEnv) };

}
`

And that's it for changes to the type checker!

Changes to The Visitor

We need to add new cases and methods for each of the new AST nodes in src/visitor/Visitor.js. First, let's add the cases to the switch statement in visit:

js // other cases... case ASTTypes.IfExpression: return this.visitIfExpression(node); case ASTTypes.CondExpression: return this.visitCondExpression(node); case ASTTypes.WhenExpression: return this.visitWhenExpression(node); case ASTTypes.UnaryExpression: return this.visitUnaryExpression(node); case ASTTypes.BinaryExpression: return this.visitBinaryExpression(node); case ASTTypes.LogicalExpression: return this.visitLogicalExpression(node); // default case

Here's visitIfExpression:

`js
visitIfExpression(node) {
const test = this.visit(node.test);
const then = this.visit(node.then);
const elseBranch = this.visit(node.else);

return { ...node, test, then, else: elseBranch };

}
`

and visitCondExpression:

`js
visitCondExpression(node) {
let clauses = [];

for (let clause of node.clauses) {
  const test = this.visit(clause.test);
  const expression = this.visit(clause.expression);

  clauses.push({ test, expression });
}

const elseBranch = this.visit(node.else);

return { ...node, clauses, else: elseBranch };

}
`

and visitWhenExpression:

`js
visitWhenExpression(node) {
/** @type {AST[]} */
let body = [];
const test = this.visit(node.test);

for (let expr of node.body) {
  body.push(this.visit(expr));
}

return { ...node, test, body };

}
`

and visitUnaryExpression:

js visitUnaryExpression(node) { const operand = this.visit(node.operand); return { ...node, operand }; }

and visitBinaryExpression:

`js
visitBinaryExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);

return { ...node, left, right };

}
`

and, finally, visitLogicalExpression:

`js
visitLogicalExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);

return { ...node, left, right };

}
`

With the new Visitor methods in place, let's look at the desugarer.

Changes to The Desugarer

We need to desugar binary expressions, unary expressions, and cond expressions, so we need methods for each of them in src/desugarer/Desugarer.js. The first two are simple.

Here's visitBinaryExpression:

`js
visitBinaryExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);
const func = AST.Symbol(Token.new(TokenTypes.Symbol, node.op, node.srcloc));

return AST.CallExpression(func, [left, right], node.srcloc);

}
`

and visitUnaryExpression:

`js
visitUnaryExpression(node) {
const operand = this.visit(node.operand);

return AST.CallExpression(
  AST.Symbol(Token.new(TokenTypes.Symbol, node.op, node.srcloc)),
  [operand],
  node.srcloc
);

}
`

visitCondExpression is a bit more complicated. We're going to create a recursive helper function called condVisitHelper that takes in an array of clauses and the accumulated nested IfExpression node.

If the clauses array has more than one element, we return the accumulated IfExpression (or, if it's the first time calling the function, we create it) with a recursive call to condVisitHelper in the else branch position.

Otherwise, if clauses has one element, we return the base case which is an IfExpression node with the else branch from the original CondExpression node in the else branch position.

Here's visitCondExpression:

`js
visitCondExpression(node) {
const srcloc = node.srcloc;

/**
 * Recursive helper to construct IfExpression from CondExpression
 * @param {import("../parser/ast.js").CondClause[]} clauses
 * @param {import("../parser/ast.js").IfExpression} accum
 */
const condVisitHelper = (clauses, accum) => {
  if (clauses.length > 1) {
    const clause = clauses[0];
    accum = AST.IfExpression(
      clause.test,
      clause.expression,
      condVisitHelper(clauses.slice(1), accum),
      srcloc
    );

    return accum;
  } else if (clauses.length === 1) {
    const clause = clauses[0];
    const elseBranch = node.else;

    return AST.IfExpression(
      clause.test,
      clause.expression,
      elseBranch,
      srcloc
    );
  }
};

return condVisitHelper(node.clauses, {});

}
`

That's it for changes to the desugarer. Now let's look at changes to the emitter.

Changes to The Emitter

Since we're desugaring away 3 of the new AST node types, we only need emitter methods for IfExpression, WhenExpression, and LogicalExpression nodes.

First let's add the cases to the switch statement in emit in src/emitter/Emitter.js:

js // other cases... case ASTTypes.LogicalExpression: return this.emitLogicalExpression(node, ns); case ASTTypes.IfExpression: return this.emitIfExpression(node, ns); case ASTTypes.WhenExpression: return this.emitWhenExpression(node, ns); // default case

emitLogicalExpression switches on the operator and calls a helper method depending on which operator the node has:

`js
emitLogicalExpression(node, ns) {
switch (node.op) {
case "and":
return this.emitLogicalAnd(node, ns);

  case "or":
    return this.emitLogicalOr(node, ns);

  default:
    throw new Exception(`Unknown logical operator ${node.op}`);
}

}

emitLogicalAnd(node, ns) {
const left = this.emit(node.left, ns);
const right = this.emit(node.right, ns);

return `rt.isTruthy(${left}) ? ${right} : ${left}`;

}

emitLogicalOr(node, ns) {
const left = this.emit(node.left, ns);
const right = this.emit(node.right, ns);

return `rt.isFalsy(${left}) ? ${left} : ${right}`;

}
`

We have to compile to a ternary expression and use rt.isTruthy and rt.isFalsy since the runtime semantics of Wanda are slightly different from those of JavaScript with regards to the truthiness and falsiness of values.

emitIfExpression emits a ternary expression:

js emitIfExpression(node, ns) { returnrt.isTruthy(${this.emit(node.test, ns)}) ? ${this.emit(
node.then,
ns
)} : ${this.emit(node.else, ns)}; }

Since a when expression is imperative, it returns nil instead of a type based on the expression itself. Because of that, we need to emit a little bit of ternary expression magic to cover the case where the test is falsy so the body code is not executed. We wrap the consequent branch of the ternary statement in an IIFE so we can execute its contents and then return nil from the expression body. Here's emitWhenExpression:

js emitWhenExpression(node, ns) { const whenNs = ns.extend("WhenExpression"); let code =(rt.isTruthy(${this.emit(node.test, ns)}))
? (() => {\n`;

for (let expr of node.body) {
  code += `    ${this.emit(expr, whenNs)};\n`;
}

code += "    return null";
code += "  })()\n";
code += "  : null\n";
return code;

}
`

That's all the changes to the emitter, so let's look at printing the new AST nodes.

Changes to The Printer

We don't have any new types of values to print, but we do have 6 new node types.

First, let's add them to be dispatched from the switch statement in print in src/printer/printAST.js:

js // other cases... case ASTTypes.IfExpression: return this.printIfExpression(node, indent); case ASTTypes.CondExpression: return this.printCondExpression(node, indent); case ASTTypes.WhenExpression: return this.printWhenExpression(node, indent); case ASTTypes.BinaryExpression: return this.printBinaryExpression(node, indent); case ASTTypes.LogicalExpression: return this.printLogicalExpression(node, indent); case ASTTypes.UnaryExpression: return this.printUnaryExpression(node, indent); // default case

The methods themselves are pretty simple, so I won't belabor the point. Here they are:

js printIfExpression(node, indent) { let prStr =${prIndent(indent)}IfExpression:\n; prStr +=${prIndent(indent + 2)}Test:\n; prStr += this.print(node.test, indent + 4) + "\n"; prStr +=${prIndent(indent + 2)}Then:\n+ "\n"; prStr += this.print(node.then, indent + 4) + "\n"; prStr +=${prIndent(indent + 2)}Else:\n`;
prStr += this.print(node.else, indent + 4) + "\n";

return prStr;

}

printCondExpression(node, indent) {
let prStr = ${prIndent(indent)}CondExpression:\n;

for (let clause of node.clauses) {
  prStr += `${prIndent(indent + 2)}Test:\n`;
  prStr += this.print(clause.test, indent + 4) + "\n";
  prStr += `${prIndent(indent + 2)}Expression:\n`;
  prStr += this.print(clause.expression, indent + 4) + "\n";
}

prStr += `${prIndent(indent + 2)}Else:\n`;
prStr += this.print(node.else, indent + 4) + "\n";

return prStr;

}

printWhenExpression(node, indent) {
let prStr = ${prIndent(indent)}WhenExpression:\n;
prStr += ${prIndent(indent + 2)}Test:\n;
prStr += this.print(node.test, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Body:;

for (let expr of node.body) {
  prStr += this.print(expr, indent + 4) + "\n";
}

return prStr;

}

printBinaryExpression(node, indent) {
let prStr = ${prIndent(indent)}BinaryExpression:\n;
prStr += ${prIndent(indent + 2)}Left:\n;
prStr += this.print(node.left, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Operator:\n;
prStr += ${prIndent(indent + 4)}${node.op}\n;
prStr += ${prIndent(indent + 2)}Right:\n;
prStr += this.print(node.right, indent + 4) + "\n";

return prStr;

}

printLogicalExpression(node, indent) {
let prStr = ${prIndent(indent)}LogicalExpression:\n;
prStr += ${prIndent(indent + 2)}Left:\n;
prStr += this.print(node.left, indent + 4) + "\n";
prStr += ${prIndent(indent + 2)}Operator:\n;
prStr += ${prIndent(indent + 4)}${node.op}\n;
prStr += ${prIndent(indent + 2)}Right:\n;
prStr += this.print(node.right, indent + 4) + "\n";

return prStr;

}

printUnaryExpression(node, indent) {
let prStr = ${prIndent(indent)}UnaryExpression:\n;
prStr += ${prIndent(indent + 2)}Operator:\n;
prStr += ${prIndent(indent + 4)}${node.op}\n;
prStr += ${prIndent(indent + 2)}Operand:\n;
prStr += this.print(node.operand, indent + 4) + "\n";

return prStr;

}
`

That's all the changes to the printer. Now we're going to make some changes to the CLI to make it more flexible.

Changes to The CLI

We're going to make 3 major changes to the CLI:

Allow running a Wanda file directly from the command line (compiling and executing it in one step)
Interactively load a Wanda file as a script and use its definitions in the REPL
Output a compiled file and optionally build the whole bundle

In the case of a compiled file, you have the option to either use ES2015 imports to import the global library or to use ESBuild to bundle everything into one giant JavaScript file.

Changes to The REPL

First, we're going to change the arguments the REPL function takes. Instead of a single string argument mode, we're going to give it an options object with properties mode (string) and path (string). Everything has defaults, so it can still be called simply by running the bare function with no arguments.

The way mode works isn't changing. We will, however, enable the ability to turn off printing the AST from within the REPL if you've enabled it. Not doing that sooner was a silly oversight on my part.

The big change is if there's a value for the path option it will grab the contents of that file, compile it, and execute the result in the global environment so you'll have access to any toplevel definitions in that file within the REPL.

Here's the new version of the repl function in src/cli/repl.js:

`js
import os from "os";
import vm from "vm";
import fs from "fs";
import readlineSync from "readline-sync";
import { pprintAST, pprintDesugaredAST } from "./pprint.js";
import { println } from "../printer/println.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";
import { build } from "./build.js";
import { compile } from "./compile.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";
import { countIndent, inputFinished } from "./utils.js";

const read = (prompt) => readlineSync.question(prompt);

export const repl = ({ mode = "repl", path = "" } = {}) => {
// Create global compile environment
const compileEnv = makeGlobalNameMap();
const typeEnv = makeGlobalTypeEnv();

// Build global module and instantiate in REPL context
// This should make all compiled global symbols available
const globalCode = build(emitGlobalEnv());
vm.runInThisContext(globalCode);

if (path) {
// load file in REPL interactively
const fileContents = fs.readFileSync(path, { encoding: "utf-8" });
const compiledFile = compile(fileContents, path, compileEnv, typeEnv);
vm.runInThisContext(compiledFile);
}

let prompt = "wanda> ";
let input = "";
let indent = 0;

while (true) {
try {
input += read(prompt + " ".repeat(indent));

  switch (input) {
    // If it's a command, execute it
    case ":quit":
      process.exit(0);
    case ":print-ast":
      mode = "printAST";
      input = "";
      break;
    case ":print-desugared":
      mode = "printDesugared";
      input = "";
      break;
    case ":no-print-ast":
      mode = "repl";
      input = "";
      break;
    // If it's code, compile and run it
    default:
      if (inputFinished(input)) {
        let compiled = compile(input, "stdin", compileEnv, typeEnv);
        let result = vm.runInThisContext(compiled);

        if (mode === "printAST") {
          console.log(pprintAST(input));
        } else if (mode === "printDesugared") {
          console.log(pprintDesugaredAST(input));
        }

        println(result);
        input = "";
        indent = 0;
      } else {
        input += os.EOL;
        indent = countIndent(input);
      }
  }
} catch (e) {
  console.error(e.stack);
  input = "";
  indent = 0;
}

}
};
`

Changes to The `run` Function

I've removed the print command from the run function, now if you want to print the AST from within the REPL you have to enable it inside the REPL. You can decide whether or not you want to do the same—it's all the same to me either way.

We need to add a function that can compile and run a Wanda file, and parse the run command to do it. I've also replaced throwing exceptions with logging an error message and passing 1 to process.exit to fall more in line with UNIX shell script conventions. Here's the new run function and the runFile function that goes with it in src/cli/run.js:

`js
import vm from "vm";
import fs from "fs";
import { join } from "path";
import { repl } from "./repl.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";
import { build } from "./build.js";
import { compile } from "./compile.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";

export const run = () => {
switch (process.argv[2]) {
case "-i":
if (!process.argv[3]) {
console.log(-i option requires file path as argument);
process.exit(1);
}
const path = join(process.cwd(), process.argv[3]);
repl({ path });
break;
case undefined:
case "repl":
return repl();
case "run":
if (!process.argv[3]) {
console.log(run command requires file path as argument);
process.exit(1);
}
return runFile(join(process.cwd(), process.argv[3]));
default:
console.log("Invalid command specified");
process.exit(1);
}
};

const runFile = (path) => {
const fileContents = fs.readFileSync(path, { encoding: "utf-8" });
const globalNs = makeGlobalNameMap();
const typeEnv = makeGlobalTypeEnv();
const globalCode = build(emitGlobalEnv());
const compiledCode = compile(fileContents, path, globalNs, typeEnv);

vm.runInThisContext(globalCode);
return vm.runInThisContext(compiledCode);
};
`

Compiling and Building Files

We also need the ability to compile Wanda files and output the results, with the optional step of building the whole program instead of just using ES2015 import statements for the global library.

We'll call the main function wandac after compilers like javac and rustc, and add an auxiliary compileFile function to handle the actual compilation. Here are those functions, in src/cli/wandac.js:

`js
import os from "os";
import fs from "fs";
import { join, basename } from "path";
import { compile } from "./compile.js";
import { build } from "./build.js";
import { makeGlobalNameMap } from "../runtime/makeGlobals.js";
import { makeGlobalTypeEnv } from "../typechecker/makeGlobalTypeEnv.js";
import { emitGlobalEnv } from "../emitter/emitGlobalEnv.js";

const compileFile = (path) => {
const contents = fs.readFileSync(path, { encoding: "utf-8" });
return compile(contents, path, makeGlobalNameMap(), makeGlobalTypeEnv());
};

export const wandac = () => {
if (!process.argv[2]) {
console.log(wandac requires either a file path or command argument);
process.exit(1);
}

switch (process.argv[2]) {
case "build": {
const pathname = join(process.cwd(), process.argv[3]);
const compiledFile = compileFile(pathname);
const globals = emitGlobalEnv();
const code = globals + os.EOL + os.EOL + compiledFile;
const bName = basename(pathname).split(".")[0];
const outfile = bName + ".build" + ".js";
const built = build(code, outfile, bName);

  fs.writeFileSync(outfile, built, { encoding: "utf-8" });
  break;
}
default: {
  // should be a file path
  const pathname = join(process.cwd(), process.argv[2]);
  const compiledFile = compileFile(pathname);
  const globals = emitGlobalEnv();
  const code = globals + os.EOL + os.EOL + compiledFile;
  const outfile = basename(pathname).split(".")[0] + ".js";

  fs.writeFileSync(outfile, code, { encoding: "utf-8" });
  break;
}

}
};
`

Now let's create a new Node shell script in the bin directory (not src/bin) and link it up to run as a global system command. First, add this to bin/wandac.js:

`js

!/usr/bin/env node

import { wandac } from "../src/cli/wandac.js";

wandac();
`

Next, we need to make a slight change to package.json to account for the fact that there are now 2 executable scripts. Change

json "bin": "bin/wanda.js",

to:

json "bin": { "wanda": "bin/wanda.js", "wandac": "bin/wandac.js" },

Now make sure you're in the compiler's root directory and run the command npm link. Now you should have both wanda and wandac available as global shell commands.

Example Files

Finally, we need a couple of example files to load into the REPL and play with! I've created a couple of simple ones just to get you started.

I'm using .wanda as the file extension.

Create a new directory in the compiler root called examples and add the following 2 files:

examples/inc.wanda:

clojure (def inc (x: number) -> number (+ x 1))

examples/fib.wanda:

clojure (def fib (n: number) -> number (if (= n 1) n (* n (fib (- n 1)))))

Now if you run the command wanda -i examples/inc.wanda you can run the inc function from within the REPL! You can also compile them if you want to see the output.

Conclusion

We've made some major strides today! While our language was technically Turing complete with just functions (it's amazing what you can do with just functions), it's a lot easier to consider it so now that we have branching enabled through conditional expressions. Recursion allows for repeating code, so the combination of conditionals and functions makes the language fully Turing complete.

The CLI is also now much more fully functional compared to just being able to run a simple REPL.

Please feel free to add your own examples. If you come up with anything cool, by all means send a pull request and I'll incorporate it into the examples I've got!

As always, you can find the code as of this point on the GitHub repo under the relevant tag.

Next time we're going to add 2 iterative constructs: loop and for. I'll see you then!

DEV Community

Create Your Own Programming Language 8: Conditionals

Changes to The Lexer and Reader

Changes to The Core Library

Changes to The Parser

Changes to The Type Checker

Adding The Negation Type

Narrowing

Inference

Checking Against If and Cond Expressions

Checking with The Type Checker

Changes to The Visitor

Changes to The Desugarer

Changes to The Emitter

Changes to The Printer

Changes to The CLI

Changes to The REPL

Changes to The `run` Function

Compiling and Building Files

!/usr/bin/env node

Example Files

Conclusion

Top comments (0)

Changes to The Lexer and Reader

Changes to The Core Library

Changes to The Parser

Changes to The Type Checker

Adding The Negation Type

Narrowing

Inference

Checking Against If and Cond Expressions

Checking with The Type Checker

Changes to The Visitor

Changes to The Desugarer

Changes to The Emitter

Changes to The Printer

Changes to The CLI

Changes to The REPL

Changes to The run Function

Compiling and Building Files

!/usr/bin/env node

Example Files

Conclusion

Changes to The `run` Function