Functions vs. Procedures: Keep them separate.

#functional #architecture #computerscience #javascript

Did you know that keeping functions and procedures separate allows functional purity (side-effect free), and in some languages also allow compiler optimizations?

"We figured out how to combine in a single programming metaphor effectful computations and effect-free ones, without making them pollute each other. The type system keeps them apart." -- Simon Peyton Jones, creator of Haskell, in this lucid interview from Dec 2011.

But could we manage to achieve the same (avoid the same side-effect pollution), but entirely without handling it through the type system? Could we even manage to do it in JavaScript/TypeScript? Even without Haskell-ist wizardry like effect-ts?

The definitions

A procedure is a named block of code in a program that performs a specific task or a series of tasks. It is typically used to group related code together and make it easier to read and maintain. Procedures may take arguments as input, but they do not return a value to the caller. They typically affect the environment outside of the program (I/O, like sending requests, writing to disk, etc.).

A function is a named block of code in a program that performs a specific computation or calculation and returns a value to the caller. It takes one or more arguments as input and uses them to calculate the return value. Functions can be used as part of expressions, assigned to variables, or passed as arguments to other functions or procedures.

Some languages may use different terminology, such as methods, subroutines, or subprograms, but the basic distinction between functions and procedures remains the same.

The problem

Many languages merge the two concepts, and implement procedures as functions that return void. This may muddle/complect their distinction, causing programmers to call procedures from within functions, thereby making those functions into impure functions (meaning that they affect the world outside of themselves, through side-effects like I/O or mutating state). This should be avoided, especially if you care about debug-ability and Functional Core, Imperative Shell architectures (see Gary Bernhardt's Boundaries talk at 31:56) (which make testing your system easier, without mocking).

A history lesson

Pascal kept functions and procedures distinct from one another.

When did programming languages start to allow functions and procedures to be mixed together? Especially calling procedures from within functions (which cause the now much dreaded side-effects, thus losing functional purity and referential transparency)?

Procedures can be called from within functions in many programming languages, and this has been possible for a long time. In fact, this capability has been a fundamental part of many programming languages since their inception.

For example, in the C programming language, it's possible to call a function from within another function, and a function can also call a void function (i.e., a procedure) if needed. This feature has been present in C since its early versions, which were developed in the 1970s.

Similarly, in the Pascal programming language, which was developed in the late 1960s and early 1970s, procedures and functions are distinct but can be mixed together. A function can call a procedure, and a procedure can call a function, as long as the function is declared before the procedure that calls it.

Many other programming languages, including modern ones like Python, Java, and C++, also allow for procedures and functions to be called from within each other. So, to answer your question, there hasn't been a point in time when programming languages started to allow procedures to be called from within functions because this capability has been around since the early days of programming languages.

Why did Pascal treat procedures and functions as distinct then?

By treating procedures and functions as distinct, Pascal allows programmers to clearly indicate the purpose of each subprogram and use them in a way that reflects their intended purpose. This can make programs easier to read, understand, and maintain.

Moreover, Pascal was developed in an era when computer resources were relatively limited, and there was a need for a programming language that was efficient and precise. Treating procedures and functions as distinct helped ensure that the compiler could optimize the code for maximum efficiency, as the compiler knew exactly what to expect from each subprogram.

Overall, the decision to treat procedures and functions as distinct in Pascal was a design choice made to promote clarity, efficiency, and precision in programming.

How did it allow the Pascal compiler optimize the code for maximum efficiency?

The Pascal compiler can optimize the code for maximum efficiency by taking advantage of the fact that procedures and functions have distinct properties and usage patterns. Here are some examples of how this optimization can be done:

Stack management: When a function is called, the parameters and return address are pushed onto the stack. When the function returns, the stack is popped to restore the original state. In contrast, when a procedure is called, only the parameters are pushed onto the stack, as there is no return value to be saved. By treating functions and procedures as distinct, the compiler can generate more efficient code for managing the stack.
Return value handling: When a function returns a value, the compiler generates code to store the return value in a specific location, such as a register or memory location. In contrast, when a procedure is called, there is no return value to be stored. By treating functions and procedures as distinct, the compiler can optimize the code for returning values and avoid unnecessary instructions.
Memory allocation: When a function or a procedure is called, the compiler must allocate memory for local variables and temporary data structures. By treating functions and procedures as distinct, the compiler can optimize the memory allocation strategy for each type of subprogram, based on the expected usage pattern.
Inlining: In some cases, the compiler can replace a function call with the actual code of the function, a technique called inlining. This can eliminate the overhead of the function call and result in faster code. However, inlining may not be possible or desirable for all functions, as it can increase code size and reduce code readability.

Overall, by treating procedures and functions as distinct, the Pascal compiler can generate more efficient code that takes advantage of the unique properties of each subprogram. This optimization can improve the performance and resource usage of the resulting executable.

What can we gain from this in JavaScript?

In JavaScript, procedures are seen simply as functions that return void. It is advisable to keep procedures and regular functions (that do return a result) distinct, and prevent calling procedures from within functions (because procedures perform side-effects like I/O or mutation, and are thus impure since they affect the world outside of themselves).

One neat way to treat functions and procedures as distinct in JavaScript, is to represent procedures as async functions that return nothing more than void (actually it will be a Promise<void>). Then, normal synchronous functions would be reserved for calculations. That way, you can build your program in JavaScript out of hierarchies of procedures (async void functions), which are allowed to have hierarchies of normal sync functions within them. You would effectively gain a Functional Core, Imperative Shell architecture which was so eloquently detailed by Gary Bernhardt in his Boundaries talk (at 31:56).

JavaScript would even prevent you from calling (awaiting) a procedure (async function) from inside a (normal, sync) function, since procedures (async functions) cannot be called/awaited from within a normal function!

This quote sums it up quite nicely:

"I generally restrict my impure code to async functions. Hence, promises represent both asynchronous computations as well as impure computations, similar to the IO monad in Haskell. It works excellently because you can use pure computations in async functions but you can't directly use async computations in regular functions." -- Aadith M. Shah

Here is some JavaScript/TypeScript code illustrating this architecture:


// FUNCTIONAL CORE («Functions», which are pure and synchronous)

type BlogPost = {
  title: string;
  content: string;
  status: 'draft' | 'published';
};

const createDraftPost = (title: string, content: string): BlogPost => ({
  title,
  content,
  status: 'draft'
});

const publishPost = (post: BlogPost): BlogPost => ({
  ...post,
  status: 'published'
});

const formatPostForFile = (post: BlogPost): string => `# ${post.title}\n\n${post.content}`;

const parsePostFromFile = (fileContent: string): BlogPost | undefined => {
  const [title, ...contentLines] = fileContent.split('\n');
  const content = contentLines.join('\n').trim();
  return title.startsWith('# ') ? createDraftPost(title.slice(2), content) : undefined;
};

// IMPERATIVE SHELL («Procedures», which are async and have external effects, often called side-effects, but here they are explicit and expected)

import fs from 'fs/promises';
import path from 'path';

const postsDir = path.join(__dirname, 'posts');

const savePostToFile = async (post: BlogPost) => {
  const fileName = `${post.title.replace(/\s/g, '-')}.md`;
  const filePath = path.join(postsDir, fileName);
  const fileContent = formatPostForFile(post);
  await fs.writeFile(filePath, fileContent);
};

const loadPostFromFile = async (fileName: string): Promise<BlogPost | undefined> => {
  const filePath = path.join(postsDir, fileName);
  try {
    const fileContent = await fs.readFile(filePath, 'utf-8');
    return parsePostFromFile(fileContent);
  } catch {
    return undefined;
  }
};

// Orchestration aka. Controller code:

const createAndPublishPost = async (title: string, content: string) => {
  const draftPost = createDraftPost(title, content);
  const publishedPost = publishPost(draftPost);
  await savePostToFile(publishedPost);
  console.log(`Published post: ${title}`);
};

const loadAndPublishPost = async (fileName: string) => {
  const post = await loadPostFromFile(fileName);
  if (post && post.status === 'draft') {
    const publishedPost = publishPost(post);
    await savePostToFile(publishedPost);
    console.log(`Published post: ${post.title}`);
  } else {
    console.log(`Post ${fileName} is already published or does not exist.`);
  }
};

The Imperative Shell (savePostToFile and loadPostFromFile) relies on the Functional Core for data transformation and validation.
The orchestration functions (createAndPublishPost and loadAndPublishPost) are in the Imperative Shell, as they handle the application flow and side effects.

By moving core logic into the Functional Core, we have increased the testability and maintainability of it. The Functional Core functions can now be easily unit tested in isolation, without worrying about side effects or external dependencies.

Top comments (3)

Magne • Jun 19

I just found this, which details the same principle, but with other words (what we call Functions they call Mechanisms, and they call Procedures for Policies).

Mechanism vs. Policy

The basic idea is very simple: make a clear distinction between code that implements primitive functionality (mechanism), and code that implements application and business logic (policy).

Mechanism are things like parsing JSON, sending an email, or GUI rendering primitives;

Policy are things like a checkout flow, or a bulk data processing task that reads and writes from an S3 bucket, or the implementation of the look and feel of your application.