DEV Community

Cover image for Simulating SIMD in JS
Tracy Gilmore
Tracy Gilmore

Posted on • Edited on

Simulating SIMD in JS

SIMD is typically a feature of the processor as it is down at the hardware level where the greatest performance benefit can be gained. However, I think there remains benefit in a software implementation even in a relatively slow-running language like, my beloved, JavaScript.

So what is SIMD?

Wikipedia has it defined as:

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

Now you might have heard that JS is a single-threaded language but we can still run code, almost, concurrently using asynchronous techniques like promises and async/await, to leverage some performance gains.

Disclaimers

  1. We will not be simulating the hardware implementation faithfully, instead we will be simulating the concept of Single-Instruction-Multiple-Data. If you want to investigate SIMD on the web you might want to look at:

    • The TC39 proposal for SIMD.js, which has recently been relegated from stage three to stage two of their process because "SIMD operations exposed to the web are under active development within WebAssembly".
    • The work of John McCutchan from 2013 on "Bringing SIMD to the web via Dart".
  2. JavaScript is rarely a good choice for computationally intensive operations but sometimes there is no alternative. Processing data in the main thread should always be avoided to prevent it disrupting the User eXperience (UX). That is, if computation needs to be performed in the browser, an alternative to the main thread should be employed.

From a performance perspective, general recommendations regarding computationally intensive operations might be as follows:

  • Don't write it in JavaScript but use a lower-level language; there are loads to choose from.
  • Don't perform such operations in the browser but move it to the backend/server. But if you have use the browser consider using WASM (web assembly) to improve performance in the browser.
  • Don't perform such operations in the primary thread as this could impact UX. Consider using a Web Worker instead.
  • Don't perform such operations synchronously as this will almost certainly impact UX, use asynchronous techniques instead, although this might make the code slightly more complicated.

So why do this?

There are two main reasons for trying to simulate the concept of SIMD in JS, which, as I stated earlier, is more a hardware feature.

  1. Academic interest. There is much to be learned from such an exercise including:

  2. Also there is utility in implementing the concept of SIMD in software; even in the browser, in JS, in the main thread, but using asynchronous techniques.

If you are still interested, the source code for this post can be found in my GitHub repo.

The technical use case

We want to take a regular function and apply it to data through a concurrent operation. We want to be able to execute the SIMD version of the function whenever we want, as many times as needed, and with as much data as we have available. We also want to prepare the SIMD version of the function once before use and have it apply the original function to all supplied data at, nearly, the same time. The SIMD'ed function will return an array containing the results of all the operations once all have complete, irrespective of the success/failure of any given operation.

Partial application

For my implementation, given a base function, I will generate a 'SIMD' version of the function. This is a technique call partial-application where execution is split into two or more calls, each being provided with more arguments. In the first call we supply the base function and in the second (and subsequent) call we supply the data (multiple data items) to be processed.

The first call returns a reusable function. The second performs the computation returning the results as an array.

function simd(instruction) {
  return function (...data) {
    // Implementation to follow
  };
}

// Usage
const simdFn = simd(fn);

const results1 = simdFn( datum1, datum2, datum3);

const results2 = simdFn( datum4, datum5);
Enter fullscreen mode Exit fullscreen mode

Implementation using Promises and async/await

The simdFn function illustrated above would be asynchronous so in order to obtain the array of results instead of an array of promises the calls would have to be awaited as follows.

const results1 = await simdFn( datum1, datum2, datum3);

const results2 = await simdFn( datum4, datum5);
Enter fullscreen mode Exit fullscreen mode

Now let us look 'under the hood'.

function simd(instruction) {
  return function (...data) {
    const executions = data.map(
      datum =>
        new Promise(resolve =>
          resolve(instruction(datum)))
        )
    );
    return Promise.allSettled(executions);
  };
}
Enter fullscreen mode Exit fullscreen mode

As stated earlier, the simd function will take in the base function and return a new 'simd-ised' version (aka simdFn). The simdFn function can be called several times as required, with as much or as little data as needs, given as individual arguments. The arguments will be bound into a single data array using the rest parameter.

Execution of the 'simd' operations is performed using the map method over the data array where a new Promise is created and returned in the output array.

A try/catch within the Promise is used to protect the 'simd' operation and will execute the resolve or reject call-back functions accordingly.

We then await completion of the last function call to compile an array of results using the allSettled method. The returned array will contain an object for each datum. Each object will contain a status property with a value of 'fulfilled' if execution of the base function with the given datum was successful, or 'rejected' if execution resulted in an exception being thrown.

Fulfilled executions will be accompanied by a value property containing the result. Rejected execution objects will contain a reason property reporting the error.

For a demonstration of its use, may I kindly invite you to clone my repo, navigate to the HTML file in a browser and view the console log.

Conclusion

This implementation is based on the concept of SIMD rather than the specific nature of the hardware definition. Whilst it is interesting from an educational perspective alone, I also think there is potential for it to be of use in production code.

One consideration: If the array of source data becomes quite large and/or the computation too intensive, creating all the promises in one go could have a detrimental effect on resources, thus slowing down the very thing we are trying to optimise. It might be necessary/advisable to define a limit to the number of data items processed and batch the execution.

I would welcome any questions or comments about this post in the discussion form below. Thank you for reading.

Update April 1st, 2024 (no fooling)

In response to a comment I received on this topic, I had previously confused the term parallel with concurrent in the context of processing models. I have, hopefully, now corrected the error in this and the subsequent article but it also raises an interesting point for this article.

Whilst JS is, by default, single threaded, especially in the browser where it is primarily concerned with updating the DOM and responding to user actions, there is a Web API called Web Workers that can provide additional threads of execution.

Web Worker enable computationally intensive activity to be conducted away from the main thread, which reduces the potential impact on the user experience by having the browser blocking interaction. I would advise that the architecture of an application should first try to relocate such processing to the server but this might be a reasonable trade-off in some circumstances.

In order to make use of Web Workers in a SIMD fashion, it would be necessary to replace the function parameter with the URL of a JS script, prepared in a particular way. The SIMD wrapper could then fame out the work over a number of threads. This would utilise more than one processor on the machine, adopting a more parallel module of execution rather than concurrent.

If you would like an article (and code) demonstrating how this could be achieved, please leave a comment below.

Top comments (1)

Collapse
 
tracygjg profile image
Tracy Gilmore • Edited

If you found this post of interest you might also be interested in the next post in this series on "Streaming SIMD in JS" that is inspired by SSE see this Wikipedia article for details.