loading...

Sensible Promises - Initial Considerations

joeyhub profile image Joey Hernández Updated on ・14 min read

This post concerns thoughts, notes and considerations for establishing a respectable set of promise implementations based on sound reasoning.

I intend this to provide background for some promise implementations and standards I will publish in a few weeks. Existing libraries will also be measured up against these ideas.

There is a distinct separation between theory and practice. To implement promises as a concept (theory) has no meaning if that does not intend to address any problem. While the theory is not to be ignored, it must inevitably reflect against relevant use cases.

A promise does not have an exact definition that I can find. It is one side of a coin or fuse the whole of which provides a mechanism for managing values with variable (dynamic) availability (presence, existence). A future is used to represent the other side. Whether or not the action invocation is included with that varies. For simplicity, it can be left out in most cases.

The reason for not giving a damn about the pedantic matters around this is when the figureheads for it state that a simple and elegant explanation of a future and promise is a riddle that is so vague it renders the concept meaningless for all practical purposes. As such the term is used here very loosely.

For the sake of simplicity the whole rather than the parts can be referred to as a promise in most cases and if the specific end need be referred to then terms such as producer/consumer, parent/child, write/read, set/get or master/slave can be relied upon depending on the context. What ever is most relevant to the situation being described. If something involves representing a value with variable availability then we just call that a promise in general.

Because the objectives are practical, what's referred to as promises, does not have to conform entirely to theory.

This sets out to achieve:

  • A mechanism in place of callbacks passed bare through parameters preemptively that behaves the same as much as is reasonably achievable while delivering additional capabilities.
  • Basic structures that can assist with implementing an await capability.
  • A set of protocols for describing promises to improve compatibility as opposed to resorting to a single potentially flawed or inadequate implementation.
  • Eventual combination with generators while avoiding full dependency for compatibility with older engines.

Minimalism

A very simple manifestation of a promise: []

This is the minimum or close to the minimum needed to sensibly represent a value with variable availability. Generally a value that once available both its availability and value are treated as immutable. This is not necessarily a strict requirement for all implementations but serves as a common starting point.

The value must be boxed as it has a dynamic element to it (availability). This requires one then two components. When the value is not available then only the one component is needed as there is no value. When the value is available then two components may be used. A single component can not be used as it can't be ruled out from being the value otherwise. For example, [[]] is valid. If a single component were used it could got from [] to [] as it goes from being available to not available. This holds no meaning unless keeping track of the references although even then a promise can technically return itself (p = [];p[0] = p).

The exception to this is where the value and the status are effectively the same thing, a boolean to know if something finished or not. This may be without a value or may provide the value through another channel. If the boolean is to be passed around it may still need to be wrapped to be able to read the latest value or to update it. JS does not have pointers for scalars so instead wrapping a value in a type that does use pointers (references) allows passing a value by reference.

Examples:

// Unavailable (has no value).
[]

// Immediately available (same as Promise.resolve(value)):
[value]

// Make available:
[].push(value)

// Check if available:
[].length === 1

// Read value:
[value][0]

// Check if unavailable:
[].length === 0

With this example it is up to the user to ensure that constraints are maintained such as value must not change and that length must not exceed one.

There are many variations of this, traditionally a JavaScript developer would not concern themselves with concepts such as promises and instead use constructs such as arrays to their full potential. In this case the array is used for a tiny buffer and queue/stack.

Although it is common to consider the value immutable in reality as the value may be a reference its contents may be altered.

This method is only useful in narrow cases as it's not possible to efficiently take action only following the availability of the value. Either it must be polled blocking CPU or delayed adding latency. It does however present a promise as a very minimal concept.

In many cases promises will be used to represent returned or passed values, often interchangeable. It must be kept in mind that parameters do not entirely match return.

// These do not match:
f(); // args = []
f(undefined); // args = [undefined]
// These match as far as I know:
return;
return undefined;

In reality, to make a promise reactive we add a callback. Some may take a callback list but that is not in keeping with a minimalist example. A single callback can invoke a list of callbacks.

Even this simple form has a great scope for variation in how it's used and implemented. There may not always be a universal best solution, instead it depends on what needs to be achieved as well as potentially what natural capabilities expand from the implementation.

Basics

Any implementations that describe themselves as promises should also clearly state capability and behaviours such as those relating to execution flow. This should be standard for all systems that might invoke actions against your codebase as though they were using it as a library! Any reactive or side effect invoking mechanism should detail those effects. State machines should be clearly described and what state can lead to what state or what emission detailed.

To describe the capabilities and interface of a promise according to common practices in JS it is easier to start with a base implementation as a point of reference.

A base implementation should invoke reactions immediately in a standard recursive fashion. The call order must match the same as if callbacks were used and traverse the reaction tree depth first. If there are multiple reactions queued up for the same action then they should be processed in the order they were added.

A base implementation should take a single reaction, either before or after a value is made available, buffer the value, in any order and apply it only once then disposing of the action and reaction. As a constraint an action or reaction should not be added twice. It should also represent a single channel (path). Errors should be thrown immediately as soon as they happen and should propagate the callstack the implementation is currently executing.

This base implementation is a mixture of minimalism and covering some common use cases in replacing callbacks with promises.

An example of capabilities or characteristics (which may be static or dynamic):

  • Alternate orders.
  • Alternate stacks.
  • Chains.
  • Additional channels.
  • Indirect value propagation (bipasses).
  • Multiple reactions - alternate order, queue control, etc.
  • Introspection (available state details).
  • Reaction allowed before, after or both.
  • Hard and soft constraints.
  • Buffered (no reaction if value was already delivered).
  • Multiple actions - value/availability mutates or queue of values.
  • Error handling.
  • Abortion, cancellation, etc.
  • On demand behaviour (takes initiator).
  • Other features and characteristics.

Channels

There are two main return channels, return and throw. Yield may also be considered a return channel that needs to be reproduced though significantly different to others. A standard function can only exit by returning or throwing and it can only exit once so it cannot both throw and return.

These two are almost identical except that throw usually has to be explicitly captured otherwise it will be returned to the next caller/returnee. Return deals with immediate parent and child but throw may bipass many parents to act at a distance. In normal circumstances this will still depend on the same a linked list or stack.

An open mind should be maintained for promises and return channels. In many cases the two return channels can either use two separate promise chains, exceptions can use the standard native stack or a single channel can represent either given that it's only possible for one to happen at a time. Implementing a throw channel may in some cases require a stack that could be optimised out otherwise. This has not been ruled out.

Promises don't have to entirely map to existing functional channels or be restricted to those. On their own they can be treated as raw channels to be used for any purpose.

Reference implementation A

Untested.

// States: Open -> Waiting(Reaction|Action) -> Closed

// Open -> (act -> WaitingReaction)|(react -> WaitingAction)
// WaitingReaction -> react -> Closed
// WaitingAction -> act -> Closed

function Relay() {}

Relay.close = relay => {
    if('value' in relay && 'callback' in relay) {
        const {value, callback} = relay;
        delete(relay.callback);
        delete(relay.value);
        callback(value);
    }
};

Relay.prototype.act = function(value) {
    this.value = value;
    Relay.close(this);
};

Relay.prototype.react = function(callback) {
    this.callback = callback;
    Relay.close(this);
};

This is a very basic implementation imposing soft restrictions that only cover a basic use case.

It is important that it match callbacks in this case for a callback called only once if it is called at all:

// Traditional:
const f = callback => callback(666);
f(console.log);

// Alternative:
const f = relay => relay.act(666);
const r = new Relay();
r.react(console.log);
f(r);

These two cases should behave the same. It should be possible to replace callbacks with this mechanism for where the callback is called at most once with the only difference being a bump in stack depth and some performance differences. It should not change the deterministic results of the program.

This implementation is not meant to handle any more complex cases nor rely on hard restrictions. It starts out doing what callbacks do, then a little more, then a little less.

Its limitation through hard restriction is not supporting multiple calls to the callback function though this can be worked around violating soft restriction if the callback reattached itself. Its additional ability is that it can either be passed or returned and the callback can be attached at any point in time.

A weakness is that the Closed state and Open states are in practice the same. At the same time this inadvertently creates an opportunity to optimise through reuse.

Soft restrictions mean that you can reuse this even though it's not intended. It won't stop someone from using it in a way that's invalid. It won't perform an emergency stop and instead may crash right into a tree or drive down the wrong side of the motorway. Hard restrictions might consist of checks, ignores or asserts that calls are appropriate for the state. Hard restrictions intend to make it impossible use the implementation other than in the way intended.

Even with hard restrictions in place, documentation should be similar to that as with soft restrictions. If a user has to test doing something in a state to see if it causes an error for a trivial case among a small set of combinations then the documentation is likely incomplete. Even if an error does not result the action may still be unsupported and permitted without error as a soft restriction. Systems with a mixture of the two approaches can't be ruled out.

Soft restrictions require discipline and often take time for those familiar with hard restrictions to adjust to.

It is common to remove hard restrictions from production builds for performance though few people realise that this should not be done when those restrictions are relied upon and it's not 100% proven the code would never fail a removed hard restriction test. Conversely there are developers that can write code conforming to soft restrictions up front. It does not matter how restrictions are applied, through automatic enforcement or operator discipline as long as they are specified somewhere and adhered to when relevant (there are cases where someone might make a valid decision to ignore soft restrictions but it should not be done so idly).

Soft restrictions may not only represent a contract between each end of the channel and middle man but each end of the channel directly as a subset. Even a simple implementation may present many combinations only some of which will actually occur at run time given the logic of the code creating a subset of guarantees.

Good Practices

Define Restrictions

A specification should be clear about soft and hard restrictions. Any specification or documentation should also explain characteristics not immediately apparent from the interface alone.

Focus on the problem

Specifications and implementations should be kept clear of political and social interference. A specification should directly deal with the concern itself. Thought should be put into user cases but intelligently. It is common to see a switch toggled to and fro as different users have incompatibility requirements that a single implementation cannot cater to.

Arguments such as but users will make such a mistake should be carefully scrutinised to ensure that they're not fixing the problem in the wrong place. The technical solution should not be compromised by user error or excessive attempts to fool proof. Such concerns are often better addressed with better packaging and the included documentation.

Avoid misapplied arguments, such as "implementation details should not be relied upon". Everything is an implementation detail! A common mistake is to assume anything not expressed by the prototype is implementation detail. That add(2, 2) === 4 results from implementation yet, not formally expressed by the prototype and is relied upon.

Execution order is among the most overlooked aspect of results but this is also often important. The top to bottom flow of a function which executes items in order and loops which execute things in order betray the necessity of order, otherwise they would execute in arbitrary order. This is a common pitfall when implementing inversion of control, specifically in this case, inversion of flow control. Order is very important for flow control and that doesn't go away entirely adding layers of indirection in front of the flow control.

Procedure should not dictate action alone.

Case Study: This happened with HTTP streams where it was decided that data still left on the buffer should be discarded in the event of and error based on procedure. It should not be assumed that a user would not want to consume what data they can in some scenarios. This might be common if there is a resume mechanism. This results in streaming implementations that are inefficient when errors occur. There is a reason the underlying libraries give the option of reacting to an error immediately or continuing to drain the buffer. It should have been immediately obvious this was likely a questionable technical decision requiring greater scrutiny given that it deviated from the underlying library raising the question of if said libraries were in error or if there's a reason it was like that. Offloading decision making to procedure alone is an example of a lack of leadership and cannot reliably solve problems.

The real problem was user confusion, closing a file on error rather than on end. This may have been made worse by the documentation not sufficiently describing the state machine behind the stream implementation.

Concerns

Relevant concerns should be categorised, considered and included where relevant. This includes performance impacts, compatibility, etc as well as logical behaviours and guarantees.

Relevant known unknowns should be included when expressing knowns. The scope of any implementation should be expressed to make it clear what particular use cases a solution intended to address.

Compare

Mechanisms that can replace others or derive from others should be compared to that which they are replacing and use that as a point of reference. Mismatches should be actively sought out and either explained or corrected.

This also includes considering the full scope of what's being compared too. For functions it means considering all relevant aspects of the functions (params, returns, throws, yield, etc).

It is also useful to consider the way in which other existing concurrency mechanisms do things. Given that the context is single threaded and making it possible to program as close to synchronously as possible this in particular includes mechanisms that achieve the same. That includes poll, epoll, select, the standard operations on IO resources and some thread synchronisation methods.

Multiplexing is the main way to fully write a program that's single threaded, written synchronously but is in fact asynchronous. This allows turning applications into main loops.

It's worth taking note that operations often don't do one thing but let you choose whether to block or not. If they would have blocked then they will usually let you know. Although there are differences to the JavaScript approach of using callbacks or generators the basic overall concepts are important.

Those behaviours also indicate concerns when running asynchronous functions. They may run either always sync, always async or maybe either. When this case arises it should be made clear which will happen when known beforehand and consideration should be included about whether the user should be able to choose as well as be able to find out either which it will do beforehand or which it did at run time.

Natural Order

By default the simplest least assuming and immediate thing should be done. The simplest does not always mean the least but leans in that general direction. Minimal implementations reproducing the behaviour of A but using B will make the natural behaviour of B dominant and the most immediate. It often requires additional steps to coerce B into matching the nature of A.

Case Study: This may have been a factor in promise designs that fail to perform the extra step of flipping a temporary queue onto a persistent stack and instead simply use a temporary queue which appeared the most natural but was not in the correct domain being replicated.

If the natural behaviour of a mechanism it to return maybe async then it should conform to this by default. If there is a need to potentially normalise the behaviour then it should either take a flag to allow the user to request this or return a flag of whether it ran immediately or not so that the user can decide what course of action to take.

Deferring twice should be avoided or eliminated entirely for the sake of performance and simplicity in maintaining order.

Consideration should be made as to whether the user can solve this problem instead. It is possible to detect if something returned immediately or later though this may be unwieldy in circumstances. It's not recommended to make the user have to work out something if that information is readily available in the library but not being communicated.

Error handling

This area has some special concerns.

  • Should it be possible to catch top level errors resulting from calling resolve?
  • Behaviour should be described if other listeners are not executed on a top level resolve. May want the queue to be preserved (meaning the executed items must be removed) and permit continuation.

The simplest thing should be done but then considered how that would behave in each case. Where errors come from and go to (which stack, etc) is important to know to be able to handle errors appropriately.

Cases where errors need to be steered in a different direction can't be ruled out.

Cancellation

This is a complex subject so can only briefly be mentioned. There is more than one form of cancellation (abortion).

There is a reference implementation for this that can be used though it must be extracted from archives and fixed.

A promise can be cancelled itself in various ways. It might become a noop or throw errors. Some people may simply reject the promise in some cases though this might not always solve the problem or become unwieldy. Generally these methods prohibit further use and further invocations by the promise.

Cancellation also has another involvement. It can be linked to error handling or the equivalent of destroy. Many of these mechanisms are especially relevant to all and race. In this case we don't only have promises representing a result but also a task.

Propagation makes things interesting. Abortions can be chained. Cancelling something can itself fail and require cleanup.

Consider forms and purposes of aborts, such as rollback, free, destroy, undo, etc. There's a difference between stopping something, cleaning something up and undoing something (which is a small can of worms).

Double free tends to be a common problem. This should be avoided where possible. In some cases this may not be possible or practical and may need the ability to try to free without raising an error.

Destroy chains tend to invert certain flows which means extra care is needed for managing memory. Without an automatic destruct further consideration is needed to avoid dereferencing prior to destructing.

This can complicate situations as it can sometimes require the equivalent of a reference graph on top of that of the underlying language for some use cases. It also imposes the limitation of that the logic must know when the object would be dereferenced rather than being able to pass it around and then forgetting about it. Weak references can improve some situations though raise compatibility concerns. When you need or really benefit from destructors but the language doesn't implement it then you end up implementing your own garbage collection.

Stratification

Many libraries allow the injection of a promise as long as it conforms to Promise/A+. The utility of this is sometimes questionable where the Promise/A+ specification intents to offer one set of specifics.

Libraries returning promises may only ever resolve so may only ever need half an interface to execute their end of the bargain.

When implementations are kept natural and minimal with no side effects then it also allows a base implementation that can be expanded upon. Promise/A+ would be an extension of a base type for compatibility. It would have no virtually no other use cases.

Specifications must allow the use of only that which is needed.

Posted on by:

Discussion

pic
Editor guide