DEV Community

Solving Puzzles With High-Performance JavaScript

Andrew Healey on May 19, 2019

Premature optimization is the root of all evil. It's also the root of this article. I like programming puzzles. I also like to go fast. We're goin...

Read full post

StefanT123 • May 21 '19

For the fibonacci puzzle, here is a solution that runs in a constant time using little math 😄

// 0(1) time complexity
var fib = function(N) {
    return Math.round(
        (Math.pow((1 + Math.sqrt(5)) / 2, N) - 
        Math.pow(-2 / (1 + Math.sqrt(5)), N)) / 
        Math.sqrt(5)
    );
}

Abhishek Gupta • Oct 19 '19 • Edited

This solution is faster for large N, slower for small N.
jsben.ch/mOl4l


/**
 * @param {number} N
 * @return {number}
 */

const pow = function (base, n) {
    if (n==0)
        return 1

    if (n==1)
        return base

    if (n&1) 
        return base*pow(base, n-1)
    else {
        const half = pow(base, Math.floor(n/2))
        return half*half
    }
}

var fib = function(N) {
    const sqrt5 = 2.23606797749979
    const num = 1.618033988749895
    const num2 = -0.6180339887498948
    return Math.round(
        ( pow(num, N) - pow(num2, N) ) / sqrt5
    )
};

fib(NUM);

Using precomputed values for constants and using matrix expo for calculating a^b (which is log(n))

Andrew Healey • May 21 '19

Thanks for sharing this cool solution! I actually found this was slower on LeetCode than the post’s iterative solution. Crazy right? They must use low values of N in the test cases. I wonder if it’s quicker in other languages 🤔.

StefanT123 • May 21 '19

Hmm, I've run some benchmark tests (I ran them online on Chrome so I don't know how reliable they are) and I got two different results on two different sites. On the first site (jsben.ch/) it seems that for smaller numbers, the iterative way is faster, the real power of the pure math function is when the N >= 200. While on the other site (perf.zone/), the math function is way faster even for smaller numbers. I got this results on the perf.zone/quick:

for N = 20, math -> 705,738,258 op/second; iterative -> 66,518,777 op/second
for N = 50, math -> 725,140,230 op/second; iterative -> 16,581,587 op/second
for N = 100, math -> 700,447,370 op/second; iterative -> 5,875,773 op/second
for N = 200, math -> 706,141,355 op/second; iterative -> 2,562,412 op/second

But anyway I wonder what time complexity does Javascript Math functions have 🤔.

Andrew Healey • May 21 '19

The variance is very interesting. I wonder if Chrome throttles certain calculations.

StefanT123 • May 21 '19

I think it does.

Damian Rivas • May 23 '19

Came here to say this! Did you also learn this from a discrete mathematics class? 😄

StefanT123 • May 23 '19

I did, long time ago and it was forgotten until I saw this function somewhere (might be on dev.to) and that stuck in my brain, so every time I see a fibonacci I get a flash of this solution.😄

KrusaderCell • May 23 '19

Nice article. Could you please elaborate more on this sentence: "...Looking up a Set in JavaScript is constant time, O(1)...".
As far as I know if you implement Set using hash-table you could reduce most of the operations to the O(1) in an average-case but O(n) in the worst-case. Moreover in ECMAScript2015 specification for the Set.prototype.has it says:

.
4. Let entries be the List that is the value of S’s [[SetData]] internal slot.
5. Repeat for each e that is an element of entries,
If e is not empty and SameValueZero(e, value) is true, return true.
.

So I'm not sure that we could say that Set.prototype.has is an O(1) time complexity.

Andrew Healey • May 23 '19 • Edited

Thanks 😊. Yes, in the worst case the look-up time can be O(N). I was using Big O notation, where a Set look-up can be reduced down to O(1). I assumed that Chromium would be using an optimal data structure even though the spec only mandates sublinear access time.

Map object must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

PamProg • Sep 10 '19 • Edited

I know I'm late to the party, but thanks for this reply, as you provide some links that make me understand why it "should be around O(1)" ^{^}

If you reach this reply, how do you know/learn how to use the Set instead of doing a nested loop ?

Edit : what if in the first exemple; instead of the nested tab, we use jewels.indexOf(S[i]) !== -1 ? Exactly like using jewels.has(S[i]) ?

Andrew Healey • Sep 10 '19

(Last time I checked) indexOf uses linear search to check for an element. So it's like doing a for loop and checking every element until it's found. It doesn't benefit from the fast look-up time of a hashtable.

I learned the benefits and use cases of different data-structures by reading a book called Grokking Algorithms and watching Introduction to Algorithms on the MIT website (MIT Course Number 6.006). Textbooks can help as well as programming puzzle websites like hackerrank and leetcode where you can test the performance of your code and read peoples' discussions of solutions. Looking up (and learning) what 'Big O' is helps too!

ZaneHannanAU • May 23 '19

Possible idea with the jewels thing:

var numJewelsInStones = function(J, S) {
    const jewels = new Uint8Array(8);
    for (let i = 0, l = J.length; i < l;) {
        const j = J.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] |= 1 << (j & 7);
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        const j = S.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] & (1 << (j & 7)) && ++myJewels;
    }
    return myJewels;
};

This makes heavy use of CPU caching, so should be quite fast for longer strings.

ZaneHannanAU • May 23 '19 • Edited

Could also use A and B types

var numJewelsInStones = function(J, S) {
    let a = 0|0, b = 0|0, j = 0|0;
    for (let i = 0, l = J.length; i < l;) {
        j = J.charCodeAt(i++);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61)
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        j = S.charCodeAt(i++);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    }
    return myJewels;
};

This can be done entirely on the stack and registers, so avoids other possible slowdowns related to allocation. Only issue is that of shifting...

ZaneHannanAU • May 23 '19 • Edited

Another, even faster:

var numJewelsInStones = function(J, S) {
    if (!J || !S) return 0; // empty string, fast return
    let a = 0|0, b = 0|0, j = 0|0, i = J.length, l = S.length, myJewels = 0 >>> 0;
    do {
        j = J.charCodeAt(--i);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61);
    } while (i);
    do {
        j = S.charCodeAt(--l);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    } while (l);
    return myJewels;
};

This is basically just the fast ASM.js compatible form I guess. Not too sure if it will work though...

Edit: leetcode.com/submissions/detail/23...

tl;dr not as fast

ZaneHannanAU • May 23 '19

A similar thing works perfectly fine in rust, though.

leetcode.com/submissions/detail/23...

impl Solution {
    pub fn num_jewels_in_stones(j: String, s: String) -> i32 {
        let mut m = 0u64;
        for i in j.bytes() {
            m |= 1 << (i - 0x41);
        }
        let mut w = 0i32;
        for i in s.bytes() {
            let g = 1 << (i - 0x41);
            if g & m != 0 { w += 1 };
        }
        w
    }
}

Andrew Healey • May 23 '19

These are all so cool! I’m going to have to pick through them later 👍

Theofanis Despoudis • Oct 22 '19

This is probably because all the optimisations you may do in JS land may be thrown to the bin once it touches native code. I Rust you have more control over the memory layout.

Thibault ROHMER • May 22 '19

Great post!

For the second problem, i'm wondering if "caching" the char helps:

        var c = s[front];
        if (c !== 'a' &&
            c !== 'e' &&
            c !== 'i' &&
            c !== 'o' &&
            c !== 'u' &&
            c !== 'A' &&
            c !== 'E' &&
            c !== 'I' &&
            c !== 'O' &&
            c !== 'U') {
            front++;
            continue;
        }

Rather than accessing s[front] for each test, especially when the char is not a vowel.
Maybe the js engine already optimizes your code 🤔.
Sure there's an extra affectation which cost time, but s[front] has to be "computed" each time. It's a direct memory access, but first an addition is in order to add front index to the s array adress, right ?
So what is faster:

10 additions + 10 read
or 1 write + 10 read ?

Difference is probably marginal, but wondering how much the assembler code changes. And if the jit optimizes that.

Andrew Healey • May 22 '19

Interesting proposal. I ran it ten times on LeetCode and didn't get a faster answer (not that that means anything!).

I ran both functions one million times on a 445-char lorem ipsum text in Chrome latest. It seems that caching the variable is marginally faster 😊

Ezra Schwepker • May 22 '19

Fun article, thank-you!

I wrote a few solutions of my own before reading over yours just to do a comparison, and while I didn't hit 100% on any of them, they were fairly performant. However, something that was really interesting is just how inconsistent leetcode's solution runner is. My reverseVowels solution would swing between the 98th and 50th percentile with no changes to the code itself.

Did you run your solutions multiple times and experience a similar variation, or were they consistently in the 100th percentile?

Andrew Healey • May 23 '19

Yes, I experienced a lot of variation! Since the only way I could compare my answers to existing answers was the recorded 100th percentile, that’s what I chose to measure my own.

Eugene Karataev • May 20 '19

Thanks for the illustration how to solve problems iteratively: from the basic solution to the most performant.

Before reading this post I'd solve the "jewels and stones" problem by creating a js object and populating it with "jewels" keys. But using a Set collection with has method is more elegant and straightforward solution, thanks 😄

Andrew Healey • May 20 '19

Glad you found it helpful! I always appreciate when articles build up to the optimal solution 👍

Sets are also very cool 😎

sunflowerseed • Feb 25 '20 • Edited

you used myJewels as the count of how many jewels... that is kind of fuzzy... when 10 other programmer read other programmers' code, when they see... myJewels... is it an array? Is it a string, or it is a number. Sure, you can imply it from the context, but sometimes when the code expand from 10 lines to 20 and to 30, it becomes difficult to track down.

I'd suggest using count or countJewels to exactly denote what it means. Otherwise Peter has a myJewels and is an array, Michael has a myJewels and it is a string... count has very little chance of confusion.

Yuri Predborskiy • May 23 '19

"close-down-your-IDE-and-talk-a-walk gross"
Are you kidding? It was wonderful! Simple, readable, all cases covered wonderful! Loved your gross solution. Gross optimization for the masses!
Jokes aside, thanks for your solutions! You implemented some interesting new ways to approach some problems, which made me realize there are things I'm not yet aware of! In short: Thanks!

Andrew Healey • May 26 '19

Thanks for the kind words, Yuri!

sunflowerseed • Feb 25 '20

Hm... with LeetCode, this time it could be 96% faster, and then I tried again, it became 42% faster, and then 10 seconds later, I tried again, it was 98% faster...

using a ASCII map... what if the program is subject to 8-bit ISO-8859-1 characters... and then to unicode?

KHUSHBU THAKUR • May 20 '19

What about this solution for jewels problem?

function findTotal(J, S) {
return Array.from(S).filter(item => J.indexOf(item) !== -1).length;
}

Andrew Healey • May 20 '19 • Edited

That's a great solution 👍. However, as we see here, indexOf has a complexity of O(N). This means that for every stone, we might have to check the whole jewels string rather than a data structure we've constructed.

For example, say we are passed 100 stones and 1000 jewels. With indexOf, the rough number of operations is stones * jewels, or 100 * 1000. What we really want is stones * Set#has, or 100 * 1. Set#has cost.