DEV Community

Cover image for Solving Puzzles With High-Performance JavaScript

Solving Puzzles With High-Performance JavaScript

Andrew Healey on May 19, 2019

Premature optimization is the root of all evil. It's also the root of this article. I like programming puzzles. I also like to go fast. We're goin...
Collapse
 
stefant123 profile image
StefanT123

For the fibonacci puzzle, here is a solution that runs in a constant time using little math πŸ˜„

// 0(1) time complexity
var fib = function(N) {
    return Math.round(
        (Math.pow((1 + Math.sqrt(5)) / 2, N) - 
        Math.pow(-2 / (1 + Math.sqrt(5)), N)) / 
        Math.sqrt(5)
    );
}
Collapse
 
abhishek97 profile image
Abhishek Gupta • Edited

This solution is faster for large N, slower for small N.
jsben.ch/mOl4l


/**
 * @param {number} N
 * @return {number}
 */

const pow = function (base, n) {
    if (n==0)
        return 1

    if (n==1)
        return base

    if (n&1) 
        return base*pow(base, n-1)
    else {
        const half = pow(base, Math.floor(n/2))
        return half*half
    }
}

var fib = function(N) {
    const sqrt5 = 2.23606797749979
    const num = 1.618033988749895
    const num2 = -0.6180339887498948
    return Math.round(
        ( pow(num, N) - pow(num2, N) ) / sqrt5
    )
};

fib(NUM);

Using precomputed values for constants and using matrix expo for calculating ab (which is log(n))

Collapse
 
healeycodes profile image
Andrew Healey

Thanks for sharing this cool solution! I actually found this was slower on LeetCode than the post’s iterative solution. Crazy right? They must use low values of N in the test cases. I wonder if it’s quicker in other languages πŸ€”.

Collapse
 
stefant123 profile image
StefanT123

Hmm, I've run some benchmark tests (I ran them online on Chrome so I don't know how reliable they are) and I got two different results on two different sites. On the first site (jsben.ch/) it seems that for smaller numbers, the iterative way is faster, the real power of the pure math function is when the N >= 200. While on the other site (perf.zone/), the math function is way faster even for smaller numbers. I got this results on the perf.zone/quick:

  • for N = 20, math -> 705,738,258 op/second; iterative -> 66,518,777 op/second
  • for N = 50, math -> 725,140,230 op/second; iterative -> 16,581,587 op/second
  • for N = 100, math -> 700,447,370 op/second; iterative -> 5,875,773 op/second
  • for N = 200, math -> 706,141,355 op/second; iterative -> 2,562,412 op/second

But anyway I wonder what time complexity does Javascript Math functions have πŸ€”.

Thread Thread
 
healeycodes profile image
Andrew Healey

The variance is very interesting. I wonder if Chrome throttles certain calculations.

Thread Thread
 
stefant123 profile image
StefanT123

I think it does.

Collapse
 
damianrivas profile image
Damian Rivas

Came here to say this! Did you also learn this from a discrete mathematics class? πŸ˜„

Collapse
 
stefant123 profile image
StefanT123

I did, long time ago and it was forgotten until I saw this function somewhere (might be on dev.to) and that stuck in my brain, so every time I see a fibonacci I get a flash of this solution.πŸ˜„

Collapse
 
krusadercell profile image
KrusaderCell

Nice article. Could you please elaborate more on this sentence: "...Looking up a Set in JavaScript is constant time, O(1)...".
As far as I know if you implement Set using hash-table you could reduce most of the operations to the O(1) in an average-case but O(n) in the worst-case. Moreover in ECMAScript2015 specification for the Set.prototype.has it says:

.
4. Let entries be the List that is the value of S’s [[SetData]] internal slot.
5. Repeat for each e that is an element of entries,
If e is not empty and SameValueZero(e, value) is true, return true.
.

So I'm not sure that we could say that Set.prototype.has is an O(1) time complexity.

Collapse
 
healeycodes profile image
Andrew Healey • Edited

Thanks 😊. Yes, in the worst case the look-up time can be O(N). I was using Big O notation, where a Set look-up can be reduced down to O(1). I assumed that Chromium would be using an optimal data structure even though the spec only mandates sublinear access time.

Map object must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

Collapse
 
pamprog profile image
PamProg • Edited

I know I'm late to the party, but thanks for this reply, as you provide some links that make me understand why it "should be around O(1)" ^

If you reach this reply, how do you know/learn how to use the Set instead of doing a nested loop ?

Edit : what if in the first exemple; instead of the nested tab, we use jewels.indexOf(S[i]) !== -1 ? Exactly like using jewels.has(S[i]) ?

Thread Thread
 
healeycodes profile image
Andrew Healey

(Last time I checked) indexOf uses linear search to check for an element. So it's like doing a for loop and checking every element until it's found. It doesn't benefit from the fast look-up time of a hashtable.

I learned the benefits and use cases of different data-structures by reading a book called Grokking Algorithms and watching Introduction to Algorithms on the MIT website (MIT Course Number 6.006). Textbooks can help as well as programming puzzle websites like hackerrank and leetcode where you can test the performance of your code and read peoples' discussions of solutions. Looking up (and learning) what 'Big O' is helps too!

Collapse
 
zanehannanau profile image
ZaneHannanAU

Possible idea with the jewels thing:

var numJewelsInStones = function(J, S) {
    const jewels = new Uint8Array(8);
    for (let i = 0, l = J.length; i < l;) {
        const j = J.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] |= 1 << (j & 7);
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        const j = S.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] & (1 << (j & 7)) && ++myJewels;
    }
    return myJewels;
};

This makes heavy use of CPU caching, so should be quite fast for longer strings.

Collapse
 
zanehannanau profile image
ZaneHannanAU • Edited

Could also use A and B types

var numJewelsInStones = function(J, S) {
    let a = 0|0, b = 0|0, j = 0|0;
    for (let i = 0, l = J.length; i < l;) {
        j = J.charCodeAt(i++);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61)
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        j = S.charCodeAt(i++);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    }
    return myJewels;
};

This can be done entirely on the stack and registers, so avoids other possible slowdowns related to allocation. Only issue is that of shifting...

Collapse
 
zanehannanau profile image
ZaneHannanAU • Edited

Another, even faster:

var numJewelsInStones = function(J, S) {
    if (!J || !S) return 0; // empty string, fast return
    let a = 0|0, b = 0|0, j = 0|0, i = J.length, l = S.length, myJewels = 0 >>> 0;
    do {
        j = J.charCodeAt(--i);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61);
    } while (i);
    do {
        j = S.charCodeAt(--l);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    } while (l);
    return myJewels;
};

This is basically just the fast ASM.js compatible form I guess. Not too sure if it will work though...

Edit: leetcode.com/submissions/detail/23...

tl;dr not as fast

Thread Thread
 
zanehannanau profile image
ZaneHannanAU

A similar thing works perfectly fine in rust, though.

leetcode.com/submissions/detail/23...

impl Solution {
    pub fn num_jewels_in_stones(j: String, s: String) -> i32 {
        let mut m = 0u64;
        for i in j.bytes() {
            m |= 1 << (i - 0x41);
        }
        let mut w = 0i32;
        for i in s.bytes() {
            let g = 1 << (i - 0x41);
            if g & m != 0 { w += 1 };
        }
        w
    }
}
Thread Thread
 
healeycodes profile image
Andrew Healey

These are all so cool! I’m going to have to pick through them later πŸ‘

Thread Thread
 
theodesp profile image
Theofanis Despoudis

This is probably because all the optimisations you may do in JS land may be thrown to the bin once it touches native code. I Rust you have more control over the memory layout.

Collapse
 
titi profile image
Thibault ROHMER

Great post!

For the second problem, i'm wondering if "caching" the char helps:

        var c = s[front];
        if (c !== 'a' &&
            c !== 'e' &&
            c !== 'i' &&
            c !== 'o' &&
            c !== 'u' &&
            c !== 'A' &&
            c !== 'E' &&
            c !== 'I' &&
            c !== 'O' &&
            c !== 'U') {
            front++;
            continue;
        }

Rather than accessing s[front] for each test, especially when the char is not a vowel.
Maybe the js engine already optimizes your code πŸ€”.
Sure there's an extra affectation which cost time, but s[front] has to be "computed" each time. It's a direct memory access, but first an addition is in order to add front index to the s array adress, right ?
So what is faster:

  • 10 additions + 10 read
  • or 1 write + 10 read ?

Difference is probably marginal, but wondering how much the assembler code changes. And if the jit optimizes that.

Collapse
 
healeycodes profile image
Andrew Healey

Interesting proposal. I ran it ten times on LeetCode and didn't get a faster answer (not that that means anything!).

I ran both functions one million times on a 445-char lorem ipsum text in Chrome latest. It seems that caching the variable is marginally faster 😊

Collapse
 
moresaltmorelemon profile image
Ezra Schwepker

Fun article, thank-you!

I wrote a few solutions of my own before reading over yours just to do a comparison, and while I didn't hit 100% on any of them, they were fairly performant. However, something that was really interesting is just how inconsistent leetcode's solution runner is. My reverseVowels solution would swing between the 98th and 50th percentile with no changes to the code itself.

Did you run your solutions multiple times and experience a similar variation, or were they consistently in the 100th percentile?

Collapse
 
healeycodes profile image
Andrew Healey

Yes, I experienced a lot of variation! Since the only way I could compare my answers to existing answers was the recorded 100th percentile, that’s what I chose to measure my own.

Collapse
 
karataev profile image
Eugene Karataev

Thanks for the illustration how to solve problems iteratively: from the basic solution to the most performant.

Before reading this post I'd solve the "jewels and stones" problem by creating a js object and populating it with "jewels" keys. But using a Set collection with has method is more elegant and straightforward solution, thanks πŸ˜„

Collapse
 
healeycodes profile image
Andrew Healey

Glad you found it helpful! I always appreciate when articles build up to the optimal solution πŸ‘

Sets are also very cool 😎

Collapse
 
sunflower profile image
sunflowerseed • Edited

you used myJewels as the count of how many jewels... that is kind of fuzzy... when 10 other programmer read other programmers' code, when they see... myJewels... is it an array? Is it a string, or it is a number. Sure, you can imply it from the context, but sometimes when the code expand from 10 lines to 20 and to 30, it becomes difficult to track down.

I'd suggest using count or countJewels to exactly denote what it means. Otherwise Peter has a myJewels and is an array, Michael has a myJewels and it is a string... count has very little chance of confusion.

Collapse
 
yuripredborskiy profile image
Yuri Predborskiy

"close-down-your-IDE-and-talk-a-walk gross"
Are you kidding? It was wonderful! Simple, readable, all cases covered wonderful! Loved your gross solution. Gross optimization for the masses!
Jokes aside, thanks for your solutions! You implemented some interesting new ways to approach some problems, which made me realize there are things I'm not yet aware of! In short: Thanks!

Collapse
 
healeycodes profile image
Andrew Healey

Thanks for the kind words, Yuri!

Collapse
 
sunflower profile image
sunflowerseed

Hm... with LeetCode, this time it could be 96% faster, and then I tried again, it became 42% faster, and then 10 seconds later, I tried again, it was 98% faster...

using a ASCII map... what if the program is subject to 8-bit ISO-8859-1 characters... and then to unicode?

Collapse
 
khushbu_thakur profile image
KHUSHBU THAKUR

What about this solution for jewels problem?

function findTotal(J, S) {
return Array.from(S).filter(item => J.indexOf(item) !== -1).length;
}

Collapse
 
healeycodes profile image
Andrew Healey • Edited

That's a great solution πŸ‘. However, as we see here, indexOf has a complexity of O(N). This means that for every stone, we might have to check the whole jewels string rather than a data structure we've constructed.

For example, say we are passed 100 stones and 1000 jewels. With indexOf, the rough number of operations is stones * jewels, or 100 * 1000. What we really want is stones * Set#has, or 100 * 1. Set#has cost.