loading...

Loop detection - Google interview question

rodiongork profile image Rodion Gorkovenko ・3 min read

My friend (really former student) wrote me about interview he had recently - as I understand for "intern QA" position at some of Google departments (probably Adwords/Adsense but not sure). One of the questions puzzled him the most, though I think it was more due to slight panic, rather than difficulty. Here I'll share this question and then put few details he told about interview itself for those curious.

Detecting Loop in Numeric Sequence

It was a kind of brief "practical exercise" after long "theoretical part". Fellow was proposed to write small snippet of code in Python or Java (languages he manifested to know).

Suppose, we develop function, which on every call yields some number (it is ok to think of it as of random generator) - and we need to check how many calls we can do until sequence starts to repeat.

I.e. we want to:

  • make sure sequence won't enter loop (repetitions) too soon;
  • if possible, tell at which step it enters loop and how many steps loop take.

Some "fake" example of such function was provided (it was not remembered well, but by his recollections I reconstructed it as von Neumann random generator with improvement:

x = ... # some starting value, e.g. 1776

n = 100
nn = n * n
w = 0
s = nn // 2 - 1917

def next_num():
    w = (w + s) % nn
    x = ((x * x + w) // n) % nn
    return x

Now what my friend was proposing and their objections:

  1. Let us create dict and put every value x into it, and after each call check, if new x is in this dict already. (they pointed out the state of sequence depends on w either - but this is rather minor detail)

  2. Let's do the same, but use tuples (x, w) as keys of our dict. (they said it works for n = 100 but he should increase to n = 10000 - his program was not ending)

  3. He said program is not finishing because of too large number of iterations. However they explained it is about nn iterations and so would end in a few minutes unless something bad happens.

  4. After some thought he understand dict becomes very large (up to 100 mln elements here) - and process takes all memory and (as I suppose, hits the swap and performance degrades greatly).

  5. He offered to save memory, by replacing tuples with w * nn + x and using bitset instead of dict.

They agreed it is workaround, but said with n = 100000 or more program anyway may eat up all memory. I thought it may too long to calculate to be reasonable anyway - but then thought, this could be rewritten in Go or Java with 10 mln iterations per second speed - so running time may be about hour, not too long to be "unreasonable".

So he was told though he did well enough, they expected solution which won't depend on having gigabytes of memory.

What is your opinion? I think I understand what they mean. My idea is described in the end of this post.


About Interview

This happened in Google office in Canada (haven't asked where exactly but perhaps it could be googled), this fellow moved here with his parents year ago and completed education here or something like this. They had some Google employees visiting their college (?) and telling about possible internship or something like this. I believe there was some application process and eventually he was invited for interview.

Interview had the following parts:

  • many questions about languages he said he know (Java, Python) - though only about standard libraries and syntax, no tools, frameworks etc
  • few very shallow questions about other technologies, mainly SQL and HTML - i.e. checking "general understanding" about them
  • logic puzzles (I believe at least some are well-known, e.g. "how to put digits on two dice so that any day number between 01 and 31 could be represented)
  • practical exercise (single question, as described above)

I'm not sure about order of all other parts, but he said they, besides exercise, took somewhat over 1 hour and then exercise was the last and he spent "almost hour" on it solely.

He haven't shared anything about interview results yet - I suppose he have no response still.

My idea

Here it is, encoded in ROT13 (google to decode if you don't know it) - just to avoid you seeing it unintentionally and spoiling your fun.

Jr arrq gb fgber pheerag (j,k), gura eha sbe x vgrengvbaf, vs jr qba'g rapbhagre gur fnzr, jr fgber arj pheerag (j,k) naq eha sbe x*2 vgrengvbaf, naq fb ba - fb jr qba'g eryl ba qvpg ohg gel qrgrpgvat ybbc ol fvatyr "fgngr", vg znl gnxr gjb gvzrf zber vgrengvbaf, ohg jr fnir gvzr ba trggvat evq bs qvpg.

Posted on by:

rodiongork profile

Rodion Gorkovenko

@rodiongork

Developer, Dreamer, School-Teacher, author of CodeAbbey and few other silly projects :)

Discussion

pic
Editor guide
 

This cycle detection in functional graph problem can be solved by the Floyd's Tortoise and Hare algorithm in O(P+L) time and O(1) memory, where P is the start point of the loop, and L is the length of the loop: en.wikipedia.org/wiki/Cycle_detection

TL;DR?
The basic idea is, for all i >= P, we have f(i) = f(i+k*L), because all points after P is in the loop, so:

  1. f(k*L) = f(2*k*L) => there must exist a point x=k*L in the loop that f(x) = f(2*x) => if a hare and a tortoise start their race at 0, and the hare moves 2 times as fast as the tortoise, they will meet at the point x=k*L in the loop.
  2. f(P) = f(P+k*L) => when the hare and the tortoise meet at x=k*L, if we reset the hare's position to 0 and set its speed to be the same as the tortoise, then after P steps, they will meet again at P, or the starting point of the loop => this is the point we want to find.
 

Thanks a lot! It never occurred to me that such question even has several named algorithms. Hare-and-Tortoise obviously couldn't be applied as is (because we have single function to call and it has some state) - but obviously it will fit in some modified version, especially if we allow restarting generator.

I'll send the link to your comment to my fellow, so thanks from him too, beforehand :)