My friend (really former student) wrote me about interview he had recently - as I understand for "intern QA" position at some of Google departments (probably Adwords/Adsense but not sure). One of the questions puzzled him the most, though I think it was more due to slight panic, rather than difficulty. Here I'll share this question and then put few details he told about interview itself for those curious.
Detecting Loop in Numeric Sequence
It was a kind of brief "practical exercise" after long "theoretical part". Fellow was proposed to write small snippet of code in Python or Java (languages he manifested to know).
Suppose, we develop function, which on every call yields some number (it is ok to think of it as of random generator) - and we need to check how many calls we can do until sequence starts to repeat.
I.e. we want to:
- make sure sequence won't enter loop (repetitions) too soon;
- if possible, tell at which step it enters loop and how many steps loop take.
Some "fake" example of such function was provided (it was not remembered well, but by his recollections I reconstructed it as von Neumann random generator with improvement:
x = ... # some starting value, e.g. 1776
n = 100
nn = n * n
w = 0
s = nn // 2 - 1917
def next_num():
w = (w + s) % nn
x = ((x * x + w) // n) % nn
return x
Now what my friend was proposing and their objections:
Let us create
dict
and put every valuex
into it, and after each call check, if newx
is in this dict already. (they pointed out the state of sequence depends onw
either - but this is rather minor detail)Let's do the same, but use tuples
(x, w)
as keys of ourdict
. (they said it works forn = 100
but he should increase ton = 10000
- his program was not ending)He said program is not finishing because of too large number of iterations. However they explained it is about
nn
iterations and so would end in a few minutes unless something bad happens.After some thought he understand
dict
becomes very large (up to 100 mln elements here) - and process takes all memory and (as I suppose, hits the swap and performance degrades greatly).He offered to save memory, by replacing tuples with
w * nn + x
and usingbitset
instead ofdict
.
They agreed it is workaround, but said with n = 100000
or more program anyway may eat up all memory. I thought it may too long to calculate to be reasonable anyway - but then thought, this could be rewritten in Go
or Java
with 10 mln iterations per second speed - so running time may be about hour, not too long to be "unreasonable".
So he was told though he did well enough, they expected solution which won't depend on having gigabytes of memory.
What is your opinion? I think I understand what they mean. My idea is described in the end of this post.
About Interview
This happened in Google office in Canada (haven't asked where exactly but perhaps it could be googled), this fellow moved here with his parents year ago and completed education here or something like this. They had some Google employees visiting their college (?) and telling about possible internship or something like this. I believe there was some application process and eventually he was invited for interview.
Interview had the following parts:
- many questions about languages he said he know (Java, Python) - though only about standard libraries and syntax, no tools, frameworks etc
- few very shallow questions about other technologies, mainly SQL and HTML - i.e. checking "general understanding" about them
- logic puzzles (I believe at least some are well-known, e.g. "how to put digits on two dice so that any day number between
01
and31
could be represented) - practical exercise (single question, as described above)
I'm not sure about order of all other parts, but he said they, besides exercise, took somewhat over 1 hour and then exercise was the last and he spent "almost hour" on it solely.
He haven't shared anything about interview results yet - I suppose he have no response still.
My idea
Here it is, encoded in ROT13 (google to decode if you don't know it) - just to avoid you seeing it unintentionally and spoiling your fun.
Jr arrq gb fgber pheerag (j,k), gura eha sbe x vgrengvbaf, vs jr qba'g rapbhagre gur fnzr, jr fgber arj pheerag (j,k) naq eha sbe x*2 vgrengvbaf, naq fb ba - fb jr qba'g eryl ba qvpg ohg gel qrgrpgvat ybbc ol fvatyr "fgngr", vg znl gnxr gjb gvzrf zber vgrengvbaf, ohg jr fnir gvzr ba trggvat evq bs qvpg.
Top comments (2)
This cycle detection in functional graph problem can be solved by the Floyd's Tortoise and Hare algorithm in O(P+L) time and O(1) memory, where P is the start point of the loop, and L is the length of the loop: en.wikipedia.org/wiki/Cycle_detection
TL;DR?
The basic idea is, for all
i >= P
, we havef(i) = f(i+k*L)
, because all points after P is in the loop, so:f(k*L) = f(2*k*L)
=> there must exist a pointx=k*L
in the loop thatf(x) = f(2*x)
=> if a hare and a tortoise start their race at 0, and the hare moves 2 times as fast as the tortoise, they will meet at the pointx=k*L
in the loop.f(P) = f(P+k*L)
=> when the hare and the tortoise meet atx=k*L
, if we reset the hare's position to 0 and set its speed to be the same as the tortoise, then after P steps, they will meet again at P, or the starting point of the loop => this is the point we want to find.Thanks a lot! It never occurred to me that such question even has several named algorithms. Hare-and-Tortoise obviously couldn't be applied as is (because we have single function to call and it has some state) - but obviously it will fit in some modified version, especially if we allow restarting generator.
I'll send the link to your comment to my fellow, so thanks from him too, beforehand :)