DEV Community

Mage
Mage

Posted on

Solving Interview Problems with Deep Learning

Image descriptionPhoto credit: Godzilla vs. Kong

It’s been a while since I’ve practiced programming interview questions, and I worry that my skills are lacking. It’s always important to work through some every now and then to stay sharp so here we go:

Let’s start with Fizz Buzz:

Write a program that prints the numbers from 1 to 100. But for multiples of 3print “Fizz” instead of the number and for the multiples of 5 print “Buzz”. For numbers which are multiples of both 3 and 5 print “FizzBuzz”.

This solution was actually inspired by Joel Grus. We can represent numbers with binary encoding:

import tensorflow as tf
import numpy as np
INPUT_SIZE = 10 # We can encode up to 2^15 numbers with binary encoding.
HIDDEN_SIZE = 100 # Hidden layer of size 100
OUTPUT_SIZE = 4 # One hot encoding for the 4 possible outputs: "Fizz" "Buzz" "FizzBuzz", ""
NUM_EPOCHS = 10000
def binary_encode(number):
    data = np.zeros(INPUT_SIZE)
    for bitshift in range(INPUT_SIZE):
        data[bitshift] = number >> bitshift & 1 # Get the bit at position bitshift
    return data
def fizz_buzz_encode(i):
    if   i % 15 == 0: return np.array([0, 0, 0, 1])
    elif i % 5  == 0: return np.array([0, 0, 1, 0])
    elif i % 3  == 0: return np.array([0, 1, 0, 0])
    else:             return np.array([1, 0, 0, 0])
def fizz_buzz_decode(encoding):
    max_idx = np.argmax(encoding)
    if max_idx == 0:
        return ""
    if max_idx == 1:
        return "fizz"
    if max_idx == 2:
        return "buzz"
    if max_idx == 3:
        return "fizzbuzz"
Enter fullscreen mode Exit fullscreen mode

Now, we can generate some training data (we will generate training data from 101 to 1024) since our actual problem will solve fizzbuzz for 1–100.

X_train = np.array([binary_encode(i) for i in range(101, 2 ** INPUT_SIZE)])
y_train = np.array([fizz_buzz_encode(i) for i in range(101, 2 ** INPUT_SIZE)])
Enter fullscreen mode Exit fullscreen mode

And now, let’s define our multilayer perceptron model that will actually perform the bulk of the work. First, we define the inputs to the model:

X = tf.placeholder('float', [None, INPUT_SIZE])
Y = tf.placeholder('float', [None, OUTPUT_SIZE])
Enter fullscreen mode Exit fullscreen mode

Next, let’s define the weights and biases that we will learn via backpropagation:

w1 = tf.get_variable("w1", [INPUT_SIZE, HIDDEN_SIZE], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("b1", [HIDDEN_SIZE])
w2 = tf.get_variable("w2", [HIDDEN_SIZE, OUTPUT_SIZE], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("b2", [OUTPUT_SIZE])
Enter fullscreen mode Exit fullscreen mode

And now, let’s feed our input through the model:

z1 = tf.add(tf.matmul(X, w1), b1)
a1 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a1, w2), b2)
Enter fullscreen mode Exit fullscreen mode

Then, let’s compute the loss and tell tensorflow to minimize that loss:

cost = tf.nn.softmax_cross_entropy_with_logits(logits=z2, labels=Y)
optimizer = tf.train.GradientDescentOptimizer(0.001).minimize(cost)
Enter fullscreen mode Exit fullscreen mode

Now, let’s actually feed our real training data into that model:

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for i in range(NUM_EPOCHS):
        c, o = sess.run([cost, optimizer], feed_dict={X: X_train, Y: y_train})
        print(np.sum(c))
    X_test = np.array([binary_encode(i) for i in range(1, 100)])
    y_test = np.array([fizz_buzz_encode(i) for i in range(1, 100)])
    pred = sess.run(z2, feed_dict={X:X_test, Y:y_test})
Enter fullscreen mode Exit fullscreen mode

And print out the results:

def real_fizz_buzz(i):
    txt = ""
    if i % 3 == 0:
        txt += "fizz"
    if i % 5 == 0:
        txt += "buzz"
    return txt

for i in range(len(X_test)):
    text = fizz_buzz_decode(pred[i])
    true_text = real_fizz_buzz(i + 1)
    print("{}: {} ({})".format(i + 1, text, "✔" if text == true_text else "x"))
Enter fullscreen mode Exit fullscreen mode

Results:

1:  (✔)
2:  (✔)
3:  (x)
...
97:  (✔)
98:  (✔)
99: fizz (✔)
Total Accuracy: 89.89%
Enter fullscreen mode Exit fullscreen mode

For all that work, we achieved an embarrassingly low 90% accuracy. I was going to try to fix this but then quickly lost interest. Let’s move onto something a bit more interesting:

Write a program that counts the number of unique characters in a string.

This sounds like a problem that can be solved with an LSTM. Let’s generate some data.

import tensorflow as tf
import random
import numpy as np
CHARS = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
STRING_LENGTH = 12
num_examples = 10000
# Args:
#   n: Number of examples to generate.
# Returns:
#   strings_v: numpy array of the form (n, STRING_LENGTH, len(CHARS)). One hot encoding of sequences of text
#   strings: Array of actual generated random text:
#   uniques_v: numpy array of the form (n, len(CHARS)). One hot encoding of number of unique characters
#   uniques: numpy array of length n, number of unique characters for each sequence.
def generate_data(n=num_examples):
    chars_to_idx = { c: i for i, c in enumerate(CHARS)}

    strings_v = np.zeros([n, STRING_LENGTH, len(CHARS)])
    strings = [''] * n
    uniques = np.zeros(n)
    uniques_v = np.zeros([n, len(CHARS)])
    for x in range(n):
        for y in range(STRING_LENGTH):
            random.shuffle(CHARS)
            char = CHARS[0]

            strings_v[x][y][chars_to_idx[char]] = 1
            strings[x] += char

        uniques[x] = len(set(strings[x]))
        uniques_v[x][len(set(strings[x])) - 1] = 1

    return strings_v, strings, uniques_v, uniques
Enter fullscreen mode Exit fullscreen mode

Next let’s create our LSTM model:

HIDDEN_LAYERS = 64
X = tf.placeholder("float", [None, STRING_LENGTH, len(CHARS)])
y = tf.placeholder("float", [None, len(CHARS)])
X_seq = tf.unstack(X, STRING_LENGTH, 1)
lstm_cell = tf.contrib.rnn.BasicLSTMCell(HIDDEN_LAYERS)
#sequence of 12 chars to output of 7
outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, X_seq, dtype=tf.float32)
final_output = outputs[-1]
weights = tf.get_variable("weights", [HIDDEN_LAYERS, len(CHARS)], initializer=tf.random_normal_initializer())
biases = tf.get_variable("biases", [len(CHARS)], initializer=tf.random_normal_initializer())
prediction = tf.add(tf.matmul(final_output, weights), biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)
Enter fullscreen mode Exit fullscreen mode

Now, let’s go ahead and run our model:

init = tf.global_variables_initializer()
costs = []
EPOCHS = 300
with tf.Session() as sess:
    sess.run(init)
    for i in range(EPOCHS):
        _, c, _ = sess.run([train_op, cost, prediction], feed_dict={X:X_train, y: y_train})
        costs.append(c)
        if i % 10 == 0:
            print('cost: {}, epoch: {}'.format(c, i))

            X_test, X_test_strings, y_test, y_test_strings = generate_data(100)
            p = sess.run(prediction, feed_dict={X: X_test, y: y_test})
            prediction_idxs = np.argmax(p, axis=1)
            prediction_vals = prediction_idxs + 1
correct = 0.0
            for i in range(len(y_test_strings)):
                string = X_test_strings[i]
                actual_val = y_test_strings[i]
                predicted_val = prediction_vals[i]
                # Print the first 5 examples
                if i < 5:
                    print('string: {}, pred: {}, actual: {}'.format(string, predicted_val, actual_val))

                if predicted_val == actual_val:
                    correct += 1
            print("{}% accuracy\n\n".format(correct * 100 / len(y_test_strings)))
Enter fullscreen mode Exit fullscreen mode
cost: 0.0496655665338, epoch: 280
string: eeaccbdeagac, pred: 6, actual: 6.0
string: gefaddbbcfac, pred: 7, actual: 7.0
string: acedcbgdcagf, pred: 7, actual: 7.0
string: aadebcdacefg, pred: 7, actual: 7.0
string: abeeaebcbbag, pred: 5, actual: 5.0
100% accuracycost: 0.0413268692791, epoch: 290
string: ebbfbfaebede, pred: 5, actual: 5.0
string: fgaccdbabedg, pred: 7, actual: 7.0
string: dgdcbaefcdad, pred: 7, actual: 7.0
string: ffagdfedccad, pred: 6, actual: 6.0
string: gfggfgbebcdb, pred: 6, actual: 6.0
99% accuracy
Enter fullscreen mode Exit fullscreen mode

Hopefully the interviewer won’t be disappointed that we cannot solve this problem for any string that’s greater than MAX_LENGTH, but overall, it achieved a 99% accuracy. I didn’t deal with variable length inputs in this case, we could’ve easily done that by adding padding.

Now, instead of just printing out the number of unique characters, print out the actual unique characters in the order they appear.

Ex: “acabdb” => “acbd”

Oof. Thats a much tougher problem. In this case, the value we need to return is a sequence of characters instead of a single character or number so we will need some form of sequence to sequence model in order to learn this relationship.

import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
import random
#"abc" => "abc"
#"aabbac" => "abc"
#"abacd" => "abcd"
MAX_LENGTH = 6 # Max length of 6 
chars = ["a", "b", "c", "d", "e", "f"]
all_chars = chars + [' '] # Space for padding
NUM_EXAMPLES = 50000
# Args:
#   n: number of examples to generate
# Returns:
#   strings: list of strings that may contain duplicates
#   solutions: strings without duplicates
#   strings_v: One hot encoding of strings with duplicates (without padding)
#   solutions_v: One hot encoding of solutions (with padding)
def generate_data(n=NUM_EXAMPLES):
    all_chars_to_idx = { c:i for i, c in enumerate(all_chars) }
    strings_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))
    solutions_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))

    strings = [''] * NUM_EXAMPLES
    solutions = [''] * NUM_EXAMPLES

    for i in range(NUM_EXAMPLES):
        for l in range(MAX_LENGTH):
            char = random.choice(chars) # only sample from valid characters
            strings[i] += char
            if char not in solutions[i]:
                solutions[i] += char

        # Pad solutions strings
        num_missing = MAX_LENGTH - len(solutions[i])
        solutions[i] += ' ' * num_missing

    for x in range(len(strings)):
        for y in range(MAX_LENGTH):
            string_char = strings[x][y]
            strings_v[x][y][all_chars_to_idx[string_char]] = 1

            solution_char = solutions[x][y]
            solutions_v[x][y][all_chars_to_idx[solution_char]] = 1

    return strings, solutions, strings_v, solutions_v
Enter fullscreen mode Exit fullscreen mode

Again, we can one hot encode the sequences, but this time, we will add some padding since the length of the output string is variable. Instead of trying to return a variable length sequence, we will just return a sequence that is equal to the length of the input, and pad the output with spaces.

For example, “abdab” would map to a string of the same length, but with spaces as padding: “abd “.

Let’s create a test and training split.

strings, solutions, strings_v, solutions_v = generate_data()
split_at = len(strings) - (len(strings) // 10)
strings_train = strings[:split_at]
solutions_train = solutions[:split_at]
X_train = strings_v[:split_at]
y_train = solutions_v[:split_at]
strings_test = strings[split_at:]
solutions_test = solutions[split_at:]
X_test = strings_v[split_at:]
y_test = solutions_v[split_at:]
Enter fullscreen mode Exit fullscreen mode

Now we can set up our model. Let’s start with the encoder:

encoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
decoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
with tf.name_scope("basic_rnn_seq2seq") as scope:
    encoded_sequence = tf.unstack(encoded_input, MAX_LENGTH, 1)
    encoder_cell = rnn.BasicLSTMCell(128, forget_bias=1.0)
    encoded_outputs, states = rnn.static_rnn(encoder_cell, encoded_sequence, dtype=tf.float32)
Enter fullscreen mode Exit fullscreen mode

And now, the decoder:

with tf.name_scope("lstm_decoder") as scope:
    decoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
    decoder_cell = rnn.BasicLSTMCell(128, reuse=True)
    decoded_outputs, _ = rnn.static_rnn(decoder_cell, decoded_sequence, initial_state=states, dtype=tf.float32)
Enter fullscreen mode Exit fullscreen mode

Now, we can compute the predictions by multiplying the decoder’s hidden layer output with a fully connected layer:

with tf.name_scope("fully_connected") as scope:
    weights = tf.get_variable('weights', (128, len(all_chars)), initializer=tf.random_normal_initializer())
    biases = tf.get_variable('biases', (len(all_chars)), initializer=tf.random_normal_initializer())

    predictions = []
    encoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
for output in decoded_outputs:
        prediction = tf.add(tf.matmul(output, weights), biases)
        predictions.append(prediction)
concatenated_outputs = tf.stack(predictions, 0)
    concatenated_outputs = tf.transpose(concatenated_outputs, perm=[1, 0, 2])
    concatenated_inputs = tf.concat(decoded_input, 0)
Enter fullscreen mode Exit fullscreen mode

Now, we can compute the loss:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=concatenated_outputs, labels=concatenated_inputs))
# FC Layer
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)
Enter fullscreen mode Exit fullscreen mode

And now let’s run our model:

def decode_guess(one_hot):
    return ''.join([all_chars[m] for m in np.argmax(one_hot, axis=1)])
init = tf.global_variables_initializer()
costs = []
with tf.Session() as sess:
    sess.run(init)
    for i in range(30):
        e, r, c, t, c_out, c_in = sess.run([encoded_outputs, predictions, cost, train_op, concatenated_outputs, concatenated_inputs], feed_dict={encoded_input: X_train, decoded_input: y_train})
        costs.append(c)
        if i % 5 == 0:
            print('training cost: {}, epoch: {}'.format(c, i))

            results = sess.run(predictions, feed_dict={encoded_input: X_test, decoded_input: y_test})
            guesses = np.array(results).transpose(1, 0, 2)
            for i in range(5):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                print("{}: {} - {}".format(string, solution, guess_decoded))

            correct = 0.0
            for i in range(len(strings_test)):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                if i < 5:
                    print("input: {}, solution: {}, prediction: {}".format(string, solution, guess_decoded))
if solution == guess_decoded:
                    correct += 1

            print("{} % accuracy".format(correct / len(strings_test) * 100))
Enter fullscreen mode Exit fullscreen mode

And here are the results:

training cost: 2.07600975037, epoch: 0
cebbdf: cebdf  -       
dffcbc: dfcb   -       
aadeab: adeb   -       
faceec: face   -       
bfaeec: bfaec  -       
0.0 % accuracy...training cost: 0.219934388995, epoch: 25
cebbdf: cebdf  - cebdf 
dffcbc: dfcb   - dfcb  
aadeab: adeb   - adeb  
faceec: face   - face  
bfaeec: bfaec  - bfaec 
100.0 % accuracy
Enter fullscreen mode Exit fullscreen mode

It looks like by 25 epochs, our model has a fairly good understanding of how to solve this problem.
Image description

Please do not solve real interview problems in this way.

Discussion (0)