Interview Programming Problems Done Right

Introduction

Why 37signals Doesn't Hire Programmers Based on Brainteasers and my comment on HN generated a lot of responses, so much so that I'm writing this post to properly explain the essence of a good (IMHO) interview programming problem.

Pascal's Triangle

Pascal's Triangle is a shortcut for getting coefficients most often used binomial probability. The root element is 1. Every other element is the sum of the one or two above it (diagonally left and diagonally right).

There are several variations of the problem:

  • Print out the triangle to a specific row;
  • Return a given row of the triangle;
  • Return a given element (by row and index) of the triangle.

All of them use the same basic logic. You explain this to the interviewee and ask them to solve it on paper or on a whiteboard.

Recursive Solution

The simplest version is a recursive solution, something like:

#!/usr/bin/python
def value(row, index):
  if index < 0 or index > row:
    return 0
  if index == 0 or index == row:
    return 1
  return value(row-1, index-1) + value(row-1, index)

def row(n):
  return [value(n, x) for x in xrange(0, n+1)]

for i in xrange(10):
  print row(i)

If a candidate can produce this it is at least a working solution even though the performance (for non-trivial n) is prohibitive. Ideally they would be able to point this out (plus the exponential big-O performance).

On my Macbook Pro (2010) this runs for n=20 in about 0.8 seconds.

Memoization

Some candidates will improve this solution by identifying and caching the repeated calculations. Something like this:

import collections

values = collections.defaultdict(dict)

def value(row, index):
  result = values[row].get(index)
  if result is not None:
    return result
  if index < 0 or index > row:
    return 0
  if index == 0 or index == row:
    return 1
  result = value(row-1, index-1) + value(row-1, index)
  values[row][index] = result
  return result

Real time for n=20 is 0.03 seconds. Bonus points if the candidate correctly states this as "memorization" (rather than the more generic "caching").

Iterative Solution

A more common optimization is to use an iterative rather than recursive solution. The simplest version of this is something like:

#!/usr/bin/python
rows = [[1]]

for row in xrange(1, 20):
  values = [1]
  prev = rows[-1]
  for index in xrange(1, row):
    values.append(prev[index-1] + prev[index])
  values.append(1)
  rows.append(values)

for row in rows:
  print row

Performance is similar to the previous one (0.028 seconds). There are lots of subtle variations of this.

Dynamic Programming

The astute candidate will realize that you don't need to store the rows at all. You only ever need the current and the previous rows. This reduces space for O(n2) to O(n).

#!/usr/bin/python
prev = []
for row in xrange(20):
  curr = [1]
  for index in xrange(1, row):
    curr.append(prev[index-1] + prev[index])
  curr.append(1)
  print curr
  prev = curr

0.027 seconds.

Bonus Points

Assuming f(r, i) returns the value for row r and index i, a candidate may well point out that the triangle is symmetrical, specifically that f(r, i) == f(r, r-i), meaning you, at most, only have to calculate half of the triangle. This is particularly relevant if they are asked to return a specific value (ie f(117, 116) == f(117, 1) == 117).

The second optimization one could make is that since f(r, i) == f(r, i-1) + f(r, i) then to calculate f(r, i) you only need to calculate up to the i'th element of each row.

Why is this a good question?

As demonstrated, the code solutions are short. They're longer in, say, C, C++ or Java rather than Python but not that much longer. The point here isn't to get a perfect solution from the interviewee (meaning that deducting points for a missing colon or a typo would be silly). The purpose of the exercise is to demonstrate the they can turn a relatively simple algorithm into code. They have the thought process to do so. In doing so, their familiarity with their chosen language should be obvious.

Also, there are several degrees of solutions. Any solution trumps no solution but the quality of the solution will give you a useful signal (IMHO).

What should a coding test tell you?

A coding test like this is a negative filter. Assuming your chosen problem is sufficiently simple, if an interviewee can't turn it into at least the outline of code in a reasonable length of time then that's a red flag. If they can do it in record time it doesn't mean they're a rock star. You're just trying to weed out candidates who should've already been screened out.

What's a reasonable length of time? IMHO 5 minutes or less is fast (arguably blazing fast). Anything under 10 minutes is fine. If someone is taking more than say 15-20 minutes, that is possibly cause for concern.

Conclusion

The key qualities in a coding test are:

  1. It needs to be relatively simple. If it takes more than about 20-30 lines of Python to solve it's probably too complex;
  2. It needs to be easy to explain. If someone doesn't know what Pascal's Triangle is, that's not a problem. Explain it; and
  3. The goal is to put thought into code. That code doesn't need to be perfect. It just needs to be sufficiently expressive.

0 comments:

Post a Comment