I thought it would be fun to write up solutions to the Microsoft interview questions that Andrew Wagner (chessguy) posted to the Haskell Cafe mailing list. I read about his post in the Haskel Sequence weekly news of June 25, 2008. You can read his post here.
> module Questions
> where
> import Data.Array
> import Data.List
> import Random
1.) A “rotated array” is an array of integers in ascending order, after which for every element i, it has been moved to element (i + n) mod sizeOfList. Write a function that takes a rotated array and, in less-than-linear time, returns n (the amount of rotation).
First we’ll write a simple function to rotate a given array (xs) by a given amount (n).
> rotate n xs = b ++ a
> where n' = n `mod` (length xs)
> (a, b) = splitAt ((length xs) - n') xs
Now we turn to the function that computes how much the array was rotated. The key to the solution is realizing that you need to find an inflection point where consecutive numbers in the rotated list decrease. To do this, we find the first, largest sub-sequence that monotonically increases. We can find that subsequence using binary search, testing the endpoints of subsequences. The running time is O(log(n)), where n is the length of the input list (ignoring the O(n) time to construct our array).
> rotateAmount xs = _ra 0 ((length xs) - 1) (listArray (0, ((length xs) - 1)) xs)
> where _ra s e ys = if (e - s) == 1
> then (if ((ys ! s) < (ys ! e)) then s else e) -- base case
> else let h = ys ! s -- first item
> l = ys ! e -- last item
> mi = s + ((e - s) `div` 2) -- middle index
> m = ys ! mi -- middle item
> in if (h < l)
> then s -- return start index
> else if (h > m)
> then _ra s mi ys
> else _ra mi e ys
2.) You are given a list of Ball objects. Each Ball is either Red or Blue. Write a function that partitions these balls so that all of the balls of each color are contiguous. Return the index of the first ball of the second color (your result can be Red balls, then Blue balls, or the other way around). In haskell, you’ll probably want to return a ([Ball],Int).
This one is dead-simple in Haskell.
> data Ball = Red | Blue deriving (Eq, Ord, Show, Read)
> groupBalls bs = let gs = (group . sort) bs
> in (concat gs, length $ head gs)
3.) Live Search is a search engine. Suppose it was to be tied into an online store. Now you’re given two lists. One is a [(SessionId, NormalizedQuery)]. That is, when a particular user performs a query, it is turned into some consistent format, based on their apparent intent, and stored in this logfile. The second list is a [(SessionId, ProductId)]. This indicates the product bought by a particular user. Now, you want to take these two (potentially very long) lists, and return some structure that will make it easy to take a query and return a list of the most popular resulting purchases. That is, of people who have run this query in the past, what are the most common products they’ve wound up buying? The interviewer said that this is an instance of a well-known problem, but I didn’t catch the name of it.
I’m skipping this for now because its more of a rambling thought experiment. I’d love to chat about it, but I’m not going to immediately dive into writing some code.
4.) You’re given an array which contains the numbers from 1 to n, in random order, except there is one missing. Write a function to return the missing number.
Obviously sorting and walking through would work, but we can do better because the sum of the numbers 1 through n is (n * (n+1) / 2). So we can get it in one pass. This takes O(n) time. Neat.
> missingInt xs = let n = length xs
> in ((n + 1) * (n + 2) `div` 2) - (sum xs)
5.) Write a function to reconstruct a binary tree from its preorder traversal and inorder traversal. Take into account that the traversals could be invalid.
First define a tree data structure.
> data Tree a = Tree a (Tree a) (Tree a) | Leaf a | Empty
> deriving (Eq, Ord, Show, Read)
For the pre-order list we need to split it up by what elements are less than or greater than the head of the list (thinking recursively here). So in the initial list, the first element is the root, and then the subsequent items that are less than the first element are in the left branch, and the other items are in the right branch. I don’t catch invalid traversals, although they would show up if ‘rest2′ in the below function was not empty.
> preorderTree xs = f xs
> where f [] = Empty
> f (h:[]) = Leaf h
> f (h:tl) = let (left, rest1) = span (< h) tl
> (right, rest2) = span (> h) rest1
> in Tree h (f left) (f right)
> po1 = [4, 2, 1, 3, 5, 6] -- test
Now, for the in-order traversal, every input will simply be an in-order list of numbers. So we create a balanced binary tree using that traversal by recursively splitting the list in half.
> inorderTree xs = f xs
> where f [] = Empty
> f (h:[]) = Leaf h
> f (h1:h2:[]) = Tree h2 (Leaf h1) Empty
> f ys = let (left, right) = splitAt ((length ys) `div` 2) ys
> in Tree (head right) (f left) (f $ tail right)
> pi1 = [1..6]
6.) You have a [(WeatherStationId, Latitude, Longitude)]. Similar to #3, write a function which will, off-line, turn this into a data structure from which you can easily determine the nearest Weather Station, given an arbitrary Latitude and Longitude.
I’m fairly stumped by this one. I can think of good ways to to solve this problem conceptualy, but I can’t think of a simple data structure to return. I’d like to return a data structure of polygons, where each polygon encapsulates what points are closest to each weather station. I don’t feel like coding that up though.
7.) Write a function for scoring a mastermind guess. That is, in a game of mastermind (http://en.wikipedia.org/wiki/Mastermind_(board_game)), given a guess, and the actual answer, determine the number correct, the number wrong,and the number out of place.
For this solution, we zip up the answer and the guess so we can examine the elements pair-wise. We then consider each pair. If they are equal, we can increment a correct counter. Otherwise, we must accumulate a list of incorrect values. As we build up the incorrect values, we check if each guess and answer has already been accumulated in the list of incorrect values. When we find it in the accumulated list, we know we’ve found a guess that was in the wrong position.
> data ColoredPeg = W | G | B | R | P | O deriving (Eq, Ord, Show, Read)
> scoreMM as gs = let (crct, wrg, _) = foldl f (0, [], []) (zip as gs)
> wrongPosition = (length as) - crct - (length wrg)
> in (crct, (length wrg), wrongPosition)
> where f (c, oA, oG)
> (a, g) = case g == a of -- update an accumulator after examining each pair
> True -> (c+1, oA, oG) -- match, so simply update the count of correct
> False -> case ((a `elem` oG), (g `elem` oA)) of
> (True,True)->(c, (rem g oA), (rem a oG))
> (False,True)->(c, a:(rem g oA), (rem a oG))
> (True,False)->(c, (rem g oA), g:(rem a oG))
> (False,False)->(c, a:(rem g oA), g:(rem a oG))
> rem a [] = []
> rem a (b:bs) = if (a == b) then bs else b:(rem a bs)
> testAnswer1 = [W, G, B, R]
> testGuess1 = [W, G, R, B]
> testGuess2 = [W, G, R, P]
8.) Implement a trie (http://en.wikipedia.org/wiki/Trie) data structure. Write a function add, which takes a word, and a trie, and adds the word to the trie. Write another function lookup, which takes a prefix and a trie, and returns all the words in the trie that start with that prefix.
This is a fairly straightforward problem. We create the Trie data structure and define the recursive functions. I wish that we didn’t have a special data value for the Trie root, but I couldn’t think of an alternate solution besides having a special null character as the root.
> data Trie = TrieRoot { tries :: [Trie] }
> | Trie { char :: Char, tries :: [Trie] }
> deriving (Eq, Ord, Show, Read)
> trieAdd t [] = t
> trieAdd (TrieRoot ts) ws = TrieRoot{tries = (_tahelp ts ws)}
> trieAdd (Trie c ts) ws = Trie{char = c, tries = (_tahelp ts ws)}
> _tahelp ts (w:ws) = case find (\t -> (char t) == w) ts of
> Nothing -> let newTrie = trieAdd Trie{char=w,
> tries=[]} ws
> in newTrie:ts
> Just mtch -> let otherTries = filter (\t -> (char t) /= w) ts
> newTrie = trieAdd mtch ws
> in newTrie:otherTries
> trieLookup (TrieRoot ts) ws = concatMap (\t -> trieLookup t ws) ts
> trieLookup (Trie c ts) [] = case ts of
> [] -> [[c]]
> otherwise -> recurse
> where recurse = map (\str -> c:str) $ concatMap (\t -> trieLookup t []) ts
> trieLookup (Trie c ts) (w:ws) = if (c == w)
> then case ts of -- we matched the character
> [] -> [[c]]
> otherwise -> recurse
> else [] -- stop the search down this trie branch
> where recurse = map (\str -> c:str) $ concatMap (\t -> trieLookup t ws) ts
> testTrie = trieAdd (trieAdd (trieAdd (TrieRoot []) "hi") "hot") "gr"
> testTrie2 = trieAdd (TrieRoot []) "h"
9.) Write an algorithm to shuffle a deck of cards. Now write a function to perform some kind of evaluation of “how shuffled” a deck of cards is.
To shuffle I swap element i with a random element at position [i, (n-1)].
> shuffle gen xs = let n = length xs
> a = listArray (0, (n - 1)) xs -- create an Array
> swapPairs = (zip [0..(n - 1)] (swaps gen n)) -- compute a sequence of swaps to perform
> in elems $ foldl f a swapPairs
> where f a' (i1, i2) = a' // [(i1, a'!i2), (i2, a'!i1)] -- swap element at i1 with element at i2
Here’s a function which generates a sequence of swaps. We swap element at index ‘i’ into the range [i, (n-1)].
> swaps gen n = unfoldr f (gen, 0)
> where f (g, i) = if (i < n)
> then let (randInt, g') = next g
> swapIndex = (randInt `mod` (n - i)) + i -- produce a number in [i, (n-1)]
> in Just (swapIndex, (g', (i+1))) -- add 'swapIndex' to the generated list
> else Nothing -- stop generating elements
Here are functions that generates a sequence of random numbers or random generators. Can be used to call shuffle multiple times.
> randomGems = map mkStdGen randomNums
> randomNums = unfoldr (Just . next) (mkStdGen 1)
One way we might measure how good the suffle is (although for a single shuffle, the question is a bit preposterous), is to compute the min, max, and average distance that each item moved in the list.
> shuffleQuality xs ys = let xsPos = sortBySnd $ zip [1..(length xs)] xs
> ysPos = sortBySnd $ zip [1..(length ys)] ys
> diffFun = (\ (xp, yp) -> abs ((fst xp) - (fst yp)))
> posDiffs = map diffFun (zip xsPos ysPos)
> maxDiff = maximum posDiffs
> minDiff = minimum posDiffs
> avgDiff = (sum posDiffs) `div` (length posDiffs)
> in (maxDiff, minDiff, avgDiff)
> where sortBySnd ps = sortBy (\ p1 p2 -> compare (snd p1) (snd p2)) ps