Dissecting an interview question: reconstructing a tree

College recruiting season is almost upon us again, and I’ve been doing a little research to try to find a new question. My current question is OK, but it’s a little clunky, and I’ve been sniffing around for something new.

In my research I’ve been going through Programming Problems. It’s a good compendium of the different types of problems you’re likely to run into, focusing mostly on different data structures (lists, trees, queues, stacks, hash tables, and so on) and algorithms (searching, sorting, etc.). It’s interesting how common some of the questions are – even many of the “problem solving” style questions are ones I’ve seen before.

So I was a little surprised when I ran into an unfamiliar, difficult question in the early chapters (“This question was a favorite of interviewers at Google for some time,” according to Green), with a complicated answer that left me scratching my head. OK, it happens, some people ask tough questions and if you see your way through, great, if not, well, that’s the way it goes.

Except that it gnawed at me. This is the mark of a good question – it looks hard at first, but as you think about it the way opens up. And I decided that the answer in the book was, if correct, horribly over-complicated. It also had a terrible big-O running time.

And so, I present to you the problem and my preferred solution. I hope you find this useful, or at least interesting. I encourage you to try to solve the problem before reading the walkthrough.

The problem

Reconstruct a binary search tree (BST) from a list (or array) that contains the results of a traversal of the tree. You may use any traversal you wish.

Traversals

First, a little background. There are three canonical ways to traverse a BST:

in-order: in which you first traverse the left branch, then the root, then the right branch, recursively. This gives you an ordered list of values.
pre-order: in which you first traverse the root, then the left branch, then the right branch.
post-order: in which you first traverse the left branch, then the right branch, then the root.

These traversals can be demonstrated with the following tree:

  10
 /  \
5   20

in-order: [5, 10, 20]
pre-order: [10, 5, 20]
post-order: [5, 20, 10]

Now, on to the problem.

How to think about the problem

If you were guaranteed that this were a balanced, complete tree, then you could easily use the in-order traversal to solve the problem. This, however, is the first trap – every time you catch yourself adding a constraint to a problem (“if the tree is balanced and complete…”), you need to test this with the interviewer. In this case, the answer is no, it could be any valid BST, even a terribly unbalanced one.

The next step is to figure out an approach. In this problem, our first question is which traversal to use. Can we use in-order? Even with the simple tree up above, there are five different trees the in-order traversal could have come from. So no, in-order is out.

What about pre-order? With pre-order, we first push the root node, then do the left sub-tree, then the right, so we’re guaranteed that the 10 is the root, the 5 is to the left, and the 20 is to the right. Seems like a possibility.

Post-order is very similar to pre-order, only it’s the last element in the buffer that goes into the root spot. This also seems like a possibility.

Ultimately, simpler seems better, so I’m going to decide to go with pre-order.

At this point, it’s helpful to draw a representation of what your list looks like. This isn’t just for fun – research shows that it’s hard to keep multiple things in your mind simultaneously, and putting things (especially pictures) onto the whiteboard can clear space for other thoughts. Don’t be macho and think that you’re special in this regard – it’s true of everyone, and using the whiteboard to cache important visual ideas is like giving yourself a RAM upgrade.

In this case, we have the ROOT, then the Left sub-tree, then the Right sub-tree. So the elements look like this:

ROOT L L L L L L L L R R R R R R R

Crucially, the same is true of both the left and right sub-trees. I.e., the first element of each is the root of that sub-tree, and so on. This will be important later.

Set up your data structure

We’re going to go simple here – no need for a container Tree class, everything will be a Node:

public class Node
{
    Node left, right;
    int value;
    public Node(int value)
    {
        this.value = value;
    }
}

At this point, we need to think about the algorithm. Let’s just start writing, and see where it gets us.

public static Node reconstruct(int[] values)
{
    // TODO
}

We could use an ArrayList<Integer> for the values, but in this case an array is simple, and gets us what we want.

Next, we know that the first element is the root, so let’s take care of that.

public static Node reconstruct(int[] values)
{
    Node root = new Node(values[0]);
    root.left = get_left_subtree_somehow(values);
    root.right = get_right_subtree_somehow(values);
    return root;
}

OK, this is a good next step, but now we’re stuck. How are we going to get the sub-trees? Remember what we said before about each of the sub-tree sections (“L L L”, “R R R”) following the same pattern, recursively? Because of that, we can do the exact same thing, over again:

public static Node reconstruct(int[] values)
{
    Node root = new Node(values[0]);
    root.left = reconstruct(values);
    root.right = reconstruct(values);
    return root;
}

Unfortunately, as I’m sure you’ve already seen, this will just recurse infinitely. You need to restrict the method to only work on the part of the array that you care about. For that, we’ll need some new parameters. “start” will be the first index, and “end” will be one after the last index (this is a common pattern in Java – c.f., String.substring()).

public static Node reconstruct(int[] values, int start, int end)
{
    Node root = new Node(values[start]);
    int left_subtree_start = start + 1;
    int right_subtree_start = ? // we have to figure out where the right subtree starts
    root.left = reconstruct(values,left_subtree_start,right_subtree_start);
    root.right = reconstruct(values,right_subtree_start,end);
    return root;
}

OK, we’re getting closer. Given that we know that everything between indices “start+1” and “end” is a sub-element of values[start], we can just see what’s smaller, and what’s bigger, than values[start].

public static Node reconstruct(int[] values, int start, int end)
{
    Node root = new Node(values[start]);
    int left_subtree_start = start + 1;
    int right_subtree_start = left_subtree_start;
    // find the start of the right sub-tree
    // also, make sure we don't run off the end
    while (right_subtree_start < end && values[right_subtree_start] < root.value)
    {
        right_subtree_start++;
    }
    root.left = reconstruct(values,left_subtree_start,right_subtree_start);
    root.right = reconstruct(values,right_subtree_start,end);
    return root;
}

OK, looking pretty good! All of our pseudo-code and question marks are gone, and the code looks done. This is the point at which you should go down a mental checklist. First and foremost is, have you covered all the base cases? No, you haven’t. What if there’s nothing in a sub-tree (i.e., start == end)? It’s going to happen eventually, at which point this code will break. So let’s cover that case:

public static Node reconstruct(int[] values, int start, int end)
{
    if (start == end)
    {
        return null;
    }
    Node root = new Node(values[start]);
    int left_subtree_start = start + 1;
    int right_subtree_start = left_subtree_start;
    while (right_subtree_start < end && values[right_subtree_start] < root.value)
    {
        right_subtree_start++;
    }
    root.left = reconstruct(values,left_subtree_start,right_subtree_start);
    root.right = reconstruct(values,right_subtree_start,end);
    return root;
}

The next item on the checklist is “have I answered the question?” Again, the answer is no. You may look at this and think you’re done, but this isn’t the kind of code you would expose to an end-user. They shouldn’t have to specify start and end. So, it’s time to turn our method into a helper method, and use the externally-visible method to set things up (this is a very common practice when doing recursion).

public static Node reconstruct(int[] values)
{
    return _reconstruct(values,0,values.length);
}

private static Node _reconstruct(int[] values, int start, int end)
{
    if (start == end)
    {
        return null;
    }
    Node root = new Node(values[start]);
    int left_subtree_start = start + 1;
    int right_subtree_start = left_subtree_start;
    while (right_subtree_start < end && values[right_subtree_start] < root.value)
    {
        right_subtree_start++;
    }
    root.left = _reconstruct(values,left_subtree_start,right_subtree_start);
    root.right = _reconstruct(values,right_subtree_start,end);
    return root;
}

Now, finally, it’s time to go through a sample, prove to yourself that the code works, look for anything else you’ve missed, and call it done.

End notes

If you find thinking through these kinds of problems fun and interesting, you should send me your resume! TripAdvisor is growing, and we’re looking for people who love to code in a collaborative, friendly, fun environment.

If you liked this post, you’ll also like:

Dissecting an interview question

Dissecting an interview question: reconstructing a tree

The problem

Traversals

How to think about the problem

Set up your data structure

End notes

Recommend

独立游戏开发者发现 Linux 玩家递交了更多 Bug 报告

奇客Solidot | 微软的开源项目办公室

手把手教你在线制作亚马逊要的发票(形式发票)

我们是否应该为开采月球水资源投资建造基础设施

The best iPhone 13 mini deals for October 2021 | TechRadar

奇客Solidot | 来自国企的排放

Black Friday arrived early for Blink cameras at Amazon, with prices from $24.99

流量重定向 Rootkit 使用了微软签发的数字签名

管理企业微信的工具有哪些？怎么用好企业微信？

亚马逊卖家3年老店商标被抢注，不给钱产品就下架，还被要7W天价商标费...

About Joyk