5

Java quirks and interview gotchas

 2 years ago
source link: https://dandreamsofcoding.com/2015/01/05/java-quirks-and-interview-gotchas/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Java quirks and interview gotchas

Interviewers are a diverse lot. Some care about this, others about that, each has her own set of biases, and short of being perfect, there’s really no way to please everyone. The worst is when you’re doing well, then get hung up on an obscure language feature that the interviewer decides is make-or-break. This says more about the interviewer than you, but it can easily cost you an offer if you blank or aren’t prepared.

So, as a public service announcement, and in the interests of moving the conversation past some of the annoying gotcha questions, here’s a grab bag of things you should know about Java – some more important, some less so, some just plain annoying. But, well, interviews.

  • StringBuilder

Seriously, this is one of the foundational classes you use all the time, and yet I frequently run across candidates who’ve never run across it. College students, in particular, since string manipulation presumably doesn’t come up in their classes. It’s a critically important class, though, which can be demonstrated by the following code snippets:

// Concatenation without StringBuilder
String result = "";
for (int i = 0; i < 100; i++)
  result = result + i;

// Concatenation with StringBuilder
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; i++)
  sb.append(i);
String result = sb.toString();

In the first example, you’re creating a new String and copying the values every time you concatenate result and i. This works, except it turns this into an O(n2) operation. In the latter case, you append to a char buffer each time through the loop (O(n)), only creating the result String at the end.

Caveat 1: This is such a common anti-pattern that for simple cases, the compiler is optimized to create a StringBuilder for you behind the scenes. You can see it happen if you step through your code in the debugger.

Caveat 2: If you don’t specify the initial size of the StringBuilder (as I fail to do above), then it will probably have to resize (and copy) its buffer multiple times.

Bonus history lesson: StringBuilder was a drop-in replacement for StringBuffer introduced in Java 1.5. The two are identical, with the exception that StringBuffer synchronizes all operations (which is slower, and usually unnecessary).

  • String.substring()

This is one of those things that seems horribly pedantic, terribly unfair, yet is kind of interesting, and important for you to know (both because you should understand what’s going on under the covers, and because an unusually anal interviewer might spring it on you). I.e., as of Java 7u6, the behavior and performance characteristics of the String.substring() method changed. In the old days, String contained four fields: char[], offset, length, and hash. The idea was that multiple String objects could point to the same char[], but have different offset, length, and hash fields. This made substring() into an O(1) operation, but had a couple of problems:

  • Memory leaks. Consider the following code:
    String pi_to_a_million_digits = "3.14159265358979323846264...";
    String pi_approx = pi_to_a_million_digits.substring(0,4);
    pi_to_a_million_digits = null;

    You allocate a really big String, take a very small piece of it, and throw away the original String. Now you’re carrying around the full char[], but only using four characters.

  • Serialization. Consider what happens in the above example when you try to serialize pi_approx. You only want four characters, but you end up serializing the million character char[].

Taken together, these are why you sometimes see the following in legacy Java code, with the goal of forcing creation of a new char[] for the new String:

String pi_approx = new String(pi_to_a_million_digits.substring(0,4));

As of 7u6, substring() creates a copy of the sub-string – this fixes the above problems, and removes the need for offset and length. The good news is that this now works more intuitively, and substring() no longer has a weird set of side effects. The bad news is that substring() is now less performant, and frequently takes more memory (because you’re keeping multiple copies of identical data).

  • String.intern() considered harmful?

String.intern() is a quirky little method that’s caused a lot of trouble over the years. When “interned”, common Strings are stored internally, then re-used instead of being allocated for each use. So, if you’re reading an address from a database:

String state = resultSet.getString(1).intern();
String country = resultSet.getString(2).intern();

If the state and country values have already been stored internally, then pointers to the previously allocated Strings are returned (and the Strings returned by resultSet are garbage collected). If they don’t exist yet, the new Strings are stored and returned. This saves on memory in cases where the same Strings are going to be used over and over again, and speeds comparisons a little (even so, don’t use ==, use equals(), which will check for pointer equality anyway).

Prior to Java 7, intern() got a bad rap because it put Strings into PermGen – the area of memory that doesn’t get garbage collected. Unfortunately, PermGen is usually pretty small when compared to the total heap allocation, so it was easy to hit an OutOfMemoryError when you still had plenty of heap.

As of Java 7, interned Strings are stored in the heap, so this is no longer an issue. However, there’s still a performance penalty for calling intern(), so you shouldn’t unless you have a good reason to, and know what you’re doing.

  • Double-checked locking

The following used to be a common pattern for implementing lazy initialization:

class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) {
      synchronized(this) {
        if (helper == null) 
          helper = new Helper();
      }
    }
    return helper;
  }
...
}

Alternatively, the static case, frequently used for singletons:

class Foo { 
  private static Foo instance = null;
  public static final Foo getInstance() {
    if (instance == null) {
      synchronized(Foo.class) {
        if (instance == null) 
          instance = new Foo();
      }
    }
    return instance;
  }
...
}

This makes logical sense, and if you didn’t know better, you might even have come up with this idiom on your own (ahem). There’s a pretty amusing article on why this fails non-deterministically in all sorts of creative ways. You should know the fixes, both so that you can use them as necessary, and understand why they’re in someone else’s code:

  • volatile. By specifying the helper variable as volatile (i.e., saying that it can be altered by something external to the Java thread), you force the code to look up its value in memory before using it.
    class Foo { 
      private volatile Helper helper = null;
    ...
    
  • SingletonHolder. Because of the way static initialization works (see below), using a private static internal class as a holder for a singleton instance enforces good behavior on the part of the JVM.
    class Foo {
      private final static class FooHolder {
        private static final Foo instance = new Foo();
      }
      public Foo getInstance() {
        return FooHolder.instance;
      }
    ...
    }
    
  • Static class initialization

Static fields in a class aren’t initialized until the class is referenced for the first time, which could happen when a static field is referenced, an object is instantiated, a static method is called, etc. The details of how this works can be found here, but one key point to keep in mind is that the JVM synchronizes on this initialization, which is what allows the SingletonHolder pattern described above to work. Unfortunately, it also means that classes with time-consuming initialization can block your main thread. Sometimes, the best strategy is to trivially reference the class at start-up time to trigger static initialization.

  • HashMap

HashMaps are such a normal part of web development that it’s easy to forget that they aren’t used much in other domains. Video game development, for instance, doesn’t tend to use them that much (though of course that depends on the game). They’re absolutely essential data structures to know, however, so you should just get up close and personal with them. Know what a hash function is, the difference between a Map and a Set, and (because some interviewers are pedantic) the difference between HashMap and Hashtable (HashMap is unsynchronized – faster! – and permits null values – a frequent source of bugs). Be able to explain the underlying details of how key/value pairs get stored (i.e., the key gets hashed to an int, which is mod’d to the size of the array and used as an index, then the key/value pair is stored in a linked list pointed to from the indexed cell in the array), and why you have to store the key along with the value (i.e., multiple values may be stored in a particular bucket, so you need to be able to differentiate between them).

Caveat: there are multiple ways of resolving collisions in HashMapsjava.util.HashMap uses “chaining”, described above. “Open addressing” is an alternate method for resolving index collisions. You should know both.

Bonus points: it’s also useful to know the LinkedHashMap class (and how it works), since that can sometimes short-circuit an otherwise difficult problem.

Advanced topics

Garbage collection, concurrency, and Java 8 lambda expressions are key topics that you should understand at at least a basic level, but which deserve more space than I can devote to them in this post. I’ll try to do some basic primers (nothing fancy) soon…

If you liked this…

I’ve written a lot on interviewing, and you might want to check out some of the following articles:

If you’re a Java programmer, then you should absolutely read Effective Java by Josh Bloch. Every language has its quirks, best practices, and idioms, and this book is hands down the best way I’ve found to move beyond the awkward novice period in which you know just enough to be dangerous.

Let me know if you’ve run into any other obscure details – somehow, it’s the fiddly bits that are the most interesting, and cause the most trouble.

Updated: changed “stop the world event” to “block your main thread”. Thanks ldan for the comment!

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK