Avoid too much sorting

“Java is slow” is the sentence that I heard very often when I began studying computer science – and I forunately never really believed it. But why the predjudice? Well Java CAN be slow if it’s just handeled wrong. Often it’s just convenience of just the missing knowledge of implemntations that makes code slow, so I’ll try to post once in a while whenever I come across such code parts in my hobby programming or at my programmings at work.

So my first issue is about sorting and autoboxing: About last week we profiled some code that felt just sloooow. It turned out that we lost most of the time within a certain loop that was executed very often. The critical part of the code was (stripped from all other stuff) like this :

ArrayList<Double> list = new ArrayList<Double>(20); // keep 20 smallest calculated values
while (condition) {
  double value = calculate(args);
  if (list.size < 20 || value < list.get(19)){
    list.add(value);
    Collections.sort(list)
  }
  // strip elements if size is > 20
}

So what’s the issue here?

  1. condition holds true for a LOT of iterations (well, can’t change this)
  2. the list is small (just 20) BUT it is to be sorted completely for each insert
  3. could autoboxing be an issue here?

Okay, what did we change?

We changed the ArrayList to a SortedDoubleArray (an implementation that I coded some time ago) that inserts the value already in the correct place using Arrays#binaraySearch() and System.arrayCopy(). As I wasn’t quite sure whether or not autoboxing could be an issue here, I created a copy of the class that operates on Doubles instead of the double primitives.

The Test

In order to compare the 3 methods (using Collections.sort(), and the SortedArrays using double and Double), I inserted 1,000,000 random double values into the structures and measured the times. The results are:

  • Collection.sort(): 2907 ms (=100%)
  • SortedDoubleArray (with Double-autoboxed values):  93 ms (~3%)
  • SortedDoubleArray (with double primitives):  94 ms (~3%)

Conclusion

  • Using Collections.sort() is convenient and in most cases absolutely okay! But if you use it in critical locations within the code (for example in loops that are executed very often), you might want to check if there isn’t a better solution.
  • Autoboxing does not hurt in our case

But never forget: Profile first, then tune. Otherwise you might tune code that has almost no impact to the overall execution time (for example, if the for-loop above is just executed 10 times).  And just change one issue after the other and perform measurements between each step so that you can identify the changes with the most impact.
If you have no profiler at hand, you might want to try the NetBeans profier.

value