Java Article 15 - Experiments and compareTo()

In this article I am going to talk very briefly about experimental design while covering the workings of Java's compareTo() method.

I was in a discussion the other day about Java's compareTo() method. compareTo() is useful for searching and sorting collections (or container classes like an array) of strings, and comparing strings to other objects.

In the discussion there was a misunderstanding about compareTo() that was causing some bugs. A Java textbook was pulled out and I was shown some incorrect examples of how compareTo() worked. Sadly, this textbook had the return values reversed (in two different chapters even). To prove my point I had to design a simple experiment for people to see how compareTo() really worked.

Showing someone a simple program that is easy to read and understand is a great way of demonstrating how something works.

As a programmer you learn a lot from experimenting. Often the documentation for a language can not always explain things sufficiently and you just have to play around with a method, an algorithm, or a design pattern to gain an understanding of how it works.

compareTo() compares two strings based on the Unicode value of each the characters in the strings. If the string object (firstString as shown below) has a lower Unicode value than the argument (secondString) then a negative value is returned.

So given the following:

String firstString = "a";
String secondString = "b";
result = firstString.compareTo(secondString);

result = -1

"a" has a lower Unicode value (97 in decimal) than "b" (98). One way to remember this is think of "a" as being less than "b" (a < b) so a value of < 0 is returned. Since "a" has a lower value than "b", "a" will appear before "b" in lexicographically sorted container. The Java API refers to this as "a" preceding "b".

If the string object (firstString) has a higher Unicode value than the argument (secondString) then a positive value is returned as shown below.

String firstString = "b";
String secondString = "a";
result = firstString.compareTo(secondString);

result = 1

The way I remember this is to think of "b" as being greater than "a" (b > a) so a value of > 0 is returned.

If the two strings are the same, then result = 0.

After comparing Unicode values, string lengths are compared with shorter strings preceding longer strings:

String firstString = "a";
String secondString = "aaa";
result = firstString.compareTo(secondString);

result = -2

One counter-intuitive pitfall is that you would think a < B, but this is not the case since upper-case letters are all listed together in the Unicode table, followed by lower case letters. This means upper-case letters have lower Unicode values than lower-case letters, and will precede lower-case levels when sorting with compareTo(). One way around this is to use compareToIgnoreCase() which ignores case.

Below is the short experiment I wrote. It was designed to be easy to read and follow, with output that explains what is going on during execution. If I was using this in a program I would have made a second method to do the comparing and printing to save some typing, improve modularity, scalability, and reusability, and reduce the file size. You could also put all of the string values in arrays, but again that would add complexity requiring the people reviewing the code to do some extra work / spend more time, which would be counter-productive here.

public class CompareToExample 
{
	
	public static void main(String[] args) 
	{
		
		String firstString = "a";
		String secondString = "b";
		int compareReturn = firstString.compareTo(secondString);
		System.out.println(firstString + ".compareTo(" + secondString ") = " + compareReturn);
		
		firstString   = "b";
		secondString = "a";
		compareReturn = firstString  .compareTo(secondString);
		System.out.println(firstString   + ".compareTo(" secondString + ") = " + compareReturn);
		
		firstString   = "a";
		secondString = "a";
		compareReturn = firstString  .compareTo(secondString);
		System.out.println(firstString   + ".compareTo(" secondString + ") = " + compareReturn);
		
		firstString   = "a";
		secondString = "aaa";
		compareReturn = firstString  .compareTo(secondString);
		System.out.println(firstString   + ".compareTo(" secondString + ") = " + compareReturn);
			
		//Be careful: "B" has a lower Unicode value than "a"
		firstString = "a";
		secondString = "B";
		compareReturn = firstString.compareTo(secondString);
		System.out.println(firstString + ".compareTo(" secondString + ") = " + compareReturn);
		
		//You can also use compareToIgnoreCase():		
		firstString = "a";
		secondString = "B";
		compareReturn = firstString.compareToIgnoreCase(secondString);
		System.out.println(firstString + ".compareToIgnoreCase(" secondString + ") = " + compareReturn);
	}
}