Friday, March 28, 2014

Base64 in Java 8 - It's Not Too Late To Join In The Fun

Finally, Java 8 is out. Finally, there's a standard way to do Base64 encoding. For too long we have been relying on Apache Commons Codec (which is great anyway). Memory-conscious coders will desperately use sun.misc.BASE64Encoder and sun.misc.BASE64Decoder just to avoid adding extra JAR files in their programs, provided they are super sure of using only Sun/Oracle JDK. These classes are still lurking around in Java 8.

To try things out, I've furnished a JUnit test to show how to use the following APIs to encode:

  • Commons Codec: org.apache.commons.codec.binary.Base64
  • Java 8's new java.util.Base64
  • The sort-of evergreen internal code of Sun/Oracle's JDK: sun.misc.BASE64Encoder

package org.gizmo.util;

import java.util.Random;

import org.apache.commons.codec.binary.Base64;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
import static org.junit.Assert.assertArrayEquals;

import sun.misc.BASE64Encoder;

public class Base64Tests {

 private static byte[] randomBinaryData = new byte[5000000];
 private static long durationCommons = 0;
 private static long durationJava8 = 0;
 private static long durationSun = 0;
 
 private static byte[] encodedCommons;
 private static byte[] encodedJava8;
 private static String encodedSun;
 
 @BeforeClass
 public static void setUp() throws Exception {
  
  //We want to test the APIs against the same data
  new Random().nextBytes(randomBinaryData);  
 }

 @Test
 public void testSunBase64Encode() throws Exception {
  
  BASE64Encoder encoder = new BASE64Encoder();

  long before = System.currentTimeMillis();

  encodedSun = encoder.encode(randomBinaryData);
  
  long after = System.currentTimeMillis();
  durationSun = after-before;
  System.out.println("Sun: " + durationSun);
 } 
 
 @Test
 public void testJava8Base64Encode() throws Exception {
  
  long before = System.currentTimeMillis();

  java.util.Base64.Encoder encoder = java.util.Base64.getEncoder();
  encodedJava8 = encoder.encode(randomBinaryData);
  
  long after = System.currentTimeMillis();
  durationJava8 = after-before;
  System.out.println("Java8: " + durationJava8);
 }
 
 @Test
 public void testCommonsBase64Encode() throws Exception {
  
  long before = System.currentTimeMillis();
  
  encodedCommons = Base64.encodeBase64(randomBinaryData);
  
  long after = System.currentTimeMillis();
  durationCommons = after-before;
  System.out.println("Commons: " + durationCommons);
 }

 @AfterClass
 public static void report() throws Exception {

  //Sanity check
  assertArrayEquals(encodedCommons, encodedJava8);
  System.out.println(durationCommons*1.0/durationJava8);
 }
}



What about the performance of these 3 ways? Base64 seems to be a small enough method so there are less ways to screw it up, but you'll never know what lies beneath the surface. From general timing (in the JUnit tests), it seems that the 3 methods can be arranged like this, from the fastest to the slowest: Java 8, Commons, Sun. A sample of the timing (encoding a byte array of size 5,000,000):

Sun: 521
Commons: 160
Java8: 37

Java 8's method ran 4x faster than Commons, and 14x faster than Sun. But this sample is just simplistic. Do try to benchmark for yourselves to come to your own conclusions.

So, which APIs to use? As any expert will tell you...it depends. If you have enough power to dictate that your code should only run on Java 8 and above, then by all means use the new java.util.Base64. If you just need to support multiple JDK versions and vendors, you can stick with Commons Codec or some other 3rd party API. Or wait until the older Javas to be out of circulation or usage, and rewrite your precious codebase. Or move on to another programming language.

Note: I did not even mention about using sun.misc.BASE64Encoder. Avoid it when possible. Perhaps one day this class will be removed in another (alos) version of JDK...it isn't present in other (heteros) JDKs by other vendors.

References: