Java vs. C

From Simson Garfinkel
Revision as of 13:44, 28 May 2009 by Simson (talk | contribs)
Jump to navigation Jump to search

Several people have told me recently that Java runs as fast as C. After repeating this information somewhat, I decided to test it for computer forensics.

The test I constructed a test file of 4238912226 bytes. The test involved reading the file 4K byte blocks at a time and computing the SHA1 hash of each block.

Here is the C program I used:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/sha.h>

int main(int argc,char **argv)
{
    
    FILE *f = 0;
    if(argc!=2){
	fprintf(stderr,"usage: %s - compute block hashes (but don't print them)\n");
    }
    f = fopen(argv[1],"r");
    if(!f) {
	perror(argv[1]);
	exit(1);
    }
    while(!feof(f)){
	char buf[4096];
	unsigned char md[20];
	size_t count = fread(buf,1,sizeof(buf),f);
	SHA_CTX c;
	SHA_Init(&c);
	SHA_Update(&c,buf,count);
	SHA_Final(md,&c);
    }
    fclose(f);
}

I ran the test 3 times on my Mac Pro (2x2.66 Ghz Dual-Core Intel Xeons, 12GB 667 Mhz DDR2 FB-DIMM memory, 1TB hard drive)

12:59 PM m:~/nps/speedtest$ time ./ctest /realistic.aff 

real	0m53.443s
user	0m25.459s
sys	0m6.113s
01:00 PM m:~/nps/speedtest$ time ./ctest /realistic.aff 

real	0m31.137s
user	0m25.327s
sys	0m5.650s
01:01 PM m:~/nps/speedtest$ time ./ctest /realistic.aff 

real	0m31.694s
user	0m25.392s
sys	0m5.920s
01:02 PM m:~/nps/speedtest$

The first time the file was being read off the disk, the second two trials the file was in memory.

Interestingly, the entire file can be read in around 8 seconds on this hardware:

time dd if=/realistic.aff of=/dev/null bs=4096
1034890+1 records in
1034890+1 records out
4238912226 bytes transferred in 7.786416 secs (544398372 bytes/sec)

real	0m7.979s
user	0m1.455s
sys	0m6.335s
01:22 PM m:~/nps/speedtest$ 

So how fast is Java? Here is my Java program:

import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.io.*;
	
public class jtest {
    public static void main(String[] args){
	long t0 = System.currentTimeMillis();
	try {
	    System.out.println("Start");
	    FileInputStream fis = new FileInputStream(new File(args[0]));
	    MessageDigest md = MessageDigest.getInstance("SHA");
	    while(true){
		md.reset();
		byte[] buf = new byte[4096];
		int count = fis.read(buf);
		if(count==-1) break;
		md.update(buf,0,count);
		byte[] f = md.digest();
	    }
	    System.out.println("Done");
	}
	catch (IOException e){
	    System.out.println(e);
	}
	catch (NoSuchAlgorithmException e){
	    System.out.println(e);
	}
	long t1 = System.currentTimeMillis();
	System.out.printf("Miliseconds to execute: %d\n",t1-t0);
    }
}

Notice that I have the program report how long it takes to run the benchmark, so we can factor out the cost of JVM startup.

01:38 PM m:~/nps/speedtest$ time java jtest /realistic.aff 
Start
Done
Miliseconds to execute: 98012

real	1m38.611s
user	1m26.193s
sys	0m7.176s
01:40 PM m:~/nps/speedtest$ time java jtest /realistic.aff 
Start
Done
Miliseconds to execute: 92977

real	1m34.298s
user	1m26.149s
sys	0m6.701s
01:42 PM m:~/nps/speedtest$ 

So those are pretty disappointing numbers. Java seems to be running 3x slower.

Second Java Try

It's possible that most of the Java overhead was in creating the new hash object each time through. So I tried this version, which makes one hash object and then clones it: