Measurements for Mobile Agent Performance Modeling Paper

2/11/00 Plot of Beta for tests with 3 second interval between agent sends using agent Meetings.
2/11/00 Plot of bandwidths for tests with 3 second interval between agent sends using agent Meetings.
Same with another data point.

2/11/00 Plot of Beta for tests with 3 second interval between agent sends using agent Meetings.
Added Bob's improvements to platSigioImp and javaImp.
2/11/00 Plot of bandwidths for tests with 3 second interval between agent sends using agent Meetings.
Added Bob's improvements to platSigioImp and javaImp.

2/9/00 Plot of Beta for tests with 3 second interval between agent sends using agent Meetings.
2/9/00 Plot of bandwidths for tests with 3 second interval between agent sends using agent Meetings.
(seg violations and meeting failures occur with # of senders above 10)

2/2/00 Plot of bandwidths for tests with 10 second interval between agent sends using agent Meetings.
(code wasn't modified to handle more than 10 agents so it max's out at 10.)

2/1/00 Plot of bandwidths for tests with 10 second interval between agent sends using agent Events.

Background

Beta is calculated using 2MB/s (2097152 b/s) as the raw bandwidth. The agents are each sending about 50K bytes of documents every T seconds with each agent sending 500 copies of the 50K document block (hence each test run takes about 500 x T seconds to complete). I add up the total size of the documents transmitted and divide by the total time it took to transmit them to get the data bandwidth values. The total bandwidth comes from the SNMP octet count from the switch, which all messages to the client have to pass through, divided by the same total time.
A description of the hardware might be useful:
The client agent is running on one of the Gateway Solo 2300's (Pentium 200MHz CPU w/MMX and 48MB RAM) that is wirelessly linked to one of the Tecra's (133MHz Pentium CPU with 16MB RAM) which is acting as a gateway (using gated) between the 10Mbps wired and 2Mbps wireless networks. The 10Mbps line from the Tecra goes through a 10Mbps hub which connects to a port on a switch and the switch has a 100Mbps line to the switch in the cluster that the PC's there are plugged into (also via 100Mbps interfaces).

Related

Here's a link to an article that describes the results from a performance comparison between a Java program (using a JIT compiler) and C++ code.


Broadcast Beta

A simple sender and receiver program were written which broadcast 4999 data blocks of 50,000 bytes each across the 2MB/sec wireless link in 1135 seconds which results in the following numbers:
Broadcast Beta = (1,761,762 bits/sec) / (2,097,152 bits/sec) = 0.840


Comparison of C and Java performance

A C program was written which reads in an 8 bit BMP image file, processes the image to detect edges, then writes the resulting image to another file. This program was then ported to Java. Shown below are example images showing an image before processing and the images that result from the C and Java code after processing. (I converted the images to GIF format for better browser compatability.)

Original image file
Edge processed image file produced by C code
Edge processed image file produced by Java code

Here is a copy of the C and Java code.

C edge detection program
Java edge detection program
ancillary Java class
ancillary Java class
ancillary Java class
ancillary Java class

Due to some features of Java (no unsigned integers and it's byte and word order are the opposite of Unix/C's) I initially did a byte and word swap on the BMP image before running it through the Java code (which also has to do some swapping itself) and this externally performed swapping isn't taken into account in the performance numbers below. I think this is valid if you just assume that the images would be in the format appropriate for the language that is processing them as it seems likely they would be in a real application. The swapping inside the Java code results in four extra assignments between variables inside two loops that process the whole bitmap. Removing these might improve the Java processing time and make the comparison more valid. Jon H. has suggested a good way to do swapping in Java so I'll try implementing that.

Running these two programs on oddjob (233MHz Pentium II, 190MB RAM) using Java1.0.2 (agentjava) for the Java program resulted in the following execution times, averaged over 100 runs:

C Program:

READ IMAGE: 6 msec
PROCESSING: 155 msec
WRITE IMAGE: 8 msec

Java1.0.2 Program:

READ IMAGE: 29 msec
PROCESSING: 6034 msec
WRITE IMAGE: 23 msec

Running the same programs on bald (200MHz Pentium, 64MB RAM) using Java2.0 (an early Blackdown Java2 port: jdk1.2-PreRelease V2) for the Java program resulted in the following execution times:

C Program:

READ IMAGE: 14 msec
PROCESSING: 247 msec
WRITE IMAGE: 15 msec

Java2.0 Program:

READ IMAGE: 703 msec
PROCESSING: 415 msec
WRITE IMAGE: 80 msec

Thus for Java1 there is about a 39 times speed advantage for C when executing the edge detection algorithm.
For Java2 there is about a 1.7 times speed advantage for C when executing the algorithm.


C/Java performance comparison results with rewritten Java code

The Java code was rewritten to eliminate the need to do byte and word swapping as a pre and post processing step. This gets rid of 4 assign statements inside two of the pixel processing loops, but adds byte and word swapping to the code that reads in the header information. The altered Java code is here: Java edge detection program

Running the Java program on oddjob, averaged over 100 runs, resulted in the following times:

Java1.0.2 (agentjava)

READ IMAGE: 229 msec
PROCESSING: 6446 msec
WRITE IMAGE: 23 msec

Java2.0 Sun Linux Port jdk1.2.2

READ IMAGE: 198 msec
PROCESSING: 2521 msec
WRITE IMAGE: 25 msec

Java2.0 Blackdown Linux Port jdk1.2.2

READ IMAGE: 421 msec
PROCESSING: 2683 msec
WRITE IMAGE: 38 msec

It turns out that the Sun and Blackdown ports no longer include a JIT compiler, hence the longer processing time compared to the same code running on bald (a slower machine) which has a pre-release Blackdown port that -does- include a JIT compiler:

Java1.0.2 (agentjava)

READ IMAGE: 421 msec
PROCESSING: 7834 msec
WRITE IMAGE: 57 msec

Java2.0 Blackdown Linux Port jdk1.2-PreRelease V2

READ IMAGE: 865 msec
PROCESSING: 403 msec
WRITE IMAGE: 76 msec

So the new Java code is slightly faster when using the JIT compiler and is actually somewhat slower when run under Java 1.0.2.
For the purposes of the paper I'd suggest using the best Java2 numbers for processing time (403 msec) to compare against the C processing time (247 msec) which gives a ratio of Alpha's equal to 1.63 in favor of C. Using a JIT compiler seems the most likely scenario in any real world application and the read and write times are dependent on factors that have more to do with disk drive speed and efficiency of the image format than processing speed.


Here's the results from running tests on one of the cluster nodes.

Java2 times:

Average READ time is:
Average of 100 runs is 271

Average PROCESSING time is:
Average of 100 runs is 111

Average WRITE time is:
Average of 100 runs is 2008

Times are in milliseconds. I ran the test using the Blackdown Linux Port jdk1.2-PreRelease V2 Java which has a JIT compiler.
Here are the times for the C code as well:

Average READ time is:
Average of 100 runs is 5

Average PROCESSING time is:
Average of 100 runs is 83

Average WRITE time is:
Average of 100 runs is 587

The tests were run on agentc07 (PII 450MHz, 256MB RAM) under RedHat Linux 6.1 (Linux kernel 2.2.12-20).


Running the C code on the client (Gateway Solo 2300 200MHz MMX, 48MB RAM, Linux 2.0.36) resulted in a processing time of 236 milliseconds when averaged over 100 test runs.