SWC: A Small Framework for WebComputing
David
Arnow, Gerald Weiss, Kevin Ying, and Dayton Clark
Department
of Computer and Information Science
Brooklyn
College and CUNY Graduate Center
2900
Bedford Avenue
Brooklyn,
NY 11210, USA
{kevin,arnow,weiss,dayton}@sci.brooklyn.cuny.edu
Abstract:
WebComputing is
an approach to parallel computing that uses Java applets to automatically
distribute a computation across the Internet. The promise of WebComputing is
the potential for achieving an unprecedented degree of parallelism — in
principle, a computation could harness every computer that is connected to the
Internet. There have been a number of
ambitious projects, including Charlotte, Javelin, and Bayanihan that have explored
this platform.
SWC is a small,
simple WebComputing framework that bears a number of distinct design approaches
compared to the other WebComputing systems.
The primary objective of the SWC system is to provide a layered system
design that separates the programming interface from the underlying
WebComputing architecture. This design
approach shields application developers from the underlying dynamic and
unreliable execution environment, makes the application programs portable, yet
at the same time provides system developers freedom in architecture
implementation.
The framework
provided by the SWC system along with its simple programming interface allows
WebComputing application developers to simply and conveniently implement and
deploy a range of scientific programs and computationally demanding
applications.
Keywords: WebComputing, distributed computing, Internet computing,
Metacomputing, Parallel computing, Java applet, Java-enabled Web browser.
The advent of
the Java programming language, with its support for web-deliverable applets,
has created a new, promising, though peculiar parallel-computing platform that
some call WebComputing. The essential idea is that a master server,
or collection thereof, in league with a collection of web servers coordinates
the execution of tasks by applets running in parallel on an ever-changing set
of unreliable, heterogeneous client machines (see Figure 1). The promise of WebComputing is the potential
for achieving an unprecedented degree of parallelism — in principle, a
computation could harness every computer that is connected to the
Internet. There have been a number of
ambitious projects, including Charlotte [BKKW,96], Javelin [CCINSW,97], and
Bayanihan [Sarmenta,98] that have explored this distributed computing platform.

SWC is a small,
simple framework, originally developed as a tool for students, that simplifies
developing and deploying WebComputing applications. The objective of the SWC system is to provide a layered
WebComputing system design that separates the programming interface from the
underlying WebComputing architecture.
Figure 2 illustrates the SWC architecture. The large middle box represents the SWC system proper and
contains two levels. The upper level,
shown in the dashed box, provides the SWC programming API. The lower level provides the support for
this API. This design approach shields
application developers from the underlying dynamic and unreliable execution
environment and makes the application programs portable, yet at the same time provides
system developers freedom in architecture design.

The SWC
framework provides a master-worker MIMD parallel programming model. At the programming interface level, the
master process (SWCMaster) defines an initial set of tasks and integrates the
results of those tasks, possibly defining new tasks along the way. The tasks comprising the computation are
defined by lightweight data objects, received and carried out by workers
(SWCWorker) that run as Java applets, and generated dynamically by both the
master and the workers themselves. The
master and any worker can also broadcast control information through the server
(SWCRouter) to all workers. The workers
do not communicate directly with each other — they simply return the results of
the tasks and request new ones to carry out.
The
master-worker programming model has a number of attributes that make it
suitable for WebComputing:
1.
It
has been widely used and proven
effective in many distributed computing systems.
2.
It fits well with the Object-Oriented design
methodology. Here in the SWC system, it
is supported with the Java programming language.
3.
It adapts well in an environment where
communication is relatively expensive.
Network latencies and bandwidth limitations on the Internet are
extremely high.
4.
It is a versatile and general parallel
programming model and can be transformed to many other programming models,
including Linda tuple space model [CG,90].
5.
A large set of problems can be parallelized
using this model.
To use the SWC
framework, an application programmer must extend a set of framework-defined
abstract classes (see Figure 3). The
SWCMaster class defines the master’s actions, while, the SWCWorker class
defines the task computation. The
SWCWorkUnit and the SWCResultUnit interface classes label the data needed to
define a particular task or result, a mechanism for recognizing task
equivalence, and a mechanism for recognizing task completion. Using these classes, the framework itself
oversees the entire computation and handles all communications.
|
public interface SWCUnit extends
Serializable {} public interface SWCWorkUnit extends
SWCUnit {} public interface SWCResultUnit
extends SWCUnit {} abstract class SWCMaster {
final void sendWork(SWCWorkUnit swu) {…}
final SWCResultUnit receiveResult() {…}
abstract void beTheBoss(); } abstract class SWCWorker {
final synchronized void send(SWCUnit su) {…}
final synchronized SWCWorkUnit receiveWork() {…}
abstract void doTheWork(); } |
Figure 3. SWC Framework
Programming API
Because of its
independent architecture design, SWC is able to provide three implementations
of its architecture (see Figure 4). One
is thread-based and runs on an SMP platform; it serves as development
environment when a web-based one is not available or is inconvenient. Another implementation is based on
independent Unix processes. The third implementation serves our main purpose —
WebComputing. It consists of a
multithreaded, servlet-enhanced HTTP server that provides an application
control page, creates the master process, downloads the applets and handles all
communication. It also provides the
classes that define, for both the master process and the applets, their
computational structure and their communication tools.

An SWC computation is
initiated using a form in a web page provided by the server. A servlet responds by creating the necessary
objects within the HTTP server along with an external master process. The master process and the server
communicate using the connection-oriented TCP protocol and need not reside on
the same machine. The master process,
using programmer provided classes creates an initial set of task definitions,
which are sent off to the server. The
server maintains a collection of task definitions, and uses eager scheduling,
as is commonly done in WebComputing [AISS,97] [BKKW,96] [CCINSW,97], to assign
them to applets that have been downloaded into volunteering web clients. Because of the unreliable nature of the
applets themselves, the fact that the most obvious candidates for WebComputing
will not require large quantities of data to define tasks, and the desire to
eliminate system-imposed limits on the number of connections on the server as a
potential bottleneck, plus the performance consideration [YAC,99],
server-applet communication uses the connectionless UDP protocol. Because the server does not consider a task
complete until the results are actually received, lost packets do not compromise
the integrity of the application. To
avoid failure to utilize an applet as a result of communication failure,
applets use a timeout mechanism to repeatedly send the server their most recent
response until the server provides additional work or instructs them to
terminate.
As the server receives responses from applets, it determines whether
these are additional task definitions, in which case they are added to its
collection of task definitions, or results from completed tasks in which case
they are passed to the master. The
latter may, in response, generate new task definitions or control information
for the server to distribute to the applets.
Control information is immediately broadcast to applets and made
available to all future ones. Figure 5
illustrates SWC communication between different modules.

The efficiency of a
WebComputing application depends on many elements beyond the control of the
implementer of a WebComputing system or framework. Among these are the appropriateness of the application to the
platform because of the extreme network latency and bandwidth limitations of
the Internet, the implementation of the Java Virtual Machines (JVM) that
support the applets, and the network conditions that effect latency and
bandwidth. To provide a sensible
estimation of the system’s performance, we need to quantify the communication
overhead. Data transfers between
different system modules constitute the dominant portion of the WebComputing
system overhead. Data flow of the SWC
system can be classified into three categories:
§
MW: The SWCMaster sends newly defined tasks to a SWCWorker.
§
WM: The SWCWorker sends SWCResultUnits to the SWCMaster.
§
WW: The SWCWorker sends SWCWorkUnits to the SWCRouter for rescheduling
to another SWCWorker.
Evaluating the communication
costs for these three categories can provide insight to the system's
behavior. A WebComputing application
developer may use this information and the communication pattern of the
specific application to estimate its communication overhead.
To study these costs,
performance measurements were carried out using JDK's Solaris Production
Release version 1.1.7 and its default just-in-time (JIT) compiler, on two Sun
Ultra-Sparc-10 workstations with 128 MB of RAM each, running Solaris 7
operating systems. One machine was
employed as a server, the other as a client.
These machines were connected with 100Mbit Ethernet LAN. The network latency was below 1 millisecond
(ms) (tested with UNIX ping facility). We also used Netscape Communicator 4.51
with a 1.1.5 Java run-time environment when applet implementation of the SWC
system was tested.
Table 1 shows the
communication costs for different categories of application data flow using an
object with a 45-byte internal state.
For both thread and process measurements, SWCMaster, SWCWorker, and
SWCRouter ran on a single workstation as three threads and three processes
respectively. For the Web Applet
measurement, two workstations were employed.
Both SWCMaster and SWCRouter ran on the server as different processes,
while, a SWCWorker ran as a Java Applet in a Netscape browser on the
client. In the first experiment (MW and
WM), a 45-byte object was bounced back and forth between the SWCMaster and the
SWCWorker. The cost of a round trip was
recorded. The second experiment (WW)
shows the data flow cost of the third communication category.
|
45-byte state data object |
Thread |
Process |
Web Applet |
|
MW and WM |
50 ms |
50 ms |
60 ms |
|
WW |
7 ms |
8 ms |
19 ms |
Table 1.
Communication Costs of Different Application Data Flow Paths
Since Java's
inception, many distributed computing research projects have been designed to
implement distributed systems using Java technologies. These research projects may be classified
into two categories based on whether the system uses the World Wide Web as the
distributed platform or not. Some
projects from the first category like JavaPVM [Thurman,96] and mpiJava
[BCHL,98] are direct extensions from traditional message passing distributed
systems like PVM [GBDJMS,94] and MPI [SOHWD,96]. While, some like ATLAS [BBB,96] extend the concept of thread,
which is initially developed for SMP machines, to distributed computing by
spawning “threads” onto remote hosts. These
projects are developed solely based on Java applications running in a LAN
environment. They can not be easily
adopted for a large scale Internet Computing [YA,99]. Projects from the second category use Java-enabled Web browsers
as distributed platforms and deploy Java applets as primary computing units
across networks. They are WebComputing
systems.
The Charlotte project [BKKW,96] pioneered the research on WebComputing
in 1996. It uses the remote
thread-programming model with distributed shared memory (DSM) that is
implemented at the Java application programming level. The high price for migrating thread across
JVM boundary renders the performance and scalability of Charlotte is in
question. On the other hand, Javelin
[CCINSW,97] was developed from a system-development point of view, using
brokers to match computing resource donors with their consumers. It provides limited support for high level
programming interfaces and expects WebComputing application developers to
absorb most or all of the programming complexity imposed by the WebComputing
programming environments. In addition,
Javelin's ad hoc run-time library components use expensive Applet-to-Applet
communication for load balancing between applet workers. This communication must be routed through
the server because of the web-browser sandbox security restriction. The Bayanihan project [Sarmenta,98] provides
a highly object-oriented framework with replaceable components for its
applications. Compared to Bayanihan,
the programming interface of the SWC framework is less complex and more
coherent, and the system itself is small and compact.
Although SWC applications
must be written in Java and therefore can easily be developed in an
object-oriented fashion, the framework is equally "friendly" to a
procedural or imperative style. In
particular, it is not terribly difficult to port C or even Fortran applications
to run as SWC programs. Several
projects (including Monte Carlo computations and classical Operational Research
projects) are underway using SWC. Two
CS courses in the spring of 1999 were taught using SWC and it is worth noting
that students — graduate and undergraduate — took to it easily.
The SWC system provides a small framework for WebComputing. The system's source code is less than 80K
and the total size of the compiled Java classes is less than 50K. Its layered design approach makes its
application portable and provides freedom for robust underlying architecture
development. At the upper layer, the
SWC system provides a simple coherent programming interface. WebComputing application programmers simply
extend four predefined classes of the framework using master-worker programming
paradigm. Because of the robust lower
layer implementation, the same application program can run as threads, processes,
as well as applets for WebComputing.
This provides a useful distributed program developing environment.
This paper also provides a communication performance measurement based
on three categories of application data flows.
Currently, we are extending the SWC framework and implementing more
applications for this system. At the
same time, we are extending current single SWCRouter architecture and adding
multiple SWCRouter capability to the framework. This extension will greatly enhance the scalability of the
system, and at the same time, be transparent to SWC WebComputing applications.
This research is supported
by ONR grant N00014-96-1-1057 and National Science Foundation's CISE program
#CDA-9522537.
[AISS,97] A.D.
Alexandrov, M. Ibel, K. E. Schauser, and C. J. Scheiman, SuperWeb: Research Issues in Java-Based Global Computing,
Concurrency: Practice and Experience, June 1997.
[BCHL,98] Mark Baker, Bryan Carpenter,
Sung Hoon Ko, and Xinying Li. mpiJava: A
Java interface to MPI. Presented at First UK workshop on Java for High
Performance Network Computing, Europar 1998. http://www.npac.syr.edu/projects/pcrc/mpiJava/mpiJava.html
[BBB,96] J. Baldeschweiler, R.
Blumofe, and E. Brewer, ATLAS: An
Infrastructure for Global Computing, Proceedings of the Seventh ACM SIGOPS
European Workshop: Systems Support for Worldwide Applications, Connemara,
Ireland, September 1996.
[BKKW,96] A. Baratloo, M. Karaul, Z.
Kedem, and P. Wyckoff. Charlotte:
Metacomputing on the Web. In Proceeding Of the 9th International Conference
on Parallel and Distributed Computing Systems, 1996.
[CCINSW,97] P.
Cappello, B. Christiansen, M.F. Ionescu, M.O. Neary, K.E. Schauser and D. Wu, Javelin: Internet-based parallel computing
using Java, the Sixth ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming, 1997.
[CG,90] Nicholas Carriero and
David Gelernter, How to Write Parallel
Programs : A First Course, MIT Press, 1990.
[GBDJMS,94] A.
Geist, A. Beguelin, J. Dongarra, W. Jiang, B. Manchek and V. Sunderam, PVM: Parallel Virtual Machine - A User's
Guide and Tutorial for Network Parallel Computing, MIT Press, 1994.
[Sarmenta,98] Luis
Sarmenta. Bayanihan:
Web-Based Volunteer Computing Using Java.
2nd International Conference on World-Wide Computing and its Applications
(WWCA'98),
Tsukuba, Japan, March, 1998.
[SOHWD,96] M.
Snir, S. Otto, S. Huss-Lederman, D. Walker and J. Dongarra, MPI: The Complete Reference, MIT Press,
1996.
[Thurman,96] D. Thurman, JavaPMV: The Java to
PVM Interface, December 1996. JavaPVM
is renamed to jPVM, URL:
http://www.isye.gatech.edu/chmsr/jPVM/
[YA,99] Kevin Ying and David
Arnow. A WebComputing Overview. Brookly College Computer Science Department
Technical Report #TR-3-99, March 1999.
[YAC,99] Kevin
Ying, David Arnow, and Dayton Clark. Evaluating
Communication Protocols for WebComputing. In Proceeding of 1999 International Conference on Parallel and Distributed
Processing Techniques and Applications (PDPTA'99), CSREA Press, June, 1999.