DistrIT's power lies in its simplicity. The package contains only four essential classes:
- distrit.Task
- distrit.InteractiveTask
- distrit.InteractiveTaskServer
- distrit.client.InteractiveTaskClient
All you must do to is implement a couple of interfaces and you're away. Read on for a walk through of the first steps.
The first thing you must know is what a Task
is. The Task
interface defines one single method called run
. This method is totally generic (ie. can become any method you like) because it takes an Object as its parameter and returns another Object. So for eg. if you wanted a task that found the first n primes you could do something like:
import distrit.Task;
import java.util.ArrayList;
public class PrimeFinderTask implements Task
{
public Object run( Object howManyObj )
{
int howMany = ( ( Integer ) howManyObj ).intValue();
ArrayList rv = new ArrayList();
for( int currentTest = 2; rv.size() < howMany; currentTest++ )
{
boolean foundFactor = false;
for( int fl = 2; fl < currentTest; fl++ )
{
if( currentTest % fl == 0 )
{
foundFactor = true;
}
}
if( !foundFactor )
{
rv.add( new Integer( currentTest ) );
}
}
return rv;
}
}
In the example you see how using containers such as ArrayList
you can pack as many arguments and results as you need. You could even pack in other sub tasks this task would have to complete. Your task run method could also open a GUI frame and return the results of the work done within this frame.
Now what we have here is an object containing custom executable code. Since the Task
interface extends java.io.Serializable
all objects implementing it can be transferred across a network. So this custom executable object can be sent to any other PC which can execute it and collect it's results by calling the run
method.
In some cases you may want to interact with your task while it is running. For eg. you may want to update some parameters it used during calculation, or retrieve the list of results completed so far. You may even want to halt the task because another client found a solution. You may also want to send new data to display on a GUI or retrieve changes done to a document. In the case of a distributed evolutionary algorithm you may want to exchange individuals between neighbouring populations distributed across a 2D plane.
The InteractiveTask
interface extends Task
adding two methods: public void set( Object params )
and public Object get( Object params )
. For eg. in a strategy game you may want to update the running task with the position of other players using the set method and then retrieve the player's latest moves using the get method.
Returning to our prime number finder we could extend it so we could stop its execution and retrieve a list of partial results like this:
import distrit.InteractiveTask;
import java.util.ArrayList;
public class InteractivePrimeFinderTask implements InteractiveTask
{
protected ArrayList resultsSoFar;
protected boolean stopNow = false;
public Object run( Object howManyObj )
{
int howMany = ( ( Integer ) howManyObj ).intValue();
resultsSoFar = new ArrayList();
for( int currentTest = 2; !stopNow && ( resultsSoFar.size() < howMany ); currentTest++ )
{
boolean foundFactor = false;
for( int fl = 2; fl < currentTest; fl++ )
{
if( currentTest % fl == 0 )
{
foundFactor = true;
}
}
if( !foundFactor )
{
resultsSoFar.add( new Integer( currentTest ) );
}
}
return resultsSoFar;
}
public void set( Object stopNowObj )
{
stopNow = ( ( Boolean ) stopNowObj ).booleanValue();
}
public Object get( Object param )
{
return resultsSoFar; // Ignores param
}
}
Now we could run this task asking it to give us an infinite number of prime numbers and just keep checking its progress now and then by calling get( null )
and stop it when we need to calling set(new Boolean( true ) )
.
Since in most cases your run
method will only be executed once, all the information it requires could be passed during object construction. Why you may want to do this will become clear later on. Also, we may want the prime finder to look for all primes within a specified range. Then we could tell different clients to check different ranges for primes and collate all information at the end. We could also implement a toString()
method so this object prints nicely. Let's see how we could implement both these changes:
import distrit.InteractiveTask;
import java.util.ArrayList;
public class RangePrimeFinder implements InteractiveTask
{
protected ArrayList resultsSoFar;
protected boolean stopNow = false;
protected int startSearchAt, stopSearchAt;
public RangePrimeFinder( int startSearchAt, int stopSearchAt )
{
this.startSearchAt = startSearchAt;
this.stopSearchAt = stopSearchAt;
}
public Object run( Object howManyObj )
{
resultsSoFar = new ArrayList();
for( int currentTest = startSearchAt; !stopNow && ( currentTest < stopSearchAt ); currentTest++ )
{
boolean foundFactor = false;
for( int fl = 2; fl < currentTest; fl++ )
{
if( currentTest % fl == 0 )
{
foundFactor = true;
}
}
if( !foundFactor )
{
resultsSoFar.add( new Integer( currentTest ) );
}
}
return resultsSoFar;
}
public void set( Object stopNowObj )
{
stopNow = ( ( Boolean ) stopNowObj ).booleanValue();
}
public Object get( Object param )
{
return resultsSoFar; // Ignores param
}
public String toString()
{
return "RangePrimeFinder with range = [" + startSearchAt + ", " + stopSearchAt + ").";
}
}
Now the run
method can be called with a null parameter and we could give RangePrimeFinder
objects covering different ranges to various clients.
Once you've written your InteractiveTask
you're half-way there. All you need now is to implement your InteractiveTaskServer
to tell your clients what to do and manage information flow between them.
DistrIT adopts a Client-Server model in which clients contact a server, request a task to execute, and while running it contact the server regularly to interact with it. All communication is initiated from the client using Remote Method Invocation (RMI). Basic knowledge of RMI (see tutorial) is recommended but not required to use DistrIT. Upon interaction with clients the server will manage whatever exchange of information is required or update their task being executed.
A generic client capable of connecting to any InteractiveTaskServer
and then running any InteractiveTask
has been implemented for you and will be sufficient for all purposes. In fact, it has been packaged for you into a 5Kb JAR file ready to distribute to your client PCs. The InteractiveTaskClient
(ITC) has been implemented to be robust to server failure and is capable of dynamically switching from one task to the next. If a client disconnects from the server it will not stop executing its task. It will keep trying to connect and upon reconnection it will request an ID and a task again. If the task it was previously running yields the same toString()
value as the task it downloaded it does simply continues running the previous task. Otherwise it stops it and runs the new one. toString()
is used instead of equals()
to provide robustness to recompilation of task classes. In effect once you install ITC on your client PCs you can just leave it there unattended and stop worrying about it. This is because all customization of the DistrIT system is done by implementing the InteractiveTask
and InteractiveTaskServer
interfaces which reside on your server. ITC will act as the intermediary between the interactive task it is running and the server where it got it from. Once it downloads the task it will execute run( null )
so any configuration of the task object must be done through the constructor as mentioned above. Even though ITC is sufficient for all purposes, it is nevertheless possible to use the DistrIT architecture with a custom client.
Now we come to the second and last interface you must implement to get your distributed processing up and running: the InteractiveTaskServer
interface. Your server must be able to give tasks out to clients and then interact with them. For this it is likely it will want to know how to identify each client. This is why the first thing your server must provide is an (probably unique) ID object for each client. Once this is given to it, the client will use this ID in all subsequent communication with the server to identify itself. The next thing the server must provide the client is an InteractiveTask
for it to run. The last behaviour to be described for the server is what happens when it interacts with the client. The server would likely want to stop it if it has reached a solution, it may want to exchange information between clients, etc... All this behaviour is coded by implementing three methods in the InteractiveTaskServer
interface:.
public Object getID( Object initialParameters )
: This is the first method ITC calls on the server, akin to a registration step. ITC provides initialParameters
in the form of a Vector
containing Strings which were passed to the client as command line parameters. For example ITC might have been started with the parameters Bob 2.8 modem
and this could be telling the server that this client belongs to Bob who has a 2.8Ghz processor connected to the internet through a modem, and this could influence future behaviour towards this client. The server must return some form of ID Object
to be used by the client in future communications. For eg. the ID could be a unique Integer
.public InteractiveTask getTask( Object id )
: Here ITC is asking for a task to run and refering itself by the id it was given when it registered. The server here would return the task it would want this client to execute. For eg. return new InteractivePrimeFinderTask();
.public Object interact( Object id, Object clientTaskOutput )
: Here ITC is sending information from the task it is running to the server and expects some information in return to give to the task. In our prime finder task ITC would be forwarding the list of primes and the server would return it a Boolean signalling if to stop execution. ITC internally calls: task.set( server.interact( task.get( null ) ) )
. So first it gets the client task output (task.get( null )
). Then it sends this as a parameter to the interact
method on the server. The returned value from the server interact
method is then sent into the task using the set
method.
import distrit.InteractiveTask;
import distrit.InteractiveTaskServer;
import java.util.Vector;
import java.util.ArrayList;
import java.util.HashSet;
public class PrimeFinderServer implements InteractiveTaskServer
{
protected static final int RANGE_SIZE = 10000;
protected ArrayList clientNames = new ArrayList();
protected HashSet primesFoundSoFar = new HashSet();
public Object getID( Object initialParameters )
{
Vector clientArgs = ( Vector ) initialParameters;
String clientName = ( String ) clientArgs.get( 0 );
int id = clientNames.indexOf( clientName );
if( id < 0 )
{
id = clientNames.size();
clientNames.add( clientName );
}
return new Integer( id );
}
public InteractiveTask getTask( Object id )
{
int rangeBase = ( ( Integer ) id ).intValue() * RANGE_SIZE;
return new RangePrimeFinder( rangeBase, rangeBase + RANGE_SIZE );
}
public Object interact( Object id, Object clientTaskOutput )
{
primesFoundSoFar.addAll( ( ArrayList ) clientTaskOutput );
return new Boolean( false ); // tell client to keep going
}
}
The getID
gives out unique Integer
objects as client identifiers. The getTask
method uses this integer to give each client a RangePrimeFinder
with a unique range to search primes within. The interact
method merges the primes found by the client calling it with the global set primersFoundSoFar
. Then it signals this client to keep running by always returning new Boolean( false )
which will influence the stopNow
variable in the RangePrimeFinder
through the set
method.
getID
assumes a name is passed as a command line argument to ITC. Then it makes sure that if a client connects twice with the same name it should be given the same id. This could be for example if connection is lost and regained. This in turn will ensure that ITC is given a RangePrimeFinder
with the same range yielding the same toString()
as before. This means ITC will not stop and restart the task but will simply submit all the results its been working on while disconnected upon the next interaction.
For ITC to find your server, it must be a Remote Object registered with the RMI registry. A simple wrapper class has been written that will do this for you. This is the distrit.server.SingleServerWrapper
class. For remote ITCs to find the classes it needs to run your task code, you must have a webserver up and running at your code base. For your server to register with the RMI registry you must run rmiregistry
from the command prompt. IMPORTANT: you must do this from somewhere where it cannot find your class files, ie. not from your codebase. Next you need a policy file giving your server the necesary rights. Save the following text as a file called java.policy
:
grant
You may want to add other rights to your server such as file system permissions into this file as well.
{
permission java.net.SocketPermission "*:1024-65535", "connect,accept,resolve";
permission java.net.SocketPermission "*:80", "connect";
permission java.util.PropertyPermission "java.rmi.server.codebase", "read";
}
Once you have done these things you can run your server like this:
java -Djava.rmi.server.codebase=http://SOURCEHOST:SOURCEPORT/ -Djava.rmi.server.hostname=HOST -Djava.security.policy=POLICYFILE distrit.server.SingleServerWrapper PrimeFinderServer BINDINGNAME
where SOURCEHOST and SOURCEPORT are the hostname and port of the webserver providing the class files, HOST is the hostname of the machine where rmiregistry was run and your server is running (probably the same as SOURCEHOST), POLICYFILE is the name of the java policy file described above and BINDINGNAME an identifier for this server in the RMI registry.
Now if we have on the clients the ITC client JAR file with a sample client java policy file (in which all hostnames should be replaced by SOURCEHOST and HOST above) we can run the client like this:
java -Djava.security.policy=java.policy -jar ITClient.jar HOST BINDINGNAME [PARAMETERS...]
where HOST and BINDINGNAME are the same as above and PARAMETERS are any parameters we want to send to the server's getID
method such as the name of the client.
To use ITC with another server PrimeFinderServer can be replaced by whatever other class implementing the InteractiveTaskServer
. Command line arguments can be passed to the constructor of this class by making a constructor with a single ArrayList
parameter. This ArrayList
will contain all Strings passed in the command line when launching the server after the BINDINGNAME.
java.io.Serializable
interface.
Please do not hesitate to contact the project administrator for troubleshooting or advice on using DistrIT.