next up previous
Next: 7 Testing the System Up: Getting Started with HAFTA Previous: 5 Simple Example

Subsections

6 Overlord Script

Once we have a couple programs we want to describe, we need to create the Overlord Runtime (olrt) script which will describe the system. In this case we have a simple server and client, but the concept can be extended as far as is required to support your system.

6.1 Choosing modules to use

To keep this example fairly simple, we will only monitor the existance of the two processes. In other words, we will only react if the process dies for some reason, and will not worry about a process getting ``stuck'' or in some other unexpected state.

For this example, we will use four modules. Remember modules extend the functionality of the language in the same way that the C library adds to the C language.

The following modules will be used:

6.2 Examining the script

In the same way we include headers in C, there are include files which are generated when we created the modules and we want to include them.

// Prototype modules required by this script.

include "exec.include";
include "getpidbyname.include";
include "log.include";
include "logf1.include";

Again for all the same reasons we want macros in C, we have a define operator which allows us to define constants.

Each process in the system can be polled to check to see if it is still responding, you define what this means. In this case we want to check the system every 100 milliseconds.

define POLL_INTERVAL 100

The actual system definition starts with the keyword system, followed by a list of it's dependacies. In this case, we only depend on ``client''.

system: client
{
Using the modules we have included, we want to declare global nebuloids which we will use in the various actions.

  // Create global instances of the modules required
  exec launch();
  getpidbyname getpid();
  log logerr(0, "test");
  logf1 logserverpid(0, "test", "server pid is %s");
  logf1 logclientpid(0, "test", "client pid is %s");
Without getting into alot of detail of how the various modules work, you can see that we can initialize the various nebuloids with different parameters.

Some nebuloids are very generic, for example, launch() which just takes the executable name and can be used to launch any executables. Other's like the log nebuloids are initialized with the format string, which makes each nebuloid specific to a particular use.

The next step is to define to the system how you want to monitor the process called ``client'' and what you want to do when starting, or restarting it.

  // check once per POLL_INTERVAL milliseconds to see if client is alive
  process  client POLL_INTERVAL: server
  {
This says we have a process in the system which we will call ``client''. This name does not have to be the same as the executable name and if you had more than one client, you could call the first client1 and the second client2.

This line also says we want to check this process every POLL_INTERVAL milliseconds and that it depends on something called ``server''.

    // run this once to start it
    start
    {
      // check to see if it was started by someone else in which case we
      // just record the pid.
      client = getpid( "client" );
	  if( client == 0)
	    {
	      launch("./client");
	      logerr(5, "Starting client");
	      client = getpid("client");
	    }
	  endif;
	  logclientpid(5, client );
	}

OK, this action, called the start action, is executed the first time the overlord checks and fails it's test for client.

Note: Think of client as a variable and it passes if it is non-zero.

The first thing we do it test to see if it is already running. This is VERY important as if you don't do this, you will not be able to start or restart the overlord without it failing to work as you expect - ie. it will do your actions regardless if they have been done before or not. It is up to you, the script writer to make sure your scripts work as you expect.

If the client is not running, ie. getpid could not find a pid to ``client'', we then call launch to start it. We log this fact and then check to see that it actually did start. We don't test to see if that failed, because the overlord will notice and call the restart action if that is true.

      restart
	{
	  client = getpid("client");
	  if( client == 0 )
	    {
	      launch("./client");
	      logerr(5, "Restarting client");
	      client = getpid("client");
	    }
	  else
	    {
	      logerr( 5, "No restart required");
	    }
	  endif;
	  logclientpid(5, client );
	}

In a similar way, this action is called when the overlord notices the process has failed after it has run the start action.

      dependfail
	{
	  logerr(5, "client needs server running");
	}
This action is called if something you depend on fails. In this case if the server fails, this action is called.

  process  server POLL_INTERVAL:
    {
      // run this once to start it
      start
	{
	  // check to see if it was started by someone else in which case we
	  // just record the pid.
	  server = getpid( "server" );
	  if( server == 0)
	    {
	      launch("./server");
	      logerr(5, "Starting server");
	      server = getpid("server");
	    }
	  endif;
	  logserverpid(5, server );
	}
      
      restart
	{
	  server = getpid("server");
	  if( server == 0 )
	    {
	      launch("./server");
	      logerr(5, "Restarting server");
	      server = getpid("server");
	    }
	  else
	    {
	      logerr( 5, "No restart required");
	    }
	  endif;
	  logserverpid(5, server );
	}
    }

This section describes the server process and is similar to the client description. The only real differences are the server doesn't depend on anything and therefore having a dependfail action doesn't make sense.

In either case the actions are optional, although there would be some question as to why you want to watch the process if you are not going to do anything if it fails.

6.3 Compiling the Script

The overlord runtime (olrt) is really a virtual machine (vm) which runs a psuedo code we created for this purpose. In order for you to ``run'' the script you just created, you need to compile it. The compilation is a two step process:

We made the default installation install a precompiled version of the binary versions of the compiler, assembler and runtime. Obviously as you modify the runtime you will want to use your version.

The script from above is available in the same place as the other example code. You can copy it over and compile it as follows:

$ cd ~/hafta
$ cp /opt/hafta/examples/getstart/example.script .
$ olc -f example.script | ola -q
$

This produces a binary version of the pcode which the runtime uses. You will see a file called p.out which was produced by the assembler.


next up previous
Next: 7 Testing the System Up: Getting Started with HAFTA Previous: 5 Simple Example
2003-01-03