HP-PVM operation

HP-PVM operation

Start-up procedures - Tasks, Daemons and Remote shell daemons

There are three types of program used in a HP-PVM application.
rshd
This program sits on each Windows computer and allows daemon programs to be started remotely. Under Unix the standard remote shell daemon or rexec daemon mechanisms are used instead.
daemon
A daemon runs on each computer while the system is running, and forms the backbone of the communication system.
application
The application itself runs on each computer. These are often categorised as master or slave tasks.
The executing HP-PVM application consists of a master task, started by the user, and a group of slave tasks spawned on remote computers at the request of the master task.
The PVM slaves to be spawned in the application can be placed anywhere in the path. Be careful to ensure that there are not multiple versions of slaves in different directories. This is a common cause of confusing software bugs.

While the HP-PVM application is running, a daemon process runs on each computer to allow remote tasks to be started and to co-ordinate communications. When a PVM application starts, if the daemons are not already running, they will be started automatically by the application. The daemons can be run manually by simply running the 'daemon' program.
The master daemon starts on your computer and will then create a daemon on each of the remote computers listed in the mc_list file. This is done by communicating with the remote shell daemon running on the remote computers. Under Windows it is important that the rshd.exe is running before the application is started. You may want to add it to the ‘startup’ windows group.
The mc_list and config files are read from each computer by the daemon as it starts up. Changes to these files will not take effect until the daemon process is restarted.

The mc_list file
The machine list file (mc_list) contains the names of the computers to be used in the HP-PVM system. Only the mc_list on the computer which starts up the daemon network is used. It can be edited prior to each run, before starting the daemons.
Sample mc_list file:
stone
woodwind
stone
stone
parasys.co.uk

Full domain names can be used. If any machine name appears several times in the mc_list, that computer will have several daemons run on it. It will effectively emulate a number of different computers. This allows you to test applications and HP-PVM on larger networks than are actually available.
Communications between tasks on the same computer would normally make use of fast shared memory communications, but tasks running off different daemons on the same computer will not, as that would not realistically emulate the TCP-IP communications that would be used between different computers. This performance penalty means you should generally avoid using repeated names in the mc_list file as a quick and dirty way of making applications that were designed to spawn only one task per computer work with multi-processor computers.
There are several options that can be included on the line, after the slave host name.
       dx=path                Specify full path of directory which contains
                                   the daemon executable
       lo=login name        Specify login name to use on the remote computer,
                                   if different to local name
       pw                       ask for a password, rather than relying on .rhosts
                                   mechanism on remote Unix machines
example, phoenix dx=/home/fred/hppvm lo=fred pw
If you use the dx=path option in the mc_list file, then the config file option 'exe_path' will usually be needed as well. This specifies the path the remote daemon should search when trying to spawn application tasks.
The daemons will close down automatically after a few minutes of inactivity, however they were started.

Config file options

speed = 1000
Each computer is given a speed setting which describes the relative speed of that computer. This information can be used by the system for deciding which computers to spawn slave tasks on

procs = 2
The number of processors on the computer. This information can be used by the system to decide which computer to spawn slave tasks on, and the information is available to the application through the extended pvm_config command (pvme_config_ex).

error_level = all
This determines the severity of errors that will be displayed in the HP-PVM task debug windows.
no - no errors at all
env - errors in the set-up of the environment, such as PATH problems
fatal - fatal errors which cause HP-PVM to close down
warning - all warnings, except problems like sending data to dead tasks, which can reasonably occur
all - all errors

verify = no | yes
Turn on communications verification. This will cause the system to verify all communications between HP-PVM tasks, using check-sums and message sequence numbers. If HP-PVM seems to be corrupting messages in your application, try running it with this option set to 'yes'. This will slow down communications.

debug_pvm_commands = no | yes
Turn on HP-PVM command debugging. A list of HP-PVM commands called by your application will be listed in the tmp directory, to help in debugging applications

debug_pvm_data = no | yes
Dump all data packed and unpacked. This can be used instead of the verify mode to manually verify that the correct data is being unpacked from received messages.

shmem = no | yes
Should shared memory be used for communications between tasks on the same computer. This will usually allow significantly faster communications. Shared memory should only be created once a communication between two slaves on the same computer occurs.

shmem_total =    2000000
shmem_link_size = 30000
These options control the amount of shared memory that the system should use, and the amount used of each uni-directional link between two tasks.

auto_socket_port = yes
Allow HP-PVM to choose inet port numbers automatically. There is no reason to switch this off

tmp_file_buffering = no | full | line
This option controls the stream buffering of output by HP-PVM into the log files in the tmp directory. Turning on buffering will increase performance, but an abrupt failure of an application may leave the files in an incomplete state.
machine = machine_name_1
machine = machine_name_2
This option allows you to list the computers that will be allowed to act as a master, using your computer as a slave. If you are connected to the internet this is important for security. You can set machine to "all" to allow any computer to make use of it. If no machine list is given, no other computer will be allowed to use it.
slave_window = no | yes | popup
This option controls the use of debug windows for slave processes, to display errors or warnings. 'no' will prevent windows being created. 'yes' will allow them to be created but they will remain minimised. 'popup' will cause them to appear as normal windows. The windows will not appear in any case until an error or warning occurs.
exe_path = /home/fred/apps1:/home/fred/app2
This options specifies a colon (or semi-colon) seperated list of additional paths to searched for executables started through pvm_spawn. For bourne shells where a .cshrc can not be used to set-up the PATH, this is necessary for PVM to be able to find your executables

Users and Daemons
Under Windows, once a network of daemons is running on a set of computers, an HP-PVM application started on one of those computers will use that existing network of daemons. Many applications run by many different users can all use the same daemon network.
Under Unix, each user has a separate and isolated network of daemons. This is because of the Unix permissions system. The daemons must be owned by you if they are going to spawn your slave programs. If a user runs several applications they will shared the same network of daemons.
An application can insist that it has its own unique network of daemons by calling the pvme_uniq_daemon() function before it connects into HP-PVM by running pvm_mytid, or any other main PVM commands (most PVM commands will run pvm_mytid to connect into HP-PVM if you have not already done so).

Using unique daemon networks has several advantages.

Each daemon can only support a limited number of tasks on a computer. Using unique daemons, this becomes an application rather than a machine wide limitation.
Any reliability problems that occur with the daemon could bring down all of the tasks running from it, allowing a rouge application which manages to damage the daemon to effect other applications.
Any intermittent problems may be easier to reproduce using unique daemons since the communications patterns of the application will not be effected by the communications activities of other applications sharing the daemons.

Tmp files
Temporary files containing error information and debug information appear under Windows in c:\tmp\pvm\*, and under Unix in /tmp/pvm_<user_name>/*
A default directory ‘out’ is used to store the files. If your application requests unique daemons, then it will use a different directory name. The file list might look like this.

149.pid 152.pid 59.pid
‘pid’ files are started by the application tasks as they start up, in case they break before they get around to opening their main output file - the u*.txt file. They can be used by HP-PVM for debug information but will generally be empty

d_out.0
d_out.* files are the daemon output files. These will contain error and warning messages, and any debug information generated by the HP-PVM daemon
startup
An initial daemon start-up file, normally empty

u1.txt u2.txt u3.txt
Output files. u<tid>.txt. Any HP-PVM debug output, and any errors, warnings or debug information requested through the config file debug options will be generated in these files. An application can make use of these files through the file handled obtained from the pvme_tmp_file function

0.err 1.err 2.err 3.err
Errors and warnings will be produced in these files, for daemons and tasks, as they appear in d_out.* or u*.txt files, but without the surrounding debug information that occurs in the .txt files