Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Questions about PTP SDM debugger


On Aug 25, 2008, at 11:00 AM, Dave Wootton wrote:

Greg
Some additional questions
1) It looks like I don't pass the name of the application executable as a parameter on the top level SDM instance since the top level instance isn't
directly invoking the SDM instances required for individual tasks.

No this isn't necessary. The debugger protocol supplies the executable name and the application arguments.


2) What are the invocation parameters of the individual SDM? I'm sort of guessing I need the hostname and port of the top SDM, the pathname of the application and any parameters the application requires. I'm guessing then
the individual SDM starts, starts a debugger instance and the debugger
instance starts the application instance.

The master sdm should be invoked with as 'sdm --host=address -- port=port --debugger=gdb-mi --numprocs=n' where address is the address of the machine running eclipse and port is a port number assigned by PTP. The servers will be started with something like 'mpirun sdm - debugger=gdb-mi --numprocs=n'.



3) Is the routing file on a node a list of all tasks in the application or
only the tasks running on that node?

A list of all tasks.


4) How does the routing file get loaded onto each individual node?

At the moment it is assumed there is a shared filesystem. This requirement will be removed in a later version, and the sdm's themselves will be used to propagate the routing file.

5) How does each individual SDM know how to connect back to the top SDM if
the top SDM host/port is not a parameter?

Connections propagate up the tree (starting from the master). Each sdm knows the index of its children (computed as a binomial tree) so it just attempts to connect to its children using the address/port obtained from the routing file.


6) If the individual SDM is passed the host/port that it connects to the
top SDM, how do I find out what that top level SDM port is?

There is no easy way to do this at the moment, since it is generated internally and passed to the submitJob command as an argument. The easiest way would be to print out the arguments to the submitJob command either in the Java side of the RM or in your proxy.



I think I understand how this is supposed to work, and it seems reasonable for the case where the user specifies a host list file. In the case where
we use LoadLeveler to allocate nodes, I'm not sure how this will work
since we have no way of knowing what nodes are allocated until the poe job
(the SDMs) starts.

The SDMs do nothing until they get the routing file. Would it be possible to launch the SDMs, get the node information from LL, then create the routing file? This is how the new OMPI RM works.

Greg


Back to the top