[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [ptp-dev] Questions about PTP SDM debugger
|
On Aug 25, 2008, at 11:00 AM, Dave Wootton wrote:
Greg
Some additional questions
1) It looks like I don't pass the name of the application executable
as a
parameter on the top level SDM instance since the top level instance
isn't
directly invoking the SDM instances required for individual tasks.
No this isn't necessary. The debugger protocol supplies the executable
name and the application arguments.
2) What are the invocation parameters of the individual SDM? I'm
sort of
guessing I need the hostname and port of the top SDM, the pathname
of the
application and any parameters the application requires. I'm
guessing then
the individual SDM starts, starts a debugger instance and the debugger
instance starts the application instance.
The master sdm should be invoked with as 'sdm --host=address --
port=port --debugger=gdb-mi --numprocs=n' where address is the address
of the machine running eclipse and port is a port number assigned by
PTP. The servers will be started with something like 'mpirun sdm -
debugger=gdb-mi --numprocs=n'.
3) Is the routing file on a node a list of all tasks in the
application or
only the tasks running on that node?
A list of all tasks.
4) How does the routing file get loaded onto each individual node?
At the moment it is assumed there is a shared filesystem. This
requirement will be removed in a later version, and the sdm's
themselves will be used to propagate the routing file.
5) How does each individual SDM know how to connect back to the top
SDM if
the top SDM host/port is not a parameter?
Connections propagate up the tree (starting from the master). Each sdm
knows the index of its children (computed as a binomial tree) so it
just attempts to connect to its children using the address/port
obtained from the routing file.
6) If the individual SDM is passed the host/port that it connects to
the
top SDM, how do I find out what that top level SDM port is?
There is no easy way to do this at the moment, since it is generated
internally and passed to the submitJob command as an argument. The
easiest way would be to print out the arguments to the submitJob
command either in the Java side of the RM or in your proxy.
I think I understand how this is supposed to work, and it seems
reasonable
for the case where the user specifies a host list file. In the case
where
we use LoadLeveler to allocate nodes, I'm not sure how this will work
since we have no way of knowing what nodes are allocated until the
poe job
(the SDMs) starts.
The SDMs do nothing until they get the routing file. Would it be
possible to launch the SDMs, get the node information from LL, then
create the routing file? This is how the new OMPI RM works.
Greg