Re: [ptp-dev] Questions about PTP SDM debugger

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [ptp-dev] Questions about PTP SDM debugger

From: Greg Watson <g.watson@xxxxxxxxxxxx>
Date: Fri, 29 Aug 2008 09:21:37 -0400
Delivered-to: ptp-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/ptp-dev>
List-help: <mailto:ptp-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/ptp-dev>, <mailto:ptp-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/ptp-dev>, <mailto:ptp-dev-request@eclipse.org?subject=unsubscribe>

Dave,

I think connecting to random applications could be solved by havingeach sdm perform a simple handshake protocol. If it doesn't get thecorrect response, it can immediately drop the connection and tryagain. Since someone could spoof this, the handshake could alsoinclude some kind of authentication so that both the server and clientSDM could be validated.

Startup completion/error would be signaled by an event packet that ispropagated down the tree (to the master). This would be triggered by aleaf SDM (i.e. one with no children) or failure to connect to a child.

Another approach that might be worth considering would be to generatesome number of random ports (say 10) and pass these on the commandline to the SDM's (e.g. --ports=3234,43534,4547,6900...). Each SDMwould then try binding to each port in turn until it is successful (afailing process could exit, which would terminate the poe/mpiruncommand). When connecting to it's children, the SDM could try eachport number in turn until it was successful, or continue to cycle ifnot. Attempting to connect to an unbound port is a fast operation, sothe overall delay should be minimal. A handshake would still need tobe used in case a port was in use by another application.


Greg

On Aug 28, 2008, at 2:08 PM, Dave Wootton wrote:

I also prefer a base + mod(rank)/n approach. My problem with therandom
port selection with retry is that you introduce delays in the startup
process as you build the tree. Depending on tree depth, this couldcausedebugger startup to be slow. I don't think 65536 is the correct 'n'sinceyou then end up scattering PTP ports across the entire user portrange,and also, as Daniel points out, because of the multiple debuggerinstances
case. I was thinking 'n' might be 256 or maybe 512 since it's not very
likely a user would ever run that many tasks on a node. There'sstill theslight possibility of port collisions because of other applications,so
I'd 'reserve' a few more ports above base+n to use for collisions.
I'm not entirely sure why the parent SDM which is trying to buildthe treedownwards needs to use a random port connection approach. I'm alsoafraidthat if you make 'n' 65536, that you will accidentally connect to arandom
port from some other application and either get yourself hung because
whatever you connected to doesn't understand your protocol, orworse, you
hang or crash the other aplication.
You can at least partially solve this by telling the SDMs what'base' and'n' (if not hardcoded) are when you start the SDMs. Each task picksitsport number using the base + mod(rank)/n calculation. The proxy alsoknows
what 'base' and 'n' are, as well as task rank, so as it builds the
routing_file it can fill in the port number for each SDM. Then as SDMs
build the tree, they try to connect to that port.
The question is what do you do if you have a collision on port? TheparentSDM can try connecting to ports in the spare port range until itconnects
to the proper SDM.. You still have the problem of connecting to random
applications. Also, what's the timeout when you try to connect to aportwhere nothing is listening? Could it be long eneough to make startuptime
a problem?

Your handshake to validate that a legitimate SDM is connecting is
important, especially since whatever connects (wrong user's SDM,malicioususer) to the SDM could get control of the debugger on some tasks intheapplication. If you were to build the tree bottom-up instead of top-down,then as long as an invalid connection could do no worse than sendbad dataupstream to the GUI and not grab control of the application, therisk is
less.
I'm not sure I'd rely on users to pick 'base'. If two users pick thesamebase, or one that causes overlap of port numbers, then you stillhave a
problem.
A simplistic way to pick base would be a calculation based on theuser'suid number (uid # * something mod(256)?). The only other way I knowhow tosolve the problem is what we did in DPCL, where we had what wecalled a'super daemon' that ran as root and handled the issue of startingmultiple
instances of DPCL daemons, but that gets a little complicated.
How do you tell when the tree is built? Is this by each parent SDMkeeping
track of how many child SDMs it started and counting responses then
reporting 'done' up the tree?
Dave



Daniel Felix Ferber <dfferber@xxxxxxxxxxxxxxxxxx>
Sent by: ptp-dev-bounces@xxxxxxxxxxx
08/28/2008 01:18 PM
Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>


To
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
cc

Subject
Re: [ptp-dev] Questions about PTP SDM debugger






Greg and Dave,

I think that Greg suggestion to launch SDM is reasonable. But are we
considering race conditions? A am afraid that this approach might
present several failure patterns depending on how long each sdm delays
to start.

For example: The servers and the master are started nearly at the same
time. All servers bind to a port as you described. The master receives
the routing file and starts connecting to children that on their turn
connect to grandchildren and so on. What happens if a childrendelays to
start up for some reason? Its parent will try to connect (but the
children will not be listening yet) and the parent will try the next
ports, but will never try again the port that the children is actually
listening to. I saw this happening, and that is the reason why the
launcher is currently starting the master after the servers insteadthe
opposite as described in the specification. I think other race
conditions might be possible.

There is another issue in the strategy to launch the sdm master. After
starting the sdm master, the launcher starts listening on a socketwhere
sdm master is expected to connect. The port number is passed as
parameter to sdm master. However, it may happen that sdm starts faster
than the launcher creates the socket. The sdm master will try to
connect, and on failure try the next ports. This approach does notmake
sense in this situation, since the port number passed to sdm master is
guaranteed to be the port where the launcher is listening. Therefore,
sdm master should not try the next ports, but try the same port again.
Another concern: Does the handshake consider the job ID? There couldbea scenario were two users start a debug sessions on the same machineat
the same time. Then, one might connect to the SDM server of the other,
by accident, if the are listening on the same port range.

I agree that using a base port number is better than using a random
number for each process. I think it is enough that the base portnumber
is pseudo-random. I would avoid using a fixed port number because that
would potentially cause port number collisions on two simultaneous
debugging. I understand that sdm servers will know to handle this
collision, but the start of sdm servers will take more time. Bychoosingthe base port randomly, we reduce the probability of causingcollisions.
My comments about who should write the hostfile: I see Daveconcerns. Ireally did not consider that the amount of data to be transmittedwould
become that large.
Couldn't we establish a standard file format to be used for all
debuggers? Then the file could be written by the proxy, regardlesswhich
debugger is being used. I don't have a really good idea for this issue
yet.

Best regards,
Daniel Felix Ferber


Greg Watson wrote:
Good, I'm glad we're in agreement :-). Daniel, do you have any
comments on this?

Regarding the port numbers, this is not how I had intended the
debugger startup to work, so I want to change this at some point. My
approach is as follows, but any other suggestions would be welcome.

1. The SDM servers are given a "base" port number. At startup, they
attempt to bind to this port. If that fails, they try to bind to
base_port+1 after waiting a short random period (this is to avoid
servers started on the same node from chasing each other up the port
numbers). An alternative to this would be to bind to ((base_port
+rank)%65536)+1024. A third alternative would be to use a pseudo
random number generator seeded by the rank.

2. When the SDM master receives the routing file, it can then
determine the location of it's children, so it attempts to connect to
each in turn using the same port generation mechanism as in #1.

3. Once the connection is established to the server, a handshake is
used to swap credentials, etc., then the routing file is sent. The
routing file could be successively pruned as it propagates up thetree
to reduce bandwidth.

4. Once the server receives the routing file, it does the same as #2.

5. This continues until all connections have been established, or
there was a timeout or some other error.

Greg

On Aug 28, 2008, at 8:46 AM, Dave Wootton wrote:
Greg
I think the proxy should be responsible for building the routing
file, in
order to keep the traffic on the connection between the GUI and the
proxy
down. With the current approach, you are sending node information
across
the connection twice, once to populate the PTP runtime model, then a
second time to create the routing file on the nodes where the SDMsarerunning. I'm not sure what the message length for the messagesfrom the
proxy to the GUI are, but for the remote_file you have
strlen(task_index)
+ strlen(hostname) + strlen(port_number) + 3 bytes per node. In mycase
that's close to 20 bytes per task, minimum. With large numbers of
tasks,
this could be a lot of data, and since all of these interactions
between
the GUI, the proxy, and the SDMs are a serial process, they slowdown
debugger startup.
The down side to this is the need for each proxy to implementsupport
for
each of unique debugger startup sequences it is willing to support,
where
you could end up with some proxies not supporting a debugger. If you
implement all of the code in the GUI resource manager side though,
I'm not
sure you don't have the same problem, where the RM needs to beaware ofthe details of both the debugger startup sequence and the detailsof a
particular runtime environment/proxy.
The other question I have after seeing the contents of the routingfile
you generate is the generation of random port numbers. If you end up
actually using these port numbers, do you run the risk ofaccidentally
using a port number reserved for some other application, unless you
block
out a range of port numbers and only use that range? Even if port
numbers
are up for grabs with no expectation of reserved port numbers, what
happens if something else is using your port number?
Dave
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

References:
- Re: [ptp-dev] Questions about PTP SDM debugger
  - From: Dave Wootton

Prev by Date: Re: [ptp-dev] Problems starting SDM processes
Next by Date: [ptp-dev] Fwd: [CQ 2606] ANTLR Version: 3.1
Previous by thread: Re: [ptp-dev] Questions about PTP SDM debugger
Next by thread: [ptp-dev] Externalized messages on PTP
Index(es):
- Date
- Thread

Breadcrumbs