[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [ptp-dev] Re: Re: About sdm and mpi jobs
|
On May 24, 2006, at 9:26 PM, yang ke wrote:
Thanks,Greg.
Still I get a few questions :-):
Happy to answer them.
when debug flag is on, we will start a debug job. In
orte_server.c , debug_spawn() will allocate two job structures: one
for the application, the other for sdm, and then sdms are launched
in debug_spawn(am I right?) .
Correct.
1. Does orte launch the application at the same time as it
launches sdm?
No. orte just makes a process allocation for the application and
assigns it a jobid. The sdm then sets this information in environment
variables prior to starting gdb. It is gdb that ultimately starts
the application process, but because the environment has been set
correctly the MPI application can communication with orte to set up
the task id's. Since orte normally uses the environment to pass this
information, there is virtually no difference between the sdm-
launched application and a normal application. Because gdb started
the application, it is also possible to get control very early in the
application execution (possibly even in shared library loading),
which will improve the debugger functionality.
2. sdm servers internally fork a gdb to launch the program
( execlp gdb_path,gdb, -q, -tty, /dev/null, -mi, program,NULL). Is
this program a newer one?
I'm not sure what you mean. Each sdm process starts a local gdb that
forks/execs the application process.
3. if 2 is, then it seems like gdb debugging a local single
program. but the fact is , ptp parallel debugger works quite well .
Thanks :-). Although the debugger only provides basic commands at the
moment, we think it's a great platform for developing some really
interesting parallel debugger features.
You are correct, each gdb is debugging a single process. The sdm uses
gdb just for the low-level debug operations (such as setting a
breakpoint, etc.). The sdm is responsible for managing communication
between Eclipse and the many copies of gdb that are running. I'm
currently working on some optimizations that will make this more
efficient (at the moment we get linear scaling, but by using well
know techniques we should be able to get log scaling). This will mean
that debugging 10,000+ processes will be feasible. I've tested the
current architecture to 1000 processes with acceptable performance.
4. Will the new resource manager part in future PTP provides
interface for slurm?
Absolutely. Randy has designed the resource manager to work with
virtually any type of job scheduler. He's looked at both LSF and
slurm to make sure that they will work with the RM system. We haven't
yet decided which scheduler to interface to first, but if there was
interest in using PTP with slurm then we could look at that option.
Cheers,
Greg
Thank you.
BTW, My gdb version is 6.0post-0.20031117.6rh. I will consult
the administrator for a higher version.
Integration with resource mangers is something we're working on right
now, but is not currently available in PTP. At the moment, you have
to do the allocation request manually before you launch the job
(possibly even before you launch Eclipse if environment variables are
used.)
In any case, the allocation you need to make for debugging an MPI job
with the SDM will depend on your resource management system. The SDM
itself requires an allocation of n+1 processes (where n is the number
of processes in the job being debugged). It will then fork/exec n
copies of gdb (one per process being debugged) and n copies of the
program being debugged. If your resource manager somehow manages the
allocation of forked processes, you may potentially need to request
an allocation of 3n+1 processes to debug the job. The usual way to
deal with this would be to provide a debug or interactive queue that
these types of jobs can be submitted to.
Threading support has been in gdb for a while, but I'm not sure if it
was available before version 6. Also, it depends on threading support
being available on your machine architecture and operating system.
What version of gdb to do you have (gdb -v), and what type of system
is it?
Regards,
Greg
Do you Yahoo!?
Get on board. You're invited to try the new Yahoo! Mail Beta.
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev