Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Re: Re: About sdm and mpi jobs


On May 24, 2006, at 9:26 PM, yang ke wrote:

Thanks,Greg.

Still I get a few questions  :-):

Happy to answer them.

when debug flag is on, we will start a debug job. In orte_server.c , debug_spawn() will allocate two job structures: one for the application, the other for sdm, and then sdms are launched in debug_spawn(am I right?) .

Correct.

1. Does orte launch the application at the same time as it launches sdm?

No. orte just makes a process allocation for the application and assigns it a jobid. The sdm then sets this information in environment variables prior to starting gdb. It is gdb that ultimately starts the application process, but because the environment has been set correctly the MPI application can communication with orte to set up the task id's. Since orte normally uses the environment to pass this information, there is virtually no difference between the sdm- launched application and a normal application. Because gdb started the application, it is also possible to get control very early in the application execution (possibly even in shared library loading), which will improve the debugger functionality.

2. sdm servers internally fork a gdb to launch the program ( execlp gdb_path,gdb, -q, -tty, /dev/null, -mi, program,NULL). Is this program a newer one?

I'm not sure what you mean. Each sdm process starts a local gdb that forks/execs the application process.

3. if 2 is, then it seems like gdb debugging a local single program. but the fact is , ptp parallel debugger works quite well .

Thanks :-). Although the debugger only provides basic commands at the moment, we think it's a great platform for developing some really interesting parallel debugger features.

You are correct, each gdb is debugging a single process. The sdm uses gdb just for the low-level debug operations (such as setting a breakpoint, etc.). The sdm is responsible for managing communication between Eclipse and the many copies of gdb that are running. I'm currently working on some optimizations that will make this more efficient (at the moment we get linear scaling, but by using well know techniques we should be able to get log scaling). This will mean that debugging 10,000+ processes will be feasible. I've tested the current architecture to 1000 processes with acceptable performance.

4. Will the new resource manager part in future PTP provides interface for slurm?

Absolutely. Randy has designed the resource manager to work with virtually any type of job scheduler. He's looked at both LSF and slurm to make sure that they will work with the RM system. We haven't yet decided which scheduler to interface to first, but if there was interest in using PTP with slurm then we could look at that option.

Cheers,

Greg


Thank you.

BTW, My gdb version is 6.0post-0.20031117.6rh. I will consult the administrator for a higher version.
Integration with resource mangers is something we're working on right
now, but is not currently available in PTP. At the moment, you have
to do the allocation request manually before you launch the job
(possibly even before you launch Eclipse if environment variables are
used.)

In any case, the allocation you need to make for debugging an MPI job
with the SDM will depend on your resource management system. The SDM
itself requires an allocation of n+1 processes (where n is the number
of processes in the job being debugged). It will then fork/exec n
copies of gdb (one per process being debugged) and n copies of the
program being debugged. If your resource manager somehow manages the
allocation of forked processes, you may potentially need to request
an allocation of 3n+1 processes to debug the job. The usual way to
deal with this would be to provide a debug or interactive queue that
these types of jobs can be submitted to.

Threading support has been in gdb for a while, but I'm not sure if it
was available before version 6. Also, it depends on threading support
being available on your machine architecture and operating system.
What version of gdb to do you have (gdb -v), and what type of system
is it?

Regards,

Greg



Do you Yahoo!?
Get on board. You're invited to try the new Yahoo! Mail Beta.
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



Back to the top