Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Orte jobs do not stop

Wyatt,

I haven't tried PTP 2.0 with Open MPI 1.2.6 (only 1.2.5) so it's possible that something has broken. I'll install it on my Linux VM and let you know how it goes.

Greg

On Jul 9, 2008, at 6:08 PM, wspear wrote:

This is openmpi 1.2.6 built with gnu 4.1.2.  It's running on x86_64
Linux.  I have been using the PTP 2.0 available from the update site
(2.0.0.200806061515).  The behavior is the same in both Europa and
Ganymede.

-Wyatt

On Wed, Jul 9, 2008 at 5:35 AM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:
Wyatt,

What version of Open MPI are you using? What type of system is it? Is this
PTP 2.0 or from CVS?

PTP 2.0 has not been tested with Ganymede, but it sounds like this is a problem with Open MPI. Can you try with Europa to see if you have the same
problem?

Thanks,

Greg

On Jul 8, 2008, at 11:36 PM, wspear wrote:

Greetings,

When I try to execute an mpi application with ptp via the orte it
seems to run successfully, but after what should be the final output
is printed the ptp continues to list the job status as running, and
the orte process's processor usage shoots up to 100% in top. If I try
to stop the job or shut down the orte resource manager manually
eclipse freezes solid and I need to kill the orte process from the
command line.

Three possibly relevant factors are that I'm using a version of
openmpi configured for use with pbs (though I'm just running on the
head node at the moment), I'm running these tests in the Ganymede
Eclipse release, and I get a warning about oversubscribed nodes (which is also normal for running with mpirun on the headnode in this case).

I don't know if any of those could explain why the application would
run successfully while the orte fails to stop, though.

When I run it on a back-end node, where interactive jobs are allowed, the execution completes without the warning, but the output only shows
up on the command line where Eclipse was launched, and there is no
sign that the start of the process or individual jobs were detected or
handled by the PTP.  The orte process still freezes as described
above.

Any ideas how I might fix this? Has anyone has been working on a pbs
resource manager for ptp?

Thanks,

Wyatt
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




Back to the top