[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [ptp-dev] PBS remote resource manager fails on NERSC's Hopper
|
On Tue, Oct 4, 2011 at 12:39 PM, Wyatt Spear
<wspear@xxxxxxxxxxxxxx> wrote:
For some reason apstat requires me to enter my password again on Hopper, I'm guessing that would throw off the RM's query.
I can run apstat without having to enter my password again (both on Hopper and Jaguar). It works both with and without ssh-agent forwarding. Only if I use ssh-agent forwarding but my private key is not unlocked than I get asked for the password of the private key. But even than it works without providing it (pressing cancel).
Roland
To what extent is this node info needed for the remote build/launch functionality? Right now when this check fails the resource manager
fails to start altogether, but it seems like the remote build and launch commands should still be able to work even if we can't see what's happening on the system.
=Wyatt
Then we need to provide a modified lml.da, right? (trying to figure out where this is plumbed in, I believe it is packaged in a plugin). I don’t think there is a place yet, for configuring
this in the resource manager xml (really the control xml), but I’m not certain.
Jay
On Cray the pbsnodes command never gives useful information. On Jaguar it only shows the batch nodes.
"apstat -v -n" gives the required information instead.
Roland
On Tue, Oct 4, 2011 at 12:08 PM, Wyatt Spear <wspear@xxxxxxxxxxxxxx> wrote:
On Hopper pbsnodes returns:
pbsnodes: Server has no node list MSG=node list is empty - check 'server_priv/nodes' file
I'll ask the NERSC-ies about this...
Wyatt
On Tue, Oct 4, 2011 at 8:48 AM, Jay Alameda <jalameda@xxxxxxxxxxxxxxxxx> wrote:
Well, I’m slowly coming up the steep learning curve on the configurable RM. I know that the monitoring code, lml.da is looking at pbsnodes output on machines that support pbs. Maybe start
on hopper, and see what pbsnodes returns?
There also could be a missing perl module, that would be needed to convert the raw output of pbs nodes into xml that the client monitoring code expects to see. We saw this on one system here
at NCSA –
Jay
I would be happy to dig around and check. Is there a spot where I can do a sysout on what PBS is returning?
Wyatt
On Tue, Oct 4, 2011 at 7:51 AM, Jay Alameda <jalameda@xxxxxxxxxxxxxxxxx> wrote:
I think we tried this out on an older Cray system, Kraken, at NICS. I
seem to recall that it worked, including the LML display. I wonder what
may be different here?
Jay
-----Original Message-----
From: ptp-dev-bounces@xxxxxxxxxxx [mailto:ptp-dev-bounces@xxxxxxxxxxx] On
Behalf Of Greg Watson
Sent: Tuesday, October 04, 2011 6:37 AM
To: Parallel Tools Platform general developers
Subject: Re: [ptp-dev] PBS remote resource manager fails on NERSC's Hopper
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov865-241-1537, ORNL PO BOX 2008 MS6309