[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [ptp-user] Job submission problems with PTP 5.0.3 and Torque RM
|
Beth,
I'll have to go back and try. I believe it was working with 5.0.1 (it was in the June 2011 time frame). An update in this system's Torque installation occurred since then also, but I have no control over that and so would like to get 5.0.3 working with whatever is installed.
Phil
On Nov 1, 2011, at 16:10 , Beth Tibbitts wrote:
>
> I can't help with the RM problem but can you tell us, what was the previous
> version that worked for you?
> 5.0.2? This was a breakage fro 5.0.2 to 5.0.3 ????
>
>
> ...Beth
>
> Beth Tibbitts
> Eclipse Parallel Tools Platform http://eclipse.org/ptp
> IBM STG - High Performance Computing Tools
> Mailing Address: IBM Corp., 745 West New Circle Road, Lexington, KY 40511
>
>
> |------------>
> | From: |
> |------------>
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |"Roth, Philip C." <rothpc@xxxxxxxx> |
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | To: |
> |------------>
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |PTP User list <ptp-user@xxxxxxxxxxx> |
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | Date: |
> |------------>
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |11/01/2011 03:50 PM |
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | Subject: |
> |------------>
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |[ptp-user] Job submission problems with PTP 5.0.3 and Torque RM |
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | Sent by: |
> |------------>
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
> |ptp-user-bounces@xxxxxxxxxxx |
>> --------------------------------------------------------------------------------------------------------------------------------------------------|
>
>
>
>
>
>
> Hello all,
>
> After the recent 5.0.3 update, I'm back to trying to get PTP working well
> on a cluster to which I have access. The cluster uses Torque 2.5.7 for
> batch queue software. PTP 5.0.3 and Eclipse 3.7.1 don't seem to be able to
> completely talk to this Torque installation, and there are other problems
> that I can't seem to diagnose. Perhaps someone else has figured these out
> already, or has some suggestions about how to fix or work around these
> problems?
>
> For the following description I'm in the Parallel Runtime perspective, and
> working with a local MPI-based C++ project.
>
> The first hint something is wrong is that it isn't clear whether the
> PBS-Generic-Batch resource manager is fully started or not. I created a
> stock PBS-Generic-Batch RM. I use the context menu to start the RM, and
> after a second or so the RM icon changes from grey to green. However, if I
> select it and look at the properties view, the properties view still
> indicates the RM is in the STOPPED state, with "num machines" and "num
> queues" both 0. Furthermore, nothing shows up in the "Machines" or "Jobs
> list" views. These views seem contradictory - the RM view showing me it
> has started but others suggesting not.
>
> I'm able to create a Run Configuration for my program. Interestingly, the
> set of available queues in the "Run Configuration" dialog's Resources tab
> is the set of queues on our system, so PTP must have been able to obtain
> the correct set of queues.
>
> If I attempt to Run my Run Configuration (i.e., submit it to the batch
> queue), the progress view gets as far as saying 'submit-batch' with a large
> hex number (a GUID?) but sticks at 75%. PTP creates a file in my home
> directory named $(GUID)managed_file_for_script, but with a different GUID
> than the one shown with the submit-batch progress bar.
>
> Eventually I have to cancel the submit-batch operation in the progress
> view. When I do so, a message is displayed to the screen and written to
> the workspace log file saying that the qsub command failed because it
> couldn't find the batch script. The message shows the command was trying
> to use the path $HOME$HOME$GUIDmanaged_file_for_script (i.e., the home
> directory path is listed twice). I can't see how to modify the
> PBS-Generic-Batch XML file to keep it from building the path with $HOME
> twice.
>
> Just to explore, I created the directories and a symlink so that $HOME$HOME
> existed and pointed to $HOME. PTP was able to submit the job but it
> produced no output to the Console view nor did it change the Machines or
> Jobs list views.
>
> Does anyone have any ideas?
>
> Phil Roth
>
> P.S., using diagnostics advice given previously on this list, I found that
> the LML_da_driver.pl script is not correctly finding the version of Torque.
> When given the --version flag, the qstat command with Torque 2.5.7 writes
> its output on stderr, and that script sends stderr to /dev/null. If I
> change the script so that it sends stderr to stdout for this test, the
> script determines the version correctly but it has no effect on the
> problems I describe above.
>
>
> --
> Philip C. Roth | +1 865 241-1543 | http://ft.ornl.gov/~rothpc
>
>
>
> _______________________________________________
> ptp-user mailing list
> ptp-user@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/ptp-user
>
>
> <graycol.gif><ecblank.gif>_______________________________________________
> ptp-user mailing list
> ptp-user@xxxxxxxxxxx
> hxxps://dev.eclipse.org/mailman/listinfo/ptp-user
--
Philip C. Roth | +1 865 241-1543 | http://ft.ornl.gov/~rothpc