Hey all -
On a related note, I am working on BG/P support for Intrepid/Challenger/Surveyor at Argonne. They don't use Load Leveler, they use PBS Cobalt. So, to get job info I am using qstat, and to get node info I am using partlist.
One problem I have encountered is that partlist returns the partition blocks, not the actual machine nodes. From the partlist results, I can either return the partition blocks (which are not mutually exclusive - a single node can exist in many different partitions, configured different ways) or I can reconstruct the machine based on the partition names, and then map the jobs to the machine. For example, the 1024 nodes of Challenger are organized into 230 different partitions, with 16 to 512 node per partition.
The jobs themselves are allocated to partition blocks, so they have to be remapped too.
Neither of these mappings is that difficult, but I just wonder if it is more meaningful to show the partitions in the Eclipse IDE, or the BG/P hardware hierarchy.
Any suggestions?
Thanks - Kevin
-- Kevin A. Huck ParaTools, Inc (541) 359-2261
On Jan 13, 2012, at 6:32 AM, Greg Watson wrote: I think this is a question for the LML developers: How can the layout of the nodedisplay be customized for a particular resource manager?
Greg On Jan 12, 2012, at 9:00 PM, Dave Wootton wrote:
I'm trying to figure out what I need
to do in order for my PE JAXB resource manager to effectively handle LML
status views when the user's application is using a large number of nodes
in the cluster, and where displaying status for all those nodes will affect
Eclipse performance.
With PE, I have no information about
which nodes are in use until the application starts since PE uses either
a host file or a back end resource manager such as LoadLeveler to handle
node allocation at job submit time. I also have no information at all about
node topology, at least in the hostfile case.
So I think the default action is that
I need to arbitrarily group nodes into groups of 100 nodes, so taht if
I was using 500 nodes, the initiial view would be 5 elements representing
groups of 100 nodes, and where the user could zoom in to see detail of
those 100 nodes.
One idea that I had about better grouping
was to do something where I apply a regular _expression_ pattern to the list
of nodes, where node names may be representative of what frame they resided
in and group nodes in the same logical set into a single display element.
I'm looking for suggestions about what
I need to do in my resource manager or elsewhere to make this work.
Thanks.
Dave_______________________________________________ ptp-dev mailing list ptp-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________ ptp-dev mailing list ptp-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/ptp-dev
|