Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[ptp-dev] My Commit

I have just committed, finally, my changes. It took a few days to merge back into the changes that Clement put in but I like the changes over all. I've tested this extensively on my Mac and it works great. I have NOT yet tested it on bproc, but intend to in the next day or two - I do expect there to be problems that I had not foreseen.

The changes are pretty major, touching a lot of files, but basically they move us to a more discovery oriented mechanism for getting system status information. Everything now is pushed up from the runtime system level. If the runtime system takes 10 minutes to tell us anything about the cluster, we'll have an empty model for 10 minutes but it WILL NOT BLOCK like it would before, waiting on a response. If the runtime system sends us blocks of data about the machine over time, we'll update as we get more data. So if we have a 10,000 node cluster which takes 60 seconds to completely update but the proxy is set up to send back info asynchronously then it will fill in over time as the data comes in.

As such, ANYTHING that touches the model must know recognize that the model may be in various states of completeness at any time. Do not expect certain attributes (such as a node name) to be there, or the status of a node. All code that touches the models must accept that you may ask for an attribute and get back a 'null', so it will have to adjust accordingly. Much of the UI code displays a "?" icon in these cases, later changing to an appropriate icon once the data is fully available.

This asynchronous updating has not yet been done for job launching - that part still blocks. It is (was?) my intention to do this but I felt it more important to commit something so that I didn't stay too far afield in my own branch for much longer. Additionally, I hope that this allows Randy to stop blocking on my commit.

One final note. There are compile errors in both the simulation and mpich trees due to major structural changes to the core model. Greg told me not to worry about these and I agree. Firstly, I didn't write the mpich one and these changes should make that implementation a lot easier now. Additionally, the simulation tree I think is defunct, so maybe we'll just delete it?

--
-- Nathan
Correspondence
---------------------------------------------------------------------
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndebard@xxxxxxxx
---------------------------------------------------------------------



Back to the top