[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
I have just committed, finally, my changes. It took a few days to merge
back into the changes that Clement put in but I like the changes over
all. I've tested this extensively on my Mac and it works great. I have
NOT yet tested it on bproc, but intend to in the next day or two - I do
expect there to be problems that I had not foreseen.
The changes are pretty major, touching a lot of files, but basically
they move us to a more discovery oriented mechanism for getting system
status information. Everything now is pushed up from the runtime system
level. If the runtime system takes 10 minutes to tell us anything about
the cluster, we'll have an empty model for 10 minutes but it WILL NOT
BLOCK like it would before, waiting on a response. If the runtime
system sends us blocks of data about the machine over time, we'll update
as we get more data. So if we have a 10,000 node cluster which takes 60
seconds to completely update but the proxy is set up to send back info
asynchronously then it will fill in over time as the data comes in.
As such, ANYTHING that touches the model must know recognize that the
model may be in various states of completeness at any time. Do not
expect certain attributes (such as a node name) to be there, or the
status of a node. All code that touches the models must accept that you
may ask for an attribute and get back a 'null', so it will have to
adjust accordingly. Much of the UI code displays a "?" icon in these
cases, later changing to an appropriate icon once the data is fully
available.
This asynchronous updating has not yet been done for job launching -
that part still blocks. It is (was?) my intention to do this but I felt
it more important to commit something so that I didn't stay too far
afield in my own branch for much longer. Additionally, I hope that this
allows Randy to stop blocking on my commit.
One final note. There are compile errors in both the simulation and
mpich trees due to major structural changes to the core model. Greg
told me not to worry about these and I agree. Firstly, I didn't write
the mpich one and these changes should make that implementation a lot
easier now. Additionally, the simulation tree I think is defunct, so
maybe we'll just delete it?
--
-- Nathan
Correspondence
---------------------------------------------------------------------
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndebard@xxxxxxxx
---------------------------------------------------------------------