SUSPEND and RESUME sound good to me. The proxy will presumably still need to deal with the situation where more than one SUSPEND is sent before a corresponding RESUME?
I can now detect thresholds for flow
control and send START_EVENTS and STOP_EVENTS commands to the proxy. As
I started testing this I found what I think are a couple problems, at least
with how the PE and LoadLeveler proxies were implemented. The START_EVENTS
command indicates a state change between sending attribute definitions
to the client and sending machine, node and job/process information to
the client. In the LoadLeveler case a couple threads get created to monitor
LoadLeveler node and job objects. The transaction id passed across with
the START_EVENTS command is also used for sending asynchronous events such
as job state and process state notifications that take place outside the
scope of a run command. When a STOP_EVENTS command is processed, an OK
ack is then set for the start command, which closes the scope of that command
and invalidates the transaction id. This is probably ok, since the ack
for the OK command will be behind any event notifications issued within
the scope of that START_EVENTS/STOP_EVENTS pair.
However, if the proxy generates more
events between the time that the STOP_EVENTS command is processed and the
next START_EVENTS command is processed, those events will be tagged with
the old START_EVENTS transaction id and since they follow the OK ack for
the START_EVENTS command will be invalid.
Also, if we allow multiple START_EVENTS
commands then proxies need to be able to distinguish between first and
subsequent START_EVENTS commands, and process them differently.
I'm afraid if we use START_EVENTS and
STOP_EVENTS for flow control this is going to cause more problems. I'm
thinking we introduce new commands SUSPEND_EVENTS and RESUME_EVENTS to
handle event notifications, where all they do is request flow contol as
well as the SET_FILTERS and CLEAR_FILTERS to handle the flow control on
process change events currently limited to handing stdio flow control.
Or do we just use SET_FILTERS and CLEAR_FILTERS for everything?
Dave
|