Re: [cdt-dev] [DSF] Making Sequence more robust

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [cdt-dev] [DSF] Making Sequence more robust

From: Pawel Piech <pawel.piech@xxxxxxxxxxxxx>
Date: Thu, 20 Oct 2011 10:40:17 -0700
Delivered-to: cdt-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/cdt-dev>
List-help: <mailto:cdt-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.23) Gecko/20110921 Lightning/1.0b2 Thunderbird/3.1.15

On 10/20/2011 05:49 AM, Vladimir Prus wrote:

On Monday, October 17, 2011 20:20:13 Pawel Piech wrote:

Pawel,

thanks for guidance here. I have a couple of questions:

1. I haven't quite worked our through the complexities of catching runtime
exceptions by DsfExecutor, but I wanted to clarify what the goal should be.
At least for launch code, the workflow is that GdbLaunch delegates executes
startup sequences at waits for them using Query, so I guess the right goal
is that all runtime exceptions thrown while we're 'inside' the execute
method of a query end up in 'get' method of the query throwing?

Yes, I think that's perfectly reasonable. Adding a guard in Query.run()to catch runtime exceptions is a simple fix for a limited set of cases.The only downside I can think of is that it may give the caller animpression that all cases where the rm is not completed will be caught.Another remote possibility would be that a runtime exception could bethrown after the RM is passed down to some other service. Then whenthat service eventually completes, the rm.done() will have been calledtwice. Overall though I think the benefit is worth these two risks.

2. While a general repository for outstanding RM's might be a good idea,
what would you say if we start small, and implement the solution that would
remember outstanding RequestMonitor instances for some limited cases (for
example, 'target *-remote' commands in DSF-GDB, and then cancel those
RequestMonitors after a timeout? This is somewhat orthogonal from the
initial goal of catching lost RMs, but seems sufficiently related and
immediately useful.

This is something that could be done at the service level, where thesecommands are issued. If we expect that the back-end may never completesome commands then in those cases a timeout guard should even berequired. For example, we queue commands with the command control, butwe count on the process monitor to notify us if the back end processdies. At that point we complete any outstanding command in queue withan error. So our process monitor is the guard.

3. What do you think about changing the interface of RequestMonitor
along the lines I have originally suggested, so that the implementor
is forced to declare success or failure using return value?

While the sequence is designed to perform async operations in each step,in reality most steps return immediately. So your suggestion couldsimplify majority of those steps. I would suggest to introduced anextension: SyncStep, which would then include the syncExecute() or(xexecute) method.


Cheers,
Pawel

Thanks,

I think there's at least a couple things we could do to improve this:

*Report runtime exceptions thrown in DSF executor.*
Currently a quirk in the implementation of ScheduledThreadPoolExecutor
keeps us from catching these exceptions unless assertions are enabled.
If a brave souls steps up to do a clean-room reimplement this class we
could get around this bug.  It wouldn't need to be as efficient or as
complicated implementation because we only need a single thread.  Or
maybe someone more clever can think of a way to get around this problem
without reimplementing or degrading performance. (bug 236915)
*
Create a registry for outstanding request monitors.*
Most request monitors are passed around between functions, sometimes
directly as parameters to functions, and sometimes as parents to other
RMs.  There's only a few clients outside of the DSF framework that
collect request monitors and don't complete them immediately: caches and
things like the GDB command control.  We could catch the lost RMs if we
create a registry where RMs are added when created and removed when
collected by a service or another component.  So a few example flows
would be:

_Simple completion_
- An RM is created and added to the registry.
- A service calls RM.done() and the RM is removed from the registry.

_Waiting on async completion_
- An RM is created and added to the registry.
- A service sends a command to an asynchronous data provider (command
control)
- The service manually removes the RM from registry.

_RM cached, waiting on another RM to complete
_- An RM is created and added to the registry.
- A service finds that another RM is already waiting in cache for the
same result
- Cache removes the RM from registry.
_
Stacked RM waiting on async completion_
- An RM is created and added to the registry.
- Another method takes the RM and adds it as a parent to another RM.
- A service sends a command to async. data provider:
- Service removes RM from registry.
- RM removes its parent from registry too.

At the end of any sequence there should be no RMs in the registry.  So a
post-executor if a DSF executor could check for RMs in the registry and
log exception or even immediately halt execution.

We'd chance a lot of false positives at first and we may have some hard
to resolve edge cases, but with some work we'd  end up with a more
predictable system.  Also, for backward compatibility, this tracking
would need to be optional.

Cheers,
Pawel

On 10/16/2011 03:07 AM, Vladimir Prus wrote:

Hi,

presently, if any step of a Sequence fails to call 'done' or 'cancel',
the whole Sequence just hangs, and it's rather painful to debug.

Can we do something to improve this long-standing problem? For example:
      abstract public static class Step {

          ....

          public final void execute(RequestMonitor rm) { // note: final.
			
			xexecute(rm);
			if (!rm.isDone()) // isDone would have to be added
			{
			
				rm.setStatus(new Status(IStatus.ERROR, ..., "Misbehaving step",
				....));
				
	            rm.done();
			
			}
			
          }
		
		abstract protected void xececute(RequestMonitor rm);
		
      }

Or, we can take take this even further:
      abstract public static class Step {

          ....

          public final void execute(RequestMonitor rm) { // note: final.
			
			rm.setStatis(xexecute(rm))
			rm.done();
			
          }
		
		abstract protected IStatus xececute(RequestMonitor rm);
		
      }

In this case, the compiler will object if a step fails to return a value,
although there's a risk that a step might fail to handle failure of any
operation it does and still return IStatus.OK anyway.

Comments?

Follow-Ups:
- [cdt-dev] Timeouts for GDB/MI commands (Was: [DSF] Making Sequence more robust)
  - From: Vladimir Prus
- Re: [cdt-dev] [DSF] Making Sequence more robust
  - From: Marc Khouzam

References:
- [cdt-dev] [DSF] Making Sequence more robust
  - From: Vladimir Prus
- Re: [cdt-dev] [DSF] Making Sequence more robust
  - From: Pawel Piech
- Re: [cdt-dev] [DSF] Making Sequence more robust
  - From: Vladimir Prus

Prev by Date: Re: [cdt-dev] [DSF] Making Sequence more robust
Next by Date: Re: [cdt-dev] Fix for bug 307311
Previous by thread: Re: [cdt-dev] [DSF] Making Sequence more robust
Next by thread: Re: [cdt-dev] [DSF] Making Sequence more robust
Index(es):
- Date
- Thread

Breadcrumbs