Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jetty-dev] Jetty-7 stress test failures.


All,

We've been having some intermittent failures in the jetty-7 stress tests that
have been holding us up from doing an M3 release.

But we've not been able to ping anything down and I'm not even convinced
the problem is not in the test harnesses themselves.

However, if the problem is in the server code, then it has all the hall
marks of a race condition, so I would not mind if a few more people looking
over the code with an eye to how we have synchronized it (or not).


The key holder of state is the HttpConnection class with it's contained
HttpGenerator and HttpParser classes.

These classes are not synchronized on the basis that only a single
thread should be accessing them at any given time.   This is controlled
by the dispatching of threads performed by the SelectChannelEndPoint,
which is a well synchronized class.

Specifically the SelectChannelEndPoint._dispatched field is protected
and a thread is dispatched to handle the connection only if _dispatched
is false.  The act of _dispatching the thread set's _dispatched to true.

_dispatched is only set to false by a call to undispatched() which is
only called in a finally block of the run method of the dispatched
thread (which calls _connection.handle()).

So with this synchronization, I am convinced that only a single thread
will be in the handle method at once, and there is a definite happens-after
relationship between one thread dispatch and the next, so memory barriers
should be ok.


Now the only slight exception to all this are the close/reset methods that
may be called when shutting down a server or if the connection unexpectedly
closed.     Thus the methods HttpParser.reset(boolean) and
AbstractGenerator.reset(boolean) have synchronized blocks to make sure that
only a single reset is called at once, and thus buffers etc. are only
returned to pools once.     But mmmm   I don't think this is good
logic, because that will protect setting the buffer fields to null, but
they might be set to non null out side of a synchronized by another
thread.  So this logic needs to be improved - but I think the only risk
that some buffers might not be returned to the pool - which is no big
deal.

I think there is also a minor issue that a dispatched thread might not
see the reset/close of connection (because of lack of synchronization)
and thus it could see a strange state - and throw strange exceptions.
This does indeed occasionally occur, but it is only for a close/reset
connection - so this should not be fatal - just annoying.  Or at least
I'm trying to convince myself of that.

So currently I don't think the reset issue is anything fundamentally
wrong, but I am thinking that perhaps it should be cleaned up so
that there is no close race.  Perhaps all close/resets need to be
dispatched to the connection.

But all this pre-supposes that there are no other threads accessing
the connection other than via handle() and reset().  This looks to
be true, but needs to be verified.

Perhaps the best thing to do is to fix the reset issue and then
there are no excuses for the occasional strange state.

But if anybody wants to caste their eye over the code with
these kinds of issues in mind - feedback much appreciated.


cheers

























Back to the top