Hi Jan,
Thanks for the response. I've updated my code accordingly. And thanks for the insight on clients. I have a related question that may be at least partially explained by what you told me about clients being heavy-weight.
In my application, I own both the client and the server. The server is a secure embedded Grizzly server. It's configured to "want" and "need" client authentication. It's using a simple self-signed certificate. It does, however, use a dynamic x509 trust store of my own design. It's reasonably simple, however - it merely accepts modifications to an internal keystore, which it then reapplies to a new trust manager before any TrustManager interface requests are satisfied. I've used this technique in the past and it seems to work well.
The client is a Jersey 2.35 client (I can't upgrade to 3.x yet due to the Jakarta dependency issue). It's configured with an SSLContext that uses a client-side private-key/cert, and a simple trust-all trust manager. (The application is more interested in using TLS to encrypt the pipe and to identify clients than to ensure against MIM attacks, so we're not really concerned about server spoofing.) I initially used the default HttpUrlConnector, but I was getting intermittent SSL handshake failures due to "SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" errors.
Generally, in my experience, such errors either occur all the time, or never. I've never heard of them being intermittent before.
I played with this configuration for a solid week, but every tweak I made seemed only to make the problem occur more often, rather than less. Finally, on a whim, I decided to try a different connector provider. I plugged in the ApacheConnectorProvider and, suddenly, all the SSL handshake failures disappeared! Weird! How could a different connector provider affect how the underlying JSSE component works?
Well, it seemed to work at first, but then I realized it wasn't working perfectly. The problem I was now having was that the Apache connector provider seemed to be limiting the number of connections I could make between my clients and servers.
My application establishes a separate Jersey client for each server it connects to. These clients are cached and reused for each message sent to the associated host. In my test setup, I have four hosts, each acting as both a client and a server. Therefore, each client component talks to four other hosts (including itself). The client retries some failures on an exponential backoff delay.
The problem is, a dozen or so connections would be made and then nothing would happen. I have a timeout reaper thread that monitors outstanding requests and times them out after 10 minutes. 10 minutes after the hang, all my threads come back with timeout errors, telling me that the Apache connector provider is simply freezing up and not sending these additional requests.
So, I thought, why not try the pooling connection manager? (I'm now just complicating the situation, I know.) Here's the strange part - when I plug the pooling connection manager in, I start getting the SSL handshake errors again!
If you (or anyone else in the community here) have any thoughts on what might be causing this strange behavior, I would very much appreciate your insights. I've been using Jersey and Grizzly for years now, and I can usually solve any issues on my own at this point, but this one has me totally stumped.
Regarding how this connects with the previous question: Given what you said about clients being heavyweight and not really needing to be closed because of their long-lived nature, should I be using a single client to connect to all hosts? If using a few clients - one for each host - is ok, should I be using a single ApacheConnectorProvider instance for all clients? A single pooling connection manager?
Kindest regards,
John