On 12/02/2021 17:11, Greg Wilkins wrote:
> So this gives us two problems. Firstly I now cannot find any current
> specification of HTTP URL path parameters and how they should be
> parsed. So I think the servlet spec could be relying on the
> grandfathered definition from RFC2396. IS this true? Does anybody know
> of a current specification for them?
I'm not aware of anything better than the references you have provided.
Then I think we need to probably note in future specs we are using an obsoleted RFC to define what they are. I'll ask contacts in IETF if there is any process to get path parameters formalized there.
> In Jetty, we already handle ambiguous URIs like/foo/%2e%2e/bar with a
> 400 and intend to do the same with /foo/..;/bar and /foo/%2f/bar. How
> do the other container handle such URIs?
In Tomcat we process URIs with the following sequence:
- separate the query string (if any)
- parse the path parameters (we are only interested in jsessionid or
equivalent) and remove them from the URI
- %nn decode the URI
- normalize the URI
- map the URI to virtual host, web app, filters, security constraints,
servlet etc.
That is also how Jetty has done it until recently. Specifically decode and then normalize.
However, that is not compliant with RFC 3986 which says that normalization should happen before decoding. That is all fine if you remember the segment boundaries so that segments like "%2e%2e", "%2f" and "..;" would be seen as decoded segments after normalization of "..", "/" and "..". But we don't remember segment boundaries and put them all back into a string, so following the RFC of normalize and then decode means that information is thrown away and those segments are now ambiguous.
Our plan is to now to follow the RFC and then 400 any requests that have such segments, but have a compliance mode for apps that want to work on the raw URI.
I don't think that is a huge difference to what Tomcat is doing, other than for a URI like "/foo/%2e%2e/bar". Jetty will 400 such requests unless it is the compliance mode, in which case the path of "/foo/../bar" will be passed the application (which will have to have it's big boy pants on and look at the raw URI). I think Tomcat will handle that as "/bar"? Jetty used to do that, but we had problems with intermediaries putting constraints on "/foo/*"
Given all of the above when receiving a request of the form
"/foo;v=1/bar" there were several ways to handle it:
No disagreement on the handling of that one. Spec is clear that we strip path params and treat it as "/foo/bar". It is only when the docoded segment is ".", ".." or includes "/" that I think there is a problem.
The downside with approach 3 is that when used behind a reverse proxy,
and depending on how the reverse proxy does the request mapping, it may
be possible to bypass path based security constraints defined in the
reverse proxy. If the reverse proxy is using path matching as a security
control, URIs containing "/..;/" may bypass that control. The Tomcat
provided proxy modules (mod_jk, ISAPI redirector) map requests using the
same algorithm as Tomcat to avoid this potential issue.
Using the same algorithm is key to avoiding security problems. Given that the RFCs are very poor in this area then perhaps the servlet spec needs to specify the algorithm we expect to be applied?
cheers