I asked Markus about the flattening some time ago
because it has caused problems more than once.
The general reason, which I think makes reasonable
sense, is to consider how many round internet trip
accesses are needed for loading. For any repo, p2 must
access the p2.index first. Then it knows to try the
composite content/artifact jar/xml first, each of which
it must then load. Then it must load the one child repo
of the composite, again for both contents and
artifacts.. So it's double the number of round trips.
And p2 itself doesn't cache the p2.index, so it actually
loads each one twice for this pattern of a colocated
composite with a single colocated child. (Oomph's
transport layer caches the p2.index.)
Ed, would you be interested in providing p2.index cache
on p2 level?
It's actually a bit of a tricky issue. The file is very tiny and
we do caching primarily for offline support and to avoid accessing
the resource at all. I don't think a HEAD request will be
significantly faster than a GET request, so there's not much time
to be saved from caching if it's going to be used in the same way
as it's currently used by p2. We generate a p2.index in the cache
even if one is not present in the repository to avoid trying all
the possible combinations the next time the repo is loaded. In
the Oomph layer, we avoid doing anything with the remote p2.index
(no HEAD nor GET request) once it's present in the cache. This
works well in general as long as the p2 repo doesn't change its
"type". If the type changes, we must try all the combinations and
then record it in the cached p2.index for the next time. This too
works well, as long s the server doesn't tell us that a deleted
file exists...
This is all quite different from what "bare" p2 does, but it
works very well to avoid additional delays in accessing the
p2.index at all, let alone multiple times.
Much of what we added in Oomph could be pushed into p2. (Lots of
mirror improvements as well.) All that would be a very good
thing. But I checked my budget just now and it says there are
zero dollars for this and my list of zero dollar work items is
enough for more than a full time job.
So yes I'm interested but it's not sustainable.
Why doesn't this same argument apply for the release
train composite? Because is composes the one simple
release repository and the EPP repository itself, i.e.,
it's always a composite with at least two children.
The catalog generator optimizes this to reference the
one simple child of the train, and the one simple EPP
repo. Even if the EPP repo were a composite with a
simple child on the release day, the generator could
still optimize this to refer it's one simple child.
Even the train composite could be optimized to refer to
the one simple child of the EPP composite.
So I don't really care so much which approach is taken,
but that's the reasoning behind it, i.e., it's optimal
for users updating their IDE via a single URI.
The CI job that generates everything makes that
composite - but as per checklist means that it
isn't used.
I'm a bit concerned that neither look right now as
I would expect based
patterns of the past.
I think everything is in place. I still have to
satisfy myself about the make visible procedures
and ensure the mirrors are "loaded". This part of
the checklist has not been automated yet for EPP.
What I don't get - but so far I am "just doing
it" - is why the repo is flattened? Looking at
past releases (before quarterly releases) it was
done too. I don't know why EPP is not handled the
same way as the main downloads.