Hello,
This is a re-send of an email I sent to jgit-dev on 6/19. I don't see my message in the archives, and I sent it immediately after subscribing, so I thought maybe it was rejected or otherwise lost.
We are using JGit primarily via Gerrit Code Review v2.14.1. I'm emailing this list as prior emails to the Gerrit list have pointed out that this is a JGit question.
Since upgrading to Gerrit 2.14.0, we've noticed poor push performance for tags and commits, especially later in the workday. Push times often exceed 1 minute - I've personally seen as high as 6m30s - which is a significant regression for us. The initial cause was thought to be the recently-introduced autogc capability, so we disabled it by setting receive.autogc false in all of our repos, but we're still seeing very bad push performance even with it disabled.
One thing I have noticed is that every new patchset or tag pushed to Gerrit causes a new pack file to be created. In the case of a tag pushed after a patchset, a new pack file with the patchset content + the tag object is created, i.e. we have one pack with the commit, tree, and blob objects, plus another pack with all of that plus a tag object. I'm not sure if this is new or if this is how JGit/Gerrit have always worked.
Having the additional packfiles doesn't seem to affect small repos where there is not a lot of change during the day, however for one of our larger/busier repos we end up with hundreds of pack files by the end of the workday; we do a single GC run at about 10pm - GC on this repo takes ~45 minutes since Gerrit 2.14 - which reduces the number of pack files to 2, but then dozens of nightly builds (each of which are tagged) plus whatever developers push during the day brings the total back up into the hundreds by the next evening. This affects our developers when they try to push, and also slows down any automated push activity (e.g. build tagging).
Regular git doesn't seem to create these additional pack files, instead there are new loose objects which are then gathered into a pack by gc if a loose object threshold is crossed.
Is JGit (or Gerrit's use of JGit) supposed to be creating all these additional pack files? Would a high number of small pack files, some with redundant data, contribute to push performance issues? If so, is there any information I can put together which would help narrow down what the problem is?
Thanks,
-Will