Hey,
It seems that copying of .gitignore files from /src/ to /bin/ directories was the cause of recalculation – it forced a full re-index even if the directory where the .gitignore resides is ignored
(/bin). We don’t have autocrlf anywhere (phew... ;)). The rest of the long lasting indexing seems to be caused by this:
1)
The delta visitor gathers deltas, even for ignored files
2)
If there are too many deltas (regardless of whether ignored or not), a full re-index is triggered – bad for large workspaces... :(
3)
If, by chance, there are not too many deltas, the update job is triggered and discovers that there is nothing to do. It does seem to have some base overhead of ~200-300 milliseconds. Since
we have hundreds of such update jobs alls the time during builds, it seems to be easier on the CPU to catch ignored files early (see
https://git.eclipse.org/r/#/c/37880/ )
Cheers,
Markus
Von: Halstrick, Christian [mailto:christian.halstrick@xxxxxxx]
Gesendet: Mittwoch, 10. Dezember 2014 09:35
An: Duft Markus; EGit developer discussion (egit-dev@xxxxxxxxxxx)
Betreff: RE: re-indexing running for nothing
I guess you need core.autocrlf to be true, or? If you have set it could you do me a favor and test again with temporarily set core.autocrlf set to false for all repos opened in eclipse? I would:
- Stop eclipse
- Set the property core.autocrlf=false in all repos being opened in eclipse
- Go to the root of each repo and call a “git status”
The reason why I ask: Ideally it shouldn’t matter too much how often we reindex. If index is set correctly and has only clean (unsmudged) entries we should never compute SHA1s during reindex. EGit/JGit
should detect from the index and its lastmodification-time info that all files are clean. We compute the SHA1’s only if the lastmodification of a file tells us it was really modified (unlikely in your case) or if a index entry is smudged (it’s length was intentionally
set to 0 to mark that this file can be racily clean. We can’t trust lastmodified+length and have to recomputed SHA1). And what I know is that when autocrlf is on then we always modify files in a checkout just before writing a new index leading to a lot of
smudged entries.
All this is only for finding the reason why we compute SHA1s of versioned file. That’s no explanation why we compute SHA1’s on non-versioned/ignored files.
Ciao
Chris