Re: [egit-dev] re-indexing running for nothing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [egit-dev] re-indexing running for nothing

From: Duft Markus <Markus.Duft@xxxxxxxxxxxxxxxx>
Date: Wed, 10 Dec 2014 13:26:18 +0000
Accept-language: en-US, de-DE
Delivered-to: egit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/egit-dev>
List-help: <mailto:egit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: AdATtYCbXsuPK63tQkO8ZUe1Qpmk9gAnotcQAAn2wwA=
Thread-topic: re-indexing running for nothing

Hey,

It seems that copying of .gitignore files from /src/ to /bin/ directories was the cause of recalculation – it forced a full re-index even if the directory where the .gitignore resides is ignored (/bin). We don’t have autocrlf anywhere (phew... ;)). The rest of the long lasting indexing seems to be caused by this:

1) The delta visitor gathers deltas, even for ignored files

2) If there are too many deltas (regardless of whether ignored or not), a full re-index is triggered – bad for large workspaces... :(

3) If, by chance, there are not too many deltas, the update job is triggered and discovers that there is nothing to do. It does seem to have some base overhead of ~200-300 milliseconds. Since we have hundreds of such update jobs alls the time during builds, it seems to be easier on the CPU to catch ignored files early (see https://git.eclipse.org/r/#/c/37880/ )

Cheers,

Markus

Von: Halstrick, Christian [mailto:christian.halstrick@xxxxxxx]
Gesendet: Mittwoch, 10. Dezember 2014 09:35
An: Duft Markus; EGit developer discussion (egit-dev@xxxxxxxxxxx)
Betreff: RE: re-indexing running for nothing

I guess you need core.autocrlf to be true, or? If you have set it could you do me a favor and test again with temporarily set core.autocrlf set to false for all repos opened in eclipse? I would:

- Stop eclipse

- Set the property core.autocrlf=false in all repos being opened in eclipse

- Go to the root of each repo and call a “git status”

The reason why I ask: Ideally it shouldn’t matter too much how often we reindex. If index is set correctly and has only clean (unsmudged) entries we should never compute SHA1s during reindex. EGit/JGit should detect from the index and its lastmodification-time info that all files are clean. We compute the SHA1’s only if the lastmodification of a file tells us it was really modified (unlikely in your case) or if a index entry is smudged (it’s length was intentionally set to 0 to mark that this file can be racily clean. We can’t trust lastmodified+length and have to recomputed SHA1). And what I know is that when autocrlf is on then we always modify files in a checkout just before writing a new index leading to a lot of smudged entries.

All this is only for finding the reason why we compute SHA1s of versioned file. That’s no explanation why we compute SHA1’s on non-versioned/ignored files.

Ciao

Chris

References:
- [egit-dev] re-indexing running for nothing
  - From: Duft Markus
- Re: [egit-dev] re-indexing running for nothing
  - From: Halstrick, Christian

Prev by Date: [egit-dev] EGit Hudson instance left the building
Next by Date: [egit-dev] Please check pending patches
Previous by thread: Re: [egit-dev] re-indexing running for nothing
Next by thread: [egit-dev] EGit Hudson instance left the building
Index(es):
- Date
- Thread

Breadcrumbs