[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [jgit-dev] Improving ObjectIdSubclassMap performance
|
I took this as an opportunity to practice in implementing Cuckoo
hashing[1]. There is a paper[2] describing several variations of the
basic data structure, so I wrote my ObjectIdSubclassMap implementation
in two slightly different ways. The first implementation uses four
different hash functions, the second implementation uses only two hash
functions but four cells per each bucket. With subclasses of ObjectId
we have five really cheap hash codes (w1...w5) so let's take advantage
of this!
Theoretically it should work faster. My results, however, are
disappointing. I compared the original map with my two new classes.
For the set of 1500000 objects it gives me the following numbers.
Insertion of 1500000 unique objects into map:
ORIGINAL IMPLEMENTATION
iterations: 10
time: 3,188780 sec
waste: 2,796203
CUCKOO HASHING, FOUR HASH FUNCTIONS
iterations: 10
time: 7,552806 sec
waste: 1,398101
--------------------------------
speedup: 0,422198
CUCKOO HASHING, TWO HASH FUNCTIONS, FOUR CELLS PER BUCKET
iterations: 10
time: 3,894506 sec
waste: 1,398101
--------------------------------
speedup: 0,818789
Retrieval of 1500000 objects:
ORIGINAL IMPLEMENTATION
iterations: 100
time: 3,964279 sec
CUCKOO HASHING, FOUR HASH FUNCTIONS
iterations: 100
time: 13,294283 sec
--------------------------------
speedup: 0,298194
CUCKOO HASHING, TWO HASH FUNCTIONS, FOUR CELLS PER BUCKET
iterations: 100
time: 17,257221 sec
--------------------------------
speedup: 0,229717
My Cuckoo hash tables work slower than the original map. The one that
uses four hash functions inserts objects significantly slower. Both
maps retrieve objects significantly slower, which is very surprising
for me. My guess it is because of randomized access to memory. The
good news, however, is that these new maps waste less space -- 1,4
array elements per each object versus 2,8 of the original map.
Anybody interested in looking at my code?
[1] http://en.wikipedia.org/wiki/Cuckoo_hashing
[2] http://www.ru.is/faculty/ulfar/CuckooHash.pdf
On 9 March 2011 03:36, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> One of the profiling tools I have access to at $DAY_JOB is telling me
> the obj_hash table within ObjectIdSubclassMap is one of the major
> sources of garbage when enumerating all objects in the linux-2.6
> repository. This makes some sense as the obj_hash table needs to be
> about 4 million entries large to store the entire 1.8 million object
> list that PackWriter is discovering during the enumeration phase (as
> the table is only powers of 2, doubles on each expansion, and expands
> when its at 50% capacity). As a source of garbage, the GC is spending
> an inordinate amount of time trying to allocate the next largest table
> size during expansions.
>
> Unfortunately I can't do better. So I'm looking for ideas.
>
> My extendible hashing patch that I abandoned today[1] was slower by
> several seconds (~37s current code, ~42s extendible hashing) despite
> having the property that it should never generate garbage. The extra
> array lookups and branches during get() and add() completely
> overshadowed the GC savings.
>
> I tried using plain java.util.HashMap just to compare, and our current
> code is faster (~37s current code, ~39s HashMap). So moving to HashMap
> isn't a solution. :-)
>
> I tried implementing linear hashing[2], but the table has a bug that
> is causing some objects to be lost during a resize. Even if I fix that
> (its only 6%) I don't think its going to beat the current code, its
> too complex and is using a unique ArrayList inside of each hash
> bucket. I don't think its useful to try and fix the resize bug,
> removing the 6% duplicates may not save enough work to drop the time,
> and the overall memory footprint is quite a bit bigger.
>
> I hardcoded ObjectIdSubclassMap to initially allocate its obj_hash at
> 3,600,000 entries, 2x the final size, thus ensuring it never grows.
> This shaved about 500 milliseconds off the running time, but its
> difficult to get an estimate for the table size in advance. If
> PackWriter is doing a full repacking, we can guess close by summing up
> the object counts from the PackFile instances of the source
> Repository, but this doesn't correctly include the loose objects. I'm
> not sure saving 1.3% of the running time is worth the code complexity
> to define APIs exposing an estimated object count from ObjectDatabase
> up to PackWriter. Getting that count from the PackFiles may require
> 500 milliseconds itself if there are more than a handful of pack
> files.
>
> So I think I've hit a wall here, PackWriter is about as fast as its
> getting anytime soon. But I'd still like to improve that counting
> phase. :-(
>
>
> Profiling shows our real hot spots are ~30% inflate() and ~30%
> ObjectIdSubclassMap (the other 40% is in tiny pockets that aren't
> worth major mention). Dropping inflate requires switching the pack
> format to store commit tree/parent pointers uncompressed, and trees
> uncompressed. This isn't going to happen anytime soon. Hence my
> attempt to speed up the other 30%.
>
>
> [1] http://egit.eclipse.org/r/2670
> [2] http://citeseer.ist.psu.edu/griswold93design.html
>
> --
> Shawn.
> _______________________________________________
> jgit-dev mailing list
> jgit-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/jgit-dev
>