Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Sorting

Hi Marcel,

Our colleagues from GeoWave were down for a meeting on Monday where we discussed the soon-to-be announced benchmarking project.

At the heart of the issue is that sorting Java objects is rather slow compared to sorting simple structs in C/C++.  The fundamental issue boils down to data locality in each language's memory model. 

In terms of parallelization of sorting, there isn't really a good way to do this in the Accumulo model*.  For an analytic, a user may make a query which returns millions of points from a dozen or more servers.  Streaming the data back unsorted as it comes, it is going to be the faster option.  Accumulo iterators don't really provide a sane way to perform sorting separately.  Consequently, all sorting is performed on the machine making the request; for large datasets, this would require gathering and holding the entire result set in memory.

All of that was assuming that you were interested in a web services response.  In the context of Spark or MapReduce, you may be able to perform some distributed sorting operations.  That would cover the situation where your result set was too large for main memory on one machine.

Cheers,

Jim

*  For this discussion, I'm assuming that one is requesting data in an order different from the lexical order of the keys.

On 09/23/2015 09:27 AM, Marcel wrote:
Hello,
I received from the GeoWave mailing-list: "Keep scanning Location Tech's web site for upcoming projects.  We are in the process of creating project around a Geo Benchmarking suite.  Part of the evaluation criteria is sorting, which will come up as a deficiency for GeoWave and GeoMesa."

I know that GeoMesa can sort a query:
Query query = new Query("name", spatialFilter, attributeSubset);
SortBy sort = ff.sort(distance, SortOrder.ASCENDING);
query.setSortBy(new SortBy[]{sort});

Is sorting for this query done in parallel?
So could you imagine what is meant by this statement?

Thanks in advance,
Marcel Jacob.


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users


Back to the top