Hi Marcel,
Our colleagues from GeoWave were down for a meeting on Monday where
we discussed the soon-to-be announced benchmarking project.
At the heart of the issue is that sorting Java objects is rather
slow compared to sorting simple structs in C/C++. The fundamental
issue boils down to data locality in each language's memory model.
In terms of parallelization of sorting, there isn't really a good
way to do this in the Accumulo model*. For an analytic, a user may
make a query which returns millions of points from a dozen or more
servers. Streaming the data back unsorted as it comes, it is going
to be the faster option. Accumulo iterators don't really provide a
sane way to perform sorting separately. Consequently, all sorting
is performed on the machine making the request; for large datasets,
this would require gathering and holding the entire result set in
memory.
All of that was assuming that you were interested in a web services
response. In the context of Spark or MapReduce, you may be able to
perform some distributed sorting operations. That would cover the
situation where your result set was too large for main memory on one
machine.
Cheers,
Jim
* For this discussion, I'm assuming that one is requesting data in
an order different from the lexical order of the keys.
On 09/23/2015 09:27 AM, Marcel wrote:
Hello,
I received from the GeoWave mailing-list: "Keep scanning Location
Tech's web site for upcoming projects. We are in the process of
creating project around a Geo Benchmarking suite. Part of the
evaluation criteria is sorting, which will come up as a deficiency
for GeoWave and GeoMesa."
I know that GeoMesa can sort a query:
Query query = new Query("name", spatialFilter, attributeSubset);
SortBy sort = ff.sort(distance, SortOrder.ASCENDING);
query.setSortBy(new SortBy[]{sort});
Is sorting for this query done in parallel?
So could you imagine what is meant by this statement?
Thanks in advance,
Marcel Jacob.
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users
|