Re: [geomesa-users] issue with KNN efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [geomesa-users] issue with KNN efficiency

From: Michael Ronquest <ronquest@xxxxxxxx>
Date: Thu, 12 Mar 2015 10:44:12 -0400
Delivered-to: geomesa-users@xxxxxxxxxxxxxxxx
List-archive: <https://www.locationtech.org/mhonarc/lists/geomesa-users>
List-help: <mailto:geomesa-users-request@locationtech.org?subject=help>
List-subscribe: <http://www.locationtech.org/mailman/listinfo/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=subscribe>
List-unsubscribe: <http://www.locationtech.org/mailman/options/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0

Dear Elad,

Thanks for writing in! The KNN algorithm is a relativelynew addition to the analytics built into GeoMesa and I believe this isthe first time we've received community feedback regarding it.

The KNNQuery does indeed exploit the geohash index in order to increaseefficiency and avoid pulling large numbers of false positives. Startingfrom a geohash containing your reference point, the process executesa series of small queries in an outward spiral of geohashes until Kneighbors are found, and then all remaining geohashes that may containnearer neighbors are also swept to ensure the nearest neighbors havebeen found.

As this is a WPS process, the issue could be with GeoMesa core, theplugin, or GeoServer. I'd like to ask a few questions to collect somemore information to assist you in debugging the problem:


1) What are the indications that the entire layer is being extracted?

2) Did you build GeoMesa from source, and if so, what is the git commithash? If you're using a jar pulled from Nexus, when was it pulled?3) Can you please send the relevant portions of your GeoServer logs,showing the log output from the KNNQuery process?4) This is somewhat redundant, but can you please send the parametersyou fed into the KNN process? The XML file would be fabulous.5) Can you please describe the geographic distribution of the data abit? For example, how widely distributed are those 200k Features? Andgiven K, how large an area would you expect the KNN to reside?6) Can you please send the details of your layer? In particular I'd liketo know if caching is enabled in the Accumulo Feature Store and thefields in the Coordinate Reference Systems portion of the layer info.


Best regards, and thanks again for the feedback,
Michael Ronquest





On 03/12/2015 09:47 AM, Elad Katz wrote:

Hi,
I am trying to use KNNQuery to find the KNN of a point from a biglayer (~200,000 features) and it seems GeoMesa extracts the entirelayer from the server in order to find the nearest neighbor.This makes the algorithm impossible to use with big layers, and makesno sense - the geohash index should be used to find the nearestneighbors efficiently (in O(output size), not O(layer size)).
I think the relevant line is:
https://github.com/locationtech/geomesa/blob/accumulo1.5.x/1.x/geomesa-core/src/main/scala/org/locationtech/geomesa/core/process/knn/KNNQuery.scala#L77
Thanks


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users

References:
- Re: [geomesa-users] issue with KNN efficiency
  - From: Elad Katz

Prev by Date: Re: [geomesa-users] issue with KNN efficiency
Next by Date: [geomesa-users] Quickstart fails
Previous by thread: Re: [geomesa-users] issue with KNN efficiency
Next by thread: [geomesa-users] Quickstart fails
Index(es):
- Date
- Thread

Breadcrumbs