Hi,
We are getting ourselves familiar with
Geomesa and evaluating its suitability for spatial analyses of
several medium - large sets of point data ranging from 50K to
6 - 7 million points stored in HBase. Some of the analyses
patterns
1.
Finding the intersection of a
given set against one or more of several static sets of
polygons stored in HBase and sort the results based on ANY
chosen attribute (with each point having ~200+ attributes)
2.
Find out the stats of the point
sets based on any of attribute data
These analyses will be done interactively
over REST calls.
Some of the design questions that we are
seeking your help to answer are
1.
Since we will be filtering and
sorting by any of the ~200 attributes, do we need to add an
attribute index for each of them?
2.
When we try to retrieve top 100
data sorted by an indexed attribute, we see queries taking ~17
seconds (on a 600K point set) whereas with a BBOX filter the
same happens < 1 second. Is it because Geomesa is fetching
all the data to the client and sorting them?
Thanks,
Rama Sundaram