Hi,
We are getting ourselves familiar with Geomesa and evaluating its suitability for spatial analyses of several medium - large sets of point data ranging from 50K to 6 - 7 million points stored in HBase. Some of the analyses patterns
1.
Finding the intersection of a given set against one or more of several static sets of polygons stored in HBase and sort the results based on ANY chosen attribute (with each point having ~200+ attributes)
2.
Find out the stats of the point sets based on any of attribute data
These analyses will be done interactively over REST calls.
Some of the design questions that we are seeking your help to answer are
1.
Since we will be filtering and sorting by any of the ~200 attributes, do we need to add an attribute index for each of them?
2.
When we try to retrieve top 100 data sorted by an indexed attribute, we see queries taking ~17 seconds (on a 600K point set) whereas with a BBOX filter the same happens < 1 second. Is it because Geomesa is fetching all the data to the
client and sorting them?
Thanks,
Rama Sundaram