Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Any suggestions for indexing data with a duration?

Beau,

You are, of course, correct as to the query structure.  Thanks for that.

As to the indexing, the query's effectiveness will depend quite a bit on
how dense your data are in the various dimensions.  If your query
polygon is huge, or if you have billions of records whose start-date
precedes "dt1", then the query may not perform as well as you'd like.
On the other hand, GeoMesa distributes data uniformly across all of the
tablet-servers in your cluster (in part) to distribute the load of
applying CQL filters to the subset of records that meet the (potentially
coarse) geographic and temporal filters as applied by the index.

Another capability that we are prototyping internally (but that won't
make it into the upcoming 1.0, sadly) is secondary indexing, which is
specifically designed to facilitate the type of query you describe.

I hope this helps; if not, please just let us know.

Thanks!

Sincerely,
 -- Chris


On Thu, 2014-06-26 at 18:20 +0000, Beau Lalonde wrote:
> Chris,
> 
> As always, thanks for the quick reply.
> 
> I have not tested, but I don't think the logic you supplied will work if query time, dt0, is after the indexed start-time.  
> 
> What I really want is (simplified for presentation, assuming start is the special time-indexed attribute):
> geom INTERSECTS polygon
> AND start BEFORE dt1
> AND end AFTER dt0
> 
> But my concern is that the "start BEFORE dt1" portion, the temporal portion that takes advantage of the GeoMesa indexing, is unbounded on one-side and thus may take a long time to query.  
> 
> Thanks,
> 
> Beau
> 
> 
> -----Original Message-----
> From: geomesa-users-bounces@xxxxxxxxxxxxxxxx [mailto:geomesa-users-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Chris Eichelberger
> Sent: Thursday, June 26, 2014 1:04 PM
> To: Geomesa User discussions
> Subject: Re: [geomesa-users] Any suggestions for indexing data with a duration?
> 
> Beau,
> 
> GeoMesa should be able to get you most of the way through this use case.
> By indexing based on location and one end-point of your time range, the remainder of the range query can be handled via CQL.  That is, if I have a feature that contains these fields "geom:Geometry,start:Date,end:Date", then a query of the form (simplified for presentation):
> 
>   geom INTERSECTS polygon
>   AND start DURING dt0/dt1
>   AND end BEFORE dt0/dt1
> 
> would both take advantage of the index on the (location, start-time) pair, and would return all of the records whose time-span intersects the
> dt0/dt1 interval.
> 
> If you find that's not working for you, please just let us know.
> 
> Thanks!
> 
> Sincerely,
>   -- Chris
> 
> 
> P.S.  We removed explicit support for the end-time attribute relatively recently, but only because that field was never actually used in the index.  Whether we add this field explicitly to the geo-time index in the future probably depends on how our work on secondary indexes gels.
> 
> 
> On Thu, 2014-06-26 at 16:21 +0000, Beau Lalonde wrote:
> > All,
> > 
> > As I am using the latest GeoMesa I am coming up with some issues that hopefully others have already thought about.
> > 
> > Namely, I have data that inherently has a time duration (e.g. a start and end time), and I want to index that data using GeoMesa.  In an older version of GeoMesa I could index the data using both a start and end time, but the current version of GeoMesa only indexes data using a single time parameter.  Since time ranges do not seem to be supported by current GeoMesa, does anyone have a suggested approach?
> > 
> > Here is an abstract example for my problem:
> > - I index data that inherently lasts from time 5 to time 10
> > - I want to be able to perform a query that will return results if the 
> > query time range at all intersects/overlaps with the indexed data
> > -- For example, I want to perform a query using the time range 6-7 and 
> > still get a result
> > 
> > My only thoughts are that since I can no longer index a time range, I must discretize my data and index each discretized portion - each with its own indexed time.  This may work in a practical sense, but will always succumb to the above abstract problem where a query that should return results does not return results because the indexed data is discretized.
> > 
> > Does anyone have any thoughts?
> > 
> > Is GeoMesa going to bring back support for indexing data that has a duration?
> > 
> > Thanks,
> > 
> > Beau
> > 
> > _______________________________________________
> > geomesa-users mailing list
> > geomesa-users@xxxxxxxxxxxxxxxx
> > http://www.locationtech.org/mailman/listinfo/geomesa-users
> 
> _______________________________________________
> geomesa-users mailing list
> geomesa-users@xxxxxxxxxxxxxxxx
> http://www.locationtech.org/mailman/listinfo/geomesa-users
> _______________________________________________
> geomesa-users mailing list
> geomesa-users@xxxxxxxxxxxxxxxx
> http://www.locationtech.org/mailman/listinfo/geomesa-users



Back to the top