Hi Emilio,
thanks fort he explanation. I mainly want to do queries in the fashion of: “All records withing bbox and between t0 and t1 which belong to simulation run x”. It sounds like z3, or xz3
index is just the right choice for me.
I have an additional question regarding indexing. My simulation data also contains records which have no spatial or temporal attributes, like a list of cars used by the simulated persons.
My first attempt now, was to just also store them as SimpleFeatures with Geomesa but without a geometry. This way I wouldn’t have to use an additional storage technology. Now, it feels a little hacky to do that since I would treat a SimpleFeature as a database
row. Is this something I can do or would you recommend a different approach for storing such data?
Best Regards
Janek
Von: geomesa-users-bounces@xxxxxxxxxxx <geomesa-users-bounces@xxxxxxxxxxx>
Im Auftrag von Emilio Lahr-Vivaz
Gesendet: Mittwoch, 8. Juli 2020 21:42
An: geomesa-users@xxxxxxxxxxx
Betreff: Re: [geomesa-users] Z-Index Time Interval of hours
The time interval is used to break a unbounded dimension (time) into a bounded dimension (e.g. seconds in a week). Then the bounded dimension is used to create the z-index curve, which is used to generate the
scan ranges. Generally you would want to query only over a few time intervals (z-curves), so having an hourly interval would only be appropriate if you planned to query at most 1-2 hours at a time.
Even at a day interval, data will be indexed down to the millisecond. You can see the breakdowns for different time intervals here:
https://github.com/locationtech/geomesa/blob/main/geomesa-z3/src/main/scala/org/locationtech/geomesa/curve/BinnedTime.scala#L17
Because we only use a signed 2 byte short to store the interval offset from the java epoch (1970), we didn't implement hourly intervals because the max date would be 1973. However, that is an implementation detail, which could be revisited if someone wanted
to implement hourly intervals.
Whether to use an attribute index vs a z3 index will really depend on your query patterns. If you plan to query with both a spatial and temporal component, then the z3 index will be much more effective. Otherwise, you could consider an attribute index on your
date, but keep in mind you would not be able to leverage a secondary spatial index unless your temporal predicate was an equality filter, i.e. you are querying for a specific second if you store time as a simple long. You could potentially store the hour in
the day, and create an attribute index with a secondary z-index - then you would be able to query for a particular hour.
Thanks,
Emilio
On 7/8/20 3:00 PM, Laudan, Janek wrote:
Hi Emilio,
Thanks for the quick reply. I will investigate this further next week when I'm back from my vacation.
Since smaller intervals are not supported, will it even make sense to use this type of index in my case or will the database have to perform a sequential scan on every temporal query?
The temporal component of my data is actually just seconds from the start of the simulation represented as a simple long. Would it make more sense in my case to use an attribute index - of this very number - in combination with a spatial one?
Thanks again for your help
Janek
--
Janek Laudan
Research Associate
Transport System Planning and Telematics
TU Berlin
Mittwoch, 08 Juli 2020, 03:51nachm. +02:00 von Emilio Lahr-Vivaz
elahrvivaz@xxxxxxxx:
Hello,
I'm actually surprised that you didn't get an error. There isn't any implementation for hourly intervals, so it either didn't index the time, or it fell back to the default weekly interval. You might be able to use the 'describe-schema' CLI command to see if
the user data was persisted as you specified.
Thanks,
Emilio
On 7/8/20 4:14 AM, Laudan, Janek wrote:
Hi,
I just started working with geomesa. I’m planning to use it for storing traffic simulation data generated with MATSim (https://matsim.org).
Because a simulation run usually covers only a single day, I thought it would make sense to have a Z-Index Time Interval of ‘hour’. Now, the documentation (https://www.geomesa.org/documentation/user/datastores/index_config.html#configuring-z-index-time-interval)
says that only ‘day’, ‘week’, ‘month’ and ‘year’ are supported. I tried to set the index of my schema to hourly intervals like so:
sft.getUserData().put("geomesa.z3.interval", "hour");
and my test set up stored and retrieved the feature I submitted to the store just fine.
Will geomesa silently fall back to another form of indexing or are smaller intervals than covered by the documentation supported?
Best Regards
Janek Laudan
-----------------------------------------------------------------------------------------------------------------------------------
Janek Laudan
Research Associate
Transport Systems Planning and Transport Telematics, TU Berlin
Website:
https://vsp.tu-berlin.de
E-Mail:
laudan@xxxxxxxxxxxx
Skype: live:janek.laudan
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users