Hello,
The filesystem data store is more rudimentary than most of our data
stores, and it doesn't support explain planning yet. In general, the
planning is not very interesting, as typically the partitions are
broad (e.g. when using something like 'daily,z2:2-bits', there are 4
spatial quadrants per day). If you are using multi-threading (the
default, controllable via data store parameter "fs.read-threads"),
then you can see which partitions are being hit for a given query by
enabling debug logging on 'org.locationtech.geomesa.fs'.
Which data store implementation to pick depends on your use case.
The filesystem data store is very easy to get started with, and
provides extremely low operating costs if you run it on something
like s3. Using a full-fledged database like HBase or Accumulo can up
your setup complexity and operating costs considerably, but gives
you much more fine-grained query capabilities, as well as
distributed push-down processing.
If you generally code to the geotools data store API (and/or the
spark API), then it should be fairly trivial (from a client
perspective) to switch between data store implementations. Roughly,
they all offer the same capabilities, but some operations may be
(much) slower depending on the implementation, and a few things may
not be implemented everywhere (i.e. the explain command). Since you
are in the exploratory phase, the filesystem datastore should give
you a decent idea of the *kind* of things that you can do, with the
understanding that it may not be the *fastest* you could do them.
Thanks,
Emilio
On 12/16/18 1:49 PM, Andrew Ames wrote:
I have been following the quickstart guide using
the geomesa-fs command line tools.
It seems that there is no "explain" command when using
bin/geomesa-fs. However, I see the "explain" command
throughout parts of the documentation. (I was curious to find
out why some spatio-temporal queries were slower than others.)
Is this because I chose to use geomesa-fs and not some
other command line tools package? (I have been using HDFS and
installed the hadoop deps in order to work through tutorials
and work with some big data I have.)
Is there another package I should be using if I am just
getting started with indexing "big" spatio-temporal data? Like
Accumulo? (Ultimately, I want to use this stack to experiment
with heatmap generation and flow analysis of moving entities
and so on. Balancing transformations across nodes is also
something I want to play with.)
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
|