Note that the reason we need to "reingest" the data is to build
the spatial index. If the data is in HDFS though its fairly easy
to run a mapreduce ingest job to get it into Accumulo or HBase.
Just wondering what you would be interested in doing with the
data if it were only in HDFS? Is your main goal querying/making
maps or doing bulk analysis with something like Spark?
Andrew
On 03/06/2017 09:27 AM, Emilio
Lahr-Vivaz wrote:
Hi Lee,
In general, no, you would want to ingest it into a datastore which
will write the data in an indexed format so that spatial queries
will be fast. That said, we do have some spark-sql integration
points that let you load up a spatial RDD from any input files by
defining a converter to simple features. See the example here
(which is for accumulo):
http://www.geomesa.org/documentation/user/spark/sparksql.html
but use the spark provider here:
https://github.com/locationtech/geomesa/tree/master/geomesa-spark/geomesa-spark-converter
Others might be able to chime in with more details.
Thanks,
Emilio
On 03/06/2017 04:34 AM, li she wrote:
Hi, all:
I'm very new to Geomesa, and I want to know if it is
possible that reuse existed data set in an HDFS cluster
without importing to some feature store?
Thanks,
Lee
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
|