Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] How to use data not managed by Geomesa without reimporting

Note that the reason we need to "reingest" the data is to build the spatial index. If the data is in HDFS though its fairly easy to run a mapreduce ingest job to get it into Accumulo or HBase.

Just wondering what you would be interested in doing with the data if it were only in HDFS? Is your main goal querying/making maps or doing bulk analysis with something like Spark?

Andrew

On 03/06/2017 09:27 AM, Emilio Lahr-Vivaz wrote:
Hi Lee,

In general, no, you would want to ingest it into a datastore which will write the data in an indexed format so that spatial queries will be fast. That said, we do have some spark-sql integration points that let you load up a spatial RDD from any input files by defining a converter to simple features. See the example here (which is for accumulo):

http://www.geomesa.org/documentation/user/spark/sparksql.html

but use the spark provider here:

https://github.com/locationtech/geomesa/tree/master/geomesa-spark/geomesa-spark-converter

Others might be able to chime in with more details.

Thanks,

Emilio

On 03/06/2017 04:34 AM, li she wrote:
Hi, all:
I'm very new to Geomesa, and I want to know if it is possible that reuse existed data set in an HDFS cluster without importing to some feature store?

Thanks,

Lee




 



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top