Re: [geomesa-users] How to use data not managed by Geomesa without reimp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [geomesa-users] How to use data not managed by Geomesa without reimporting

From: Andrew Hulbert <ahulbert@xxxxxxxx>
Date: Mon, 6 Mar 2017 09:31:25 -0500
Delivered-to: geomesa-users@xxxxxxxxxxxxxxxx
List-archive: <https://dev.locationtech.org/mhonarc/lists/geomesa-users>
List-help: <mailto:geomesa-users-request@locationtech.org?subject=help>
List-subscribe: <https://dev.locationtech.org/mailman/listinfo/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=subscribe>
List-unsubscribe: <https://dev.locationtech.org/mailman/options/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

Note that the reason we need to "reingest" the data is to build the spatial index. If the data is in HDFS though its fairly easy to run a mapreduce ingest job to get it into Accumulo or HBase.

Just wondering what you would be interested in doing with the data if it were only in HDFS? Is your main goal querying/making maps or doing bulk analysis with something like Spark?

Andrew

On 03/06/2017 09:27 AM, Emilio Lahr-Vivaz wrote:

Hi Lee,

In general, no, you would want to ingest it into a datastore which will write the data in an indexed format so that spatial queries will be fast. That said, we do have some spark-sql integration points that let you load up a spatial RDD from any input files by defining a converter to simple features. See the example here (which is for accumulo):

http://www.geomesa.org/documentation/user/spark/sparksql.html

but use the spark provider here:

https://github.com/locationtech/geomesa/tree/master/geomesa-spark/geomesa-spark-converter

Others might be able to chime in with more details.

Thanks,

Emilio

On 03/06/2017 04:34 AM, li she wrote:
Hi, all:

I'm very new to Geomesa, and I want to know if it is possible that reuse existed data set in an HDFS cluster without importing to some feature store?

Thanks,

Lee
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

References:
- [geomesa-users] How to use data not managed by Geomesa without reimporting
  - From: li she
- Re: [geomesa-users] How to use data not managed by Geomesa without reimporting
  - From: Emilio Lahr-Vivaz

Prev by Date: Re: [geomesa-users] How to use data not managed by Geomesa without reimporting
Next by Date: Re: [geomesa-users] Geomesa Spark build : ClassNotFoundException: org.apache.commons.io.Charsets
Previous by thread: Re: [geomesa-users] How to use data not managed by Geomesa without reimporting
Next by thread: [geomesa-users] Geomesa Spark build : ClassNotFoundException: org.apache.commons.io.Charsets
Index(es):
- Date
- Thread

Breadcrumbs