Yeah, in general non-splittable files don't work great with
map/reduce. We have some code that prevents files from being split
up, so that a single mapper will see the entire file - we use that
for xml and I believe SHP files.
The tools are usually a quick and relatively fast way for people to
get started, but feel free to use whatever works best for you
(obviously). We have lots of users with custom ingest setups.
For large data sets, you might try the NYC Taxi data - they have
1.2B records available for download. I'm not sure how much that
would translate to in size on disk though. We already have a
converter for it and more details here:
https://github.com/locationtech/geomesa/tree/master/geomesa-tools/conf/sfts/nyctaxi
Thanks,
Emilio
On 01/26/2017 05:37 PM, Damiano Albani
wrote:
Hi Emilio,
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
|