[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [geomesa-users] About performance on writing data to HBase
|
Hello,
I can't really say how long I would expect it to take - there are a
lot of factors that affect it, including the size and hardware of
your hbase cluster, your splitting-related configurations, how many
indices you are creating, your data locality, your backing storage
speeds, and any other concurrent load on your cluster.
Some things that may help:
* Pre-splitting your tables[1]
* Disabling any indices you don't need[2]
* Enabling table compression[3]
The fastest thing will usually be to ingest offline[4] and then bulk
import[5] the files, instead of using spark.
Thanks,
Emilio
[1]:
https://www.geomesa.org/documentation/user/datastores/index_config.html#configuring-index-splits
[2]:
https://www.geomesa.org/documentation/user/datastores/index_config.html#customizing-index-creation
[3]:
https://www.geomesa.org/documentation/user/hbase/index_config.html#setting-file-compression
[4]:
https://www.geomesa.org/documentation/user/hbase/commandline.html#bulk-ingest
[5]:
https://www.geomesa.org/documentation/user/hbase/commandline.html#bulk-load
On 3/18/20 11:03 PM, Yifan Wang wrote:
Hi,
I tried to write 1TB data through GeoMesa to Hbase. The
resources snapshot is as follows:
It took 10 hours to finish the whole writing process. I'm
wondering if 10 hours is normal in this situation or is there
anything I can do to improve the efficiency? Thanks!
Best Regards,
Evan
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users