Hi Maria,
Great question. The DataStoreFinder.getDataStore calls reads through
the JVM classpath for all the DataStoreFactory's it can find,
instantiates them, and holds them in a registry object(1). It sounds
like the classpath scanning and classloading is what is taking several
seconds.
The DataStoreFinder.getDataStore approach for getting a DataStore is a
general one which is great for building up general, re-usable code.
Given the performance concern, you can opt instead create an
AccumuloDataStoreFactory directly and call createDataStore. That maybe
a little quicker for loading up the necessary classes, etc.
If you have data coming in frequently, NiFi may be a fit. GeoMesa has a
NiFi adapters (3) which would manage the DataStore connections, etc.
If you can cache/share/re-use the DataStore connection in the client
app, that might be helpful. DataStore objects tend to be somewhat
heavy-weight, so creating them frequently has some downsides. As
another option, could you setup a small server to post the incoming
data?
If none of those suggestions help out with your client app, it is worth
noting that DataStore objects should be cleaned up by calling the
dispose() method.
I hope that helps; let us know if you have any other questions.
Cheers,
Jim
1.
https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/DataStoreFinder.java#L113-L1312.
https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloDataStoreFactory.scala3.
https://github.com/geomesa/geomesa-nifi/
On 2018-04-13 10:03, Maria Krommyda wrote:
> Hello everyone,
>
> I am dealing with a very weird and unexpected problem that I would
> like to share with you in case you have any suggestions.
>
> I have set up my system with Zookeeper 3.4.6 on localhost, Hadoop
> 2.2.0, Accumulo 1.7.3, Geomesa 2.11-1.3.5 and Geotools 15.1.
>
> I have written a very simple script that connects to an Geomesa
> Accumulo DataStore and uploads some data.
>
> I use the line:
>
> DataStore dataStore = DataStoreFinder.getDataStore(parameters); in my
> code
>
> I am importing the org.geotools.data.DataStoreFinder accordingly.
>
> The first time that I call the function, with the above line, it takes
> around 4 to 5 secs to find the DataStore and less than 300 msecs to
> upload the data.
>
> If I create a loop and call this function more than once, even with
> some delay between the calls, from the second time onward it takes
> less than 20msecs to find the datastore and approximately the same
> time (300 msecs) to upload the data. I am not sure if this has
> something to do with Java optimization, and the connection is
> maintained from the first call or with anything else.
>
> The problem is that I want to call the function from a client app that
> will call quite often but only once each time, making the 4 secs a
> serious problem.
>
> I have tried searching for any related problems but I couldn't find
> anything helpful. So any ideas and thoughts on what might be the
> problem are highly appreciated.
>
> Thank you very much for your time.
>
> Best regards,
> Maria.
> _______________________________________________
> geomesa-dev mailing list
>
geomesa-dev@xxxxxxxxxxxxxxxx> To change your delivery options, retrieve your password, or
> unsubscribe from this list, visit
>
https://dev.locationtech.org/mailman/listinfo/geomesa-dev