Hi Maria,
Great question. The DataStoreFinder.getDataStore
calls reads through
the JVM classpath for all the DataStoreFactory's it
can find,
instantiates them, and holds them in a registry
object(1). It sounds
like the classpath scanning and classloading is what
is taking several
seconds.
The DataStoreFinder.getDataStore approach for getting
a DataStore is a
general one which is great for building up general,
re-usable code.
Given the performance concern, you can opt instead
create an
AccumuloDataStoreFactory directly and call
createDataStore. That maybe
a little quicker for loading up the necessary classes,
etc.
If you have data coming in frequently, NiFi may be a
fit. GeoMesa has a
NiFi adapters (3) which would manage the DataStore
connections, etc.
If you can cache/share/re-use the DataStore connection
in the client
app, that might be helpful. DataStore objects tend to
be somewhat
heavy-weight, so creating them frequently has some
downsides. As
another option, could you setup a small server to post
the incoming
data?
If none of those suggestions help out with your client
app, it is worth
noting that DataStore objects should be cleaned up by
calling the
dispose() method.
I hope that helps; let us know if you have any other
questions.
Cheers,
Jim
1.
https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/DataStoreFinder.java#L113-L131
2.
https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloDataStoreFactory.scala
3.
https://github.com/geomesa/geomesa-nifi/
On 2018-04-13 10:03, Maria Krommyda wrote:
> Hello everyone,
>
> I am dealing with a very weird and unexpected
problem that I would
> like to share with you in case you have any
suggestions.
>
> I have set up my system with Zookeeper 3.4.6 on
localhost, Hadoop
> 2.2.0, Accumulo 1.7.3, Geomesa 2.11-1.3.5 and
Geotools 15.1.
>
> I have written a very simple script that
connects to an Geomesa
> Accumulo DataStore and uploads some data.
>
> I use the line:
>
> DataStore dataStore =
DataStoreFinder.getDataStore(parameters); in my
> code
>
> I am importing the
org.geotools.data.DataStoreFinder accordingly.
>
> The first time that I call the function, with
the above line, it takes
> around 4 to 5 secs to find the DataStore and
less than 300 msecs to
> upload the data.
>
> If I create a loop and call this function more
than once, even with
> some delay between the calls, from the second
time onward it takes
> less than 20msecs to find the datastore and
approximately the same
> time (300 msecs) to upload the data. I am not
sure if this has
> something to do with Java optimization, and the
connection is
> maintained from the first call or with anything
else.
>
> The problem is that I want to call the function
from a client app that
> will call quite often but only once each time,
making the 4 secs a
> serious problem.
>
> I have tried searching for any related problems
but I couldn't find
> anything helpful. So any ideas and thoughts on
what might be the
> problem are highly appreciated.
>
> Thank you very much for your time.
>
> Best regards,
> Maria.
> _______________________________________________
> geomesa-dev mailing list
>
geomesa-dev@xxxxxxxxxxxxxxxx
> To change your delivery options, retrieve your
password, or
> unsubscribe from this list, visit
>
https://dev.locationtech.org/mailman/listinfo/geomesa-dev