Hi
Maria,
Great question. The
DataStoreFinder.getDataStore calls
reads through
the JVM classpath for all the
DataStoreFactory's it can find,
instantiates them, and holds them in a
registry object(1). It sounds
like the classpath scanning and
classloading is what is taking several
seconds.
The DataStoreFinder.getDataStore
approach for getting a DataStore is a
general one which is great for
building up general, re-usable code.
Given the performance concern, you can
opt instead create an
AccumuloDataStoreFactory directly and
call createDataStore. That maybe
a little quicker for loading up the
necessary classes, etc.
If you have data coming in frequently,
NiFi may be a fit. GeoMesa has a
NiFi adapters (3) which would manage
the DataStore connections, etc.
If you can cache/share/re-use the
DataStore connection in the client
app, that might be helpful. DataStore
objects tend to be somewhat
heavy-weight, so creating them
frequently has some downsides. As
another option, could you setup a
small server to post the incoming
data?
If none of those suggestions help out
with your client app, it is worth
noting that DataStore objects should
be cleaned up by calling the
dispose() method.
I hope that helps; let us know if you
have any other questions.
Cheers,
Jim
1.
https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/DataStoreFinder.java#L113-L131
2.
https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloDataStoreFactory.scala
3.
https://github.com/geomesa/geomesa-nifi/
On 2018-04-13 10:03, Maria Krommyda
wrote:
> Hello everyone,
>
> I am dealing with a very weird
and unexpected problem that I would
> like to share with you in case
you have any suggestions.
>
> I have set up my system with
Zookeeper 3.4.6 on localhost, Hadoop
> 2.2.0, Accumulo 1.7.3, Geomesa
2.11-1.3.5 and Geotools 15.1.
>
> I have written a very simple
script that connects to an Geomesa
> Accumulo DataStore and uploads
some data.
>
> I use the line:
>
> DataStore dataStore =
DataStoreFinder.getDataStore(parameters);
in my
> code
>
> I am importing the
org.geotools.data.DataStoreFinder
accordingly.
>
> The first time that I call the
function, with the above line, it
takes
> around 4 to 5 secs to find the
DataStore and less than 300 msecs to
> upload the data.
>
> If I create a loop and call
this function more than once, even
with
> some delay between the calls,
from the second time onward it takes
> less than 20msecs to find the
datastore and approximately the same
> time (300 msecs) to upload the
data. I am not sure if this has
> something to do with Java
optimization, and the connection is
> maintained from the first call
or with anything else.
>
> The problem is that I want to
call the function from a client app
that
> will call quite often but only
once each time, making the 4 secs a
> serious problem.
>
> I have tried searching for any
related problems but I couldn't find
> anything helpful. So any ideas
and thoughts on what might be the
> problem are highly appreciated.
>
> Thank you very much for your
time.
>
> Best regards,
> Maria.
>
_______________________________________________
> geomesa-dev mailing list
>
geomesa-dev@xxxxxxxxxxxxxxxx
> To change your delivery options,
retrieve your password, or
> unsubscribe from this list, visit
>
https://dev.locationtech.org/mailman/listinfo/geomesa-dev