Hello,
I don't think you'd want to use NiFi as a query platform. NiFi is
designed to manage flows of data, which in my experience means
ingestion pipelines. And indeed, we only offer ingestion processors
for geomesa.
If you want to query over http, I'd strongly suggest looking into
geoserver. It implements an OGC standard interface for querying,
which means that there are lots of client libraries available that
you can use to call it, including ones for _javascript_ and java.
Geomesa provides plugins that let you access it through geoserver:
http://www.geomesa.org/documentation/user/accumulo/geoserver.html
If you want a custom endpoint, then you will need to translate your
request into a Query object and call the data store
featureReader/featureSource methods in code. The geotools
documentation has a lot of cruft in it (related to the UI framework
used), but covers the basics of querying a data store:
http://docs.geotools.org/latest/userguide/tutorial/filter/query.html
Thanks,
Emilio
On 04/17/2018 04:20 AM, Maria Krommyda
wrote:
Hello,
I would like to
give NiFi a go before looking into JSP.
Let's just
assume for now that I have figured out the HTTP request and
that I can get the parameters that I need from the client.
So
now I have a bounding box that I want to use in a query.
Can you please
let me know if there is a Processor to query GeoMesa using
parameters?
The Processors
that are available are only to ingest data, or have I
understood something wrong?
Thank
you for your time!
Best
regards,
Maria.
Hello,
You should be able to have some persistent state in
a jsp page, so possibly the easiest thing for you is
to modify your current jsp to use a shared data
store instance.
If you want to continue with nifi, I'd suggest
asking on the nifi user forums for ways accept data
from http requests. The geomesa nifi integration is
mainly through our converter framework, which
converts different data formats into geotools simple
features that geomesa can ingest. So you would need
to set up a flow that will generate data in e.g. csv
format, in order to pass it to the geomesa nifi
processor. Once you get to that point, we can help
out with configuring the geomesa processor
appropriately.
Thanks,
Emilio
On
04/14/2018 02:52 PM, Maria Krommyda wrote:
Hello,
I have no control
over the client side of the application so
the WFS-T is not an option for me.
Also the client
does not have access to my file system.
What has been set
up currently (as a test while trying to
figure out how to handle data and queries
with the GeoMesa) is a Java Service
Page, which is called with an http request
and the relevant parameters and what is
returned is a JSON object.
The
preferable solution for me would include
keeping the http request as the
communication method.
Reading
through the NiFi documentation, I thought
that something like that is possible, but I
am still unsure of how it should be properly
set up to read the parameters and call
GeoMesa using them and return the object to
the client.
It
would be great if you could point me to the
right direction.
Best
regards,
Maria.
Hello,
You really have a lot of different
options. At a basic level, it
sounds like you want some
persistent service that will have
a data store instance and respond
to client requests. How do you
plan to communicate between your
client and your service? That will
probably inform what solution is
best, and how you configure NiFi
will depend on how you plan to
send messages. For example, if
your client can just write files
to your filesystem, it is fairly
easy to configure NiFi to monitor
a folder and ingest data written
there.
In addition to NiFi, another
option you might consider is using
geoserver with WFS-T, which might
be easier to set up. I believe
there are a variety of clients
that you could then use to send
requests over http, depending on
where your client runs.
I would guess that most of the
time spent getting the data store
is from class loading. But still,
5 seconds seems high. Using the
geomesa command-line tools, I'm
able to invoke java, get the
datastore and retrieve the current
feature types in ~2 seconds:
$ time ./geomesa-accumulo
get-type-names -c geomesa -u user
-p pass -i instance -z zoo
Current feature types:
example-csv
real 0m2.071s
user 0m3.332s
sys 0m0.164s
(user time is higher because it
counts time spent in multiple cpu
cores - my machine has 4 cores)
Thanks,
Emilio
On
04/13/2018 03:58 PM, Maria
Krommyda wrote:
Hello
Jim,
Let me start by
thanking you for taking
the time to give me such
detailed response.
I am still
surprised that it needs 5
secs to find the only one
existing datastore, and
very curious as to what
would have happened if
there were many more, but
at least I now understand
the reason.
I
tried the AccumuloDataStoreFactory but as
you predicted it didn't
improve the time.
Thank you for
the dispose() tip, I
had misunderstood it to
mean the destruction of
the schema (delete all
data) and I was not using
it.
NiFi
seems the only solution
for what I try accomplish.
Can I
ask you for some, as
detailed as possible,
examples of how I can
configure the NiFi, if you
know of any available.
I
searched for them but all
the examples that I found
assume a deep
understanding of how NiFi
works, which I do not
have, and give only an
overview of the steps that
should be followed.
What I
would like to achieve is
to receive a request from
the client, either to
insert data to the
DataStore or a query,
process that to the
GeoMesa and send back a
response, either
confirming the data
storage or the query
results properly
formatted.
Thank you once
more for you time!
Best regards,
Maria.
Hi
Maria,
Great question. The
DataStoreFinder.getDataStore calls reads through
the JVM classpath
for all the
DataStoreFactory's
it can find,
instantiates them,
and holds them in a
registry object(1).
It sounds
like the classpath
scanning and
classloading is what
is taking several
seconds.
The
DataStoreFinder.getDataStore
approach for getting
a DataStore is a
general one which is
great for building
up general,
re-usable code.
Given the
performance concern,
you can opt instead
create an
AccumuloDataStoreFactory directly and call createDataStore. That maybe
a little quicker for
loading up the
necessary classes,
etc.
If you have data
coming in
frequently, NiFi may
be a fit. GeoMesa
has a
NiFi adapters (3)
which would manage
the DataStore
connections, etc.
If you can
cache/share/re-use
the DataStore
connection in the
client
app, that might be
helpful. DataStore
objects tend to be
somewhat
heavy-weight, so
creating them
frequently has some
downsides. As
another option,
could you setup a
small server to post
the incoming
data?
If none of those
suggestions help out
with your client
app, it is worth
noting that
DataStore objects
should be cleaned up
by calling the
dispose() method.
I hope that helps;
let us know if you
have any other
questions.
Cheers,
Jim
1.
https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/DataStoreFinder.java#L113-L131
2.
https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloDataStoreFactory.scala
3. https://github.com/geomesa/geomesa-nifi/
On 2018-04-13
10:03, Maria
Krommyda wrote:
> Hello
everyone,
>
> I am dealing
with a very weird
and unexpected
problem that I
would
> like to share
with you in case
you have any
suggestions.
>
> I have set up
my system with
Zookeeper 3.4.6 on
localhost, Hadoop
> 2.2.0,
Accumulo 1.7.3,
Geomesa 2.11-1.3.5
and Geotools 15.1.
>
> I have
written a very
simple script that
connects to an
Geomesa
> Accumulo
DataStore and
uploads some data.
>
> I use the
line:
>
> DataStore
dataStore =
DataStoreFinder.getDataStore(parameters);
in my
> code
>
> I am
importing the
org.geotools.data.DataStoreFinder
accordingly.
>
> The first
time that I call
the function, with
the above line, it
takes
> around 4 to 5
secs to find the
DataStore and less
than 300 msecs to
> upload the
data.
>
> If I create a
loop and call this
function more than
once, even with
> some delay
between the calls,
from the second
time onward it
takes
> less than
20msecs to find
the datastore and
approximately the
same
> time (300
msecs) to upload
the data. I am not
sure if this has
> something to
do with Java
optimization, and
the connection is
> maintained
from the first
call or with
anything else.
>
> The problem
is that I want to
call the function
from a client app
that
> will call
quite often but
only once each
time, making the 4
secs a
> serious
problem.
>
> I have tried
searching for any
related problems
but I couldn't
find
> anything
helpful. So any
ideas and thoughts
on what might be
the
> problem are
highly
appreciated.
>
> Thank you
very much for your
time.
>
> Best regards,
> Maria.
>
_______________________________________________
> geomesa-dev
mailing list
> geomesa-dev@xxxxxxxxxxxxxxxx
> To change your
delivery options,
retrieve your
password, or
> unsubscribe
from this list,
visit
> https://dev.locationtech.org/mailman/listinfo/geomesa-dev
|