Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Is there available an asynchronous or lazy Java CQL driver for on-demand return of large resultsets from Accumulo?

Thank you, Emilio,


this is very helpful.


Ben




From: geomesa-users-bounces@xxxxxxxxxxxxxxxx <geomesa-users-bounces@xxxxxxxxxxxxxxxx> on behalf of Emilio Lahr-Vivaz <elahrvivaz@xxxxxxxx>
Sent: 11 January 2017 18:20
To: geomesa-users@xxxxxxxxxxxxxxxx
Subject: Re: [geomesa-users] Is there available an asynchronous or lazy Java CQL driver for on-demand return of large resultsets from Accumulo?
 
Hi Ben,

Our feature iterators are wrappers around Accumulo BatchScanners, so they do fetch data in a lazy fashion. Generally they will pre-fetch a chunk of data, then wait for the iterator to be consumed before fetching more. There are some Accumulo settings that you can use to tweak this, but I can't find them at the moment...

One very large caveat is if you are sorting your result sets. In order to sort, we have to load the entire data set into memory.

I can't tell from your message, are you wrapping the feature iterators in some way? The standard dataStore.getFeatureReader(query, Transaction.AUTO_COMMIT) and dataStore.getFeatureSource.getFeatures.features both return the same lazy iterator.

Thanks,

Emilio

On 01/11/2017 12:37 PM, Benjamin Weaver wrote:

Hi all,


We are returning large resultsets from Accumulo (1.7.2) while running Geomesa 1.2.1. Our system frequently runs slow and locks up. We believe our problems owe to our use, in our queries, of the non-lazy FeatureIterator. The problem seems to be that our query returns our entire resultset, loads that set into the FeatureIterator, before we traverse that FeatureIterator, converting features into protobuf and returning to the client.


Our NATS server does not appear to be the bottleneck. We have noticed via metrics that the query itself takes the substantial portion of time.


Do we have available an asynchronous or lazy Geomesa driver or iterator enabling return of large results sets as they are returned from Accumulo? Is such asynchronous/lazy functionality available in Geomesa 1.2.1 or any newer version of Geomesa?


Any perspective is greatly appreciated!


Ben

This email (and any attachments) may contain confidential information and is intended solely for the recipient(s) to whom the email is addressed. If you received this email in error, please inform us immediately and delete the email and all attachments without further using, copying or disclosing the information. This email and any attachments are believed to be, but cannot be guaranteed to be, secure or virus-free. Satellite Applications Catapult Limited is registered in England & Wales. Company Number: 7964746. Registered office: Electron Building, Fermi Avenue, Harwell Oxford, Didcot, Oxfordshire OX11 0QR.

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users

This email (and any attachments) may contain confidential information and is intended solely for the recipient(s) to whom the email is addressed. If you received this email in error, please inform us immediately and delete the email and all attachments without further using, copying or disclosing the information. This email and any attachments are believed to be, but cannot be guaranteed to be, secure or virus-free. Satellite Applications Catapult Limited is registered in England & Wales. Company Number: 7964746. Registered office: Electron Building, Fermi Avenue, Harwell Oxford, Didcot, Oxfordshire OX11 0QR.

Back to the top