Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] number of threads constantly increasing

Hi Emilio,

Using a single data store instance shared between all the threads, solved the problem: the number of threads remains stable during the ingest.

Thank you very much!

Rémi

De : geomesa-users-bounces@xxxxxxxxxxxxxxxx [geomesa-users-bounces@xxxxxxxxxxxxxxxx] de la part de Emilio Lahr-Vivaz [elahrvivaz@xxxxxxxx]
Envoyé : mercredi 30 août 2017 15:26
À : geomesa-users@xxxxxxxxxxxxxxxx
Objet : Re: [geomesa-users] number of threads constantly increasing

Hi Rémi,

I gave you code a quick look and I don't see any obvious resource leaks. Generally if you use getFeatureStore.addFeatures(), you don't need to clean up any feature writers. If you use getFeatureWriter*, then you need to close it yourself. Calling .dispose on the datastore should clean up the metadata connections, but won't close any feature writers.

You might try keeping a single data store instance around for the entire ingest - generally a data store object is fairly expensive to create, so only one is created and shared across threads. Possibly the datastore.dispose() method is not cleaning things up appropriately, but that may not be apparent unless many datastore instances are created (as you're doing). I will check into that.

Thanks,

Emilio

On 08/30/2017 03:58 AM, GRISOT, REMI wrote:
Hi Emilio,

Thank you for your quick answer. I tested an ingest based on the Quickstart example, and the problem definitely comes from my implementation : in the test, the number of threads stayed stable.
For my insertions, I do not use a distributed setup: I have only one machine, with a lot of RAM and CPUs. In a later time, I will probably split it into VMs, but for now, I would like to make it work on a single node.
The ingest program works as follow:
 - Datas are collected, and when a limit is reached, a thread is created (in a ThreadPoolExecutor).  - That thread create an instance of DataStore, send datas and then destroy the instance of Datastore (which I thought should have closed feature writers).

It is really nice to you to propose check the code for resource cleanup! The code is not public (I write it in the field of my internship), but I changed some variables name, and it should be OK. I attached an archive containing the relevant files of the project.

I will have a look to the documentation links and try to imrove my conf with this.

Thank you,

Rémi Grisot

De : geomesa-users-bounces@xxxxxxxxxxxxxxxx [geomesa-users-bounces@xxxxxxxxxxxxxxxx] de la part de Emilio Lahr-Vivaz [elahrvivaz@xxxxxxxx]
Envoyé : mardi 29 août 2017 17:10
À : geomesa-users@xxxxxxxxxxxxxxxx
Objet : Re: [geomesa-users] number of threads constantly increasing

Hi Rémi,

What is your ingest setup like? If you're seeing thread exhaustion, my first guess would be that you are not cleaning up resources correctly (e.g. closing feature writers). Are you distributing your ingest? Usually for data sets that large you would distribute your ingestion across a cluster using spark or map/reduce, so I'm not sure we've tested billions of points from a single JVM.

The metadata batch writer is configured to use 2 threads, so there may be others generated by Accumulo but I wouldn't expect it to swamp the JVM by itself. It will also only write mutations when you call 'flush' or 'close' on a feature writer, so minimizing those calls would reduce the writes.

There are other batch writers associated with each feature writer you create; those are more likely to be causing issues than the metadata writer as there are more of them, with more threads and greater throughput.

You can control the underlying batch writer configuration via system properties as described here [1], although the metadata writer overrides max memory and max threads [2].

[1] http://www.geomesa.org/documentation/user/accumulo/configuratiissues.on.html#batch-writer-properties,
[2] https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloBackedMetadata.scala#L30

If you code is public, I'd be glad to give it a quick check for resource cleanup.

Thanks,

Emilio

On 08/29/2017 09:55 AM, GRISOT, REMI wrote:

Hello,

I'm using Geomesa to store large amount of data, and I noticed that while the insertion is running, the number of threads run by the application is constantly increasing (especially those named "BatchWriterLatencyTimer", see picture attached). After looking into the JConsole, it seems that these threads are created in the file AccumuloBackedMetadata.scala (https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/data/AccumuloBackedMetadata.scala) which through the method "write", call an Accumulo method creating the threads (https://github.com/Sciumo/Accumulo/blob/master/src/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java).
On a big insertion (billions of points), the number of threads was so high it caused an OutOfMemoryError, as no more threads could be created.

Do you have any idea of what could prevent this behavior?

Thank you.

Rémi Grisot
Ce message et toutes les pièces jointes (ci-après le "message") sont établis à l’intention exclusive des destinataires désignés. Il contient des informations confidentielles et pouvant être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de détruire le message. Toute utilisation de ce message non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse de l’émetteur. L'internet ne garantissant pas l'intégrité de ce message lors de son acheminement, Atos (et ses filiales) décline(nt) toute responsabilité au titre de son contenu. Bien que ce message ait fait l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne peut garantir l’absence totale de logiciels malveillants dans son contenu et ne pourrait être tenu pour responsable des dommages engendrés par la transmission de l’un d’eux.

This message and any attachments (the "message") are intended solely for the addressee(s). It contains confidential information, that may be privileged. If you receive this message in error, please notify the sender immediately and delete the message. Any use of the message in violation of its purpose, any dissemination or disclosure, either wholly or partially is strictly prohibited, unless it has been explicitly authorized by the sender. As its integrity cannot be secured on the internet, Atos and its subsidiaries decline any liability for the content of this message. Although the sender endeavors to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

Ce message et toutes les pièces jointes (ci-après le "message") sont établis à l’intention exclusive des destinataires désignés. Il contient des informations confidentielles et pouvant être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de détruire le message. Toute utilisation de ce message non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse de l’émetteur. L'internet ne garantissant pas l'intégrité de ce message lors de son acheminement, Atos (et ses filiales) décline(nt) toute responsabilité au titre de son contenu. Bien que ce message ait fait l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne peut garantir l’absence totale de logiciels malveillants dans son contenu et ne pourrait être tenu pour responsable des dommages engendrés par la transmission de l’un d’eux.

This message and any attachments (the "message") are intended solely for the addressee(s). It contains confidential information, that may be privileged. If you receive this message in error, please notify the sender immediately and delete the message. Any use of the message in violation of its purpose, any dissemination or disclosure, either wholly or partially is strictly prohibited, unless it has been explicitly authorized by the sender. As its integrity cannot be secured on the internet, Atos and its subsidiaries decline any liability for the content of this message. Although the sender endeavors to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

Ce message et toutes les pièces jointes (ci-après le "message") sont établis à l’intention exclusive des destinataires désignés. Il contient des informations confidentielles et pouvant être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de détruire le message. Toute utilisation de ce message non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse de l’émetteur. L'internet ne garantissant pas l'intégrité de ce message lors de son acheminement, Atos (et ses filiales) décline(nt) toute responsabilité au titre de son contenu. Bien que ce message ait fait l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne peut garantir l’absence totale de logiciels malveillants dans son contenu et ne pourrait être tenu pour responsable des dommages engendrés par la transmission de l’un d’eux.

This message and any attachments (the "message") are intended solely for the addressee(s). It contains confidential information, that may be privileged. If you receive this message in error, please notify the sender immediately and delete the message. Any use of the message in violation of its purpose, any dissemination or disclosure, either wholly or partially is strictly prohibited, unless it has been explicitly authorized by the sender. As its integrity cannot be secured on the internet, Atos and its subsidiaries decline any liability for the content of this message. Although the sender endeavors to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.

Back to the top