[geomesa-users] GeoMesa FSDS on S3 - very slow response times

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[geomesa-users] GeoMesa FSDS on S3 - very slow response times

From: christian.sickert@xxxxxxxxxxx
Date: Tue, 30 Jul 2019 15:23:05 +0000
Accept-language: de-DE, en-US
Delivered-to: geomesa-users@xxxxxxxxxxxxxxxx
List-archive: <https://dev.locationtech.org/mhonarc/lists/geomesa-users>
List-help: <mailto:geomesa-users-request@locationtech.org?subject=help>
List-subscribe: <https://dev.locationtech.org/mailman/listinfo/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=subscribe>
List-unsubscribe: <https://dev.locationtech.org/mailman/options/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=unsubscribe>
Thread-index: AdVG6CzRCTElGlxSQuWKWtnYqAI92w==
Thread-topic: GeoMesa FSDS on S3 - very slow response times

Title: Default Disclaimer Daimler AG

Hi GeoMesa Users,

we are using GeoMesa with an S3 file system datastore and are experiencing extremely slow response times when we access our data - even with a “moderate” number of files stored in it (let’s say 10.000).

Our setup:

* GeoMesa 2.3.0

* Filesystem datastore pointing to an S3 URL

** encoding: orc

** partition scheme: daily,xz2-8bits

** leaf-storage: true

We’re accessing that data store using different “clients”:

* a Java microservice which uses the GeoTools GeoMesa API and is running in the same AWS region as the S3 bucket

* GeoServer (2.14) running in the same AWS region as the S3 bucket

* geomesa-fs CLI running in the same AWS region as the S3 bucket

All of them are really slow (it takes minutes up to hours until we get a response). Doing some debugging with our microservice we found out that even operations like org.geotools.data.DataStore.getTypeNames() takes really long because all of the metadata files seem to be scanned (which does not seem to be necessary since reading the per-feature top-level storage.json files should be sufficient). Is that “works-as-designed” or might that be a bug inside the Geomesa-FSDS implementation?

Is there anything (besides switching the actual data store) we can do to improve the performance?

We’re doing a “geomesa-fs compact …” from time to time which gives us a fairly acceptable performance (but also takes hours, sometimes even days, to complete).

Thanks,

Christian

Mit freundlichen Grüßen / Kind regards

Christian Sickert

Crowd Data & Analytics for Automated Driving
Daimler AG - Mercedes-Benz Cars Development - RD/AFC

+49 176 309 71612
christian.sickert@xxxxxxxxxxx

If you are not the addressee, please inform us immediately that you have received this e-mail by mistake, and delete it. We thank you for your support.

Follow-Ups:
- Re: [geomesa-users] GeoMesa FSDS on S3 - very slow response times
  - From: Emilio Lahr-Vivaz

Prev by Date: Re: [geomesa-users] Chinese garbled And no features returned in GeoServer
Next by Date: Re: [geomesa-users] Chinese garbled And no features returned in GeoServer
Previous by thread: [geomesa-users] GeoMesa 2.3.1/2.2.3/2.1.4 released
Next by thread: Re: [geomesa-users] GeoMesa FSDS on S3 - very slow response times
Index(es):
- Date
- Thread

Breadcrumbs