Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[geomesa-users] Improving geomesa performance with MapReduce

Problem Description:

Currently in our platform we are using geomesa to store large amounts of geographical and time sensitive metadata, and we are experiencing very poor performance metrics (i.e. latency) with our systems current configuration. The primary bottleneck has to do with the large amount of data returned by geomesa, so we are actively pursuing avenues to reduce and shrink the size of the responses. We have been investigating the use of MapReduce with in the system, but have run into some knowledge gaps due to the lack of documentation. The idea behind our MapReduce use case is to either intercept queries coming into our cluster, or run jobs to periodically to combine and reduce the primary dataset and place the results into a separate table. Ideally we would intercept the queries due to the complications of the data reduction, since the reductions is dependent on the parameters of a query.

 

MapReduce Options

·        When intercepting queries coming into our cluster we’d  have them trigger jobs that combine and reduce the queries raw metadata into a smaller set of formatted/processed data points which is then returned to our backend services as the result of the query.

·        Periodically or have events such as a write to a table trigger a job that process and reduces the primary data set and write the result to our new “query” table.

 

Questions

·        Can MapReduce jobs be triggered by events in the database?

·        Can one intercept the queries written to a geomesa instance?

·        How are MapReduce Jobs initiated, and can they be triggered programmatically?

·        Can we send back the results of a MapReduce Job as the result of a query?

·        Are there any other options to reduce the latency occurred by large responses from the database?

 

We were hoping that you'd be able to give us some insight into our problems and additional help in terms of the plausibility for our MapReduce and geomesa use case.

 


Back to the top