Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-dev] Kafka datastore contribution request

Hi Chad,

This sounds like a great feature.  And thanks for starting a discussion first!  For general contributing requirements, check out https://github.com/locationtech/geomesa/blob/master/CONTRIBUTING.md

Ideally, we want a little more control than 'start now' and 'start at the beginning'.  From a quick read, there are two separate configurations which could be exposed.  Your suggestion for exposing 'auto.offset.reset' is one of them, and that's a concrete piece of work. 

It'd also be great to specify a 'read-behind' number or time period.  This is slightly separate from the Kafka config mentioned  above, and would allow for fine-grained control.  For instance, a user could specify an offset of 1000 messages.  If they specify a time window, it is easy enough to use a binary search through the Kafka WAL to find an appropriate offset.  (This is already written in the Kafka ReplayDataStore bits.)

The two ideas are related, but can be worked out separately.  (That is, I'm not trying to suggest that any contribution would have to solve both.)  Thoughts?

Cheers,

Jim

On 3/15/2017 2:31 AM, Chad Phillips wrote:
I'd like to contribute a feature for the geomesa kafka datastore libraries that exposes the auto offset reset configuration in the Kafka consumer, when using the live feature source.  To preserve the existing default behaviour, this would be set to "largest" but, a user could also set it to "smallest" in order to start reading from the beginning of the topic, re-populating the cache with any existing features upon initialization of the feature source.  This is useful in situations where a user is expecting low volume data or highly volatile data where the data in the Kafka topic backing the feature source is only being kept for a short period of time.  In both of those situations, having the cache repopulate upon initialization (and continuously receive updates as they come in after that) makes it easy for a user to continue consuming data after restarting their system, or restarting GeoServer (for example), without having to use the replay feature source in addition to the live feature source.


_______________________________________________
geomesa-dev mailing list
geomesa-dev@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-dev



Back to the top