Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [hono-dev] Changes in settlement of telemetry and dispatch router config?

On 14/09/17 13:58, Hudalla Kai (INST/ECS4) wrote:
We are currently sending telemetry messages to multicast addresses pre-settled. Doing so, we are seeing many (telemetry) messages being dropped when we run our JMeter tests, i.e. we often receive less messages than we have sent. Hono Messaging doesn't seem to drop the messages (we can tell because of the metrics) so we assume that the Dispatch Router drops them. The problem here seems to be that the Dispatch Router starts to discard messages aggressively, if the upstream senders send more messages than the downstream consumers can handle (i.e. if any of the downstream consumers is not providing enough credits). The number of credits the Dispatch Router hands out to Hono Messaging seems to not be linked to the credits the DR receives from the downstream consumers. Note that this is true for both multicast as well as balanced target addresses.

So, if we want to make sure that the Protocol Adapters only get as many credits as the DR can forward to its downstream consumers, we need to send telemetry messages unsettled, because only this way the DR is able to provide end-to-end flow control. However, if we do so, we also need to make sure that the telemetry addresses are balanced (not multicast), as Marc pointed out.

Do you want to allow multiple independent subscribers to the telemetry stream? If so, and if you want at-least-once, you need to route them through a broker topic. You can do this by establishing link routes for that address, configured to route the links to the broker hosting the topic.

In general, if your consumers can't keep up with the rate at which telemetry messages are produced, there is really only two things you can do: queue the messages up somewhere or drop some of them.

Queuing the messages up somewhere only makes sense if the situation is temporary, and the consumers will catch up (meaning messages have to actually be processed faster than they are produced for some amount of time, to allow the backlog to be cleared). If queuing makes sense, it can be done either at the producers (by throttling them) or by the messaging infrastructure. The router deliberately only has a fixed (and relatively small) amount of buffer space. For handling larger backlogs, a broker is a better choice.

The decision to drop messages could likewise be taken either at the producers (providing they receive some indication of the problem) or by the messaging infrastructure. For telemetry data, the producers might reduce the frequency at which they send updates, or more aggressively batch any data that needs to be archived.

However, where it is not desirable (or not possible) to put the responsibility for this onto the producers the messaging infrastructure could be given hints about how to drop messages to minimise the impact, e.g. ttls on message with data that becomes stale (this saves the consumers having to process these stale messages, and allows them more chance to catch up with the recent data) or 'last-value' type semantics, where a new update effectively removes any older messages. At present the router doesn't do anything like this (modifying it to use ttl when dropping messages should be possible however), but brokers do.


Back to the top