[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [hono-dev] Changes in settlement of telemetry and dispatch router config?
|
On 14/09/17 13:58, Hudalla Kai (INST/ECS4) wrote:
We are currently sending telemetry messages to multicast addresses
pre-settled. Doing so, we are seeing many (telemetry) messages being
dropped when we run our JMeter tests, i.e. we often receive less
messages than we have sent. Hono Messaging doesn't seem to drop the
messages (we can tell because of the metrics) so we assume that the
Dispatch Router drops them. The problem here seems to be that the
Dispatch Router starts to discard messages aggressively, if the upstream
senders send more messages than the downstream consumers can handle
(i.e. if any of the downstream consumers is not providing enough
credits). The number of credits the Dispatch Router hands out to Hono
Messaging seems to not be linked to the credits the DR receives from the
downstream consumers. Note that this is true for both multicast as well
as balanced target addresses.
So, if we want to make sure that the Protocol Adapters only get as many
credits as the DR can forward to its downstream consumers, we need to
send telemetry messages unsettled, because only this way the DR is able
to provide end-to-end flow control. However, if we do so, we also need
to make sure that the telemetry addresses are balanced (not multicast),
as Marc pointed out.
Do you want to allow multiple independent subscribers to the telemetry
stream? If so, and if you want at-least-once, you need to route them
through a broker topic. You can do this by establishing link routes for
that address, configured to route the links to the broker hosting the topic.
In general, if your consumers can't keep up with the rate at which
telemetry messages are produced, there is really only two things you can
do: queue the messages up somewhere or drop some of them.
Queuing the messages up somewhere only makes sense if the situation is
temporary, and the consumers will catch up (meaning messages have to
actually be processed faster than they are produced for some amount of
time, to allow the backlog to be cleared). If queuing makes sense, it
can be done either at the producers (by throttling them) or by the
messaging infrastructure. The router deliberately only has a fixed (and
relatively small) amount of buffer space. For handling larger backlogs,
a broker is a better choice.
The decision to drop messages could likewise be taken either at the
producers (providing they receive some indication of the problem) or by
the messaging infrastructure. For telemetry data, the producers might
reduce the frequency at which they send updates, or more aggressively
batch any data that needs to be archived.
However, where it is not desirable (or not possible) to put the
responsibility for this onto the producers the messaging infrastructure
could be given hints about how to drop messages to minimise the impact,
e.g. ttls on message with data that becomes stale (this saves the
consumers having to process these stale messages, and allows them more
chance to catch up with the recent data) or 'last-value' type semantics,
where a new update effectively removes any older messages. At present
the router doesn't do anything like this (modifying it to use ttl when
dropping messages should be possible however), but brokers do.