feat: add qos_reliablity param #329

MrBlenny · 2024-12-05T01:09:01Z

Changelog

Adds a qos_reliablity param to the ros2 bridge that allows the default reliability to be set via param. Defaults to "automatic" to retain existing behavior.

Docs

N/A

Description

The web-socket bridge runs on TCP which causes issues on flakey internet connections. There is some discussion here:
https://github.com/orgs/foxglove/discussions/15

This PR adds the ability to specify "qos_reliability" of "best_effort" which is useful when running a bridge on a different computer to the robot for example:

ROBOT COMPUTER -> DDS/UDP -> REMOTE COMPUTER -> Websocket/TCP -> Foxglove / Browser

jtbandes · 2024-12-05T01:11:26Z

when running a bridge on a different computer to the robot

Just curious, why are you doing this rather than running the bridge on the robot?

MrBlenny · 2024-12-05T03:06:11Z

Running the bridge on the robot means the connection from the Robot to Foxglove is TCP.
When comms is poor (for example satellite internet on a field robot) the TCP connection can go down for a few seconds which will lead to a build-up of messages, potentially up to the send_buffer_limit. When connection is established there will then be a large flood of messages, some of which may no longer be relevant.

In my use-case, it is fine if messages are dropped via the DDS middleware. Adding WebRTC or WebTransport would be a nicer solution but this param provides a quick fix.

achim-k · 2024-12-05T13:25:14Z

README.md

@@ -84,6 +84,7 @@ Parameters are provided to configure the behavior of the bridge. These parameter
 * (ROS 2) __num_threads__: The number of threads to use for the ROS node executor. This controls the number of subscriptions that can be processed in parallel. 0 means one thread per CPU core. Defaults to `0`.
 * (ROS 2) __min_qos_depth__: Minimum depth used for the QoS profile of subscriptions. Defaults to `1`. This is to set a lower limit for a subscriber's QoS depth which is computed by summing up depths of all publishers. See also [#208](https://github.com/foxglove/ros-foxglove-bridge/issues/208).
 * (ROS 2) __max_qos_depth__: Maximum depth used for the QoS profile of subscriptions. Defaults to `25`.
+ * (ROS 2) __qos_reliability__: The default QoS reliability setting for subscriptions the bridge creates. Can be 'reliable', 'best_effort', or 'automatic'. Defaults to `automatic`.


Would be good to explain what the automatic setting does under the hood.

I've added a description of all options and another best_effort_if_volatile.
I know this is borderline but... this is a very useful default if running the bridge remote as transient_local topics are typically not things that should be thrown away.

A better option would probably be to have a regex matcher on topics that specifies which QoS foxglove should use when in subscribes.

defunctzombie · 2024-12-10T15:05:28Z

README.md

+ * (ROS 2) __qos_reliability__: The default QoS reliability setting for subscriptions the bridge creates. Defaults to `automatic`.
+   * `reliable`: ALWAYS subscribe with a "reliable" QoS profile.
+   * `best_effort`: ALWAYS subscribe with a "best effort" QoS profile.
+   * `best_effort_if_volatile`: subscribe as "best effort" if all the participants are "volatile". If any are "transient_local", subscribe as "reliable".


Why is it wrong to subscribe with "best effort" if any of the participants are "transient_local" ?

I'm not sure I can fully answer this question but the following info may be helpful: https://docs.ros.org/en/rolling/Concepts/Intermediate/About-Quality-of-Service-Settings.html#qos-compatibilities

It's a common pattern to publish a topic with "transient_local" durability whenever that value changes. For example a /robot/healthy Bool topic may be published only when the value changes from true<->false.

If the bridge subscribes to a transient_local topic with best_effort over a flakey connection. The historic transient_local message may be lost. These sorts of transient_local topics are typically not re-published so the bridge will never get the value.

best_effort_if_volatile is a useful default in this case but as mentioned, it's probably better to add a regex matcher param instead of this qos_reliability param.

i.e.
best_effort_qos_topic_whitelist: List of regular expressions (ECMAScript) for topics that should use be forced to use 'best_effort' QoS. Unmatched topics will use 'reliable' QoS if ALL publishers are 'reliable', 'best_effort' if any publishers are 'best_effort'. Defaults to ["(?!)"] (match nothing).

I might have a look at adding this today.

I have made said change. Ready for review again.

👍 we'll take a look

It's a common pattern to publish a topic with "transient_local" durability whenever that value changes. For example a /robot/healthy Bool topic may be published only when the value changes from true<->false.

FWIW my general advice on this is to to pick a cadence where you publish things like this even if they have not changed. It helps with data recording and debugging afterwards because you have the state information present without having to go back minutes or hours in history to lookup the topic value (which is hard in some workflows).

feat: add qos_reliablity param

d789f66

jtbandes requested review from achim-k and defunctzombie December 5, 2024 01:12

defunctzombie requested a review from jtbandes December 5, 2024 02:39

achim-k approved these changes Dec 5, 2024

View reviewed changes

feat: add a "best_effort_if_volatile" option

7e3aa4b

defunctzombie reviewed Dec 10, 2024

View reviewed changes

MrBlenny added 3 commits December 11, 2024 11:07

feat: add bestEffortQosTopicWhiteList

57b8418

chore: update comment

b29b0d2

chore: remove nested if

a061d63

defunctzombie requested a review from achim-k January 3, 2025 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add qos_reliablity param #329

feat: add qos_reliablity param #329

MrBlenny commented Dec 5, 2024

jtbandes commented Dec 5, 2024

MrBlenny commented Dec 5, 2024 •

edited

Loading

achim-k Dec 5, 2024

MrBlenny Dec 10, 2024

defunctzombie Dec 10, 2024

jtbandes Dec 10, 2024

MrBlenny Dec 10, 2024 •

edited

Loading

MrBlenny Dec 11, 2024 •

edited

Loading

defunctzombie Dec 12, 2024

feat: add qos_reliablity param #329

Are you sure you want to change the base?

feat: add qos_reliablity param #329

Conversation

MrBlenny commented Dec 5, 2024

Changelog

Docs

Description

jtbandes commented Dec 5, 2024

MrBlenny commented Dec 5, 2024 • edited Loading

achim-k Dec 5, 2024

Choose a reason for hiding this comment

MrBlenny Dec 10, 2024

Choose a reason for hiding this comment

defunctzombie Dec 10, 2024

Choose a reason for hiding this comment

jtbandes Dec 10, 2024

Choose a reason for hiding this comment

MrBlenny Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

MrBlenny Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

defunctzombie Dec 12, 2024

Choose a reason for hiding this comment

MrBlenny commented Dec 5, 2024 •

edited

Loading

MrBlenny Dec 10, 2024 •

edited

Loading

MrBlenny Dec 11, 2024 •

edited

Loading