Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CouchDB replications cannot be configured with 'heartbeat' #5043

Open
JKDingwall opened this issue Apr 29, 2024 · 1 comment
Open

CouchDB replications cannot be configured with 'heartbeat' #5043

JKDingwall opened this issue Apr 29, 2024 · 1 comment

Comments

@JKDingwall
Copy link
Contributor

Summary

In CouchDB 1.7.2 the internal replicator allowed use of the heartbeat parameter such that query parameters look like:

client_ip - [httpuser] [18/Apr/2024:15:53:37 +0000] "GET /remote/_changes?feed=continuous&style=all_docs&since=0&heartbeat=10000 HTTP/1.1 @ 127.0.0.1:5984" 200 283510 "-" "CouchDB/1.7.2" "couchdb-master" "127.0.0.1:5984"

Since upgrading to CouchDB 3.x.x the replication uses the timeout parameter instead:

client_ip - [httpuser] [29/Apr/2024:11:24:49 +0000] "GET /remote/_changes?feed=continuous&style=all_docs&since=872-g1AAAACheJzLYWBgYMpgTmEQTM4vTc5ISXIwNDLXMwBCwxyQVCJDUv3___-zMpiTGBgYW3OBYuxJKYZJqUaG2PTgMSmPBUgyNACp_wgDH4MNNElMTU02NsKmNQsAHRwqOg&timeout=10000 HTTP/1.1 @ 127.0.0.1:5984" 200 180 "-" "CouchDB-Replicator/3.3.3" "couchdb-master" "127.0.0.1:5984"

This does not affect the correctness of the replication but it does mean a new connection is established each timeout parameter. When there are 1000s of replications defined as in our environment this results in significant activity and size of the http proxy access log. We would like the ability to configure the replication using the heartbeat parameter in the replication document. I have tried using query_params on the replication document but it looks like these parameters are only used if a filter function is configured.

Desired Behaviour

The desired behaviour is that a single persistent connection is opened per continuous replication rather than a new connection is established every 10s (default).

Possible Solution

Support a heartbeat parameter in the replication settings which will be used in preference to timeout if it is defined: https://docs.couchdb.org/en/stable/json-structure.html#replication-settings

I have modified couch_replicator_api_wrap.erl in the source to provide a heartbeat parameter in changes_since():

diff --git a/src/couch_replicator/src/couch_replicator_api_wrap.erl b/src/couch_replicator/src/couch_replicator_api_wrap.erl
index a44a79da1..384f0135b 100644
--- a/src/couch_replicator/src/couch_replicator_api_wrap.erl
+++ b/src/couch_replicator/src/couch_replicator_api_wrap.erl
@@ -546,7 +546,10 @@ changes_since(
             false ->
                 [{"feed", "normal"}];
             true ->
-                [{"feed", "continuous"}]
+                [
+                    {"feed", "continuous"},
+                    {"heartbeat", "true"}
+                ]
         end ++
             [
                 {"style", atom_to_list(Style)},

This change appears in the access.log but it looks like there is more to it than this as the connections are still closing (although not after 10s, perhaps on receipt of a heartbeat line?)

Additional context

GBs of log data that is largely irrelevant to the functioning of our environment. I suspect that a heartbeat connection uses fewer bytes to maintain than the repeated connection open/close so perhaps also cheaper when bandwidth usage is metered. We can use SO_KEEPALIVE so dead connections get closed which covers a noted advantage of the 10s timeout. I did find this commit (added for 2.0.0) in the history but I can't find the relevant bug tracking system to see if there is any discussion on the rationale for the change.

commit e13f5b51511f1c8d773c3eb7f8f304a508c8f346
Author: Bob Dionne <bob@cloudant.com>
Date:   Wed Feb 27 09:16:55 2013 -0500

    Replace heartbeat in continuous feed with timeout
    
    BugzID: 17662
@big-r81
Copy link
Contributor

big-r81 commented Jan 2, 2025

Why don't you use the heartbeat parameter for continous replication anymore to keep the connection open? timeout will hold the connection open for the specified ms and then close the socket. More info in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants