Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect lastLsn #30

Open
exe-dealer opened this issue Sep 26, 2022 · 6 comments
Open

incorrect lastLsn #30

exe-dealer opened this issue Sep 26, 2022 · 6 comments

Comments

@exe-dealer
Copy link
Member

.endLsn of PrimaryKeepaliveMessage is greater than XLogData messages lsn. This causes replication slot moved to position before XLogData messages actually consumed

pgwire/mod.js

Line 1611 in 12826ad

if (lastLsn < msg.endLsn) lastLsn = msg.endLsn;

@exe-dealer
Copy link
Member Author

postgres does not send empty transactions anymore, so we need to use endLsn of PrimaryKeepaliveMessage to move slot position

@exe-dealer
Copy link
Member Author

image

endLsn of PrimaryKeepaliveMessage goes ahead of XLogData lsn, but not further than commitLsn. So acking endLsn does not cause message loss, because replication restarts from transaction start. My problem is that I mistakenly thought that I can use confirmed_flush_lsn of slot to filter already handled messages in case of big transactions. But message lsn is not monotonic and not unique.

@rkistner
Copy link
Contributor

I'm also running in the issue where endLsn from PrimaryKeepaliveMessage is not available anymore.

On an idle instance on AWS RDS, the WAL grows by 64MB every 5 minutes. When it's idle, there are no messages received on the WAL other than the PrimaryKeepaliveMessage. This means I can't ack anything, and the WAL just keeps on growing until the instance runs out of space.

I've worked around this by periodically using pg_logical_emit_message, but this only works on Postgres 14+.

@rkistner
Copy link
Contributor

Thanks, the change in the commit above solves this issue for me.

FWIW, I only use this LSN when not inside a transaction, just to be safe. Something like the following:

      let inTx = false;
      let ackedLsn = lastPersistedLsn;

      for await (const chunk of replicationStream.pgoutputDecode()) {
        await touch();

        const { messages, lastLsn } = chunk;

        for (const msg of messages) {
          if (msg.tag == 'begin') {
            inTx = true;
          } else if (msg.tag == 'commit') {
            inTx = false;
            // (flush data here)
            ackedLsn = msg.lsn!;
            await replicationStream.ack(msg.lsn!);
          } else {
            // (write data here)
          }
        }

        if (!inTx && lastLsn > ackedLsn) {
          ackedLsn = lastLsn;
          await replicationStream.ack(lastLsn);
        }
      }

@jmealo
Copy link

jmealo commented Jan 19, 2024

Is this issue addressed by #30? 9532a39

@rkistner
Copy link
Contributor

Yes, the issue is fixed for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants