Description of the Issue:
We are experiencing a data replication latency issue in our PostgreSQL setup managed by Patroni. The issue was reported by the application team when querying a view on the replica node. The data was not initially visible on the replica but was accessible on the leader (primary). After a delay, the data became available on the replica.
The observed latency is a concern, especially for application functionalities dependent on near real-time data replication.
Setup Details:
- Environment: Patroni-managed PostgreSQL High Availability (HA) Cluster
- Replication Mode: Asynchronous
- Primary-Replica Configuration: 1 leader (primary) and 2 replicas
- Patroni Version: [Specify Version]
- PostgreSQL Version: [Specify Version]
- Replication Slots: Enabled/Disabled (specify)
- Network Setup: Nodes are within the same data center / across regions (specify).
Observations:
pg_last_xact_replay_timestamp:
The timestamp on the replica node indicated it was lagging behind the leader.- The last replayed transaction timestamp was older than expected, even when active transactions were occurring on the primary.
Behavior:
- Data committed on the leader was visible immediately.
- A noticeable delay occurred before the data was available on the replica.
- No significant load or downtime was observed during this period.
WAL Replication:
- WAL logs appear to be streaming, but the application of WALs on the replica may be delayed.
System Logs:
- No errors or warnings related to replication in PostgreSQL logs or Patroni logs during the observed delay.
Actions Taken So Far:
- Verified the status of replication using
pg_stat_replication. - Monitored resource utilization (CPU, memory, I/O) on replica nodes—found no significant bottlenecks.
- Checked network latency between the leader and replicas—no anomalies detected.
- Ensured that WAL archiving and replication settings align with recommended configurations.
Request for Assistance:
We would like assistance from the EDB team in investigating the following:
- Possible causes of replication delays in this setup.
- Guidance on advanced tuning for replication in Patroni with PostgreSQL to ensure minimal latency.
- Suggestions for any additional diagnostic steps or logs to examine for identifying root causes.
Please let us know if you need further details about our environment or configurations.
Urgency Level: Medium / High (based on application needs)
Thank you for your assistance in resolving this issue.





0 comments:
Post a Comment