Monday, 7 October 2024

pg

Filled under:

 Title: High Severity Issue: PostgreSQL Database Growth and Crash

Description: The PostgreSQL database experienced rapid growth, ultimately leading to a high-severity issue. Despite my earlier email notifications to the application team about the unusual growth patterns, the recommendations were overlooked, resulting in significant consequences.

I sensed that the database was growing at an alarming rate and reached out to the application team for additional details regarding their operations. In parallel, I raised a case with the vendor to investigate the issue further.

However, just as I was about to provide more information to the vendor after a week of monitoring, I discovered that the database had already crashed due to a space issue.

In response, I opened an incident channel and raised another case with the EDB vendor, sharing all relevant logs and reactivating the database for application use. Since there was a lag in the database, I awaited confirmation from the application team regarding their tolerance for slight data loss.

The application ran successfully for a day, but unfortunately, the database crashed again during APAC hours. I contacted the EDB vendor, explained the high severity of the situation, and activated the database before business hours began.

Currently, I am following up with the EDB vendor to determine the root cause of the database inflation and explore possible solutions. This sequence of events demonstrates a proactive approach to mitigating risks and ensuring system stability.


Comments (Action Taken)

  1. Initial Monitoring: Noticed unusual growth in the PostgreSQL database and sent email notifications to the application team with recommendations and a request for further details on their operations.

  2. Vendor Case Raised: Opened a case with the vendor for investigation while gathering relevant data.

  3. Database Crash: Discovered the database had crashed due to space issues just as I was preparing to provide additional details to the vendor.

  4. Incident Channel Opened: Created an incident channel and raised another case with the EDB vendor, sharing all necessary logs.

  5. Database Reactivated: Activated the database for application use and awaited confirmation regarding slight data loss.

  6. Ongoing Monitoring: Following up with the EDB vendor for root cause analysis and potential solutions for the rapid database growth.

I will keep this issue updated as I receive more information from the vendor. Thank you for your attention to this matter!

0 comments:

Post a Comment