Monday, 7 October 2024

sev 4

Filled under:

 

GitLab Issue Description

Title: High Severity Issue: Database Server Disruption

Description: We faced a high-severity issue with our critical database server during US hours. The disk group was dismounted, causing the server to become unstable and resulting in significant impacts on both the database and application performance.

Handover Notes from US Team:

  1. Raise a case with the Oracle vendor to investigate the root cause of the disk group dismount.
  2. Perform manual synchronization of the standby database to prevent overloading the impacted primary database.
  3. Collaborate with the UNIX team and application support to ensure system stability and mitigate further risks.
  4. Address the authentication errors reported by the application team.

Comments (Action Taken)

  1. Case Raised: I raised a case with Oracle and uploaded the relevant logs. I also installed the TFA utility to gather additional logs for the vendor's investigation.

  2. Database Synchronization: I couldn’t find a reason for the suggested manual sync of the database, so I opted for the automatic (default) sync. I also educated the application team on Data Guard synchronization, assuring them it wouldn’t impact the primary database. I conveyed these details over a call to alleviate their concerns.

  3. UNIX Team Collaboration: I discussed the findings with the UNIX team and recommended further investigation on their end, as the issue appeared to stem from the OS rather than the database.

  4. Application Team Coordination: Upon working with the application team, I discovered that the authentication issue they encountered was a result of the server restart and was not affecting ongoing jobs. I advised them not to raise the severity of this issue further.

I’ll continue to monitor the situation and update as necessary. Thank you for your support!

0 comments:

Post a Comment