Slider 1 mini Slider 2 mini

Thursday, 26 March 2026

Filled under:

 Proposal to Streamline Server Access During Incident Handling

Hi Team,

Currently, during alert handling, a significant amount of time is spent raising separate access requests after identifying the host from the incident ticket, followed by waiting for approvals where applicable. This creates delays for the Infra team in responding to incidents.

I would like to propose an enhancement to streamline this process. For alert-generated incident tickets, the system can automatically identify the host and grant server access to the assigned engineer at the time of ticket assignment. This would eliminate the need for raising a separate access invocation request.

Additionally, for MAS-relevant servers, an approval option (e.g., a button) can be embedded directly within the incident ticket to enable quicker approvals when required.

This approach can help reduce response time and improve overall efficiency during incident handling.

Please let me know your thoughts.

Posted By Nikhil00:51

Monday, 23 March 2026

Filled under:

 

Dear Team,

During a recent review, we identified a non-standard Oracle database associated with your application. The database has two pluggable databases (PDBs), which is not in line with our organizational standards, and as confirmed by Issac, there is no record of its creation.

Request you to please validate this setup, involve the DBA who created it, and take appropriate action, including dropping the database if it is not required.

Kindly treat this as a priority and share an update at the earliest.

Best regards,
[Your Name]

Posted By Nikhil22:23

Thursday, 19 March 2026

Filled under:

 As part of the weekend activity for Grid listener certificate renewal, I started performing pre-checks. Although the activity is expected to be a manual listener restart, I encountered issues on a few hosts during validation, indicating it may not be as straightforward.

Additionally, the Confluence document does not currently reflect these scenarios, so the pre-checks have been helpful in identifying these gaps early. I’ll connect with you separately to understand the possible causes for these issues.

From a resourcing perspective, there is a wide scope on the Blue side (~400 servers as informed by Vinit), so his availability will be limited. Out of 74 APAC hosts, I’ve taken ownership of ~35. Vinit has agreed to handle 5–10 hosts where only a simple listener restart is required.

For the remaining scope, we may need support from Navneeth or the Shift SRE team to ensure smooth execution and avoid spillover to the next shift.

Please let me know a convenient time to discuss the pre-check findings.

Posted By Nikhil03:55
Filled under:

 As part of the weekend activity for Grid listener certificate renewal, I started performing pre-checks. Although the activity is expected to be a manual listener restart, I encountered issues on a few hosts during validation, indicating it may not be as straightforward.

Additionally, the Confluence document does not currently reflect these scenarios, so the pre-checks have been helpful in identifying these gaps early. I’ll connect with you separately to understand the possible causes for these issues.

From a resourcing perspective, there is a wide scope on the Blue side (~400 servers as informed by Vinit), so his availability will be limited. Out of 74 APAC hosts, I’ve taken ownership of ~35. Vinit has agreed to handle 5–10 hosts where only a simple listener restart is required.

For the remaining scope, we may need support from Navneeth or the Shift SRE team to ensure smooth execution and avoid spillover to the next shift.

Please let me know a convenient time to discuss the pre-check findings.

Posted By Nikhil03:54
Filled under:

 

Hi Team,

As part of the change to disable AUTOEXTEND, we understand concerns around tablespace utilization.

While the DB team will continue to monitor and send alerts for cleanup or capacity planning, to avoid back-and-forth and delays, we propose setting up a dedicated coordination channel for handling critical queries.

Also note, space addition requests raised over weekends may face approval delays, which could lead to unexpected issues by Monday.

Request your support in aligning on this approach.

Thanks,
DB Team

Posted By Nikhil02:00

Wednesday, 18 March 2026

Filled under:

 Hi OEM Admins,


Sharing a finding from a recent check and requesting your validation on OEM alerting.


The database FRA was full and out of sync for ~2 months, but no GSNow alerts were generated during this period. Alerts were present in OEM, however they do not seem to have been forwarded to GSNow. Notably, an alert was triggered in GSNow today when we started fixing the sync issue.


Request you to validate if there were any issues with alert forwarding, notification rules, or configuration gaps.


Given the criticality, ensuring consistent alert propagation is important.


Thanks & Regards,

[Your Name]

Posted By Nikhil22:21
Filled under:

 Backup failure alerts must be treated as high priority, especially for Production, and should not remain open beyond one week.

Please ensure timely acknowledgment, clear ownership, and regular communication. Refer to the available documentation and engage SMEs/SREs when needed to drive resolution promptly.

Posted By Nikhil06:14