Friday, 27 February 2026

Filled under:

 I’ve enhanced the internal BCM automation articles to be more structured and AI-friendly.

Now the internal AI can accurately handle queries like task failures or escalation contacts and provide the right responses directly — improving self-service and reducing manual handling.

Isn’t this exactly the direction we’re heading with AI & automation integration? πŸ‘πŸΌπŸ™‚

Posted By Nikhil06:52
Filled under:

 Assist has now been fed with all FAQs (user queries). Going forward, if users have any queries related to our BCM automation, please redirect them to goto/red or Copilot. In most cases, AI should be able to provide the appropriate response directly.

Let’s guide users there to reduce manual interactions and query handling.

Thanks to Ranjit for the suggestion — this has now been implemented.

Posted By Nikhil05:55

Thursday, 26 February 2026

Filled under:

 Objective 1: Implement Unified Grafana Dashboard (LGTM Project)

Objective Statement:

Design and implement centralized Grafana dashboards for Oracle and PostgreSQL databases under the LGTM initiative to improve real-time visibility and operational decision-making.

Key Deliverables:

Integrate Oracle and PostgreSQL metrics into the LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus).

Develop standardized dashboards covering availability, performance, replication lag, storage, sessions, and alert trends.

Enable role-based access for application and operations teams.

Conduct at least 1 knowledge-sharing/demo session for stakeholders.

Success Metrics:

Dashboards live in production by [target quarter].

≥ 80% of database operational metrics accessible via Grafana.

Reduction in ad-hoc monitoring requests by 30%.

Positive feedback from users on accessibility and usability.

Business Impact:

Improves transparency, reduces dependency on DBAs for basic health checks, and enhances proactive monitoring.

Objective 2: Develop Actionable Data Mesh Reporting Framework

Objective Statement:

Design and implement a structured reporting solution using Data Mesh principles to generate meaningful, actionable insights supporting daily database operations and decision-making.

Key Deliverables:

Identify key operational data domains (backup status, replication health, performance trends, storage growth).

Develop automated report generation using centralized data sources.

Ensure reports are standardized, easy to interpret, and aligned with operational KPIs.

Implement scheduled distribution or dashboard-based access.

Success Metrics:

Deliver at least 3 core operational reports.

Reduce manual data compilation effort by 40%.

Improve response time for operational decision-making.

Adoption by operations/application teams for day-to-day reference.

Business Impact:

Enables data-driven decisions, reduces manual effort, and increases operational efficiency.

Posted By Nikhil20:32
Filled under:

 Hi Pawel, thanks for reaching out. I had already signed off yesterday.

Happy to connect today — just let me know a time that works for you, and I’ll be around.

Posted By Nikhil01:41

Wednesday, 25 February 2026

Filled under:

 Hi Team,

Thank you for raising the change request for the database switchover as part of your BCM plan.

As per the standard process, BCM switchovers are executed via automation. We request you to please raise one RITM ticket (Automated – BCM Database Switchover). Once submitted, this will trigger the automation workflow and provide you with the required control to initiate the database switchover independently.

You can find the detailed steps and process documentation here:

[Insert Link Here]

Kindly ensure the RITM is raised well in advance of the planned implementation window to avoid any delays.

During the day of implementation, if you encounter any issues while executing the BCM activity, the Database Team will be available on #channel-name for immediate support.

Please let us know once the RITM is raised.

Posted By Nikhil17:33

Monday, 16 February 2026

Filled under:

 Hi Team,


I’ve noticed that in recent automated PostgreSQL provisioning runs, the "pg_krb5.conf" file is no longer present on newly built instances.


From an SRE standpoint, this should not impact us since:


- Our primary authentication mechanism is SSL certificate-based.

- LDAP usage (in our case) does not rely on Kerberos/GSSAPI.

- "krb5.conf" is only required when Kerberos (GSSAPI) authentication is enabled via "pg_hba.conf".


Could you please confirm whether this removal is intentional?

Just want to ensure there are no implications for any edge cases or future auth integrations.


Thanks,

Nikhil



Hi Team,


I hope you’re doing well.


I wanted to highlight an observation from the recent PostgreSQL server provisioning runs (automated builds in our environment). I’ve noticed that the "pg_krb5.conf" file is no longer being provisioned on newly created instances.


As part of a quick technical review from the SRE perspective:


- Our authentication model is predominantly certificate-based (SSL client certs), which does not depend on Kerberos configuration.

- LDAP authentication is used occasionally, and standard LDAP (without GSSAPI/Kerberos binding) also does not require a "krb5.conf" file.

- The "krb5.conf" file becomes relevant only if GSSAPI/Kerberos authentication is enabled (e.g., "gss" entries in "pg_hba.conf") or if LDAP is configured with SASL/GSSAPI.

- In the absence of Kerberos-based authentication, the file should not have any functional impact on PostgreSQL connectivity.


That said, I wanted to confirm:


- Is the omission of "pg_krb5.conf" intentional as part of a security hardening or configuration simplification effort?

- Or should it still be provisioned for fallback / future Kerberos-based integrations?


Just seeking confirmation to ensure there are no unintended side effects in edge cases or future auth model changes.


Thanks in advance for the clarification.


Best regards,

Nikhil

Posted By Nikhil06:33

Wednesday, 11 February 2026

Filled under:

 Meeting to Discuss Improving Aged Request Handling & Service Delivery

Hi [SDM Name],

I’d like to schedule a short meeting to discuss ways we can better manage aged requests, reduce escalations, minimize user follow-up emails, and enhance overall service delivery. Your insights will be valuable in shaping an effective approach.

Please let me know a suitable time for you.




Current scenario


Wait for user email

No concept of on hold review

DBA gets a change of work for a day.

Shift lead concept 

Not assigned to me, why I should work

Posted By Nikhil03:54
Filled under:

 Your dedication, curiosity to learn, and proactive approach really help in building a strong reputation. I appreciate how you’re always eager to fix open issues, identify improvements, and address things straight to the point. Your efforts truly make a difference!

Posted By Nikhil03:21

Tuesday, 10 February 2026

Filled under:

 Hi Team,


This ticket has been reassigned back to us twice without any comments. As this is a decommission/migration task, we understand it should be handled by IMS.


Could you please clarify the reason for the reassignment and confirm whether such tasks are no longer managed by IMS?


Thanks for your support.


Regards,

Nikhil

Posted By Nikhil18:56
Filled under:

 Hi Team,


Hope you are doing well.


The mentioned ticket has been reassigned back to our queue twice by the migration team, but there were no comments or updates added to indicate the reason for the reassignment.


As per our understanding, this task falls under the migration/decommission activity and was expected to be handled by IMS. Could you please help clarify the reason for assigning it back to us?


Additionally, please confirm if such decommission tasks are no longer being performed by IMS, so we can align accordingly and plan the next steps.


Appreciate your support and clarification on this.


Thanks and regards,

Nikhil

Posted By Nikhil18:54
Filled under:

 I’m currently working on the etcd upgrade activity for PostgreSQL clusters covering 39 servers. The plan is to execute this during the upcoming weekend maintenance window. Will keep you posted on progress.

Posted By Nikhil17:30
Filled under:

 Great – here is the updated GitLab issue draft incorporating your inputs:

Title

etcd Upgrade for PostgreSQL Clusters – 39 Servers (Weekend Maintenance)

Description

We need to perform an upgrade of the etcd version used by PostgreSQL Patroni clusters across our environment to address security vulnerabilities and improve stability.

The current etcd version deployed on multiple clusters is outdated and requires upgrade to a supported and secure release.

This activity will cover validation and upgrade of 39 servers hosting etcd for PostgreSQL clusters.

Scope

Total servers in scope: 39

Environments: Production / Staging / Non-Production

Upgrade of etcd on all PostgreSQL Patroni clusters

Health validation of PostgreSQL services post-upgrade

Testing of failover and cluster stability

Reason for Upgrade

Mitigation of known vulnerabilities in existing etcd versions

Stability and performance improvements

Compatibility with recommended Patroni/PostgreSQL HA configurations

Alignment with global infrastructure standards

Reference:

https://dev.oud.ubc.net/bbs/fs/hosting/database/apac-db-reliability/issues/224�

Maintenance Window

Activity to be performed during weekend maintenance window

Exact schedule to be coordinated with application teams

Changes will be executed in a phased and controlled manner

Implementation Plan

Identify current etcd version on all 39 servers

Notify stakeholders and finalize maintenance schedule

Take configuration and data backups

Perform rolling upgrade of etcd nodes cluster by cluster

Validate:

etcd cluster health

Patroni status

PostgreSQL connectivity

Perform controlled failover tests

Post-upgrade monitoring and verification

Rollback Plan

Revert etcd to previous version if any instability occurs

Restore prior configuration backups

Revalidate cluster functionality

Teams Involved

Database Operations Team

Global Infrastructure Team

Respective Application Owners (for validation)

Acceptance Criteria

All 39 servers successfully upgraded to target etcd version

PostgreSQL clusters fully operational post upgrade

Patroni leader election and failover working as expected

No unplanned downtime or data inconsistency

Priority

High

Timeline

Planning and communication: 

Execution window: Weekend 

Post validation: Following business day

If you share the current etcd version and target version, I can refine the issue further with exact technical details πŸ‘

Posted By Nikhil17:25
Filled under:

 Hi,


Thanks for sharing the details.


On an initial review, the connection string provided appears to be incorrect or incomplete, which could be causing the issue you are facing.


Looping in Sonali here as she is already looking into this and will be able to assist further with a detailed review.


Just wanted to share my two cents based on a quick glance.


Thanks,

Nikhil

Posted By Nikhil00:51

Thursday, 5 February 2026

Filled under:

 Hi Team,


This is to inform you that the BCM test activity has been successfully completed using the automation developed and provided by the Database team.


The end-to-end testing was performed along with your team, and the automation executed as expected. The only exception was the Oracle database switch, which was carried out through the traditional process.


Thank you for your coordination and support during the testing.


Please feel free to reach out in case any further clarification is required.


Thanks & Regards,

[Your Name]

Database Team

Posted By Nikhil23:29

Wednesday, 4 February 2026

Filled under:

 Please be informed that We will be performing a BCM activity for a critical application on [date] from [time] to [time].

This is a significant and planned change to validate business continuity readiness.


All necessary stakeholders are aligned, and monitoring will be in place throughout the exercise.

Posted By Nikhil20:28

Tuesday, 3 February 2026

Filled under:

 The idea behind this form is simple – to make communication easier, structured, and more effective. Instead of scattered emails and ad-hoc discussions, this form will help capture your inputs, concerns, and suggestions in a clear and organized way.

Your feedback plays a key role in improving the BCM process, and this tool is designed to ensure every voice is heard and every issue is tracked properly.

Posted By Nikhil23:24
Filled under:

 Regarding the authentication issue – I have already shared all necessary information with you. I compared the access, provided the relevant links, and shared all available details from my side.

For your question on how you were previously able to access the system – I’m not in a position to verify that. I do not have elevated access to check historical details such as when your BBS access got deleted. I can only see that it is currently deleted.

Kindly proceed with raising the required access request accordingly.

If you face any further issues, please raise a ticket with the DB team for investigation.

Thanks.

Posted By Nikhil23:02
Filled under:

 Hi Team,


Thank you for the discussion today regarding BCM automation.


As agreed during the call, we will proceed with the proposed automation approach. Since this activity is only for testing purposes, it is not required to raise the request in GSNOW-UAT, and the testing can be performed directly in GSNOW itself.


Accordingly, the necessary tickets have already been raised to proceed further.


The planned date of execution is tomorrow. Once the implementation is completed, we can perform the required tests anytime after tomorrow based on your availability.


Please feel free to reach out in case of any questions or if any additional information is required from our side.


Thanks and Regards,

[Your Name]

Posted By Nikhil20:38
Filled under:

 Dear [HR/Employer Name],


Regarding the Dependant Pass application for my spouse, I would like to inform you that she is currently outside Singapore. She can plan to travel to Singapore around mid-March, which would be the most feasible timeline for us. If required earlier, she can also arrange to arrive by the end of February, although mid-March would be preferable. I can confirm that she will only enter Singapore after the DP is fully approved.


Once the IPA letter is issued, could you please advise how the appointment date for formalities is scheduled? If she plans to arrive in Singapore around mid-March, would it be possible to obtain an appointment slot during that period?


Kindly let me know if any additional information is required from my side.


Thank you for your support.


Best regards,

Nikhil K

Posted By Nikhil15:55
Filled under:

 Hi Team,


As discussed earlier, the specific backup alerts were occurring due to an underlying permission issue. This topic has been under review for some time, and after detailed analysis and coordination, the required fix has now been implemented. Please refer to the attached email for reference.


With this change in place, we expect the issue to be resolved going forward, and these errors should no longer appear from this month onward.


Sonali – could you please proceed with fixing the current occurrences using the documented steps available at the below link:

[link]


Please let us know in case any further support is required.


Thanks and regards,

[Your Name]

Posted By Nikhil06:13
Filled under:

 Hi Team,

Thanks for the discussion today regarding the upcoming BCM activity. As explained, the newly introduced BCM automation is fully operational and enables the application team to trigger required DB activities in a self-service, controlled, and transparent manner. DB team will remain available for monitoring and support in case of any exceptions.

Please let us know if you need a quick walkthrough or demo before the actual BCM execution.

Posted By Nikhil04:25
Filled under:

 How You Can Explain on the Call

You can say something along these lines:

1. Start with Purpose

“Thanks for the update on the upcoming BCM activity. From the DB team side, we are already well prepared to support BCM using the newly introduced automation.”

2. Explain What Has Changed

“I would like to highlight that we have recently implemented a BCM automation framework for databases.

This automation is designed to make the BCM process more:

Efficient

Transparent

Controlled

And self-service for application teams”

3. Highlight Key Benefits

“With this automation:

Application teams can trigger the BCM database activities on their own through the approved interface.

No manual DBA intervention is required for standard BCM operations.

The process is fully logged and auditable.

Execution is faster and less error-prone compared to the earlier manual approach.”

4. Explain Control and Safety

“Although it is self-service, it is not uncontrolled:

Proper validations are built in

Role-based access is implemented

Pre-checks and post-checks are automated

DB team still has full visibility and governance”

5. Clarify DB Team Involvement

“So for the upcoming BCM activity:

Application team can use the automation to trigger the required DB steps.

DB team involvement will mainly be for:

Initial coordination

Any exceptional scenarios

Monitoring and support if something unexpected happens”

6. Offer Support

“We can arrange a short demo or walkthrough if required, so your team is comfortable using the automation before the actual BCM date.”

Keep These Points Ready if They Ask Questions

If they ask “What exactly can we trigger?”:

Database switchover/failover

Start/stop sequences

Health checks

Validation steps

If they ask “Do we still need DBAs?”:

Only for non-standard cases

Troubleshooting

First-time onboarding

One Short Summary Line (if you need to be very brief)

“In short – BCM DB activities are now automated, self-service, safe, and transparent, so application teams can execute them directly with minimal dependency on DBAs.”

Posted By Nikhil04:24

Monday, 2 February 2026

Filled under:

 Backups for the Mobilepas database  are failing due to low throughput over the last few days. No changes have been observed on the DB side. 


The backup team has raised a case with the vendor(Commvault) for investigation. 

Filesystem and backup is closely monitored over Incident_86

Posted By Nikhil23:29
Filled under:

 Strengths / Role Specific Competencies

Candidate demonstrated basic awareness of Oracle DBA concepts and terminology.

Able to provide short responses to fundamental questions, indicating some exposure to Oracle database administration.

Appears to have theoretical familiarity with common DBA activities such as backup, recovery, and performance tuning.

Development Areas

Technical depth was insufficient for the level expected for this role. Most answers were very brief and lacked detailed explanation or practical examples.

Responses did not reflect hands-on production experience; explanations were generic and not scenario-based.

Difficulty in articulating concepts clearly and logically. Follow-up questions often resulted in repeated or unclear answers.

Limited ability to elaborate on troubleshooting approaches, real-time problem solving, or past project experience.

Overall engagement during the interview was low, with minimal interaction and enthusiasm.

Additional Feedback / Areas to Probe (if considered further)

Validate actual hands-on experience in core areas such as RMAN backups, ASM, performance tuning, and incident handling.

Probe deeper into real-life scenarios handled by the candidate rather than theoretical knowledge.

Assess practical understanding through situational questions or a technical exercise.

Clarify whether responses were based on personal experience or learned material, as there were indications of prepared or read-out answers.

Overall Assessment

At this stage, the candidate does not meet the expected competency level for the Oracle DBA position due to lack of technical depth, limited communication, and insufficient practical exposure.

Posted By Nikhil20:02