Friday, 31 January 2025

Filled under:

 As requested, I performed a series of test cases as part of the preemptive troubleshooting effort for the PostgreSQL issue scheduled this weekend. The goal was to simulate database restoration and identify possible resolutions. Below are the findings and key steps documented:

Posted By Nikhil23:52

Findings on PostgreSQL Restore Simulation and Troubleshooting

Filled under:

As part of the preemptive troubleshooting effort for the PostgreSQL issue scheduled this weekend, I conducted a database restoration simulation to identify possible resolutions. Below are the findings and key steps documented:

  1. Significance of postgresql.conf_with_recovery_command:

    • This file is a new addition in PostgreSQL versions greater than 12. When a database is restored, it generates this configuration file with critical recovery parameters.
    • To ensure a successful startup, these parameters need to be manually copied into the primary postgresql.conf file.
    • Official documentation reference: Commvault PostgreSQL Restore Documentation
    • Detailed simulation and steps are documented in Confluence: [internal link].
  2. Base Configuration Update Requirement:

    • If the database is opened using pg_ctl without copying the recovery parameters, it fails to initialize properly. Updating the base configuration manually resolves this issue.
  3. Recovery Mode Persistence and WAL File Validation:

    • In cases where the database remains in recovery mode (both with pg_ctl and Patroni), a background process validation can pinpoint the missing WAL file.
    • If the WAL file no longer exists, the database gets stuck in a replica+starting mode.
    • Resolution: Manually promoting the instance using pg_ctl successfully fixed the issue. Patroni, however, did not permit the promotion. This behavior appears analogous to an "incomplete cancel-based recovery."
    • Additional details are available in [internal page].

Next Steps and Recommendations:
It would be helpful if more SREs could test these scenarios across various clusters. If we observe a consistent pattern in solutions, we can standardize a common SOP for PostgreSQL recovery. This would greatly enhance efficiency during future restoration efforts.

Please let me know if you'd like to discuss this further or need additional details.

Posted By Nikhil23:50

Tuesday, 28 January 2025

Subject: Database Restoration Issue - Patroni Starts in Standby Mode Post Restore

Filled under:

Description:

We have a 3-node Patroni cluster with a PostgreSQL HA setup. As part of a recovery process, we restored the database using a tape backup. However, post-restore, when we attempt to start the Patroni service on the restored site, the node consistently appears as a replica (standby mode). Conversely, if we bypass Patroni and start PostgreSQL directly using pg_ctl, the database successfully starts in read-write (primary) mode without any issues.

Given this discrepancy, we are unsure why Patroni is forcing the database into standby mode, even though the same database appears fully functional as primary with manual pg_ctl startup.

Current Status:

  • The site is currently running in standalone mode with PostgreSQL started using pg_ctl instead of Patroni.
  • We have scheduled maintenance over the weekend to reinitialize and bring all nodes back into HA with proper cluster resilience.
  • We need clear steps to ensure a seamless switch from pg_ctl startup to Patroni without errors and to reintegrate the restored site into the cluster properly.

Request for Assistance:

  1. Understanding the Issue: What could be causing Patroni to perceive the restored node as a replica (standby mode), despite it being in read-write mode when started manually? Is it possible that incorrect DCS metadata or configuration within Patroni could be the reason?
  2. Recommended Steps: What steps should we follow to safely stop the PostgreSQL instance running via pg_ctl and switch back to Patroni startup without triggering replication or standby mode?
  3. HA Restoration: Guidelines on how to reinitialize and reintegrate this site into the Patroni cluster without any inconsistency, especially post-restoration from tape backup.

We are looking forward to your assistance in diagnosing the root cause and providing a validated approach to restore HA functionality for the cluster.

Please advise on any logs or additional details you may require from our side.

Posted By Nikhil20:31

Monday, 27 January 2025

Filled under:

 

Subject: Request for Approval to Decommission Cloned Database

Dear [Leads' Names or Team],

I hope this email finds you well.

As part of our database optimization efforts, I recently identified a cloned database, which appears to have been created long ago and has not been in use. To confirm its redundancy, I maintained a cooling period of one week by shutting down the database. During this period, no dependencies or issues were reported, further verifying that the database is no longer actively used.

In light of this, I have now raised a request to decommission the cloned database to free up resources and improve efficiency. I kindly request your review and approval to proceed with the decommissioning process.

Please let me know if you require any additional details or have concerns regarding this activity.

Thank you for your support.

Best regards,
[Your Name]
[Your Designation]
[Your Contact Information]

Posted By Nikhil23:03

Friday, 24 January 2025

Filled under:

SELECT 

    p.proname AS function_name,

    r.rolname AS owner_name

FROM 

    pg_proc p

JOIN 

    pg_roles r ON p.proowner = r.oid

WHERE 

    p.proname = 'my_function';



SELECT 
    p.proname AS function_name,
    r.rolname AS owner_name
FROM 
    pg_proc p
JOIN 
    pg_roles r ON p.proowner = r.oid
WHERE 
    p.proname = 'your_function_name';

Posted By Nikhil18:37

Tuesday, 21 January 2025

Filled under:

 Vector databases excel at semantic search, unstructured data handling, and powering AI-driven applications. Explore their features, use cases, and how they compare to traditional databases. Read more: https://oracle-dba-help.blogspot.com/2024/12/vector-databases-explained.html

Posted By Nikhil04:39

Friday, 17 January 2025

Filled under:

Thank you for submitting the change request for assistance with the database switchover. Before we proceed with the approval and planning, I would like to confirm a few details to ensure a smooth and coordinated switchover process.

  1. List of Databases: Please provide the exact names of the databases involved in the switchover.
  2. Tentative Time: Let us know the preferred time for initiating the switchover. Kindly mention if there are any specific constraints or considerations we need to be aware of during this time.
  3. Downtime Window: If applicable, please share the expected downtime window for this activity.

Once we receive your confirmation, we will review and finalize the plan accordingly. If you have any additional requirements or queries, please feel free to share them.

Looking forward to your response.

Posted By Nikhil21:45

a

Filled under:

 I want to bring to your attention the ongoing issue regarding [briefly describe the issue, e.g., "the pending database access request" or "the resolution of ticket #12345"]. Despite multiple follow-ups, including my most recent email on [date], there has been no response or update on this matter.

This issue is critical as it [explain the impact, e.g., "affects the project timeline" or "has caused delays in operations"], and a prompt resolution is required to avoid further disruption.

Please provide an update at the earliest and let me know the expected timeline for resolution. If any additional information is needed from my side to expedite the process, feel free to reach out.

Looking forward to your prompt response.

Posted By Nikhil17:02

Wednesday, 15 January 2025

Filled under:

 Subject: Database Creation Confirmation


Dear [User's Name],


The requested database has been successfully created and is ready for use. Here are the details:


Database Name: [Database_Name]


Environment: [Environment_Details]



Please let us know if you need any assistance.


We value your feedback—share your experience here: [Feedback_Link].


Best regards,

[Your Name]

Database Administrator


Posted By Nikhil17:09

Saturday, 11 January 2025

Filled under:

 The database patching activity has been successfully completed. I hope there are no issues with any ongoing restore operations as a result of this activity.

Please do let me know if you encounter any concerns or require further assistance.

Posted By Nikhil19:06
Filled under:

 nohup bash -c "rman target=/ <<EOF connect catalog username/password upgrade catalog; EOF" > rman_upgrade.log 2>&1 &

Posted By Nikhil05:11

b

Filled under:

SELECT table_name, num_rows FROM all_tables WHERE owner = '<RMAN_CATALOG_OWNER>' AND table_name LIKE 'RC%';

Posted By Nikhil04:34

a

Filled under:

 -- Check for long-running queries

SELECT sid, serial#, sql_id, status, event, wait_class, seconds_in_wait, sql_text

FROM v$session s

JOIN v$sql q ON s.sql_id = q.sql_id

WHERE s.status = 'ACTIVE';


-- Check for blocking sessions

SELECT blocking_session, sid, serial#, event, seconds_in_wait 

FROM v$session

WHERE blocking_session IS NOT NULL;


Posted By Nikhil04:33

Friday, 10 January 2025

Subject: Assistance Required: Unable to View Deployment Procedures in OEM Despite TPAI Access

Filled under:

 Subject: Assistance Required: Unable to View Deployment Procedures in OEM Despite TPAI Access

Hi OEM Admins Team,

I hope you’re doing well.

We’ve recently encountered an issue where, despite raising and being granted TPAI access, we are still unable to view the deployment procedures in OEM. This is impacting our ability to perform certain planned tasks via OEM.

For the time being, we have proceeded with a manual switchover for one of the planned changes to ensure minimal disruption. However, resolving this issue is crucial to streamline future activities and avoid manual intervention.

Could you please assist in identifying the cause of this issue? If there are specific configurations or permissions that need to be verified or updated, kindly let us know.

Your prompt support in addressing this matter would be greatly appreciated.

Posted By Nikhil19:58
Filled under:

 


1. You must create a report that shows all security events over the last 20 days for a Windows virtual machine (VM) in your Azure environment. What tools should you consider using?

Log Analytics and Power BI


2. You want to delete all managed disks that a virtual machine (VM) is not using. How would you find these?

In All Services, select Disks. In the following blade, look for resources that have nothing listed in the Owner column.


3. In Microsoft Defender for Cloud, which security policy applies secure settings?

Enforce


4. You want to be notified when there are at least 10 failed requests on your web application and when the number of total failed requests exceeds 5,000 in a 30-minute timeline. How would you configure the alert rules?

Add one alert rule on metrics for requests and an alert rule on events with a failed status.


5. Which would you use to programmatically consume metrics data?

Azure Monitor representational state transfer (REST) APIs


6. Which Azure tools can you use to establish a workload baseline for an application server running on a virtual machine?

Application Insights, metric charts


7. How can you create detailed visual reports for billing data?

Import the usage report into Power BI.


9. Which Azure feature automatically reviews your environment for cost savings based on resource utilization?

Azure Advisor


10. You must create an activity log alert rule with a JSON template. You must design the template using the Azure Resource Manager (ARM). How will you code the template's resource type?

      "type": "Microsoft.Insights/activityLogAlerts",


11. Which is a requirement to set up an Azure Security Center event dashboard?

Security Center’s standard tier plan


12. Which service should you route metric data to for App Services performance analytics?

Application Insights


13. Where can you easily view all resources that have diagnostics settings enabled for a subscription?

Azure Monitor


14. You have a Windows based web application that you are not monitoring for performance. Some users report that it behaves slowly intermittently. What tools can you use to validate baseline performance?

Application Insights


15. How can you view the cost of a previous day's log analytics in Azure Cost Management + Billing?

Select Cost analysis, yesterday's date, and add a filter with Log Analytics service.


16. Which query language does Azure Monitor use?

Kusto


17. What is a requirement for monitoring Azure virtual machines (VMs) from a third-party tool running outside of Azure?

Network security group (NSG) rules that allows the necessary traffic between the Virtual Network (VNet) and the monitoring tool


18. To adhere to geo-compliance requirements, you want to restrict the creation of resources to specific Azure regions only. Which built-in Azure policy would you use to enforce this?

Allowed locations


19. You must plot a chart with 48 data points consisting of the Network Out Total metric based on its Count aggregation over the span of one day. How will you set up Azure Monitor to collect each data point over a regular period of time?

Set the Time range to Last 24 hours and set Time granularity to 30 minutes.


1. Which section in the Azure Security Center displays security recommendations?

Resource security hygiene


3. Your company's finance department reports that there has been considerable variance in Azure billing over the course of the holiday season. How would you begin your investigation?

Compare existing baseline usage of compute instances with auto-scale and total egress bandwidth with the latest utilization report.


2. What is the minimum number of days that you can store a classic guest operating system (OS) metric on an Azure storage account?

14


4. Which query can you use to create alerts in Azure Monitor?

| where Level != 3



5. You have a saved search that monitors processor metrics for all virtual machines (VMs). After you create an alert rule from within the workspace, how would you alert on processor metrics for only Windows VMs?

In the SQL Server Log Analyzer option, select the saved search. Use the search query field to edit the query.


6. What type of anomalous activity report is included as a risk detection type in Microsoft Entra ID?

Sign-ins from unknown sources



7. Which statement is true about creating alert rules in Azure?

You cannot create an alert for events in the alert category of the activity log.



You have an on-premises application server with a SQL Server backend. Your first step in migrating the environment into Azure platform as a service (PaaS) is to move the SQL database for the application to Azure SQL. How do you confirm that baseline performance for the application remains consistent after the move?

Prior to migration, configure App Insights to establish baseline performance metrics for the application. Compare App Insights before and after data.



10. You are working with your accounting department to validate an Azure invoice against that month's usage report. What are the equivalent terms from each document?

Name = Meter category;

Resource = Meter name;

Billable = Overage quantity



11. You want to create a query that looks at the variance of processor usage on all virtual machines (VMs) in your workspace. It should also display the top two results in a line chart that you can pin to your dashboard. What are two necessary steps to accomplish this?

Create a shared dashboard;

In the advanced analytics portal, create a query that uses the extend function.


12. You have two web applications behind an application gateway. What steps are necessary to view firewall logs for this environment?

Enable Web Application Firewall (WAF) and configure the logs that you want to archive to a storage account.


13. You are responsible for a number of App Services that have autoscale enabled. How would you view the weekly cost trends of these resources?

In the cost analysis tool, create a custom timespan. In the Cost History page filter by Meter.


14. When configuring an alert rule, which step defines the metrics to use?

Configure signal logic


15. Your company has multiple data centers and multiple environments across several regions in Azure. You must provide a centralized monitoring solution. What would be your first step in achieving this?

Install NPM agents on all endpoints in all locations.


16. How would you add notable events to your event dashboard?

Use Log Analytics query language.


17. You must create a Network Watcher in the Central-US region with the resource group DemoRG1. How will you create this using the command-line interface?

az network watcher configure \

  --resource-group DemoRG1 \

  --locations centralus \ <<<<<<<<<<<<

  --enabled

  

  

  18. When viewing the event dashboard for a client you manage, the EVENT column is blank for certain workspaces. What could be the cause?

The client has not upgraded the workspaces with the blank EVENT column to the standard tier.


19. What would you use baseline policies for in Azure?

To understand resource utilization


20. Who is responsible for establishing a connection from the on-premises network to Azure?

The customer only

Posted By Nikhil17:10

ca

Filled under:

 SELECT DB_KEY, DBID, NAME, DB_UNIQUE_NAME, RESETLOGS_TIME FROM RC_DATABASE;

Posted By Nikhil05:06
Filled under:

 Subject: Notification: Upcoming Catalog Database Patching Schedule


Dear Team,


We would like to inform you about the upcoming catalog database patching activity scheduled for this weekend. As part of this activity, we aim to ensure a seamless experience and avoid any potential disruptions to ongoing or scheduled restores.


To mitigate any issues:


1. Scheduled Restores: Please ensure that all restores scheduled to run over the weekend are reviewed and aligned accordingly to avoid conflicts with the patching activity.



2. Ongoing Restores in EMEA: Kindly review the logs of ongoing restores in the EMEA region on Saturday to confirm there are no failures or anomalies.




While we strive to ensure there are no residual issues post-patching, this email serves as a heads-up and a precautionary reminder to double-check and validate the ongoing and scheduled activities. Your cooperation in this regard will help us maintain a smooth operational flow.


Should you encounter any issues or need support during this process, please feel free to reach out.


Thank you for your attention to this matter.


Best regards,

Daidipya Upalekar

Oracle Database Administrator


Posted By Nikhil02:43

Thursday, 9 January 2025

rmancat

Filled under:

  Request for Information on OEM Changes Post-RMAN Catalog Database Upgrade


I’m in the process of patching the RMAN catalog database, and I want to ensure everything runs smoothly post-upgrade. Could you please provide information on any necessary changes at the OEM level after upgrading both the database and the catalog version?

Specifically, I want to confirm:

  • If there are any specific configurations or settings that need to be updated in OEM post-upgrade.
  • Are there any internal jobs or workflows triggered via OEM that might need adjustments to align with the new database and catalog version?
  • Any particular considerations for monitoring or alerting in OEM to avoid job failures post-patch?

Your guidance would help me ensure everything remains stable and functioning properly after the upgrade.

Looking forward to your advice.

Posted By Nikhil06:00

PEM Questions

Filled under:

What are the key metrics PEM monitors out-of-the-box, and can we customize these metrics?

How does PEM handle alerting, and what are the options for email, SMS, or webhook integrations?

Are there any templates or best practices for configuring alerts for high-availability setups?


Can PEM help identify slow queries, deadlocks, or blocking sessions?

How does PEM handle real-time performance monitoring and historical data analysis?

Are there any automated tuning or recommendation features in PEM?


  • How does PEM ensure secure communication between the PEM server and agents?
  • Can PEM integrate with external authentication providers like LDAP or Active Directory?
  • What are the access control options for restricting visibility or actions for specific users or teams?


  • Can PEM integrate with external monitoring solutions like Prometheus, Grafana, or Nagios?

  • What troubleshooting tools or logs does PEM provide for resolving connectivity or performance issues?

  • What is the upgrade process for PEM, and are there considerations for avoiding downtime?
  • How frequently are PEM updates released, and what is the typical lifecycle of a version?
  • Posted By Nikhil05:24

    mail

    Filled under:

     "Thank you for the opportunity to assist with your load balancing setup and testing. I hope the scenarios we simulated and the insights shared during these sessions were helpful. For further discussions or to close this topic, could you kindly loop in my manager and our Reliability Lead in your follow-up emails? This will ensure they are aligned with the progress and the assistance provided. 



    It was great collaborating with you on the load balancing setup and testing. 

    Please let me know if you need any further assistance or have additional queries regarding the scenarios we discussed and simulated.


    Thank you for sharing the MoM. It was a pleasure assisting with the load balancing setup and testing, including addressing your queries and simulating scenarios. 

    Please feel free to reach out if you need further assistance or have any additional questions.




    Posted By Nikhil05:14

    Wednesday, 8 January 2025

    Filled under:

     Subject: Request to Add Raised GitLab Issue to Backlog


    Dear [Team/Recipient's Name],


    I hope this email finds you well.


    I wanted to inform you that I have raised a GitLab issue addressing the restricted access for the patronictl utility to enhance security and prevent unauthorized actions affecting PostgreSQL clusters. The details of the issue are outlined as follows:



    ---


    Issue Summary:


    Problem: Non-Postgres users currently have access to patronictl.


    Resolution: Restrict access to the postgres user or explicitly authorized users via sudo.




    ---


    Tasks Proposed:


    1. Audit server permissions for patronictl.



    2. Update ownership and permissions estate-wide.



    3. Restrict access to the postgres user.



    4. Validate compliance post-deployment.



    5. Document changes for future reference.




    This has been categorized as a high-priority issue, as it directly impacts the security and reliability of our PostgreSQL clusters.



    ---


    Request:


    Please add this issue to the backlog for prioritization and planning. Let me know if any additional details are needed, or if a discussion is required to refine the proposed solution.


    Looking forward to your confirmation.


    Best regards,

    Daidipya Upalekar


    Posted By Nikhil00:07

    Tuesday, 7 January 2025

    Filled under:

     Here’s a draft for a GitLab issue to highlight the problem and the required remediation:



    ---


    Title: Restrict Execution of patronictl Commands to Authorized Users Only



    ---


    Description


    It has been observed that across multiple servers, the patronictl command can be executed by normal (non-postgres) users. This poses a potential security risk as patronictl is used to manage Patroni clusters for PostgreSQL, which could lead to unauthorized actions affecting the database cluster.


    Current Issue


    The patronictl executable has permissions that allow execution by users who are not part of the postgres group or have not switched to the postgres user (su - postgres).


    This issue affects the security and integrity of the database clusters managed by Patroni.




    ---


    Impact


    Unauthorized access to patronictl commands may result in:


    Unintended changes to cluster state (e.g., failover, demotion, etc.).


    Potential disruption of database services.


    Violation of security and compliance policies.





    ---


    Proposed Solution


    1. Modify Ownership and Permissions for patronictl:


    Ensure patronictl is owned by the postgres user and group:


    chown postgres:postgres /path/to/patronictl


    Restrict permissions so that only the postgres user can execute it:


    chmod 700 /path/to/patronictl




    2. Restrict Access to postgres User:


    Verify and restrict the ability of users to switch to the postgres user (su - postgres) by reviewing /etc/sudoers and other access control mechanisms.




    3. Validation Steps:


    After applying the changes, confirm that:


    Non-postgres users cannot execute patronictl.


    Only the postgres user (or users explicitly authorized via sudo) can operate patronictl.





    4. Apply Changes Estate-Wide:


    Roll out these changes across all servers in the estate where Patroni is deployed.


    Perform validation checks after rollout to ensure compliance.






    ---


    Tasks


    1. Audit servers to identify current permissions of the patronictl executable.



    2. Update ownership and permissions for patronictl on all servers.



    3. Restrict access to the postgres user via sudoers or other mechanisms.



    4. Perform validation testing on all servers.



    5. Document changes for future reference.





    ---


    Priority: High


    This issue requires immediate attention to prevent potential misuse or unauthorized actions affecting PostgreSQL clusters.



    ---


    References


    Patroni Documentation


    PostgreSQL Best Practices




    ---


    Does this look good, or would you like any further modifications?


    Posted By Nikhil21:40

    Friday, 3 January 2025

    Filled under:

     am writing to follow up on my recent claims, which have already been approved. However, I have not yet received the claim receipt or the approved amount in my account.

    Could you kindly provide an update on the status of the payment and share the claim receipt for my records? If there are any additional steps required from my side, please let me know.

    Here are the details for your reference:

    • Claim ID/Reference Number: [Insert Claim ID]
    • Date of Approval: [Insert Date]
    • Claim Amount: [Insert Amount, if known]

    Your assistance in resolving this matter at the earliest would be greatly appreciated.

    Looking forward to your prompt response.

    Thank you for your attention.

    Best regards,

    Posted By Nikhil19:28

    Wednesday, 1 January 2025

    Filled under:

     Subject: Action Required: Password Reset Needed for Database Accounts Before Patching


    Dear [Application/Account Owner],


    It has been identified that some database accounts have not had their passwords reset for a significant amount of time. This poses a risk to the success of the upcoming database patching activity. If the patching proceeds without addressing this, the following accounts may fail to authenticate:


    [List of Accounts]


    To avoid any disruptions, we request you to reset the passwords for these accounts before [specific date/time] to ensure a smooth patching process.


    Please confirm once the passwords have been reset. If you need assistance or have any questions, feel free to reach out to [Contact Person/Team] at [Contact Details].


    Your prompt action on this matter is highly appreciated.


    Best regards,

    [Your Name]

    [Your Role]

    [Your Contact Information]


    Posted By Nikhil20:39