DataGuard Switchover in Oracle Cloud Infrastructure and Oracle Cloud
Acronym:
DR – Disaster Recovery
Organizations are investing a colossal sum of money in DR (Disaster Recovery Solutions) because it has a direct impact on business continuity. Many companies implement the DR solution and do not test the DR functionality frequently. It’s the business and IT infrastructure team’s responsibility to test DR functionality at least twice a year. Some companies run their load in the DR site for 6 months and change it back to the primary site. The company should have a proper disaster recovery plan to be back online within a short period of time.
This link has really good information about DR strategy and planning https://www.iternalnetworks.com/what-is-the-best-method-for-disaster-recovery/
Building a Disaster Recovery Plan
Before you choose the best method for disaster recovery, you must first have a Disaster Recovery Plan in place.
To ensure your business continuity, you’ll want to make sure that these best practices are followed:
- Have an updated written or printed version of your Disaster Recovery Plan in an easy-to-access location before a disaster occurs
- Knowing where the plan is located will help you to quickly start the process without wasting precious time searching for the document
- Ensure that your recent system backups are stored offsite, be it in the cloud or at another physical location. This helps to ensure that your backups aren’t affected by the same disaster which can lead to longer recovery times
- Plan for how your business will return to normalcy post-disaster
- Update your Disaster Recovery Plans frequently to reflect changes in your business
- Test your plan to make sure it’s viable against disasters that are likely in your area
Once your Disaster Plan meets these criteria, you will be far more suited to choose a method of recovery that fits your business. You’ll also need to choose a method of backup storage, be it the cloud, local, removable storage media, or a mix of both.
Recovery Time Objective (RTO) vs Recovery Point Objective
The organization must understand the differences between recovery time objective (RTO) and recovery point objective (RPO) in disaster recovery. RTO is how long it will take to get your critical infrastructure up and running after a disaster while RPO is the timeframe between the disaster and your last backup.
When RPO is zero, there is no data loss because your system is continually backing up the newest versions of data.
When RTO is zero, there is no downtime – you barely notice that your systems went down in the first place.
OCI enables greater reflexibility to enable DR solution via DG (DataGuard), this enables the creation of a standby database in the same region or different region. Also, this enables the easy option for switchover and failover.
In this article, I will illustrate how we can test the switchover in OCI Environments.
First, let’s gather primary and standby database details.
Gather Database Details
Primary
########## database details
SQL> show parameter db_uni
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string PWSH01_yyz16x
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PWHSE_PDB READ WRITE NO
Database Status
SQL> select INST_ID,FLASHBACK_ON from gv$database;
INST_ID FLASHBACK_ON
---------- ------------------
1 YES
2 YES
select instance_name,status,HOST_NAME,to_char(startup_time,'dd/mm/yyyy hh24:mi') startup_time from gv$instance;
INSTANCE_NAME STATUS HOST_NAME STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011 OPEN dbsdpl21 26/11/2021 19:39
PWSH012 OPEN dbsdpl22 26/11/2021 19:39
Standby Database
########## standby server
SQL> how parameter db_uni
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string PWSH01_yyz1k6
SQL> select open_mode,database_role from gv$database;
OPEN_MODE DATABASE_ROLE
-------------------- ----------------
READ ONLY WITH APPLY PHYSICAL STANDBY
READ ONLY WITH APPLY PHYSICAL STANDBY
select instance_name,status,HOST_NAME,to_char(startup_time,'dd/mm/yyyy hh24:mi') startup_time from gv$instance;
INSTANCE_NAME STATUS HOST_NAME STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011 OPEN dbsdpl251 26/11/2021 19:43
PWSH012 OPEN dbsdpl252 26/11/2021 19:43
Standby guaranteed to restore point creation
Note: First create a restore point on the standby side, before that perform a few log switches from primary to make it consistent.
alter system archive log current;
Error
This error comes because the MRP process is recovering the standby database. To create GRP we need to stop the MRP process.
SQL> create restore point Before_Switchover_stby guarantee flashback database;
create restore point Before_Switchover_stby guarantee flashback database
*
ERROR at line 1:
ORA-38784: Cannot create restore point 'BEFORE_SWITCHOVER_STBY'.
ORA-01153: an incompatible media recovery is active
Solution
########### solution
DGMGRL> edit database PWSH01_yyz1k6 set state='APPLY-OFF';
Succeeded.
DGMGRL> show database PWSH01_yyz1k6
Database - PWSH01_yyz1k6
Role: PHYSICAL STANDBY
Intended State: APPLY-OFF
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 16 seconds (computed 1 second ago)
Average Apply Rate: (unknown)
Real Time Query: OFF
Instance(s):
PWSH011 (apply instance)
PWSH012
Database Status:
SUCCESS
SQL> create restore point Before_Switchover_stby guarantee flashback database;
Restore point created.
Primary guaranteed restore point creation and verification
create restore point Before_Switchover_stby guarantee flashback database;
SCN GUA TIME NAME
---------- --- --------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
5940139 YES 26-NOV-21 08.29.09.000000000 PM BEFORE_SWITCHOVER_PRI
Pre-checks before switchover
Verify the connectivity between primary and standby.
############### standby connection
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION=(SDU=65535)(SEND_BUF_SIZE=10485760)(RECV_BUF_SIZE=10485760)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.0.0.10)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=10.0.0.47)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=PWSH01_yyz1k6.sub08101943420.cnvcn01.oraclevcn.com)(UR=A)))
OK (0 msec)
[oracle@dbsdpl21 admin]$
Validate the DataGuard Status
Note: It’s really important to validate the data guard status before the switchover, dgmgrl utility has features to check the database readiness for the switchover.
[oracle@dbsdpl21 admin]$ dgmgrl sys/CHana88#_@PWSH01_YYZ1K6
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Fri Nov 26 20:34:49 2021
Version 19.11.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected to "PWSH01_yyz1k6"
Connected as SYSDBA.
DGMGRL>
DGMGRL> show database verbose
Object "verbose" was not found
DGMGRL> show configuration;
Configuration - PWSH01_yyz16x_PWSH01_yyz1k6
Protection Mode: MaxPerformance
Members:
PWSH01_yyz16x - Primary database
PWSH01_yyz1k6 - Physical standby database
Fast-Start Failover: Disabled
Configuration Status:
SUCCESS (status updated 39 seconds ago)
Validate the Database is ready for the Switchover
############# Validate
DGMGRL> edit database PWSH01_yyz1k6 set state='APPLY-ON';
Succeeded.
DGMGRL>
DGMGRL> validate database 'PWSH01_yyz1k6';
Database Role: Physical standby database
Primary Database: PWSH01_yyz16x
Ready for Switchover: Yes
Ready for Failover: Yes (Primary Running)
Managed by Clusterware:
PWSH01_yyz16x: YES
PWSH01_yyz1k6: YES
Standby Apply-Related Information:
Apply State: Running
Apply Lag: 2 seconds (computed 0 seconds ago)
Apply Delay: 0 minutes
DGMGRL>
Switchover
OCI gives flexibility to switchover databases using GUI and commands links. As a DBA I would prefer on the command line and it gives more control.
Note: Make sure to run a few log switches before the switchover and check standby database is fully synced. If everything looks perfect, perform the Switchover.
Switchover using GUI:
Commands Line:
DGMGRL> switchover to 'PWSH01_yyz1k6';
Performing switchover NOW, please wait...
New primary database "PWSH01_yyz1k6" is opening...
Oracle Clusterware is restarting database "PWSH01_yyz16x" ...
Connected to "PWSH01_yyz16x"
Connected to "PWSH01_yyz16x"
Switchover succeeded, new primary is "PWSH01_yyz1k6"
DGMGRL>
Validate Database role after the Switchover
Once the switchover is successful, verify the database role on the new primary and standby.
SQL> show parameter db_uni
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string PWSH01_yyz1k6
SQL> select open_mode,database_role from gv$database;
OPEN_MODE DATABASE_ROLE
-------------------- ----------------
READ WRITE PRIMARY
READ WRITE PRIMARY
SQL> set lines 600
SQL> /
INSTANCE_NAME STATUS HOST_NAME STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011 OPEN dbsdpl251 26/11/2021 19:43
PWSH012 OPEN dbsdpl252 26/11/2021 19:43
SQL>