Oracle 12c - causing problem by solving it!?!
Regular readers of our blog probably already know that for most of our databases we're using two storage layers to keep our backups - NAS volumes as a primary layer and tapes as secondary one - please check "Datafile without backups - how to restore?" for more details. If you read another post "Importance of testing yours backup strategy", you also know that we're constantly running automatic recoveries to verify restorability and recoverability of our databases. Every 4th recovery is done from tapes, the others are performed from disks. Recently, I've got such errors after running automatic recovery from tapes:
RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 07/08/2015 13:08:22 RMAN-06026: some targets not found - aborting restore RMAN-06100: no channel to restore a backup or copy of datafile 111 RMAN-06100: no channel to restore a backup or copy of datafile 110 RMAN-06100: no channel to restore a backup or copy of datafile 109 (...) RMAN-06100: no channel to restore a backup or copy of datafile 1 Recovery Manager complete.
This time it has been different from the case described in the first blog post mentioned above, as we got errors for every datafile, not only for one or small subset of them, which in turn, pointed me to the script used to search for incremental level 0 backupsets to be sent to tape (only every 4th backup of this type is sent to tapes). I suspected that either no backups were sent to tapes or they were somehow deleted. After few tests I've found that the former was the case, which has been caused by below condition in query looking for incremental level 0 backupsets:
and a.backup_type = 'D'
where backup_type is the column from v$backup_set_details. Quick look in the documentation pointed me to the definition of v$backup_set where I found:
BACKUP_TYPE VARCHAR2(1) Type of files that are in this backup. If the backup contains archived redo logs, the value is L. If this is a data file full backup, the value is D. If this is an incremental backup, the value is I.
At this point it is worth reminding subtle difference between full and incremental level 0 backups in Oracle terminology:
The only difference between a level 0 incremental backup and a full backup is that a full backup is never included in an incremental strategy. Thus, an incremental level 0 backup is a full backup that happens to be the parent of incremental backups whose level is greater than 0.
In the meantime I've also realized that the problem could be seen on other Oracle 126.96.36.199 databases, where everything was fine on 188.8.131.52 and even on 184.108.40.206. So I started to think that something introduced in new version of Oracle 12c messed things up. But wait a minute... After reading documentation statement again, I was surprised that we are looking for backups of type 'D', where we should be looking for backups of type 'I' instead, as normally we don't do any full backups, but incremental level 0 ones. After quick check everything was clear:
220.127.116.11 and 18.104.22.168: SQL> select distinct backup_type, incremental_level from v$backup_set where incremental_level is not null; B INCREMENTAL_LEVEL - ----------------- D 0 I 1 22.214.171.124: SQL> select distinct backup_type, incremental_level from v$backup_set where incremental_level is not null; B INCREMENTAL_LEVEL - ----------------- I 0 I 1
I've checked 11g documentation regarding description of backup_type column and it was the same as in 12c. This means that the new Oracle 12c version hasn't introduced the problem, but rather the solution to the problem existing in 11g, which of course caused our script to work incorrectly. The script has been written according to real life, not to documentation, which I think still could be a little bit misleading taking into account small full and incremental level 0 backups difference. So, beware, new versions could cause a problem actually by solving it :), which only reminds us that any change of software behaviour could have significant impact. If you rely on backup_type column in your backup or reporting scripts, please don't forget to change them after migrating to Oracle 126.96.36.199.