Reaching the end of my internship as an Openlab student at CERN, I think it’s appropriate to write a summary of my project in our blog. I will try to give a good overview for the past weeks.
Databases at CERN blog
Topic: Counting the number of distinct values (NDV) for a table column has important applications in the database domain, ranging from query optimization to optimizing reports for large data warehouses. However the legacy SQL method of using SELECT COUNT (DISTINCT <COL>) can be very slow. This is a well known problem and Oracle 18.104.22.168 provides a new function APPROX_COUNT_DISTINCT implemented with a new-generation algorithm to address this issue by providing fast and scalable cardinality estimates.
Oracle patch set 22.214.171.124 has been released and with it comes an important new feature: the In-Memory option. CERN has been involved in the testing of this feature since an early stage so I'd like to take the occasion to share my experience with you!
What is it?
It is a new static pool in the System Global Area, keeping a copy of the data stored In-Memory in Columnar format:
What is the best way to handle archived logs deletion in environments with standby and downstream capture databases? One could use own scripts, to delete for example all backed up archived logs, older than n days. But better way, will be to set RMAN archived log deletion policy, because then, additional options could be specified, to delete archived logs which are not only backed up n times, but also applied or shipped to other databases in the environment.
Having the datasource password in my version control systems is an issue that has run after me since the beginning of the time. It is classic that you are always postponing it in the development process untill somebody from the security team comes to your office and tells you "what the @#$ are these passwords doing in the svn/git???" To avoid this embarrasing situation you have different choices:
Maybe you are wondering why a java entry in a databases blog? The answer is simple, at CERN, the Infrastucture and Middleware Services section (aka IMS) is part of the Database Services , and, among many other things, we take care of the Java application servers for our users.
Are you calling external C libraries from inside your Oracle SQL? Especially if those calls are made on a row-by-row basis you might see a significant performance benefit from calling your external C libraries via Java!
If you use ASM on a regular basis, you may find that knowledge of the internals can be very useful for advanced troubleshooting and in general to acquire familiarity with the technology. In particular this is even more relevant when using ASM capabilities for data redundancy/mirroring (i.e. when deploying disk groups with normal or high redundancy).
Topic: This post introduces the latest changes to OraLatencyMap and PyLatencyMap, two custom tools for collecting and displaying Oracle wait event latency details using heatmaps.
OraLatencyMap is a SQL*Plus tool, with a core written in PL/SQL, aimed at studying Oracle random I/O by displaying the latency drill-down of the wait event 'db file sequential read' using heatmaps. The tool can also be used to collect and display event latency histograms for any other Oracle wait event, for example: log file sync to study commit time latency.
In my previous blog post (http://db-blog.web.cern.ch/blog/szymon-skorupinski/2014-05-migrating-oracle-database-12c-what-do-auditing) I provided you with number of reasons why unified auditing looks very promising and should be seriously considered while migrating to 12c. Nonetheless, I was not talking at all about performance – which also seems to be greatly improved.