Databases at CERN blog

Hadoop performance troubleshooting with stack tracing, an introduction.

Topic: This post is about profiling and performance tuning of distributed workloads and in particular Hadoop applications. You will learn of a profiler application we have developed and how it has successfully been applied to tuning Sqoop to improve the throughput of data transfer from Oracle to Hadoop.

 

Tool to visualise block distribution on Hadoop (HDFS) cluster

Distributed systems always bring new challenges for administrators and users. This is the case with HDFS, the default distributed file system that Hadoop uses for storing data.

In order to face these challenges, tools to facilitate administration and usage of these systems are developed. At CERN, a Hadoop service is provided and we have developed and deployed on our clusters some tools, today we present one of these tools.

Datafile copy not found in control file during RMAN recovery

The Problem: database restore fails with ORA-19571: datafile copy RECID xxx STAMP yyy not found in control file

Our typical setup of Oracle databases consists of a primary RAC cluster along with a standby database, also in RAC configuration. We are taking RMAN database backups from standby, while archivelog are backed up from primary database. Typically we are backing up everything to DISK (NAS), and further transferring some backups to TAPE. We are also running regular automated recoveries to test our backups.

Integrating Hadoop and Elasticsearch - Part 1 - Loading into and Querying Elasticsearch from Apache Hive

Introduction

As more and more organisations are deploying Hadoop and Elasticsearch in tandem to satisfy batch analytics, real-time analytics and monitoring requirements, the need for tigher integration between Hadoop and Elasticsearch has never been more important. In this series of blogposts we look at how these two distributed systems can be tightly integrated and how each of them can exploit the feaures of the other system to achieve ever demanding analytics and monitoring needs.

Using SQL Developer to access Apache Hive with kerberos authentication

With Hadoop implementations moving into mainstream, many of the Oracle DBA's are having to access SQL on Hadoop frameworks such as Apache Hive in their day to day operations. What better way is there to achieve this than using the familiar, functional and trusted Oracle SQL Developer!

Oracle SQL Developer (since version 4.0.3) allows access to Apache Hive to create, alter and query hive tables and much more . Lets look at how to setup connections to Hive both with and without kerberos and once connected look at the available functionality.

Pages

Subscribe to Databases at CERN blog