Blog

Blog

Oracle Cloud: start/stop automatically the Autonomous Databases

Submitted by fpachot on

In the previous post I've setup all the environment to be able to easily control the OCI services without bothering with the sign-in headers, and without installing anything. In this post I'll used the oci-curl() function to stop all my Autonomous Database services. In the previous post, I've set the environment variables for the private and public key, and the user, tenant and compartment OCIDs.

Oracle Cloud Infrastructure API Keys and OCID

Submitted by fpachot on

As you may have read in the news, CERN is testing some Oracle Cloud services. When a large organisation is using the Cloud Credits, there's a need to control the service resources. This requires automation and then the GUI interface from the Cloud portal is not sufficient. We can control the Oracle Cloud Infrastructure through the REST API, OCI CLI, OCI SDKs, and all those methods require a RSA key for sign-in and some OCI (Oracle Cloud Identifier) to identify the user, the tenant, the compartment, the service,...

Optimizer Statistics Gathering - pending and history

Submitted by fpachot on

How do you manage when you need to gather statistics on some tables in a critical environment? Some queries are too long because of stale statistics. But other queries on the same tables are ok. You cannot leave the inital problem without fixing it. Adding hints or SQL Profiles for the identified queries is not the right solution when you identified that stale statistics are the problem. But you want to reduce the risk of regression on other queries at maximum.

Install Kubernetes on Oracle Cloud Infrastructure

Submitted by anappi on

In the last year Oracle has changed a lot, moving with determination to the Cloud business. They increased their portfolio with IaaS, PaaS and SaaS solutions. In the context of Openlab collaboration between Oracle and CERN we have been testing some of these cloud solutions. Oracle Cloud Infrastructure ( OCI )  is one of these and in this post I'm gonna show how it is possible to install and run a Kubernetes Cluster in the Oracle Cloud Infrastructure.

HAProxy Canary Deployment

Submitted by anappi on

Canary deployment is a way to test a new release of a software rolling it only for a small sub set of users. In this post I'll show how at CERN, in the Middleware section of Database group, we configure  our HAProxy setup to work as canary deployment. I'll give a brief introduction on what is a canary deployment and later we will see how to configure HAProxy.

 

HAProxy High Availability Setup

Submitted by anappi on

In the modern world where everyone wants to be always connected, High Availability became one of the most important feature for a system. For example if you are running  a system you don't want a failure in one piece of your architecture impacts the  whole system. You have to make all the components of your architecture high available. In this post we will present how, in the Middleware section of Dabatase group at CERN, we setup a High Availability HAProxy based on CentOS 7.

Apache Spark and CERN Open Data Analysis, an Example

Submitted by canali on

This is a short post introducing a notebook that you can use to play with a simple analysis of High Energy Physics (HEP) data using CERN open data and Apache Spark. The idea for this work started with a concept for a technology demonstrator of some recent developments on using Spark for data analysis in the context of HEP.

Performance comparison of different file formats and storage engines in the Hadoop ecosystem

Submitted by zbaranow on

TOPIC

 

This post reports performance tests for a few popular data formats and storage engines available in the Hadoop ecosystem: Apache Avro, Apache Parquet, Apache HBase and Apache Kudu. This exercise evaluates space efficiency, ingestion performance, analytic scans and random data lookup for a workload of interest at CERN Hadoop service.

 

 

INTRO

 

Distributed Deep Learning with Apache Spark and Keras

Submitted by jhermans on

In the following blog posts we study the topic of Distributed Deep Learning, or rather, how to parallelize gradient descent using data parallel methods. We start by laying out the theory, while supplying you with some intuition into the techniques we applied. At the end of this blog post, we conduct some experiments to evaluate how different optimization schemes perform in identical situations.

Disclaimer

The views expressed in this blog are those of the authors and cannot be regarded as representing CERN’s official position.

CERN Social Media Guidelines

 

Blogroll