Last week I've investigated how does OAuth2 protocol works and developed a Proof of Concept (PoC) in Java. In this post I would like to show you how effortlessly develop simple client-server application using OAuth 2.0 standard for authorization of protected resources placed on a server.
Before we start developing our first secured web application with OAuth2 let's understand how it works.
What is it and how does it work?
Alluxio refers to itself as an "Open Source Memory Speed Virtual Distributed Storage" platform. It sits between the storage and processing framework layers in the distributed computing ecosystem and claims to heavily improve performance when multiple jobs are reading/writing from/to the same data. This post will cover some of the basic features of Alluxio and will compare its performance for accessing data against caching in Spark.
Topic: In this post, you will find an example of how to build and deploy a basic artificial neural network scoring engine using PL/SQL.
At CERN we run multiple Hadoop clusters to satisfy demanding requirements from our experiments and accelerator communities. The usage and criticality of the clusters are increasing dramatically as more users are looking at Hadoop to process and archive the vast amounts of data coming out of LHC.
Topic: In this short post you can find examples of how to use IPython/Jupyter notebooks for running SQL on Oracle.
Topic: In this post you will find a short discussion and pointers to the code of a few sample scripts that I have written using Linux BPF/bcc and uprobes for Orac
In the part 2 of 'Integrating Hadoop and Elasticsearch' blogpost series we look at bridging Apache Spark and Elasticsearch. I assume that you have access to Hadoop and Elasticsearch clusters and you are faced with the challenge of bridging these two distributed systems. As spark code can be written in scala, python and java, we look at the setup, configuration and code snippets across all these three languages both in batch and interactively.
Topic: in this post you can find examples of how to get started with using IPython/Jupyter notebooks for querying Apache Impala.
Topic: This post is about profiling and performance tuning of distributed workloads and in particular Hadoop applications. You will learn of a profiler application we have developed and how it has successfully been applied to tuning Sqoop to improve the throughput of data transfer from Oracle to Hadoop.