Performance

Distributed Deep Learning for Physics with TensorFlow and Kubernetes

Summary: This post details a solution for distributed deep learning training for a High Energy Physics use case, deployed using cloud resources and Kubernetes. You will find the results for training using CPU and GPU nodes. This post also describes an experimental tool that we developed, TF-Spawner, and how we used it to run distributed TensorFlow on a Kubernetes cluster.

 

SystemTap into Oracle

Topic: This post is about using SystemTap for investigating and troubleshooting Oracle RDBMS. In particular you will learn how to probe Oracle processes and their userspace functions. These techniques aim to be useful as well as fun to learn for those keen into peeking under the hood of the technology and improve their effectiveness in troubleshooting and performance investigations.