Using Tiered Storage in Alluxio

Introduction

Alluxio is an open source memory speed virtual distributed storage system. A brief overview of Alluxio has been covered in a previous blog. This post will cover one of the most powerful features of Alluxio, which is its tiered storage capabilities.

Tiered storage allows the Alluxio volume to be expanded outside of just memory. New tiers of storage are defined, which are mounted on different directories with potentially different underlying hardware for storage. The simplest implementation of this feature is an extension of Alluxio storage space into a secondary environment, such as HDD or SSD space. This allows the extremely high throughput of memory to be mixed with slower but cheaper and higher capacity technologies, which has the potential to greatly increase the practicality of an Alluxio based cluster.

Configuration

Tiered storage is implemented by changing the alluxio.site.properties configuration file for each of the nodes in the cluster:

// Define 2 storage levels.
alluxio.worker.tieredstore.levels=2
// Define MEM as the lowest level.
alluxio.worker.tieredstore.level0.alias=MEM
alluxio.worker.tieredstore.level0.dirs.path=/mnt/ramdisk
alluxio.worker.tieredstore.level0.dirs.quota=${alluxio.worker.memory.size}
alluxio.worker.tieredstore.level0.reserved.ratio=0.1
// Define SSD as the secondary level.
alluxio.worker.tieredstore.level1.alias=SSD
alluxio.worker.tieredstore.level1.dirs.path=/ssd
alluxio.worker.tieredstore.level1.dirs.quota=30GB
alluxio.worker.tieredstore.level1.reserved.ratio=0.1

This configuration is very simple and self-explanatory. For simplicity this change was made on the master machine and then copied over to each of the workers.

alluxio copyDir conf

Now once alluxio is rebooted we can see from the web UI that Alluxio volume is extended into a new SSD directory:

Behaviour

The MEM/SSD separation is completely internal to Alluxio. Any program which interfaces with Alluxio will still work exactly as it had before. Internally Alluxio has allocators and evictors which organise data into their respective tiers of storage. Allocators choose which level of storage and which directory data is written into for a worker. Evictors choose which blocks should change their storage level when there is a shortage of space in a tier which the allocator has decided to write to.

All default allocators prioritise maxmising the usage of the lowest tier of storage (memory in this case). I have found that this behaviour does not allow files to be optimally loaded into Alluxio. By default new blocks are written to memory over older blocks and consequently these older blocks are pushed down to a lower tier. Therefore, there is a constant cycle of blocks being written to memory and then being pushed down to SSD storage. Custom allocators/evictors can however be written which could improve this.

By default, Alluxio uses the MaxFreeAllocator and LRUEvictor. MaxFreeAllocator prioritises writing to memory and to the location which has the most free space in memory within a worker. LRUEvictor removes the block which has not been used for the longest amount of time in a storage tier.

Files can also be pinned from the filesystem API to prevent them from being moved out of memory by evictors.

Testing

Much distributed processing at CERN is done on slow, high capacity HDD clusters. We will therefore compare speeds for a standard HDD cluster, where data is stored with a HDFS to a MEM + SSD cluster were data is stored on Alluxio (with a HDFS as the underFS). We will compare both the speed differences and the ease of transitioning from this traditional style of cluster to a newer architecture.

Each node was given 2500MB of memory storage and 30GB of SSD storage. 100GB of pagecount data from Wikipedia was loaded into the respective filesystems.

A standard count was performed in each case

Clearly the MEM + SSD cluster is faster, however in this case it was only twice as fast. We would expect a greater improvement than the one above. It is suspected this is due to only having 4 cores per node and so the jobs were process bound. Alluxio should utilise full SSD (and memory) throughput given more cores to test with.

Conclusions

We saw a significant increase in performance by using higher throughput storage with tiered storage in Alluxio. While the performance was not improved as much as expected, this was not a fault of the Alluxio software.

Furthermore, the implementation of the tiered storage archictecture was incredibly easy. Configuration takes no more than a few command lines (assuming you have the hardware ready to go).

Changing the code from running the test on Alluxio vs. running it on the HDFS was just a matter of referencing hdfs://hostname:hdfs_port/path vs. alluxio://hostname:alluxio_port/path.

Tiered storage is an incredibly powerful feature of Alluxio which can make Alluxio a very viable solution for high-capacity distributed storage. It can allow a mixture of high throughput and cheaper hardware to store data in a distributive manner, allowing control over hardware speed vs. practicality.

Add new comment

You are here