Tuesday, April 10, 2018

Azure @ Enterprise - Tuning the HDIClusters programatically

HDInsight Cluster

HDInsight shortly referred to as HDI is the Microsoft wrapper around Hadoop and other open source data analytics technologies such as Spark. It depends on the Harton works platform. It can be installed onpremise and available in Azure as well, in the form of platform service. 

In Azure, the advantage is that the scaling can be easily done though it takes around 15 mins. We can create a cluster for specific workloads and delete after it is done. This help us to save lot of money as its costly during running time.

HDInsight @ Enterprise

At enterprise the workloads differ and there could be different application teams wants to use HDICluster for their various workloads. Either all can write the code to create HDICluster using the Azure management APIs in every application or there could be a common service which can be used to serve the applications. When we have common service and different application workloads have different cluster demands, we need to adjust the cluster properties. 

Setting the cluster properties is really complex since the properties are spread across in different levels. There are properties at cluster level such as no of worker nodes, node manager level, Livy job submission level, worker JVM properties etc... Getting these properties under control is a big challenge.

Sometimes we may need to reuse the clusters before deleting it to save time of cluster creation. At the time of writing this post, it takes around 15-20 mins to get a new cluster created. If the common service can give the clusters to subsequent consumers, it would save a good amount of time.


Manually we can easily adjust the properties from Azure portal and the Ambari views of specific cluster.  Some links are given below.


After setting some properties the cluster needs restart. The portal shows whether to restart or not based on the property what is changed.


It is easy to adjust the properties at the cluster level using Azure APIs. But when it comes to the properties inside cluster such as the Node Manager heap size etc...we have to rely on the Ambari API. Below are some links to do the same.


Using these APIs is the toughest thing in the API world. We have to get the current settings and do the changes to that and send back with a new label. Something similar to how we do in the coding. Get latest, do change and commit the change set.

If the jobs are submitted using Livy, there is option for sending some parameters which are at that job level. Examples of those parameters are the executor_core.


Handle restarts

As mentioned earlier some properties require the cluster to restart. The UI shows a warning to restart. What to do when we use the API? The best answer is to restart the service after setting the properties regardless whether restart needed or not :)


Since the usage is pretty much straightforward, not including any code snippets. But if anyone facing issues with these APIs, please comment in this post.

No comments: