Tuesday, August 14, 2018

Functional Programming - Finding valleys

If we were from ASP.Net Web Forms world and tasted the ASP.Net MVC at least once, it is very difficult to go back. It is similar when we go from imperative world to functional programming world. This post is a continuation of Functional Programming series. The advantages of FP has already discussed in earlier posts. 
The problems are mainly taken from HackerRank and trying to solve using FP methods. The main intention is to understand how to solve problems functionally which otherwise considered as 'solvable only imperatively'.

The Problem

Input is a sequence of D and U characters. D means one step down down and U means up. A person named Gary is hiking from sea level and end at sea level. We need to find out how many valleys he covered. If he is in a deep valley and had a small hill and again went down, it is counted as one valley. Detailed description can be found in original link

Traditional solution

The traditional (FP was there for long, but not in main stream) ie if someone comes from imperative programming world, they get it as state machine problem. Yes there are states where Gary reaches after each step. The state has to be mutated based on the rules.

In functional programming, mutation is not an appreciated word. So lets see how can we do this without mutating the state.

Functional way

Language used is JavaScript

function countingValleys(n, s) {
  
  var res = s.split('').reduce((context, value,index,arr) => {
    //console.log(context)
    if(value === 'D') {
      if(context.s === 0 ) { return {v:context.v+=1,s:context.s-=1}}
      else return {v:context.v, s:context.s-=1}
    }
    if(value ==='U'){
      return {v:context.v,s:context.s+=1}
    }
    else {
      //console.log('error');
    }
  }, {
    v: 0,
    s: 0
  });
  return res.v
}

Here n means number of steps and s is the input character sequence. If we enter this into HackerRank browser based editor, with their auto generated wiring code, this returns the number of valleys covered.

How it runs

It uses the Fold concept of FP which is implemented in JavaScript using reduce() function. Initial state is given to the reduce function, when the function progress through each element, it create new states (context variable hold the state) from existing state with the modification required. Yes as per FP, mutation is evil not creation of object from another. Also notice that -= and += is not mutating the state, instead its a transformation when assigning value during creation.

The commented console.log lines will help to under the flow it allowed those to run.

Happy functional coding...

Tuesday, July 31, 2018

Azure @ Enterprise - Time bound SAS for WebJob to dequeue ServiceBus Queue messages

Context

Enterprise in 'Azure @ Enterprise' series refers to companies or projects which have stringent security measures and process guidelines which a normal developer may not think of. Or are not expected in the self service cloud world. Often the security measures are taken to make sure that the responsibility is moved to some other party or vendor so that people at Enterprise are free from low level details of those concerns.

If we are in non enterprise projects, developing a queued back end processing in Azure is super cool with Azure Queue and WebJobs. But in enterprise that is not the case, we have to make sure the service supports virtual network or not. Will that support encryption at rest as well as in transit. If it encrypt can the enterprise provide key etc...Basically enterprise don't want to fully trust the cloud vendor though in reality vendor own all and have full control.

Problem

Enterprise may evaluate Azure ServiceBus Queue is better than the Azure Storage Queue as per current feature sets and the standards what it used to evaluate. It may come out viceversa. If it demand key rotation for ServiceBus connection string with expiry, it is little difficult, if we had used only configuration(web.config or app.config) based connection string.

Even if it is not enterprise project, it is good practice to rotate the keys with expiry as long as the ServiceBus don't support hosting it inside the vNet. If it support hosting inside vNet, the attack surface could be less. Currently any Tom,Dick or Harry can launch brute force attack against the service bus end points.

Key rotation will not remove the attack surface but reduce the possibility of a successful attack.

https://feedback.azure.com/forums/216926-service-bus/suggestions/15619302-add-service-bus-to-vnet

Solution

It is possible to have key rotation even with the attribute based WebJob functions. We can have expiry to the keys as well. Lets see step by step.

Generating time based SAS (Shared Access Signature)

It is supported in ServiceBus. In order to generate one we need one Shared Access Policy. Normally when we create the SB instance, there will be 'RootManagerSharedAccessKey' with primary and secondary keys.

Ideally this is supposed to be done outside of the main application such as Azure Automation Code and the generated new time based SAS has to be in KeyVault. This will help to have only automation account knowing the high privileged Shared Access Policy key and WebJobs know only KeyVault secret name where the the time based Shared Access Signature is stored by Automation Runboook. If application has access to the KV, it can retrieve the SAS. 

Since the code to do SAS rotation using Azure Automation Runbook is not under the scope of this post, omitting that code. If anyone request through comments section, the code will be provided.

WebJob to read SAS and dequeue

Now lets come to WebJob code side. Before we start the JobHost to listen using RunAndBlock() on the thread, there is option to override to specify the connection string to the ServiceBus which uses the time based SAS. Below goes the code.

private static void RunAndBlock()
{
    var config = new JobHostConfiguration();
    config.UseServiceBus(ServiceBusConfigurationFactory.Get());

    if (config.IsDevelopment)
    {
        config.UseDevelopmentSettings();
    }
    var host = new JobHost(config);            
    host.RunAndBlock();
}

static class ServiceBusConfigurationFactory
{

    /// <summary>
    /// Returns the SB configuration
    /// </summary>
    /// <returns></returns>
    /// <remarks>Change this to read from Azure KV</remarks>
    internal static ServiceBusConfiguration Get()
    {
        return new ServiceBusConfiguration() {
            ConnectionString = BuildsSBConnectionString(GetITimeSensitiveSASTokenProvider().Get()),
        };
    }

    private static string BuildsSBConnectionString(string sharedAccessSignatureToken)
    {
        ServiceBusConnectionStringBuilder builder = new ServiceBusConnectionStringBuilder();
        builder.SharedAccessSignature = sharedAccessSignatureToken;
        builder.Endpoints.Add(new Uri("https://<name of SB instance>.servicebus.windows.net/"));
        string finalCon = builder.ToString();
        return finalCon;
    }

    private static ITimeSensitiveSASTokenProvider GetITimeSensitiveSASTokenProvider()
    {
        return new KVBasedimeSensitiveSASTokenProvider();
    }
}

The code is mostly self explanatory. When the first time main() of WebJob is started, it will get the ServiceBusConfiguration from the factory class. Factory uses provider class which knows how to talk to Azure KV or some other store where the SAS is present. Once the SAS is present, the connection string can be built from it.

One thing to remember is that the SAS is not connection string. Connecton string has to be 

Things to remember when working with ServiceBus

API Collision

There are 2 dependencies when we work with Azure ServiceBus. Microsoft.Azure.ServiceBus.dll & Microsoft.ServiceBus.dll and they both has classes with same name. Sometimes its very difficult to get the code compiled if downloaded as snippet.

What if the token expires after RunAndBlock() is invoked?

At least in testing there were no issues to the dequeued messages. All new instances of WebJob process, it takes the new connection string.

Tuesday, July 17, 2018

Starting Scala Spark - Streaming via TCP network socket in Windows

This is a simple tutorial about basic Scala Spark word count via streaming. There are so many samples for achieving this same. Then why this post?

One reason is all the posts are talking about how it works in Linux machines. Very rarely we find posts where the authors are using Windows machines. Another reason is to simplify the code to be understandable by a Scala beginner such a proper names for reduceByKey(_+_) which is very difficult to understand at first. Lets get started.

Spark

The code is pretty much straight forward as given below.

package org.apache.spark.examples.streaming

import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.dstream.ReceiverInputDStream
import org.apache.spark.streaming.{Seconds, StreamingContext}

// Counts words in input stream.
object SparkStreaming_HdfsWordCount {
  def main(args: Array[String]) {
    //Initialization
    Logger.getLogger("org").setLevel(Level.ERROR) //Remove the noise which is useful sometimes
    var master = "local[2]" //Change to have immutability. Helps to run easily in local machine ans cluster mode.
    args.sliding(2, 2).toList.collect {
      case Array("-master", arg: String) => master = arg
    }
    val sparkConf = new SparkConf().setAppName("HdfsWordCount").setMaster(master)
    // Setup streaming
    val ssc = new StreamingContext(sparkConf, Seconds(10))
    val lines:ReceiverInputDStream[String] =ssc.socketTextStream("localhost",8081)
    val words = lines.flatMap(_.split(" "))
    val wordCounts = words.map(word => {
      //println(x) //Debug purpose. Can see individual word
      (word, 1) // 1 means one instance of word.
    }).reduceByKey((x,y) => x + y) //Summing

    wordCounts.print() // Display the output

    ssc.start()
    ssc.awaitTermination()//Wait
  }
}

Some changes are done from the standard sample versions during troubleshooting. If we create a sample project in IntelliJ IDE for Spark, we can get streaming program which has file location as stream source. It might be little tedious to get the File system working in windows machines such as prefixing file:// or file:/// and the confusion over backslash or front slash etc...So better to use the network as source

If the environment is setup right, running the above program will try to connect to localhost:8081 and process the network stream.

PowerShell

The above Scala code make connection to port 8081 in localhost. But where is the server? Server means the producer of events into stream. If we look at normal posts they talk about a utility called nc which can be used to generate contents into the stream. 

But that is a Linux specific utility named netcat (nc). Though there are some equivalent available in Windows, it may be little difficult to get things working unless we download the binaries from known or unknown sites. For simplicity lets have the producer in PowerShell

$port=8081
$endpoint = new-object System.Net.IPEndPoint ([system.net.ipaddress]::any, $port) 
$listener = new-object System.Net.Sockets.TcpListener $endpoint
$listener.server.ReceiveTimeout = 5000

$listener.start() 

try {
    Write-Host "Listening on port $port, press CTRL+C to cancel"

    While ($true){

        if (!$listener.Pending())
        {
            Start-Sleep -Seconds 1;
            continue;
        }
        $client = $listener.AcceptTcpClient()
        #$client.client.RemoteEndPoint | Add-Member -NotePropertyName DateTime -NotePropertyValue (get-date) -PassThru

        $id = New-Guid
        $data = [text.Encoding]::ASCII.GetBytes("joymon $id")    $client.GetStream().Write($data,0,$data.length)
        "Sent message - $date - $(get-date)"

        $client.close()
   }
}
catch {
        Write-Error $_       
}
finally{
            $listener.stop()
            Write-host "Listener Closed Safely"
}

This is again straight forward. Inside a loop it wait for incoming connection. Once the connection is established, it will send messages suffixed with a GUID. The GUID is for making sure all the messages are reaching to the Spark streaming. The PowerShell code is mainly composed from these link1 & link2. Thanks to the authors. Just copy paste the code to PowerShell ISE and run.

The real time scenarios could be really complex than this word count. But this will give us an idea how the Spark Streaming works.

Tuesday, July 10, 2018

Azure @ Enterprise - How AppInsight SDK hook to WebClient and HttpClient classes?

This is continuation from previous post related to .Net SDK for Azure AppInsight and end to end cross process correlation. One of the open item in that post is about how the AppInsight SDK track the http dependency calls. Without we doing nothing, how the AppInsight can subscribe or hook to the http calls going out of the process?

Capture the outgoing http traffic from a .Net process in C#

Ideally capturing or intercepting the http traffic from a .Net process by a component inside that same process is independent of Azure AppInsight SDK. The SDK is just one use case of capturing the outgoing http traffic. There could be many other use cases such as security, modification etc...

Anyway lets start from the AppInsight source code and dig to the place where the hook happens. The source code of AppInsight SDK is available @ below Github repository

https://github.com/Microsoft/ApplicationInsights-dotnet-server

Navigating Source code of AppInsight SDK

In order to find out how the AppInsight SDK for .Net Server (ie legacy .Net as .Net Core is latest) hook to the http traffic, the entry point can be the ApplicationInsights.config file. If we search for the word 'Dependency' we can get the 'DependencyTrackingTelemetryModule'. Yes this resembles like the HttpModule which is powerful to do anything in the ASP.Net web world. First thing is to comment that section and see whether the Dependencies are getting logged or not to ensure that is the right thing we are looking at.

Anyway lets move ahead. Once we identify the source code of DependencyTrackingTelemetryModule class we can see below code snippet inside

#if NET45
// Net40 does not support framework event source

private HttpDesktopDiagnosticSourceListener httpDesktopDiagnosticSourceListener;
private FrameworkHttpEventListener httpEventListener;
private FrameworkSqlEventListener sqlEventListener;

#endif

https://github.com/Microsoft/ApplicationInsights-dotnet-server/blob/develop/Src/DependencyCollector/Shared/DependencyTrackingTelemetryModule.cs#L26

This tells us that the AppInsight SDK behaves differently in different .Net frameworks. Next thing it tells is that there are 2 http listening mechanisms. Lets start with the HttpDesktopDiagnosticsSourceListener class and come back if required for the other class. Once we locate that class we can see the constructor of that class as below

        internal HttpDesktopDiagnosticSourceListener(DesktopDiagnosticSourceHttpProcessing httpProcessing, ApplicationInsightsUrlFilter applicationInsightsUrlFilter)
        {
            this.httpDesktopProcessing = httpProcessing;
            this.subscribeHelper = new HttpDesktopDiagnosticSourceSubscriber(this, applicationInsightsUrlFilter);
            this.requestFetcherRequestEvent = new PropertyFetcher("Request");
            this.requestFetcherResponseEvent = new PropertyFetcher("Request");
            this.responseFetcher = new PropertyFetcher("Response");

            this.requestFetcherResponseExEvent = new PropertyFetcher("Request");
            this.responseExStatusFetcher = new PropertyFetcher("StatusCode");
            this.responseExHeadersFetcher = new PropertyFetcher("Headers");
        }


https://github.com/Microsoft/ApplicationInsights-dotnet-server/blob/develop/Src/DependencyCollector/Shared/Implementation/HttpDesktopDiagnosticSourceListener.cs#L29

Now focus to the subscriberHelper which is initialized with HttpDesktopDiagnosticSoruceSubscriber instance. What is in there?

internal HttpDesktopDiagnosticSourceSubscriber(
            HttpDesktopDiagnosticSourceListener parent,
            ApplicationInsightsUrlFilter applicationInsightsUrlFilter)
        {
            this.parent = parent;
            this.applicationInsightsUrlFilter = applicationInsightsUrlFilter;
            try
            {
                this.allListenersSubscription = DiagnosticListener.AllListeners.Subscribe(this);
            }

It is simple as subscribing to DiagnosticListener.AllListeners.

Redirect the analysis to DiagnosticsListener

Now we got a big clue. Lets understand what is DiagnosticListener and what is can listen to? Can it listen to outgoing http traffic?

The answer is yes from the .Net versions it support. Meaning in older .Net versions we cannot use this mechanism to capture http traffic. Below goes the sample code for hooking into http traffic using DiagnosticListener class.

https://github.com/joymon/dotnet-demos/tree/master/diagnostics/HttpCapture

More details

If anybody interested in following the topic below are some links
https://www.azurefromthetrenches.com/capturing-and-tracing-all-http-requests-in-c-and-net/

AppInsight SDK changed their model to capture
https://github.com/Microsoft/ApplicationInsights-dotnet-server/issues/548

Disclaimer

The above analysis is done by looking at the code and recreating sample using the code logic.  It might now work with other source repos where the name of the classes or functions may not resemble what it does internally. It is easy to debug the source to find out what is happening

Tuesday, July 3, 2018

Azure @ Enterprise - .Net AppInsight version and dependency correlation

AppInsight Correlation

When we develop for Azure, normally there will be a lot of services interacting together to complete one meaningful business function. Business functions such as loading a web page or a queued operation may have a lot of dependencies which are expected to work together to get success. Often when we use traditional logging framework, they tend to log only till the boundary of a process. We may not be able to correlate with what is happening in the next process when troubleshooting. We can do that by adding some custom code but AppInsight brings it out of the box. Out of the box means the internal data structure supports it and most of the SDKs too support it. More details on end to end correlation can be found here in my previous post.

Dependency correlation

When we enable AppInsight, we want all the events happening inside correlated together. One major item is Dependency telemetry. If we use AppInsight SDK below 2.4, we don't get any out going http requests via HttpClient correlated. In the same time if we have used WebClient it get correlated as dependency.

Problem

The problem is that when the HttpClient calls are not logged as dependencies our troubleshooting may go wrong or take more time. One possible reason why AppInsight SDK is not able to hook into HttpClient class is that AppInsight below 2.4 version don't know the class HttpClient. HttpClient is relatively new to the .Net's list of API used to make http calls.

Ideally it is interesting to identify how the AppInsight SDK subscribe to the WebClient and HttpClient classes so that it gets notified when there are outbound requests. Something for new post.

Solution

Better solution is to upgrade AppInsight SDK to latest and use HttpClient which is recommended for modern async programming.. If AppInsight SDK change is difficult, change HttpClient usage to WebClient.

Happy Coding...

Tuesday, June 26, 2018

Azure @ Enterprise - Finding the usage of vNet and Subnet IPs

There is a common myth that networking is easy in Cloud environment. At least considering Azure ,its is not true. If we are in Enterprise and want to implement security at the networking level we have to deal with vNets, subnets and their associated rule mechanisms such as NSG and much more. If it is a small deployment, there will be less confusion about vNet and subnets inside that and how many IPs used and free etc...Even re-balancing subnets even easy.

But that may not be the situation in an Enterprise where so many systems or departments share one subscription or same networking infrastructure. Things may often go out of control and will end up in situation where there are no more IPs or subnets for new applications.

The first challenge is to identify the usage of current vNets and Subnets inside them. We can get the details from Azure portal but its difficult if we want to consolidate to one view to take actions.

Below is a simple script to list down what are the subnets inside a particular vNet and how many IPs are possible and how many are used

Get-AzureRmVirtualNetwork -Name <Name of vNet> -ResourceGroupName <Name of vNet's RG > `
| Get-AzureRmVirtualNetworkSubnetConfig `
| Sort -Property Name `
| Select -Property Name, `
                   AddressPrefix, `
                   @{Name='Available IPs';Expression={[Math]::Pow(2,32-$_.AddressPrefix.split('/')[1])}}, `
                   @{Name='Used IPs';Expression = {$_.IpConfigurations.Count}}

Please note it shows the possible IPs not displayed as an integer instead as CIDR notation. If we are familiar to networking, we can easily understand how many IPs are available excluding the reserved IPs in Azure, what is the start IP and end IP etc... There are lot of hosted tools available to interpret the CIDR notation.

Once we knew there are issues such as vNets are fragmented, we have to think about solutions. We can easily suggest to change the policy to allot a vNet per department or system or application. There are trade offs in both the approaches.  If we allocate big vNet to a department and they don't have enough applications, all those will be unused. Also all departments need experts to manage the networks. The decision has to be made case by case.

Happy Networking...

Tuesday, June 19, 2018

Azure @ Enterprise - HDInsight name availability check in portal

Whether it is good or not Azure HDInsight needs unique name across Azure to create new instance. This leads to the requirement of name availability check feature. There are multiple mechanisms like calling API or checking the URL https://<proposed hdi name>.azurehdinsight.net for existence. 

Lets see how the Azure portal handles this check?

Azure Portal using CheckNameAvailability API for HDInsight cluster

The best friend is F12 browser tools to check how a web app like Azure Portal works
It is clear that it is trying to check name using the API partially documented below.

https://docs.microsoft.com/en-us/rest/api/cdn/checknameavailabilitywithsubscription/checknameavailabilitywithsubscription

But the above article talks only about checking the name using Microsoft.Cdn provider. HDInsight with check name availability seems undocumented. Below is the URL format for doing so which portal uses.

POST https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.HDInsight/checkNameAvailability

Why the API end point is to West US while the resource to be created in East US?

This is the magic of portal. It seems portal is hosted in West US and thus it uses the API end points there. When the screenshot was taken the portal was opened from New Jersey,USA without any VPN and ISP seems from East US itself. So no chance for Azure to redirect client to West US by recognizing the location. 

Why there is Http 504 in the screenshot?

This story seems related with an outage happened last Wednesday (13Jun2018) in South Central US Azure region. During that time, if we tried to create HDInsight it will show the name is not available. 

Under the hood it tries to reach to the West US end point for checking for the name availability and it error out with gateway timeout. It might be because internally it is not able to contact SouthCentral region. When the timeout happens, Portal thinks the name is not available and displays the message. What a thinking? 

Ideally it should have been understood the http 504 and acted accordingly. As per Azure or any cloud design philosophy, failures are expected. Why not accept that failure occurred in the portal itself?

As it is mentioned the issue seems related with outage in South Central US, there is no proof ensuring causation only the time range is same. Screenshot of outage details below.

It was fun to debug this by sitting near to a Microsoft ADM. The same has been communicated to Microsoft via him. Hopefully they will fix it soon.

Happy debugging...