Tuesday, April 18, 2017

Serverless JoymonOnline.in - Dealing with GitHub API and its limits

Introduction

This is continuation of my Serverless journey with my personal site www.Joymononline.in. As mentioned in previous posts, the main motive behind going Serverless is the cost. The principle is, 'Spend only for domain name. Nothing else'. Now the site is completely Serverless. But like anything else Serverless is not the perfect solution which can be applied everywhere.

One of the draw backs of Serverless is the lack of options to store developer keys at client side. In Serverless, we don't maintain any server but utilizes many third party services. Some services will be having limits on anonymous usage and some doesn't work at all without authentication. It can be overcome only by adding our own server side component(service) which acts as proxy between our client and third party service. Our server side component can have the developer key or API secret and utilize high quota from the third party service.

GitHub API limits

The open source projects from GitHub are listed in the site with some details. For showing that, GitHub API is consumed from the client side. But the GitHub API has limits. At the time of writing this post, if the request is not authenticated, it limits 60 requests per IP per hour. When authenticated, limit is 5000..Ideally JoymonOnline.in will never need that much requests for single user. But below are the situations where the API limits matter
  • The site is opened from corporate machines where the external IP is same and simultaneously more than 60 people from a company opens.
  • The user has opened another site which does heavy GitHub anonymous activity and after that opens JoymonOnline.in.
The first scenario may be rare as the my site at this point doesn't have that much reach. But the second is valid. So this issue is something to be addressed.

Overcome free GitHub API limits using proxy service

As mentioned above the way to overcome this is to place our own service in middle which has the developer key and let that service call GitHub API. The best way to create such a service is to use any  Function as a Service offering. There are many such as Azure Functions, Amazon Lambda, Google Functions, Webtasks etc... The selection criteria is based on cost ie it should be free (not even 1$ setup free) and ability to allow outbound calls to GitHub. Below code shows how the service can be written for Webtask. Why Webtask is selected will be coming as another post.

const http=require('https');
var express    = require('express');
var Webtask    = require('webtask-tools');
var bodyParser = require('body-parser');
var app = express();

app.use(bodyParser.json());

app.get('/', function (req, res) {
  res.json({ message: 'Welcome to my api!.Currently it supports api/joymon/<name of github repo>. Not open for all repos ' });
});
app.get('/joymon/:projectName', function (req, res) {
  console.log(req.params);
  var  callback = function(response) {
    var str = '';
    response.on('data', function (chunk) {
          str += chunk;
    });
    response.on('end', function () {
        res.json(JSON.parse(str));
        res.end(str);
    });
  };
  
  http.request(getOptions(req.params.projectName),callback)
    .on('error',function(err){
      console.log(err);
      res.json({ message: 'Soemthing went wrong when contacting GitHub API.' });
    })
    .end();
});
module.exports = Webtask.fromExpress(app);

function getOptions(projectName) {
  return {
        method:'GET',
        host:'api.github.com',
        path:'/repos/joymon/'+projectName,
        uri:'https://api.github.com/repos/joymon/'+projectName,
        json:true,
        headers: {'User-Agent': 'Custom Header Demo works'} //Add dev key via header
    };
}

If we have Webtask account, we can simply copy paste the above code and enjoy free service.

Client side changes

The GitHub API calls return headers which talks about quota and usage. The client side has to check that and call the proxy service accordingly. It is kind of retry which was never needed prior to Serverless.

http://jsfiddle.net/joygeorgek/arp6bmrj/5/

Overall Serverless brings benefits but at the same time, we have to do more such as retry and service orchestration. Also it brings so many points of failure and has to design for it. If one third party service is down we should continue giving the other features as normal.

Tuesday, April 11, 2017

JavaScript Access inner objects with string expression

JavaScript is becoming more and more importance for developers. Even it can be used as DSL with the help of embedded V8 runtime engine. Below is one scenario which may occur when using JavaScript as DSL

The DSL is another program fragment, which our program has to execute.
Suppose there is an JavaScript object structure as follows

var complexObject = {
 prop0:"prop0 value",
  prop1: {
    prop1prop1: {
     prop1prop2prop3: "prop1prop2prop3 value"
    },
    prop1prop2:"prop1prop2 Value"
  },
  prop2:[
   {prop2prop1:"prop2[prop2prop1] value"},
   {prop2prop2:"prop2[prop2prop2] value"}
  ]
};

Lets see how this can be accessed.

Normal JavaScript property access

Now lets see how can we access the values using normal property dot(.) notation


var testInnerObjectProperty = function() {
  alert(complexObject.prop0);
  alert(complexObject.prop1.prop1prop2);
  alert(complexObject.prop2[0].prop2prop1);
};


This works perfect.

Access JavaScript property using string property path

But in DSL the path to the value will be available as string. So how to access the value using string property path?



var testInnerObjectPropertyWithIndexer  = function(){
  alert(complexObject["prop0"]);
  alert(complexObject["prop1.prop1prop2"]);
  alert(complexObject["prop2[0].prop2prop1"]);
}


We can see the first alert returns propert value as its direct string path access. But the next 2 will not work as JavaScript runtime is not able to find a property names "prop1.prop1prop2" on complexObject. In other words it doesn't have the intelligence to understand the hierarchy. May be its by design.
Lets see another way to access using string string.

Access property using string path via Evil Eval()

This is the easiest method pops up into a medium JavaScript programmers mind though he know its not the good way.


var testInnerObjectPropertyWithIndexerEval = function() {
 alert(eval('complexObject.'+'prop0'));
  alert(eval('complexObject.'+'prop1.prop1prop2'));
  alert(eval('complexObject.'+'prop2[0].prop2prop1'));
}


Since it works but not the good way what is next.

Access property using string path via parsing

This is the hard way. We must test is property for all combinations. Lets see one implementation.


var testInnerObjectPropertyByParsingPath = function() {
 alert('Going to parse prop path');
 alert(Object.getValue(complexObject,'prop0'));
  debugger;
  alert(Object.getValue(complexObject,'prop1.prop1prop2'));
  alert(Object.getValue(complexObject,'prop2[0].prop2prop1'));
}


Above is the API interface expected. Below goes the implementation of getValue()


Object.getValue = function(obj, path) {
    path = path.replace(/\[(\w+)\]/g, '.$1'); // convert [] access to property access with number
    path = path.replace(/^\./, '');           // trim leading dot
    var props = path.split('.');
    
    for (var propIndex = 0, noOfProps = props.length; propIndex < noOfProps; ++propIndex) {
        var propName = props[propIndex];
        if (propName in obj) {
            obj = obj[propName];
        } else {
            return;
        }
    }
    return obj;
}


This can be written using recursion too. It replace the array access [] with property access by replacing corresponding number there. Then loops into the object hierarchy till it finds required object.

Please note this is not the battle tested code. This is rather to give the idea.
https://jsfiddle.net/joygeorgek/s5egwa5o/8/


Tuesday, April 4, 2017

ASP.Net IIS AppPool Recycling is a hack

From the beginning it was known to me that the IIS AppPool recycling is a hack. If a technology is matured enough to handle web traffic, it doesn't need periodic recycling of processes. To me there is no difference between support teams asking to restart machines to solve issues and app pool recycling. In both these cases, the memory is cleaned up.

Whenever I was involved in IIS applications, I instructed the teams to disable app pool recycling from the development time itself. But it never happened. According to most of developers and their managers that is a feature from IIS and why should they change defaults and bring unnecessary issues. Whenever I had direct delivery responsibility, I ensured that the app pool recycling is not enabled. There was mistake from my side as well that I never invested time to find out a valid Microsoft link which says do not recycle app pool process. But yesterday the time arrived. There was a client question on why the app pool recycling is disabled on some applications where majority has recycling. Those were framework level applications and in my direct control during development. 
If you have a problematic application and you cannot easily correct the code that causes the problems, you can limit the extent of these problems by periodically recycling the worker process that services the application. Microsoft
Finally got the link where MSFT says about app pool recycle. But the scenario again complicated for us. 3-4 web apps only disabled the app pool recycling and the other 50+ web apps not. Were those created with no quality and difficult to fix?

Stateful v/s Stateless IIS web apps

Here comes 2 different types of web apps. Some apps will be just to receive request and send response back. If we recycle these stateless apps, there is no harm to the behavior of the system as every request and its response pair is independent of other request-response pairs. So it is 'ok' to recycle. In case there are something going wrong with memory and unnoticed during dev and QA, it won't cause trouble in production as well. Since extensive memory, load, long running testing are not needed, we can develop in low cost.

Other type of apps will be maintaining some sort of state in the memory and state might be evolving over time. Also their behavior will be controlled by the current state. If we recycle the process, the state will be lost and their behavior will be wrong after recycling. Yes there is the answer.
It is 'ok' to recycle the stateless web apps but never recycle the stateful apps.
When it is said 'ok' means, there are no ways to fix in production. Other scenario is when there was no chance to properly test for long hours and LOH fragmentation during development. Again it is not recommended to keep a big state in memory which may cause scaling issues.

Better write programs properly and avoid app pool recycling.

LOH and recycling

Without mentioning Large Object Heap, any article about .Net memory management is incomplete. As all knows LOH was never getting compacted in earlier .Net versions. That causes large objects to be fragmented and finally lead to OutOfMemoryException. In those days, it was not a choice to enable AppPool recycling but we have to. I agree that there are programmatic mechanisms to avoid LOH fragmentation. But those are all costly during development and testing. Also requires high skills.

Links

https://weblogs.asp.net/owscott/why-is-the-iis-default-app-pool-recycle-set-to-1740-minutes