Tuesday, April 18, 2017

Serverless JoymonOnline.in - Dealing with GitHub API and its limits

Introduction

This is continuation of my Serverless journey with my personal site www.Joymononline.in. As mentioned in previous posts, the main motive behind going Serverless is the cost. The principle is, 'Spend only for domain name. Nothing else'. Now the site is completely Serverless. But like anything else Serverless is not the perfect solution which can be applied everywhere.

One of the draw backs of Serverless is the lack of options to store developer keys at client side. In Serverless, we don't maintain any server but utilizes many third party services. Some services will be having limits on anonymous usage and some doesn't work at all without authentication. It can be overcome only by adding our own server side component(service) which acts as proxy between our client and third party service. Our server side component can have the developer key or API secret and utilize high quota from the third party service.

GitHub API limits

The open source projects from GitHub are listed in the site with some details. For showing that, GitHub API is consumed from the client side. But the GitHub API has limits. At the time of writing this post, if the request is not authenticated, it limits 60 requests per IP per hour. When authenticated, limit is 5000..Ideally JoymonOnline.in will never need that much requests for single user. But below are the situations where the API limits matter
  • The site is opened from corporate machines where the external IP is same and simultaneously more than 60 people from a company opens.
  • The user has opened another site which does heavy GitHub anonymous activity and after that opens JoymonOnline.in.
The first scenario may be rare as the my site at this point doesn't have that much reach. But the second is valid. So this issue is something to be addressed.

Overcome free GitHub API limits using proxy service

As mentioned above the way to overcome this is to place our own service in middle which has the developer key and let that service call GitHub API. The best way to create such a service is to use any  Function as a Service offering. There are many such as Azure Functions, Amazon Lambda, Google Functions, Webtasks etc... The selection criteria is based on cost ie it should be free (not even 1$ setup free) and ability to allow outbound calls to GitHub. Below code shows how the service can be written for Webtask. Why Webtask is selected will be coming as another post.

const http=require('https');
var express    = require('express');
var Webtask    = require('webtask-tools');
var bodyParser = require('body-parser');
var app = express();

app.use(bodyParser.json());

app.get('/', function (req, res) {
  res.json({ message: 'Welcome to my api!.Currently it supports api/joymon/<name of github repo>. Not open for all repos ' });
});
app.get('/joymon/:projectName', function (req, res) {
  console.log(req.params);
  var  callback = function(response) {
    var str = '';
    response.on('data', function (chunk) {
          str += chunk;
    });
    response.on('end', function () {
        res.json(JSON.parse(str));
        res.end(str);
    });
  };
  
  http.request(getOptions(req.params.projectName),callback)
    .on('error',function(err){
      console.log(err);
      res.json({ message: 'Soemthing went wrong when contacting GitHub API.' });
    })
    .end();
});
module.exports = Webtask.fromExpress(app);

function getOptions(projectName) {
  return {
        method:'GET',
        host:'api.github.com',
        path:'/repos/joymon/'+projectName,
        uri:'https://api.github.com/repos/joymon/'+projectName,
        json:true,
        headers: {'User-Agent': 'Custom Header Demo works'} //Add dev key via header
    };
}

If we have Webtask account, we can simply copy paste the above code and enjoy free service.

Client side changes

The GitHub API calls return headers which talks about quota and usage. The client side has to check that and call the proxy service accordingly. It is kind of retry which was never needed prior to Serverless.

http://jsfiddle.net/joygeorgek/arp6bmrj/5/

Overall Serverless brings benefits but at the same time, we have to do more such as retry and service orchestration. Also it brings so many points of failure and has to design for it. If one third party service is down we should continue giving the other features as normal.

1 comment:

Blogger said...
This comment has been removed by a blog administrator.