Problem
The problem of dealing with legacy code is always a burden in software development. Whenever a new technology comes or a technology is upgraded, we have to spent some time to make sure that our existing code can use the features in new technology. Most of the time business / stake holders don't see value in it especially if its enterprise application where people can do data entry and see reports without the new technology.
Azure WebJobs is a good technology to offload the long running processes from the main stream web apps. Microsoft provides WebJobs SDK to write functions which will be automatically called on events. Events can be entry of new message in storage queue, blobs, service bus etc...This model entertains the use of single process (.exe) to host the processing logic function and for each and every invocation SDK creates new thread on which the processing function will be called. Its really good so that for each and every request, there is no need to create new process or AppDomain which is time consuming. Since different processes are running in different threads, they can run in parallel.
All will work fine if we are starting a new project and writes functions which are going to handle the back end operations. But what about the legacy systems which are using one process / request model. If those were coded with enough design and best practices, they can work even if they are running side by side in same process. Still we need to test all the back end processors as QA is not going to certify based on the programmers assurance of good coding practice. It needs more QA time hence more budget and chances are there for stakeholders to deny//delay the migration to Azure WebJobs.
At least in the above case, we as developers sure that it will work and its matter of QA to certify. But what about the other case? The code is developed years back where developers didn't expect their code to be running side by side in same process and they extensively used static variables to easy their development. Also they didn't care about detaching events, disposing the unmanaged objects on the assumption that the process will be closed after the task is done. We / present development team is screwed.
Solution
As passionate developer who want to produce quality code, our answer will always be "Lets rewrite the code with thread safety".
Revisit
If there are only couple of back end logic, this might work out. But think about 100 different types of back end services running without any issue in existing system? Even developers rewrite the code it needs to go same testing cycle. Need to perform load testing. Can we quickly deliver this change in this Agile world?
There are chances of memory leaks even in the new code. What if it first observed in a very high load? It requires considerable amount of time, if there is any memory leak reported in production than in development. Why should we take risk, if we can have one process do one message in the WebJobs world? We are free from many things and can deliver the phase 1 faster. May be in phase 2, we can think about changing the approach to thread based, if there are performance complaints from the field or our Azure budget is exceeding.
So lets see if there are any ways to make one exe process one message from Azure storage queue.
One exe to process one request message
Batch side to 1
Easy option what comes to our minds will be to make the batch size to 1. But this wont work if there are static variables used by different types of queue messages / handler functions.
BatchSize to 1, JobHost.Stop() & Call Environment.Exit after completing function
The message is deleted from queue by SDK only after our WebJob function returns. In this case if we do stop the JobHost the message will not be deleted from the queue. I am trying to figure out in source code what is the relation between the JobHost running state and deleting the message.
Start another exe from WebJob Host exe for each message
Here we invoke same or another exe from the WebJob function and pass the required parameters. Listen to its Console /Standard Out and relay the same to Azure WebJob log stream. Once the child processing exe completes it will just go away along with what ever memory it has.
The parameters required for child processor exe can be transmitted from WebJob SDK exe using standard in as mentioned in the below link.
http://joymonscode.blogspot.com/2015/09/ipc-via-standard-in-and-out-in-cnet.html
1 comment:
Find out how 1,000's of individuals like YOU are working for a LIVING by staying home and are fulfilling their dreams right NOW.
GET FREE ACCESS INSTANLY
Post a Comment