Tuesday, November 18, 2014

State pattern v/s Orchestrator

Requirement

Normally in programming we encounter scenarios where we need to carry out many operations in sequence. The sequence of operations can change based on client's business requirement changes. At this point we can assume that there is no parallelism required. How do we code such a scenario efficiently?

To tackle this, either we can write everything in single function or we can split the entire operation as step functions and have another controller function calling the those in the required order. Some people who are inspired by patterns go for pattern oriented programming and most of them end up in state pattern.

Before we go further, if anybody still thinking that what is the problem in writing the code in sequence or in single function? I would suggest get better understanding about the below concepts of programming.

  • Object Oriented Programming
  • Single Responsibility Principle
  • Re-usability
  • Separation of concerns

State pattern

State pattern defines the system as discrete states of object where each state knows what are it's next  possible states. It also has mechanism for changing the state upon any event. Events can be anything such as keyboard input, UI action such as button click, a service invocation, a timer elapsed event etc...

There are defined ways and many libraries available as well which helps us to implement state pattern. Some libraries support encapsulating different states into different classes and we can connect them. But as the basic nature is one state knowing other, its very difficult to change the sequence without changing the state classes.

Below are some links which contains implementation of state pattern and libraries.

State pattern v/s state machine

Now most of us will get a question in our mind. What is the difference between state pattern and state machine. As of my understanding state pattern is just another state machine implementation which is clean and dead simple. We can build state machine using iterator / yieldWindows Workflow Foundation , or more simpler stateless library.

Orchestrator / Sequencer pattern

We have seen what is state pattern. But is that suitable in the requirement mentioned at the beginning of this post?  According to the requirements the flow can change based on business requirements. In state pattern each state should know about other state. If we use that state pattern we cannot build independent steps which can be interchangeable. In most of the cases, steps will be dependent each other. But our requirement is specifically about independent steps. So what is the alternative by honoring all the programming principles.

The solution leads towards orchestrator or sequencer pattern where we can define the sequence and just start the sequence by giving context which the steps are going to manipulate. It will take steps one by one and execute. I was not able to relate this requirement with any of the GOF design pattern, that's why I had to use orchestrator. Template pattern in GOF is somewhat similar. But in that we could see implementation class contains all the operation methods which seems we are violating the SRP in SOLID. 

What is Orchestrator pattern requirement

  • It needs to provide an abstraction (preferably interface) to create concrete step implementation classes.
  • Support for receiving data context into the steps.
  • It should be having support for defining / sequencing our steps.
  • A StartExecution method
  • Prefer 2 modes of execution
    • Pipe line mode - Returned value of one step should be feed as input of next step
    • Normal - All the execution methods in steps will get the context passed via StartExecution method.
A sample implementation is in progress. Hopefully I can share in next post.

Difference between state pattern and orchestrator (Sequencer)

Summarizing the differences between state pattern and sequencer. State pattern requires one state to know other. If the logic changes we canned to modify the concrete state class. In orchestrator the states/steps are independent.

Tuesday, November 11, 2014

Simple interface based programming

For some long I was searching for a good article which explains how to use interface in .Net programming. It is not for a developer who is passionate in programming, got good knowledge in object oriented programming and comes with computer science degree. But for people who came into the field because others compelled and not having computer science background. At some point I thought of writing one myself. But I was able to find a good one. Below are the link(s). I will be updating if I came across another one.

https://www.simple-talk.com/dotnet/.net-framework/designing-c-software-with-interfaces/

Tuesday, November 4, 2014

.Net Access modifiers - Revisited

This is intended mainly for beginners to enforce their understanding about access modifiers used in .Net. We are going to discuss it in a question answer fashion. Read the below MSDN link which explains the topic and try to answer the questions.

http://msdn.microsoft.com/en-us/library/6tcf2h8w.aspx
http://msdn.microsoft.com/en-us/library/ms173121.aspx
  1. Is there any method even a hack, we can access private members from other class?
  2. If we can access private member via some mechanism what is the meaning of all these access modifiers?
  3. We are using access modifiers to restrict the access to fields and functions. Do we need any user name and password to access the variables? 
  4. Can't we program if there is no concept of access modifiers in the language? What are the advantage of restricting access?
  5. Can we have same namespace in 2 different assemblies? If so can I access internal classes in other assembly which are in same namespace?
  6. Can we have a public method inside internal class? What is the specialty?
  7. Can a public method return an internal type?
  8. Can we have protected class in any scenario? Where all we can use it?
  9. There is a public class named A.Can I have a protected class in same name A inside class A?
  10. If we make the constructor as private, will that class be unusable for ever?
  11. There is a public method which returns a public interface. In the method body can we have 'return new ImplClass();' where the ImplClass is internal / protected / private?
  12. Can a public class inherit internal class?
  13. Can a class inherit a class which is declared inside the same class?
  14. Can a public class expose a property which is of internal class type?
  15. Are the access modifiers mutually exclusive in .Net? Can I have more than one access modifiers for single member?
  16. Does 'sealed' access modifier seal the class for ever? Will that class create dead code in the assembly?
  17. Can we have protected static methods in non static class?
  18. Can we have protected method inside static class?
  19. Can we have public static constructor in a non-static class?
  20. Can I declare static variable inside a method?
  21. Can we have a member variable in a class of same type? Will that cause infinite recursive loop?
  22. When I tried the above scenario, I got StackOverflowException? Why its so?
Initially I thought I should be writing 2 posts so that when trying to answer these questions you don't get the temptation to look at the answer. Later sometime, the thought process changed. We are all professionals and true professionals can never lie to his profession. We are not trying this to get marks or pass exam here. This is just for gaining knowledge or understanding our level of OOP knowledge. So answers are there in this post only.
  1. Yes there is. Using reflection we can access the private properties or call private methods of other class / object. 
  2. .Net is managed the run time knows what are the types. But object orientation started before the managed environments. In that world it has sense. Even in .Net there are some framework classes whose members cannot be invoked via reflection.
  3. This is really childish when I asked this during one interview, he really started explaining about ASP.Net membership. He was a developer born into ASP.Net and according to him .Net is for ASP.Net. The access modifiers control the callers / consumers of the class in object creation and usage of it's members through objects. When we say via objects, it might be in a method in same inheritance hierarchy using 'this' or 'base' keywords.
  4. We can program even if there is no access modifiers in the language. In that case everything will be public. Below are the advantages of access modifiers
    1. Refactoring - Suppose we need to re-factor a method, by adding one more argument. If the method is private we need to care about only one class to get successful compilation. In case you are working in huge legacy code base (I am now working in a code base which is 10+ years old,contains more than 300 projects) and the method is public, getting compiled in first shot is a nightmare.
    2. Library development - There is no access modifiers in language and we are developing a library which has a method WeatherAtLocation GetWeatherAsync(location) for getting the weather information from a web service asynchronously and returns a structure which has the parameter location and the temperature. The method temporarily stores the location in a class level variable. Its obviously public. By the time our async web request to the actual web service completes the caller can change the value in member variable. So when we return the result it has a different location. 
  5. No. Namespaces has nothing to do with internal access modifier.
  6. Yes we can. But effectively the access of that method will be internal as the contained type is internal.There is one blog post by Eric on this.
  7. No. The public method is supposed to call by classes in other assemblies. If it returns an internal type, which is restricted to classes in other assemblies, we are violating the internal 
  8. Yes if we are declaring a class inside another class it can be protected. We can create object of this protected class in members of same class and all its derived class and its members regardless of assembly.
  9. Yes we can have. If we create object from outside, it will create object of outer class A. If the object creation is inside a member function of outer class A it will be the nested/protected class A.
  10. No. Private means accessible to all the members in the same class. If the class is public / internal, we can have a public / internal static method which creates the object and returns the same. Singleton uses this technique.
  11. Yes. The public method is returning public interface type. The interface members are permitted to be called from classes in other assemblies.
  12. No. If we do so, we are expanding the reach of internal.
  13. No. A class cannot inherit nested class inside it.
  14. No. The property is public but type is internal. So there is no meaning in returning the Type via  property. Similar to Qn 7.
  15. Yes. Some are mutually exclusive. We can have protected internal as combination. It tells that the method can be accessed in any inherited class regardless of which assembly OR in any class inside same assembly.
  16. Sealed modifier for the declaration. If applied on class it means the class cannot be inherited.If on method it cannot be overriden.
  17. Yes we can have static protected members. Those will be accessible in same class or inherited classes.
  18. We cannot have protected instance methods. In fact no instance members. Also we cannot have protected static methods inside static class. There is no inheritance support.in static classes.
  19. No. Access modifiers are not allowed on static constructor as its not callable from our code. Static constructors are supposed to execute when that type is first used regardless of where from that call originated and to which member. So it doesn't make sense.
  20. No. This was supported in C. But not in .Net.
  21. Yes we can have a member variable inside the class which is of same type. That won't cause any infinite loop.Otherwise linked list would not have been possible.
  22. If we just declare, there wont be any issue. But if we try to instantiate that object in the constructor without any condition check or initiate the object in the same line of declaration, it will throw StackOverflowException.

When I started, I thought there would be limited questions. But there are many. May be the compiler team at Microsoft will be having near full list as their unit test cases. Once I get some time I will surely tag / group these questions as Static,Inheritance etc...

Tuesday, October 28, 2014

Stateful or Stateless classes?

What is mean by state of an object?

Before we discuss about Stateless or Stateful classes we should have better understanding about what is mean by the state of an object. Its same as the English meaning "the particular condition that someone or something is in at a specific time." of state.

When we come to programming and think about the condition of object at a specific time, its nothing but the value of it's properties or member variables at a given point of time. Who decides what are the properties of objects. Its the class. Who decides what are the properties and members inside a class?Its programmer who coded that class. Who is programmer? Everybody who reads this blog including me who is writing this post. Are we all experts in taking decision on what are the properties needed for each class?

I don't think so. At least its true in case of programmers in India who come into the software industry by only looking at the salary and programming as a daily job. First of all its not something can be taught in colleges like how other engineering disciplines works. It needs to come via experience because programming is in its early stages compared to other engineering and its more like art than engineering. Engineering can sometimes have hard rules but art cannot. Even after being in the programming for around 15 years (sorry I count my college days as well in programming experience)  I still take considerable amount of time to decide what are the properties needed for a class and the name of the class itself.

Can we bring some rules to what are the properties needed? In other words what properties, should the state of an object include? Or should the objects be stateless always. Below are some thoughts on this area.

Entity classes / Business Objects

There are multiple names such as entity classes , business objects etc...given to classes which are representing a clear state of something. If we take example of  Employee class, it's sole purpose is to hold the state of an employee. What that state probably can contain? EmpId, Company, Designation, JoinedDate etc...I hope there would be no confusions till this point. Everybody agrees that this type of classes should be stateful without much arguments, because this is taught in college.

But how we should do salary calculation? 
  • Should the CalculateSalary() needs to be a method inside the Employee class?
  • Should there be a SalaryCalculator class and that class should contain the Calculate() method
  • In case there is SalaryCalculator class 
    • Whether it should have properties such as BasicPay,DA HRA etc?
    • Or the Employee object needs to be a private member variable in that SalaryCalculator which is injected via constructor?
    • Or SalaryCalculator should expose Employee public property (Get&SetEmployee methods in Java)

Helper / Operation / Manipulator classes

This is the type of classes which do a task. SalaryCalculator fall into this type. There are many names to this type where classes do actions and can be found in programs with many prefix and suffixes such as
  • class SomethingCalculator eg:SalaryCalculator
  • class SomethingHelper eg: DBHelper
  • class SomethingController eg: DBController
  • class SomethingManager 
  • class SomethingExecutor
  • class SomethingProvider
  • class SomethingWorker
  • class SomethingBuilder
  • class SomethingAdapter
  • class SomethingGenerator
A long list can be found here. People have different opinion in using which suffix for what situation. But our interest is something else. 

Whether can we add state to this type of classes? I would suggest stateless. Lets examine why I am saying 'no', in rest of this post.

Hybrid classes

According to wikipedia encapsulation in object oriented programming is "Encapsulation is the packing of data and functions into a single component". Does this mean all the methods which manipulate that object should be there in the entity class? I don't think so. The entity class can have state accessor methods such as GetName() ,SetName(), GetJoiningDate ,GetSalary() etc...

But CalculateSalary() should be outside. Why its so?

According to the SOLID - Single Responsibility Principle "A class should change only for one reason". If we keep CalculateSalary() method inside the Employee class that class will change for any of the below 2 reasons which is a violation.
  • A state change in Employee class eg: A new property has been added to Employee
  • There is a change in the calculation logic
I hope its clear. Now we have 2 classes in this context. Employee class and SalaryCalculator class. How do they connect each other. There are multiple ways. One is to create object of SalaryCalculator class inside the GetSalary method and call the Calculate() to set the salary variable of Employee class. If we do so it became hybrid because it is acting like entity class and it initiate operation like helper class. I really don't encourage this type of hybrid classes. But in situations such as Save entity method, this is kind of OK with some sort of delegation of operation.

Whenever you feel that your class is falling in this hybrid category, think about re-factoring. if you feel that your classes are not falling in any of these categories stop coding.

State in Helper / Manipulator class

What is the problem if our helper classes keep state? Before that lets look at what are the different combination of state values a SalaryCalculator class can take? Below are some examples

Scenario 1 - Primitive values


    class SalaryCalculator
    {
        public double Basic { getset; }
        public double DA { getset; }
        public string Designation { getset; }
 
        public double Calculate()
        {
            //Calculate and return
        }
    }

Cons

There are chances that the Basic salary can be of a Accountant and the Designation can be "Director"  which is not at all matching.There is no enforced way to make sure that the SalaryCalculator can work independently.

Similarly if this executes in threaded environment, it will fail.

Scenario 2 - Object as state


    class SalaryCalculator
    {
        public Employee Employee { getset; }
 
        public double Calculate()
        {
            //Calculate and return
        }
    }

Cons

If one SalaryCalculator object is shared by 2 threads and each thread is for different employee, the sequence of execution might be as follows which cause logical errors.
  • Thread 1 sets employee1 object
  • Thread 2 sets employee2 object
  • Thread 1 calls Calculate method and gets Salary for employee2
We can argue that the Employee dependency can be injected via constrictor and make the property read only. Then we need to create SalaryCalculator objects for each and every employee object. So better do not design your helper classes in this way.

Scenario 3 - No state


    class SalaryCalculator
    {
        public double Calculate(Employee input)
        {
            //Calculate and return
        }
    }


This is near perfect situation. But here we can argue that, if all the methods are not using any member variable what is the use of keeping it as non static class.

The second principle in SOLID principles says "Open for extension and closed for modification". What does it mean? When we write a class, it should be complete. There should be no reason to modify it. But should be extensible via sub classing and overriding. So how would our final one looks like?

    interface ISalaryCalculator
    {
        double Calculate(Employee input);
    }
    class SimpleSalaryCalculator:ISalaryCalculator
    {
        public virtual double Calculate(Employee input)
        {
            return input.Basic + input.HRA;
        }
    }
    class TaxAwareSalaryCalculator : SimpleSalaryCalculator
    {
        public override double Calculate(Employee input)
        {
            return base.Calculate(input)-GetTax(input);
        }
        private double GetTax(Employee input)
        {
            //Return tax
            throw new NotImplementedException();
        }
    }

As I mentioned in my previous posts, always program to interface. In the above code snippet, I implemented implicitly. That is to reduce the space here. Always implement explicitly. The Logic of calculation should be kept in a protected function so that the inherited classes can call that function in case required.

Below is the way how this Calculator class should be consumed.

    class SalaryCalculatorFactory
    {
        internal static ISalaryCalculator GetCalculator()
        {
            // Dynamic logic to create the ISalaryCalculator object
            return new SimpleSalaryCalculator();
        }
    }
    class PaySlipGenerator
    {
        void Generate()
        {
            Employee emp = new Employee() { };
            double salary =SalaryCalculatorFactory.GetCalculator().Calculate(emp);
        }
    }

The Factory class encapsulate the logic of deciding which child class to be used. It can be static as above or dynamic using reflection. As far as the reason for change in this class is object creation, we are not violating the "Single responsibility principle"

In case you are going for Hybrid class and need to invoke from the Employee.GetSalary() as below.

    class Employee
    {
        public string Name { getset; }
        public int EmpId { getset; }
        public double Basic { getset; }
        public double HRA { getset; }
        
        public double Salary
        {
            //NOT RECOMMENDED 
            get{return SalaryCalculatorFactory.GetCalculator().Calculate(this);}
        }
    }

This way we ensure that, even if there is change in the SalaryCalculation logic the Employee class will not change.

Conclusion

Don't code when we are thinking. Don't think when we are coding

  • Spent some time on class design before coding. Show the class diagram to 2-3 fellow programmers and get their opinions.
  • Name the class wisely. There is no hard rule. But below are some I am following
    • Entity classes should be named with nouns which represents a type of object - eg: Employee
    • Helper / Worker class names should be reflecting that its a worker. eg: SalaryCalculator, PaySlipGenerator etc...
    • Verb should never be used as class name - eg:class CalculateSalary{}

Tuesday, October 21, 2014

Delete a SQL Server database schema with all its objects

Recently as part of R&D I had to delete all database schemas in a SQL Server Database. The major pain I foresee on identifying objects associated with it and deleting those in order. I was confident that somebody might have faced the same earlier and the script will be available as its, That's correct. I got a good link in first google itself. Its given below

http://ranjithk.com/2010/01/31/script-to-drop-all-objects-of-a-schema/#comment-428

Really thanks to this guy. But when I tried deleting the schema in my database using this SP, I got an error saying that the schema cannot be dropped as there are some user defined table types inside it. The technique which this guy used is to get the objects of schema is to query the sys.objects and that never gives the User Defined Table Types inside the schema.

SELECT *
FROM sys.objects SO
WHERE SO.schema_id = schema_id(@SchemaName) order by name

This might also be faced by some other people so read some comments but no luck. So had to spend sometime on the query and added the code to delete UDTT too.

--Add DROP TYPE statements into table
INSERT INTO #dropcode
SELECT 'DROP TYPE '+ @SchemaName + '.'+name
FROM   sys.types
WHERE  is_table_type = 1 and schema_id=schema_id(@SchemaName)

File can be downloaded from here
 
Once again thanks to Ranjith the author of original post and hope he wont mind me changing his work and redistributing

Tuesday, October 7, 2014

SQL Server internals - How to see where my data record stored

As I mentioned in many of my previous posts, its very difficult for me to learn something without seeing how its done internally. For example you can see how I explored .Net GC working in one my previous post. This time I am trying to learn how SQL Server stores that data internally.

Where SQL Server stores our tables & records?

As everybody knows, its in the disk only. But which file? Where its located. There are at least 2 files required for each database and we can see the file paths in the properties tab of SQL Server Database or query the details.

How the data records, tables are organized

We could see that the data is stored in normal files with extension .mdf,.ldf and .ndf. Does that mean we can open that in notepad and see it? Is the SQL Server just open the file and writing into it just like how we did in C/C++ labs in college?

Absolutely no. As SQL Server is a production ready software so it cannot do like academic code. It has more levels which optimize the storage techniques for maximum performance. One level is the file groups where we can specify more than one file for a group and associate with partition. Another level is the page. SQL Server considers a page as the atomic unit of storage. The page size is 8KB. It  does the IO operations such as reads / caches at page level only. Even if we need one record from a page, it reads the entire page.

Lets get into how the records are stored. As we know the physical storage order of records in SQL Server database is based on the clustered index and normally the primary key will be clustered index. We cannot have more than one physical storage order for data records. That's is why there is only one clustered index allowed.

How to inspect SQL Server pages

But there is something called non-clustered indexes. If the records cannot be physically stored in more than one order how they help us? Those are different data structures which tells the order of rows in a different way. Before going to "how the non-clustered indexes works" lets get full understanding about how the clustered index works and how to see the data inside page.

I am glad to say that people before me already thought in the same way and done enough hard work to explain the storage with good pictures. So why I need to do the task again? I just read their blogs and see understood how it works. So sharing the same via my blog.

Below is the blog post where I could see the storage is explained with undocumented SQL Server functions called DBCC IND & DBCC PAGE
http://www.mssqltips.com/sqlservertip/1578/using-dbcc-page-to-examine-sql-server-table-and-index-data/

References

http://www.practicalsqldba.com/2012/04/sql-server-index-fragmentation.html
https://www.simple-talk.com/sql/database-administration/sql-server-storage-internals-101/

My interest was about index fragmentation. So I did some more research on it and preparing my own post where we can see how fragmentation can be created and solved.

Tuesday, September 23, 2014

Leveraging PerfMon to monitor application / server health

Here I would like to discuss about a business scenario and solution we provided. Sorry to say that this post contains less code. 

I am not saying that this is the ultimate solution. Anybody can criticize so that I can improve the solution. 

Background

We have a generic queuing system in our project to off load long running process. The web layer which includes web site and web services just put the request in the queue so that it can return the response in 2 seconds by allowing the processing machines to process at a later time. The processing is done in a pull fashion instead of push fashion. Each processing machine (Windows 2008 virtual machines) knows what is its processing capacity and how many queue messages are currently running on it. Based on the current available processing capacity it de-queue requests / messages from the queue and process it. There is a proprietary algorithm which we developed for de-queuing messages based on other factors too such as priority of individual message or message type etc...

There is a web portal to monitor this environment. It tells how many messages waiting in the queue, what are the messages in progress, how many failed etc...

I would say its a cloud setup where we can add more processing servers to scale out based on the processing power requirement. But I don't know why nobody else in our project call it cloud :( . May be those poor guys don't know what is cloud? Or they didn't get any chance to experiment with commercial cloud and see how it scales :)

The problem here is to give a solution to visualize this queuing system in correlation with system performance. In other terms, the admins of this environment needs to see what is the processor utilization in a each processing machine when that is processing it's maximum number of messages. Based on that they can decide whether currently configured "maximum number of messages which can be processing concurrently" is really utilizing full system resources. If the concurrent requests configured is high than it's capacity ie over utilized, we can expect more failures (due to timeouts or memory problems ) in the processing servers. Else the system will be in underutilized state.

The solution

Approach 1

As I mentioned earlier, there is a web portal which displays details about queue system. Initial suggestions were to have a capturing mechanism to capture processor and memory snapshots of all the processing servers and display in the web portal using third party graph libraries.

Pros

  • Full control over data collection and display

Cons

  • Need license for third party. We could use open source but the client don't allow that
  • More development effort
Is this the best solution?

Approach 2

If we think about alternatives, we can end up in 3-4 alternatives. The one which got major support is "Using perfmon to show the queue details".

The solution is simple. Microsoft already invested enough in creating beautiful graph based UI for monitoring system performance and its called Perfmon, bundled along with Windows operating system. We can create our own performance counters and get it displayed there as well. We have such a nice tool, then why can't we leverage that to show queue details there and correlate with the system performance such as the processor utilization, memory etc...

Pros

  • No need of third party for visualization.
  • Simple to implement

Cons

  • Tomorrow if Microsoft discontinue perfmon, need to find alternative
  • Needs special permission if we are creating new perfmon category.

Code snippet

Without code, its very difficult for me to stop the post. So just sharing the links towards working with PerformanceCounters using .Net.


Registry permission for creating perfmon
http://bytes.com/topic/net/answers/501945-requested-registry-access-not-allowed-performance-counte