Monday, December 30, 2013

Merging 2 collections without duplicates in C# .Net using Linq

Before Linq there was only one method to join 2 collections without duplicate elements. Loop through the first list and check whether the element exists in second list,if not add element into second list. But Linq makes it easier to achieve this requirement. Its mainly done by using union method. Lets see how a simple integer list can be merged without duplicates.

            int[] list1 = {2,3,1 };
            int[] list2 = { 2, 3, 10 };
            foreach (int number in list1.Union<int>(list2))
            {
                Console.WriteLine(number);
            }

Output
2
3
1
10

By default Union removes the duplicates. Its easy to remove duplicates in a primitive type collection. What about dealing with a custom type such as Employee class with 2 properties EmpId and Name? How do the .Net linq library function Union knows 2 Employee objects are equal? Basically when we say 2 objects are same, one more more of their properties should be equal.Lets see how an custom type can tell the framework about its equality.

Object.GetHashCode()

This is a method which is used to retrieve the identity of an object as an integer value. When we use the union internally its calling this method in our Employee class and if 2 objects return same value the framework consider the objects as same

Object.Equals()

Whenever GetHashCode returns a value which is same as another object's hash code the framework calls this method to confirm the object equality. Equals method gets 2 objects and we can do comparison here and return a boolean value which tells whether the objects are equal or not. If we return true the framework discards the second object from the union result. So our Employee class will become as follows

    class Employee
    {
        public string Name { getset; }
        public int EmpId { getset; }
        public override bool Equals(object obj)
        {
            bool isEqual = false;
            Employee emp = obj as Employee;
            if (emp != null)
            {
                isEqual = emp.EmpId == this.EmpId;
            }
            return isEqual;
        }
        public override int GetHashCode()
        {
            return this.Name.GetHashCode(); 
        }
    }

It first checks for the HashCode. In this case the hash code is computed by considering only the Name property. If the hash codes returned are same, the framework calls the equals method and takes decision based on its return value.

In the first look it may seem that why we need both these methods? But if we omit the Equals and use only GetHashCode which returns different values, the framework will not consider it as equal objects. Lets see how this can be used to combine 2 Employee lists.

            IList<Employee> empList1 = new List<Employee>() 
            { 
                new Employee(){EmpId=1,Name="joy"},
                new Employee(){EmpId=2,Name="george"}
            };
            IList<Employee> empList2 = new List<Employee>() 
            { 
                new Employee(){EmpId=1,Name="joy"},
                new Employee(){EmpId=2,Name="mon"} 
            };
            foreach (Employee emp in empList1.Union<Employee>(empList2))
            {
                Console.WriteLine("{0},{1}",emp.EmpId,emp.Name);
            }

Output
1,joy
2,george
2,mon

This is the simplest method. If you are a strict follower of SOLID principle, you can move the comparison part to a different class.Only thing you need is to implement IEqualityComparer interface in it.

    class Employee
    {
        public string Name { getset; }
        public int EmpId { getset; }
    }
    class EmployeeComparer : IEqualityComparer<Employee>
    {
        bool IEqualityComparer<Employee>.Equals(Employee x, Employee y)
        {
            return x.EmpId == y.EmpId;
        }
 
        int IEqualityComparer<Employee>.GetHashCode(Employee obj)
        {
            return obj.Name.GetHashCode();
        }
    }
///////////////////////////////////////////////////////////////////////////////////////////
//Testing
            foreach (Employee emp in empList1.Union<Employee>(empList2,new EmployeeComparer()))
            {
                Console.WriteLine("{0},{1}",emp.EmpId,emp.Name);
            }
        

If you want to alter the comparison method in such a way that it should only consider the EmpId, we can have that logic inside the GetHashCode & Equals methods.I have included only those methods below for reference.This works with the earlier simple sample as well where we haven't used IEqualityComparer.
        

    class EmployeeComparer : IEqualityComparer<Employee>
    {
        bool IEqualityComparer<Employee>.Equals(Employee x, Employee y)
        {
            return x.EmpId == y.EmpId;
        }
 
        int IEqualityComparer<Employee>.GetHashCode(Employee obj)
        {
            return obj.EmpId.GetHashCode();
        }
    }

Output
1,joy
2,george        

Happy coding.

Tuesday, December 17, 2013

Object is Nothing but can access the members without NullReferenceException in VB.Net

All the .Net developers will be familiar with NullReferenceExpception. At least once this exception should have hit in our code. When we try to access a member of any object variable which is null or Nothing(in VB.Net) at that time, it will throw NullReferenceException. In almost all scenarios when we get such an exception its little difficult to identify which object is null. If its in VisualStudio we mouse over on all the object variables in the method and look for its value.

In our project there is one place where people occasionally sees one object shows Nothing in the watch window but the method call works. Since it is written in the initial days of project around 8 years back and written by a group of people who strived for generic implementations with strict SOLID principles, its kind of Bermuda triangle for the new generation programmers. Since it is written properly using Open-Close principle, normally there was no situations which demanded to edit that class. Whenever somebody encounters this situation during debugging, they will share experience each other. People treated it as kind of fairy place. For them it worked almost all the time. Nobody spend time to uncover the mystery of it or the people who tried were not good discoverers. See the below to see this in Visual Studio.



I met with the situation yesterday. Unfortunately in my case the object was really nothing and it started showing the NullReferenceException. When I checked with the people they were telling as that is the behaviour. They all sees it as Nothing when they mouse hover or put in the watch window. But it worked for them. After some more debugging I could see that the object is being loaded from a file and that file is not present in my machine. I copied that file and out of curiosity I debugged again. Yes I too reached the same page of others. In watch the object shows Nothing, but I can call the methods inside that.

After an hour of struggle one of my attempt put some light into it. I tried to get the type by invoking the ToString() method. It returned correct type name. That clicked it.

When we put an object into watch window or mouse over, its actually showing the ToString() value of that object. 

I could easily figure out that one of the base classes is overriding the ToString method without method body. Since its in VB.Net without option strict on, it returns nothing and that is the nothing which is shown in watch window and other debuggers.

Public Class Employee
    Private _name As String
    Public Property Name() As String
        Get
            Return _name
        End Get
        Set(ByVal value As String)
            _name = value
        End Set
    End Property
    Private _Id As Integer
    Public Property Id() As Integer
        Get
            Return _Id
        End Get
        Set(ByVal value As Integer)
            _Id = value
        End Set
    End Property
    Public Overrides Function ToString() As String
 'This will not compile in C# or option strict is on in VB.Net
    End Function

Yes the unsafe VB.Net programmers are really unsafe.

Monday, December 9, 2013

Finding odd or even number in VB.Net - different ways

Normally when most of the developers learn programming, they might have done the odd or even program. The basic idea is to divide by 2 and if the reminder is 0 the number is even else odd. Simple program is given below.

        If (inputNumber Mod 2) = 1 Then
            Console.WriteLine("Odd")
        Else
            Console.WriteLine("Even")
        End If

This is the normal behaviour of people who follows "do things after learning" principle. But what will happen if the person is following "learn by doing" principle. We can see a program like below

Dim total As String = Convert.ToString(inputNumber / 2, Globalization.CultureInfo.InvariantCulture)
        If total.Contains("."Then
            Console.WriteLine("Odd")
        Else
            Console.WriteLine("Even")
        End If

Interesting. Isn't it? It works. But if we look at the performance,it is not good. This may not cause a big issue in today's machines whcih have 4 cores & 8GB ram but in terms of machine cycles its a considerable factor. Surprisingly this code came from a team who consist of 2 persons who are well reputed in their organization. They started their programming in career using .Net technology (as far as I know). This again enforces my view point about programmers.

As a serious programmer one should first learn how the hardware and software work together. 

Without that knowledge also they can create and deliver good looking web sites & desktop apps in managed technologies such as .Net or Java and flourish their career. But at a later point they will struck.

Monday, December 2, 2013

Unable to debug 64bit .net app in VS 2010, because sos is not getting loaded

Recently I had to debug one of our research application for memory leaks. It is developed in Visual Studio 2010 as 64 bit .Net application with considerable amount of unmanaged (Win32 API) calls. Since I have Visual Studio 2010 Pro, thought of debugging in VS itself, for just a change.

How to debug in Visual Studio using sos extensions

Initially I enabled the unmanaged debugging in the project properties which is the first thing to do when starting sos debugging in VisualStudio. Then I tried loading sos as usual in Immediate window.

.load "C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll"

But it ended up as

Error during command: extension c:\windows\microsoft.net\framework64\v4.0.30319\sos.dll could not load (error 193)

Oh I forgot one thing, we cannot debug 64 bit applications in VS 2010 because VS 2010 process is running as 32 bit even the OS and machine are running in 64 bit. I can see devenv.exe is a 32bit process in Task Manager, but I am not sure about the reason but people says it.

Thats fine. I changed the application to x86 ie 32 bit. Then I tried to load sos. But surprised to see another error message.

.load C:\Windows\Microsoft.NET\Framework\v4.0.30319\SOS.dll
Error during command: IDebugClient asked for unimplemented interface
Error during command: Exception c0000005 occurred at 2A5E7EAE

This is interesting. I had no other way except googling the same error message. The first result itself told that there is an issue reported to Microsoft on this

If there is .Net 4.5 installed we cannot load sos in VS 2010

http://connect.microsoft.com/VisualStudio/feedback/details/742882/after-installing-visual-studio-11-beta-load-sos-fails-in-visual-studio-2010

I have .Net 4.5 installed in my machine which I cannot un-install for this silly thing. Yes I had to go back to WinDbg, the command line world to do my job.