Tuesday, May 10, 2022

Some thoughts on obfuscation

Long long ago I was given the special task of hiding code. Hiding code..what? Yes, we have to deliver code in such a way nobody should be able to reverse engineer.

Background

There are many scenarios where the software needs to be resistant to reverse engineering. Some situations below.
  • Some product companies (often referred as client companies) don't want to hire engineers instead they outsource to vendors. Vendor normally one of consulting companies in India or a US-based consulting company that has offices in India. The clients want competition among vendors to get the price reduction. The competition among vendors sometimes goes to situations where one vendor points out issues in another vendor's code. Of course with the help of reverse engineering powered by decompilers.
  • When a product company release game that has monetization feature, they need to make it resistent to reverse engineering. Regardless they follow old spend to play or modern web3 based play to earn model.
  • When product is licensed with a key that should be purchased. Most of us are remembering the days where cracks were floating around that cracks anything from Photoshop to Windows Os itself.
Hence distributing software that is resistant to reverse engineering is critical for many businesses.

Approaches

Hosted service

Someone can reverse engineer the code, if the code is given to the client and client share the same. Stop that at the first place by
  • Hosting core logic as service by vendor.
  • Creating a client app for the client to use those services. This will not be having any critical IP.
  • If the core logic is algo the client's data will just be transiting throught vendor premise.
  • Else the client's data will be stored on vendor premise.
Client companies may not like this as they would like to own the code not the engineers.

But if we are product company, we decide how to architct and deliver our product. We never share the source with consumers and they don't have any say in it.

Give installer bianaries

There is another way by giving only binaries to the client. Never the source code. Some client companies will agree to this model but some may not.
Still the vendor is not protected from reverse engineering. Client company can give the binaries to other vendors who then can reverse engineer. Its easy if its managed languages such as Java, C# ets...

If we are product company, we can decide to give only binaries but still vulnerable to reverse engineering.

Just google for Java decompiler or .Net Decompiler to get started on the reverse engineering journey. If the application is buitt using JavaScript or Python then its already plain code due to their interpretive nature.

Obfuscation

How to protect Java and C# binries from reverse engineering? The obfuscation comes to help. What is obfuscation? 

Obfuscation the act of making the mesage difficult to understand. It should not change the behavior of the code.
 
How it helps to address the problem? 

Compile the code to obfuscated form. Then deliver those binaries. Even if other vendors get it, they need to spend enormous amout of time to revers engieer and find flaws. 

Obfuscation is not making the code protected from reverse enginering, but it delays the reverse engineering.

The decompiler will still work with obfuscated code but the names of variables, methods, classes all will be renamed to cryptic words. 

Obfuscation in .Net

Thats kinda introduction. Now let us talk about real programming stuff as the title says. How to obfuscate the .Net assemblies (IL code not really binaries)?

To obfuscate there are lot of tools. 
Detailed comparison of different tools is not in the scope of this tool. Please refer other links.

Problems with obfuscation

There are problems with obfuscation.
  • We can't reverse engineer - In case we are not sure what assembly is in production and want to check intended code is present we ourselve be in trouble
  • Reflection will break - If we are creating object of a class using its name in a string variable, it will break as the class names are changed by obfuscation process.
I started writing this post long ago. Now a days in the world of open source, I am not sure anyone obfuscating the code unless its really necessary.

Reference

No comments: