Tuesday, May 28, 2019

IIS Logging time-taken = server request processing time + NETWORK TIME

When we do application development and hosting ourselves its an easy life. If something we miss, our nights and weekends are simply gone. A different kind of easy life. But what if we develop and someone else host and maintain? It adds additional efforts to bring both the parties into same page regarding root cause of errors, performance etc...

Most of the enterprise world works as 2 separate teams for development and hosting. One of the issue  in syncing both operations and development regarding performance often arise is around the IIS logs. IIS logs shows 5 digits or 6 digits values in time-taken field. It may simply reach to development as application issue. Most of the administrators know time-taken field includes the network time to move the payloads along with execution time. But still there are admins who just pass over those issues to development without thinking what is going on the network devices, ActiveDirectory, security gateways and what not in the route.

If its clear and not expecting any of these type of issues from operation admins, please feel free to stop reading further on this post. This post is mainly intended to developers and admins who needs to know little in depth about the time-taken field and how to find the application processing time from the time-taken field.

Oh yes its simply to look at the IIS Failed Request tracing(FREB here onwards). Again feel free to stop reading as its simple as looking at FREB logs to find the network time separated from application execution time.

The below is for people who wants to reproduce a network delay and see what FREB shows. Truly for hands on enthusiasts.

Sample server application

The network delay mainly happens in below situations
  • If the payload is really large
  • Security gateways placed between the user and servers to prevent attack takes extra time in inspecting packets or if the bandwidth is really slow. 
  • If the Active Directory or other identity providers which are placed before the application is taking time.
Most of the small responses would be cached so IIS shows proper time in the time-taken field.

The test application which we are going to use has a simple web api as below.
http://localhost/TimeTakenTestServer/api/files/2

It gives random 2 MB text data back. The last number in the URL shows how many MBs of text to be returned. The WebAPI caches the random string so that subsequent requests don't need the time to generate the data. Its just reading from cache.

How to create such a request is out of this post. But for those who are interested can check out a sample in below repo location.

<Code location>

Simulate delay

Delay can be simulated in different ways.

Chrome F12 network profile

Chrome dev tools has option to introduce artificial delays. It can be used to create delays. But please note that this will not work for smaller payloads as it gets buffered in HTTP driver or IIS levels.

.Net app to slow reads using TCP API

If we use TCP classes in .Net, we can simulate delays. If we use higher level APIs, either they may not have the option to put delays or they might be reading from network level buffers.

Chrome F12 tools to demo the problem

Lets use Chrome to simulate delay. Below is the screenshot.

Chrome view



The first request is made after an IIS reset. It took around ~2 seconds. IIS needs some extra time to warm up the worker process after its reset. Note that the Fast 3G network throttle profile at the top left was no turned on at this time. 

Then tried 2 Ctrl+F5 which clear browser cache and loaded again. But this time we can see that it completed in milliseconds. This is because the server is already up and running also it has the string in memory cache (in this case a static variable)

Then the Fast 3G network throttle profile is turned on. The subsequent requests took more time.

IIS log

Let us see what IIS says about this test.

#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(Referer) sc-status sc-substatus sc-win32-status time-taken 
2019-05-21 23:23:14 ::1 GET /TimeTakenTestServer/api/files/2 - 80 - ::1 - 200 0 0 929 
2019-05-21 23:23:17 ::1 GET /favicon.ico - 80 - ::1 http://localhost/TimeTakenTestServer/api/files/2 404 0 2 17 
2019-05-21 23:23:21 ::1 GET /TimeTakenTestServer/api/files/2 - 80 - ::1 - 200 0 0 715 
2019-05-21 23:23:22 ::1 GET /favicon.ico - 80 - ::1 http://localhost/TimeTakenTestServer/api/files/2 404 0 2 6 
2019-05-21 23:23:24 ::1 GET /TimeTakenTestServer/api/files/2 - 80 - ::1 - 200 0 0 641 
2019-05-21 23:23:27 ::1 GET /favicon.ico - 80 - ::1 http://localhost/TimeTakenTestServer/api/files/2 404 0 2 4 
2019-05-21 23:23:39 ::1 GET /TimeTakenTestServer/api/files/2 - 80 - ::1 - 200 0 0 6389 
2019-05-21 23:23:47 ::1 GET /favicon.ico - 80 - ::1 http://localhost/TimeTakenTestServer/api/files/2 404 0 2 3 
2019-05-21 23:23:57 ::1 GET /TimeTakenTestServer/api/files/2 - 80 - ::1 - 200 0 0 6505

Note that the Chrome and IIS traces are not taken from exact same trial. But they are done without any changes.

FREB to rescue

Now lets see how can we understand that the transfer really took time or something wrong with application. FREB can be enabled for http 200 responses though it says failed request tracing. The FREB logs can clearly tell how much time it took for data transfer and it is displayed in the below image.


The above image shows the comparison of second request and the last request. The second one don't have any F12 Networking throttling but the last one has hence the transfer time is high.

More reading

HTTP.sys buffering at TCP layer

https://support.microsoft.com/en-us/help/820129/http-sys-registry-settings-for-windows
https://blogs.msdn.microsoft.com/wndp/2006/08/15/buffering-in-http-sys/

Tool to simulate network delay

http://jagt.github.io/clumsy/index.html

References

https://blogs.msdn.microsoft.com/tess/2008/08/19/troubleshooting-a-performance-issue-with-failed-request-tracing-and-appcmd-in-iis7/
https://blogs.msdn.microsoft.com/docast/2016/04/28/troubleshooting-iis-request-performance-slowness-issues-using-freb-tracing/

Tuesday, May 21, 2019

Increase LOG_FILE_MAX_SIZE_TRUNCATE using INETMGR

We might have encountered the below message in FREB logs when the log size goes above the size limit.


Simple google tells us how to increase the size by executing the below command.

cd /d "%windir%\system32\inetsrv" appcmd set config /section:sites -siteDefaults.traceFailedRequestsLogging.maxLogFileSizeKB:1024

This works for most of us. This post is to show how this can be done via INETMGR application which is just a UI around the commands to manage IIS Web server.
There is no direct way to edit it with customized view. But we can select the server node and navigate to the settings using Configuration Editor icon under Management section.

Tuesday, May 14, 2019

Hard drive shows full but size not reflected in the folder properties

Recently, we were doing load testing of an enterprise application. We use VisualStudio load testing project now known as Azure DevOps with 4 agents having 16 cores, 64GB each. Due to some enterprise restrictions we are not able to use the VSO / DevOps hosted agents instead we have to use 'bring your own agent' method. It was fine when we were below 3000 virtual users limit. But when we targeted 5000 users, we ended up in load test agent issues than the application.

One of the issues was a failure in load test run saying there was not enough hard disk space. When I went into the machine using my login, I could see the "This PC" page shows the c:\ drive is full. But when I select all folders inside c:\ and check the properties, it doesn't shows the same disk usage!!!

Which folder is filled up? Any magic going on here?

It was interesting as the properties shows low usage where drive shows full even I am admin in the load agent machine.

Fortunately, we have Microsoft support and they were in the meeting. They pointed out to windirstat tool which shows where the files and who use the space in reality.

It was the temp folder of the user who runs the VSTS load test agent service. The location is below which is filled by the load test runs.

C:\Users\<user who runs the VSTS load test service>\AppData\Local\Temp\

It seems many people faced the same drive and folder size mismatch in different scenarios. Some experiences are here and here.

Tuesday, May 7, 2019

Node-PlantUML package to automate generation of mind map diagrams

Recently I was doing some experiments with PlantUML. PlantUML is awesome. Write code and get diagrams. It is especially great tool for engineers like me who are not that good at art. One of the main trend now a days is to automate anything and obviously we can automate diagram generation from puml file. The technology to do is obviously NodeJS.

Package to generate images from .puml files

As normal in Node world, NodeJS has a package to convert puml files into images and much more. Below is the link to the NPN package.
https://www.npmjs.com/package/node-plantuml

From here, this post assumes the reader has knowledge about NodeJS and its NPM package ecosystem.

A simple problem

If there is no problem this post is not needed. The node-plantuml package generated the diagrams for most for the puml files. But not for the mind map pumls. ie which are starting with @startmindmap.

@startmindmap
* root
** 1
**_ 3
@endmindmap

Root cause

The node-plantuml uses plantuml.jar file to bundle with it is not the latest. That jar file don't have the capability to understand @startmindmap. So it fails.

Solution

node-platuml package to include the latest plantuml.jar file. As a responsible developer, who make use of other's open source work, I should give them a PR which I will be doing soon.

Workarounds

Set the path of plantuml.jar

As any other developer, the developers of this package also left option for configuring the path to jar file. They selected the environment variable mechanism to set the path. The name of environment variable is PLANTUML_HOME. If the environment variable is not there, it will take jar file from vendor folder. Below line is from the node-plant package source code.

The code to set the environment variable to a different .jar file is as follows. If we obtain the latest plantuml.jar file, we can place it somewhere and point to the same.

process.env.PLANTUML_HOME = "node_modules/node-plantuml/vendor/plantuml_new.jar";

If there are any troubles setting the path, try the below approach.

Replace old file with new

Get the latest plantuml.jar file from VS code extension called PlantUML. The location of the platuml.jar file is below.

%userprofile%\.vscode\extensions\jebbs.plantuml-2.10.9

Put the file into the node-plantuml modules folder. Location below.

..\node_modules\node-plantuml\vendor