Tuesday, August 1, 2017

Architecting Azure limits - Azure Search

When we look at Azure evangelism, we could see that it is projected as a platform where we can scale to infinity. Some even go to a level where scale is magic. Just put the code in Azure and it scale automatically without we worrying about anything. But not all are true. When we design in Azure platform we are limited to scaling characteristics of each service. Often it demands us to design application or support tools to align with the nature of Azure service(s) we selected. To do it efficiently we should know the limits of Azure services. In other words our Azure architecture revolves around Azure limits.

Lets take Azure Search in this post. Please note that these limits might be changed in future by Azure platform. So it might not be accurate for long. 

Limits of Azure Search

Total services per subscription - 43 (Among this there are limits in different tiers eg: only 6 services in s3 tier)

Data storage

Maximum number of indexes in search service - 200 (S2 & S3 tiers)
Partitions per service - 12
Max partition size - 200GB (S3 tier)
The above 2 define the max index size  = 200GB*12
Maximum documents - 120 million/partition or 1.4 Billion / service

Query 

Replicas per service - 12( S1,S2 & S3)
Estimated queries per replica - 60/second 
The above limits total of 12*60 = 720 queries per second.
The latest limits can be found in the below link.
https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits#search-limits

This clearly indicate that using one subscription we cannot build another google. May be Azure will increase the limits later. The advantage here is that the index is doing some kind of compression so the source data size doesn't directly translate to index size.

Azure search in SaaS / Multi-tenancy scenarios

If we really want isolation of tenant data, we should put one tenant data into one services. But there is limit of 6 in the higher levels. Next level of isolation is putting tenant data by search index. The search index gives us different URL to get better isolation. But in this approach, we are limited to 200 indexes / service. When we gets 201th tenant, we have to add one more service. That is either a change in the application or support tool specific to the limits of Azure search service. 

The last resort is to tag the documents by tenant. But that is less secure. More design guidelines related to multi-tenancy can be found below.

If we distribute the data in multiple indexes the query has to change since the out of box querying is towards an index. If we want to search one search term in multiple indexes, it translate to one http call to one index.

Azure search service has got really nice features which otherwise would be difficult to code ourselves. But as any other abstraction, it comes with limits. We should be careful when designing the application by not falling into the marketing materials and magic claims.

Is Azure search really pay per use?

Since it is offered as PaaS, we may assume that it is pay per use. But that is not true for Azure search. When we provision a search service it starts charging regardless of the usage(data stored and the search queries issued). More details on pricing can be found in the below link.


The advantage here is that we gets one more level of scaling inside the service which has pricing tier defined. The scaling inside a search service is called Search units.

Is Azure search auto scaling

Not fully automatic. When we create a search service there will be a default search unit. When we want to scale we have to increase it explicitly. This is another knowledge our application or support tool has to have in order to scale. Else human intervention is required when a scaling need arises such as more data getting added or more users are expected in busy season. More details below.


In simple sense when the data volume increased increase the partitions and when queries increase increase the replicas as the query limit is per replica.

No comments: