As I mentioned in many of my previous posts, its very difficult for me to learn something without seeing how its done internally. For example you can see how I explored .Net GC working in one my previous post. This time I am trying to learn how SQL Server stores that data internally.
Where SQL Server stores our tables & records?
As everybody knows, its in the disk only. But which file? Where its located. There are at least 2 files required for each database and we can see the file paths in the properties tab of SQL Server Database or query the details.
How the data records, tables are organized
We could see that the data is stored in normal files with extension .mdf,.ldf and .ndf. Does that mean we can open that in notepad and see it? Is the SQL Server just open the file and writing into it just like how we did in C/C++ labs in college?
Absolutely no. As SQL Server is a production ready software so it cannot do like academic code. It has more levels which optimize the storage techniques for maximum performance. One level is the file groups where we can specify more than one file for a group and associate with partition. Another level is the page. SQL Server considers a page as the atomic unit of storage. The page size is 8KB. It does the IO operations such as reads / caches at page level only. Even if we need one record from a page, it reads the entire page.
Lets get into how the records are stored. As we know the physical storage order of records in SQL Server database is based on the clustered index and normally the primary key will be clustered index. We cannot have more than one physical storage order for data records. That's is why there is only one clustered index allowed.
How to inspect SQL Server pages
But there is something called non-clustered indexes. If the records cannot be physically stored in more than one order how they help us? Those are different data structures which tells the order of rows in a different way. Before going to "how the non-clustered indexes works" lets get full understanding about how the clustered index works and how to see the data inside page.
I am glad to say that people before me already thought in the same way and done enough hard work to explain the storage with good pictures. So why I need to do the task again? I just read their blogs and see understood how it works. So sharing the same via my blog.
Below is the blog post where I could see the storage is explained with undocumented SQL Server functions called DBCC IND & DBCC PAGE
My interest was about index fragmentation. So I did some more research on it and preparing my own post where we can see how fragmentation can be created and solved.