Comments: 2 Comments
Published on: March 8, 2010

Once again it is that time.  It is TSQL Tuesday.  This time it is being hosted by Mike Walsh.  Mike has decided to leave the topic open for interpretation like was done last month.  So this month we get to incorporate IO into our articles however we wish – so long as we can tie it back to SQL somehow.  This time, I will be discussing a topic that I had already once blogged.  I had planned on a three part series.  Instead, I will write this little excerpt into that series as part 1.5.  In that series, my intent was to evaluate several different solutions and compare them, eventually showing my solution and compare it.

In this article, I hope to bridge a gap that I realized would occur due to some omitted information in the first article.  Thus a comparison of the IO impact of a poorly performing query and a quick and better performing query.  Some setup for this is necessary.  I needed a routine to monitor file growth (since this is about IO afterall).  I elected to use a method that I have used in the past that is the crux of an article that is to be published on SQLServerCentral March 11.  Due to the publication, I will not be discussing that method here.


After establishing the tracking for these experiments, I attempted to create a 1 billion row numbers table.  This table was to establish plenty of IO as a baseline for comparison and testing.  The table creation failed due to lack of hard drive space.  The table creation failed after running for about an hour and then rolled back the transaction.  I was able to capture some vital information though.  We can see the growth and timeline for the first entry and final entry of this attempt, in the following table.

Notice that the data file grew from 123Mb to 5461MB and the Log grew from 120MB to 17207MB.  This also caused 53 log growths with the current log growth settings (10%).  The observed IO strain on my laptop was substantial but unmeasured beyond the above table.  Under these conditions, the tempdb grew by only about 5MB total.

Since the 1 billion row table was a nogo, I reset to test for a 10 million row test.  The table population took 2m18sec this time.  The below table shows the database growth and LogGrowths for this test.

Now with a numbers table, I can proceed with the next step in this test.  I will use some of the scripts from the FizzBuzz article to demonstrate some of the IO associated with this table (beyond the IO required to create it).

And another query to demonstrate IO.

And here is the specific IO stats for this query.

Notice that the IO is better, and might even be considered acceptable for 1 million rows.

The IO-less Solution

What if we could create a numbers table without IO?  What if we could perform several different kinds of queries without IO?  Itzik Ben Gan proposed a cascading CTE solution that does this sort of thing.  Since then, I have seen other solutions that can perform just about as well and without the IO as well.  I will demonstrate the cascading CTE solution here.

IO statistics show no Disk IO for this query.  Execution is quite similar on my laptop to the last query – but this query builds both the numbers table and then runs a query against it.


We can see how some queries can create a substantial amount of IO and disk pressure when run.  A little work, and we can relieve this IO pressure and come up with a better query.

2 Comments - Leave a comment
  1. […] Jason Brimhall reminds us that reducing the IO cost of a query is a worthwhile exercise. He uses a fun example to illustrate this to us. Check out his post and go hunt for the higher IO queries in your own environment. You can do it! […]

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

March 2010
« Feb   Apr »

Welcome , today is Sunday, April 5, 2020