IO IO IO

Comments: 2 Comments
Published on: March 8, 2010

Once again it is that time.  It is TSQL Tuesday.  This time it is being hosted by Mike Walsh.  Mike has decided to leave the topic open for interpretation like was done last month.  So this month we get to incorporate IO into our articles however we wish – so long as we can tie it back to SQL somehow.  This time, I will be discussing a topic that I had already once blogged.  I had planned on a three part series.  Instead, I will write this little excerpt into that series as part 1.5.  In that series, my intent was to evaluate several different solutions and compare them, eventually showing my solution and compare it.

In this article, I hope to bridge a gap that I realized would occur due to some omitted information in the first article.  Thus a comparison of the IO impact of a poorly performing query and a quick and better performing query.  Some setup for this is necessary.  I needed a routine to monitor file growth (since this is about IO afterall).  I elected to use a method that I have used in the past that is the crux of an article that is to be published on SQLServerCentral March 11.  Due to the publication, I will not be discussing that method here.

Baseline

After establishing the tracking for these experiments, I attempted to create a 1 billion row numbers table.  This table was to establish plenty of IO as a baseline for comparison and testing.  The table creation failed due to lack of hard drive space.  The table creation failed after running for about an hour and then rolled back the transaction.  I was able to capture some vital information though.  We can see the growth and timeline for the first entry and final entry of this attempt, in the following table.

Notice that the data file grew from 123Mb to 5461MB and the Log grew from 120MB to 17207MB.  This also caused 53 log growths with the current log growth settings (10%).  The observed IO strain on my laptop was substantial but unmeasured beyond the above table.  Under these conditions, the tempdb grew by only about 5MB total.

Since the 1 billion row table was a nogo, I reset to test for a 10 million row test.  The table population took 2m18sec this time.  The below table shows the database growth and LogGrowths for this test.

Now with a numbers table, I can proceed with the next step in this test.  I will use some of the scripts from the FizzBuzz article to demonstrate some of the IO associated with this table (beyond the IO required to create it).

And another query to demonstrate IO.

And here is the specific IO stats for this query.

Notice that the IO is better, and might even be considered acceptable for 1 million rows.

The IO-less Solution

What if we could create a numbers table without IO?  What if we could perform several different kinds of queries without IO?  Itzik Ben Gan proposed a cascading CTE solution that does this sort of thing.  Since then, I have seen other solutions that can perform just about as well and without the IO as well.  I will demonstrate the cascading CTE solution here.

IO statistics show no Disk IO for this query.  Execution is quite similar on my laptop to the last query – but this query builds both the numbers table and then runs a query against it.

Conclusion

We can see how some queries can create a substantial amount of IO and disk pressure when run.  A little work, and we can relieve this IO pressure and come up with a better query.

Worst Thing to Do to your DBA

Categories: News, Professional
Tags: ,
Comments: No Comments
Published on: March 5, 2010

Recently Andy Warren Blogged about things that could drive your DBA mad.  There was a lot of feedback on SQLServerCentral about various things that could be done to drive the DBA nuts.  Well, I have one that could drive a DBA mad (this one in particular) and any IT professional for that matter.

What drives me Mad is not the things done in ignorance, so much.  It is the malicious things that drive me batty.  Malicious acts or attacks against a Database, or computer, are downright dirty.  They take a substantial amount of resources, time and energy to correct.

Why did I pick this one?

I think that any who is the Computer geek of the family (or who has friends) can empathize on this one.  Recently I was enlisted to assist with a family computer.  The computer was borrowed and returned with nasty malware and a virus.  It was a tricky virus in that it disabled virus scanners and disabled the layers of spyware removal apps I had previously installed.  Several hours later and still working on removing the nasty bugger.  It is somewhat difficult to do this from across the internet – thus I will be making a trek to the computer to pay a repair visit.

Personally, I have only gotten a virus twice on a personal machine.  Both times it really ticked me off.  The first time was when I connected via VPN to the work network.  The network was infected already and downloaded the virus to my machine.  That was easily eradicated.  The second time was when I connected a media player to my laptop.  The player had a proprietary format and thus I was converting the files from owned media to be usable on the media player.  This was my nephew’s media player and I have no idea where else he had plugged it in.  As soon as I plugged it into my laptop, the virus was uploaded.  I figured the media player was causing the quirky behavior but decided to run a virus scan anyway.  That was annoying.  Of all places to get a virus – from a media player.

There are people out there that get a kick out of writing these kinds of apps.  It takes a certain level of talent to do it – but I don’t much like the way it is being used.  This is the type of stuff that drives a DBA mad.

Index your FK

Categories: News, Professional, Scripts
Comments: No Comments
Published on: March 4, 2010

A few months ago I read an article from SQLServerCentral.com about some Foreign Key gotchas.  Since that article, I have seen several people asking about whether a foreign key should be indexed or not.  I have seen the benefits of having FKs indexed as well as the agony of not indexing them.  The article discusses some of the benefits.  In summary, it will help to avoid deadlocks, reduce locks, and improve performance.  My intent is not to go into depth on the article, and to focus on a couple of scripts to help identify which tables need indexes on the foreign keys.  This article is a part of my Key documentation series.

If you browse the internet, you will likely be able to find several scripts to accomplish the same goal.  There is one script that I have seen that will generate the script for you to create the index on the foreign key.  Many scripts are similar – some perform slightly different things.  I thought about evaluating some of the scripts and comparing them.  I have found some to be less accurate than others.  I decided after some thought that I would just share my scripts.  I will show the evolution of the script that I now use.

Rev 1

This was just to hammer out a solution that would get me the results I sought.  This version uses some objects that are to be deprecated.  Thus should only really be used on Servers that are SQL 2000.

[codesyntax lang=”tsql”]

[/codesyntax]

This works fine and with limitations.  Obviously there was the limitation of using objects that are scheduled to be deprecated.  The second limitation is that it is not 100% accurate.  This script does not pull all of the Foreign Keys that are missing Indexes.  This is the same sort of issue that I found with several scripts on the internet.  That said, this query is rather efficient.  When compared to future versions, it was 8% less costly and about 150ms faster.  However, it does consume more disk IO and more CPU time.

With these limitations in mind, I decided to create a second query.

Rev 2

[codesyntax lang=”tsql”]

[/codesyntax]

This query is less limited than the first.  Here, I have improved the performance substantially over the first query and the execution plan cost is lower.  I have implemented the use of SQL 2005 objects, decreased Disk IO, and decreased run time.  CPU time remains about the same as the previous example.  Still one limitation that is pretty glaring.  I am not returning all Foreign Keys that are missing an index.

Moving onto attempt number three.

Rev 3

As I was testing version three of this query, I noticed there was another item that I needed.  I ran the tests in a database with multiple schemas and had difficulty locating the tables that were in the list.  When looking at the expanded view in SSMS, the tables are grouped by schema and listed alphabetically within that schema.  When I realized where the table was, it dawned on me the need to include the schema in the query results.  Adding this little piece of information will save a few seconds when trying to verify the information in the report.

[codesyntax lang=”tsql”]

[/codesyntax]
In the code, note that I have opted to use the schema_name() function rather than use the Join on the sys.schemas view.  By using the function, the script performs a little better.  There is another addition I made to the script.

[codesyntax lang=”tsql”]

[/codesyntax]

An additional condition was added to the Join.  With the addition of this condition, I am now able to pull back the additional foreign keys that are missing indexes.  In addition to this additional condition, I have added an Order By, which will slow it down a bit.  The offset to the decrease there is that the data is more manageable for verification.

Conclusion

There are many methods to arrive at this data.  The difference in performance between the methods I wrote and evolved was negligible.  The most important concept in the end is having a solution that is accurate and timely to provide this data.  All of these solutions finish in under a second for me.  None of them place a significant impact on the server, and thus I can choose for accuracy over performance in this instance.

This exercise is a part of getting to know your database.  Finding and documenting is very helpful when it comes time to troubleshoot.  Once this information is garnered, one needs to evaluate whether or not to place an index on that foreign key.

page 2 of 2»

Calendar
March 2010
M T W T F S S
« Feb   Apr »
1234567
891011121314
15161718192021
22232425262728
293031  

Welcome , today is Tuesday, July 16, 2019