SQLFool IDX Defrag Scripts

Categories: News, Professional, Scripts
Comments: No Comments
Published on: January 26, 2010

I just found out that Michelle Ufford (@sqlfool) has updated her Index Defrag scripts.  Right now she is looking for Beta testers.  Anybody that is interested could find more info at:

http://sqlfool.com/2010/01/index-defrag-script-updates-beta-testers-needed/

Indexes Out the Window II

Categories: News, Professional
Comments: 2 Comments
Published on: January 25, 2010

In Part I, I discussed some of the peculiarities and troubleshooting done in relation to a peculiar execution plan.  In this post, I will discuss some of what I did to further tune the query.

First, a little more background into the design of this query.  This query uses some dynamic SQL to try and build a dynamic where clause based on the value passed to a single parameter.  I need to be able to create three different where clauses that could essentially create a range-scan on a date field.  In two of the three queries could be an inequality date comparison rather than an equality comparison.  The specific requirements are to have the report return 1 of the following: 1) 2010 data only, 2) < 2010 data, and 3) all data.  This was initially accomplished with a query similar to:

[codesyntax lang=”sql” title=”Initial Where”]

[/codesyntax]

One of the first comments might be, “That query is not SARG-able.”  True it is not a SARG query.  That will be the last item to be evaluated.  The slowness of the query in this case has less to do with the SARG-ability of it than it does with the inequality scan.  The puzzle in the query was “How to make the query an equality comparison” rather than an < or > comparison.  The above query uses the indexes as discussed in the prior post – it is just not as fast as it could be.

It is at this point that I am reminded of an integral part of my last two presentations at my local PASS meetings.  And it dawns on me that an understanding of the data is truly necessary.  Most of the RecordDate values are Null since there was no backfill on the Column after it’s implementation.  Prior to it’s implementation there was no Date control on the table.  The table is in a warehouse and only contains data since the beginning of 2009.  It is also necessary to understand that there is a minimal chance that there will be record additions to the table after 2010.  Thus we now have a method to be able to get the equality comparisons.  The following might satisfy this need:

[codesyntax lang=”sql” title=”Next Rev”]

[/codesyntax]

This change is now using a Clustered Index seek.  This clustered Index seek is the most costly piece of the Execution plan.  However, the overall runtime of the query has improved by 1 sec.

Thinking I could still improve the query, I decided that a new column called ReportYear would help out the cause.  The change would allow for the query to be SARG-able, the column would have a default on it to change RecordDate to the first day of the year recorded in the RecordDate field.  I figured this idea would also still permit me to do an Eqaulity search in my Where clause.  Let’s see the results:

Wow, I have just come full circle back to the First Post in this series.  I just lost the clustered Index seek, but I did not lose any of the performance.  To see if I gained anything by making the query SARG-able, let’s look at the IO stats between the two.

First IO Stats

Second IO Stats

Top image represents the First Query and the bottom image represents the Second Query.  We have an improvement of about 600,000 logical reads.  That is a substantial improvement.  That improvement comes even with an Index Scan on an Index that is non-covering.  Now, to tune the index.  Much the same as the first article, I added an index to the SalesOpportunity table on LastUpdate and SalesOppID.  This creates a covering index and removes the hash match join in favor of the merge join in the execution plan.  Another effect is the IO Stats as displayed below.

Second Query Index

We can see here that there was once again an improvement made in the logical reads.  Another ~10,000 logical reads improvement.  Overall, both queries run in the same amount of time – but the second query uses less resources.  Again, if I change the clustered index on this table – I am certain I would see an improvement.

I am still in search of the reason behind the use of a bad index for this query.

Execution Plans can be downloaded here.

Indexes Out the Window

Categories: News, Professional
Comments: 1 Comment
Published on: January 22, 2010

I ran into a peculiarly puzzling situation with a query I have been tuning of late.  The query runs some aggregations and performs a range scan based on a date value passed to the query.  The primary target for this query is to be consumed by reporting services.  The query executes (from a stored proc) three times in a single report.  This is another case were the report currently outperforms the previous version by several fold.  I just want it to run a little faster – it is an executive report afterall.

While tuning the query itself, I removed some of the aggregations (not consumed by the report and not needed for accuracy), and that bought me a small performance gain.  Next was to remove a couple of “or” operations.  Again, that only bought me a small performance gain.

The next step was to check my indexing.  Looking at the maintenance plan for the query, indexes were being used – or so it seemed.  Based on the execution plan, the query was performing two Index Scans prior to a Hash Match.  In the first Index Scan, the query was expecting an Output of two columns (SalesOppID and LastUpdate) from IDX_SalesOpportunity_SalesStatusIDSalesContactID.  This is depicted in the next two images.

This makes a little sense since the index does contain the LastUpdate column.  But the index is not covering, and we do not have a Key lookup.

[codesyntax lang=”sql” title=”IDX_SalesOpportunity_SalesStatusIDSalesContactID”]

[/codesyntax]

Out of curiosity, I decided to see what would happen if I commented out the lines that required the LastUpdate field.  I ended up with the following:

Interesting, the query optimizer decided that we should use IDX_LeadTypeID which has the following definition:

[codesyntax lang=”sql” title=”IDX_LeadTypeID”]

[/codesyntax]

Once again, I have a query that is using a bad index.  There is no key lookup and the index is not covering.  I find this very peculiar.  Thus I proceeded to test a little more.  I dropped that Index and reran the query.  With each group of tests, it is important to note, I cleared the ProcCache.  After dropping the index, my execution plan showed no performance increase and still showed a funny result.  The query moved onto a new index that was not covering and there was no key lookup.

Now, let’s make this a little more interesting.  I do have a Clustered Index on SalesOppID – defined as follows:

[codesyntax lang=”sql” title=”PK_SalesOpportunity”]

[/codesyntax]

I am puzzled as to why my query will not use the Clustered Index.  I decided to do the standard stuff – DBCC CheckDB, DBCC UpdateUsage etc.  There is no corruption, and the query still tries to perform the same way.  I could recreate the clustered index on the LastUpdate field.  Since there are Foreign Key relationships that I do not want to drop and recreate, I will just create an NC index on the appropriate two fields.  After doing that, I get the following:

The index is covering and pulling the correct columns.  With this index the query is about 3 times faster than previously.  Some of that is due to the change from a Hash Match to a Merge Join from the two tables (there is still a hash match but it is after the merge and is now less costly).

The moral is, even though an index may be used by the query – it may not be a good index choice.  I am still working on tuning this query, I think I can squeeze better performance out of it.

Execution Plans can be downloaded here.

Primary Key Discovery II

Categories: News, Professional
Comments: 3 Comments
Published on: January 12, 2010

In Part I, I discussed a couple of scripts that could be used to help discover some vital information about all of the PKs in the database very quickly.  I also delved into a performance analysis of the two queries.

The scripts presented showed a very basic discovery of Primary Key information.  I decided that I could use more information at my fingertips than the data set provided in that article.  So, in this article, I want to demonstrate a few more scripts to be able to achieve my goals.  These scripts will cover some clustered index information.

In SQL Server, when you create a Primary Key, the default behavior is to create a Clustered Index on the Primary Key.  Though this is the default, it can be overridden.  Thus, one should know what tables have Clustered Indexes created on them.  It is also desirable to know upon which columns Clustered Indexes have been created.

To get to the Clustered Index Information, the Query is not much different than the one I used to retrieve the Primary Key information.  One could make the assertion that this is due to the default behavior of creating a Primary Key and Clustered Index in one fell swoop.  They are certainly closely knit together.

See the difference?

sys.objects

And…

Simple little change to get slightly more detailed information and something a little more intuitive about the database.  What does this tell me?  Well, I can now map these results to my PK results and see what PKs are also Clustered Indexes.  This is simply achieved by uncommenting the is_primary_key line in the script.  This small change will change the results to only show the Primary Keys and their Clustered Indexes.  Even better, the script could be altered as follows to just pull back the Clustered Indexes that are not on Primary Keys.  Simple modification, uncomment the is_primary_key line and change it from an equality to an inequality evaluation.

So, now we know more information about the Primary Keys and Clustered Indexes.  That is, we know that information for any table that has a Clustered Index.  It isn’t unheard of to have tables present in the database that do not have a Clustered Index at all.  Knowing that information is essential as well.  Again, this is all information that can be garnered by tedious investigation into each table.  Or, we could quickly run a little script and get a quick result gaining quicker insight into the data we manage.

If any results are returned, then it would also be useful to know record counts for each of the tables in the result set.  Knowing this might give some information as to usage of the table and impact of queries against the table that is a Heap.  I was able to achieve this with the following:

Armed with this information, it is a little easier to go into a new environment and speak to the databases and data.  The learning curve is shortened and one will be able to spend more time doing other tasks pertinent to maintaining the database environment and learning the database environment.

Download scripts: PKDiscover2_scripts

page 1 of 1








Calendar
December 2017
M T W T F S S
« Nov    
 123
45678910
11121314151617
18192021222324
25262728293031
Content
SQLHelp

SQLHelp


Welcome , today is Tuesday, December 12, 2017