Space Used

Comments: 1 Comment
Published on: May 25, 2010

At long last I am bringing the next installment in this mini-series.  You can find the rest of the articles by perusing the last article I wrote on the topic.  That article was a correction to the TableSpace script about which I had already blogged.

As you will recall, this series was designed to compare and contrast two MS provided stored procedures.  Those procedures are sp_spaceused and sp_MStablespace.  They both calculate table sizes and information but they do not always arrive at the same result.  It was the difference in results that prompted this little series of blogs.  In this article I will be examining sp_spaceused, or at least the re-write I did for this proc.  There will likely be one more article in the series.  That article will be to compare performance and wrap up the series.

Problem

As was stated in the Tablespace article, there are two main problems with this procedure.  The first is that some measure of looping must be used, if one is to use the stored procedure as is, to gather all of the information for all of the tables in a database.  This problem can be rectified by understanding the inner workings of the stored procedure.  The second issue is that it does not return data that is consistent with that of sp_MStablespace.  I would like for the data to be consistent between the two scripts that may be used.  If the data is the same, then it will only be a matter of preference as to which script one decides to use.  This should help eliminate doubt.

Looping

Much the same as needs to be done with the sp_MStablespace procedure, a loop can be used to return the data for all tables.  Based on my experience, the looping query does not take long to execute.  It really is somewhat unnecessary though.  Here is a sample of how such a loop might be done.

[codesyntax lang=”tsql”]

[/codesyntax]

In this version of the loop, it must be noted that an additional step must be taken due to the nature of the data that is returned by sp_spaceused.  This procedure returns ‘ KB’ in the output.  This additional data must be treated so proper calculations can be performed.

Inconsistencies

As stated in the aforementioned article on sp_MStablespace, the data returned between the two does not always match (in regards to table size).  The reason for this is due to a difference in calculation and it seems intent.  Thus both procs can be deemed accurate and correct.  However, that does not breed any higher level of confidence in the return data from both procedures.  I hope to elaborate somewhat on these subtle differences.

Discovery

I am using the stored procedure from SQL 2008 as the basis for my analysis.  One of the first things I have noted is that the procedure has been mostly updated in support of the new SQL 2008 objects.  The one exception is that it still uses dbo.sysfiles to try and determine the database size and logsize.  Beyond that, the proc makes use of the system views sys.partitions, sys.allocation_units, sys.internal_tables, sys.objects and sys.dm_db_partition_stats.

sys.objects

The use of this view is primarily to get the object_id of the table that was passed in to the stored procedure.  For this particular version, this can be eliminated since we are trying to get a solution that does not require looping.  For a later version, I will look into reincorporating that view into the script to allow either a single table or all tables to be returned.

sysfiles

As mentioned already, sysfiles is present in order to retrieve the database size and the logfile size.  This can be replaced by sys.database_files.  In testing, I have found that using sys.database_files is more consistent and it is also more accurate than using sysfiles.

Revision

Since this procedure was already mostly up to date with the new objects for 2005 / 2008, most of it is reusable and the changes are less dramatic than they were for the counterpart sp_MStablespace.  Much the same as the script for MStablespace, I did employ the use of CTEs to get the data that I wanted.  I use three CTEs to retrieve all of the necessary information.  I am also using the same calculations that were used in the original procedure that determined table size.  The exception being in the database size and the log size.  For ease of consumption, I will once again break the script up into smaller chunks.

Part I

[codesyntax lang=”tsql”]

[/codesyntax]

This is pretty straightforward here.  I am retrieving and summing the values needed for the database size and the log size.

Part II

[codesyntax lang=”tsql”]

[/codesyntax]

Note here that I have left code to remove MSShipped objects from the total.  This is a placeholder simply for if somebody would like to test that result.  The only thing other than that to explain is the internal_types.  I am checking for a range of internal table types just as is done in the original stored procedure.  A quick rundown of what those types is as follows (and can be found on MSDN):  202 = xml index nodes, 204 = fulltext_catalog_map, 211 = FULLTEXT_AVDL, 213 = FULLTEXT_DOCID_STATUS, 214 = FULLTEXT_INDEXED_DOCID, 215 = FULLTEXT_DOCID_FILTER, 216 = FULLTEXT_DOCID_MAP, and I was unable to locate what type 212 is.  I presume it is another FullText type.  For those types, the data is not counted as part of the data but is aggregated into the index size.

The final point with this script to note is in regards to an additional comment I left in the script.  The comment is in regards to how the size is calculated and I wanted to be certain that it was explained how that relates.  In the MStablespace script, I use sys.dm_db_index_physical_stats to help determine the index and table size.  My comment here relates to that DMF in that the spaceused version is equivalent to the max_record_size field in that DMF.  This procedure and script determine the size of a table based on the maximum size and of a page and then multiplies that across all pages.  The MStablespace script is more equivalent to calculating based on average fill of those pages.  The MStablespace script can easily be modified to support the max size and thus bring both scripts into harmony.  Beyond those points, I think the rest of the script is pretty self-explanatory.

Results

This script now permits us to get the full size of all tables in a database without the need for looping through a single stored procedure.  We also see that at this point we have outlined why the two Microsoft stored procedures returned different results.  With both scripts now returning the same data, and doing so consistently we can now begin to evaluate performance measures.

1 Comment - Leave a comment
  1. quoted on Data Compression | SQL RNNR says:

    […] to compress was to select any table larger than 1GB in size.  I used a script I showed in the table space series to determine which tables to target.  The script can be found as follows. Code […]

Leave a comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">








Calendar
May 2010
M T W T F S S
« Apr   Jun »
 12
3456789
10111213141516
17181920212223
24252627282930
31  
Content
SQLHelp

SQLHelp


Welcome , today is Saturday, December 20, 2014