BLOB Report T-SQL Tuesday #005- Reporting

This month for my second submission into T-SQL Tuesday, I decided to go with something that provides some insight into the data that is being stored and the impact it may have on storage. If you haven’t seen it by now, I made a bone-headed mistake last week by submitting my entry one week early. That entry was about trying to create a report that would join information between some DMVs and the Default Trace. I posted an admission about the mistake as well. That brings us to the current post.

This month T-SQL Tuesday (brainchild of Adam Machanic (blog | Twitter) is being hosted by Aaron Nelson, and the topic is reporting. It looks like there are a lot of submissions this month, and I am a little behind at this point. I hope this is useful to somebody. I know from reading the submission of Brad Schulz, I knew of some immediate uses for his handy little report.

Background

This new idea was born out of necessity for me about a year ago. I was having a hard time finding why my primary file group was still so large after moving all of the User objects out of that filegroup and into new filegroups. I couldn’t find the tying factor – until I checked for BLOBs that were created in the Primary Filegroup by default. To move them was a manual process and is a different topic from the crux of this one. Since the initial report to find the BLOBs, I have evolved it somewhat to try and find the size of each object, as well as to report on the usage related with each. To do this, I needed to use a method I posted about in a different post about finding some Index Information.

Pre-Req

This is a simple requirement for later consumption by the reporting queries. In favor of time, I am creating a populating a staging table that can be disposed of later, after the queries have been executed and the necessary information has been gleaned from them.

CREATE TABLE #indstats (
	indstatsid INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
	,database_id BIGINT
	,index_id BIGINT
	,IndexSizeMB DECIMAL(16,1)
)

INSERT INTO #indstats (database_id,index_id,IndexSizeMB)
	SELECT database_id,index_id
		,CONVERT(DECIMAL(16,1)
		,(SUM(ps.avg_record_size_in_bytes * ps.record_count) / (1024.0 * 1024))) AS IndexSizeMB
	FROM sys.dm_db_index_physical_stats(DB_ID(),null,NULL,NULL,'DETAILED') ps
	GROUP BY database_id,index_id
;

CREATE TABLE #indstats (

indstatsid INT IDENTITY(1,1) PRIMARY KEY CLUSTERED

,database_id BIGINT

,index_id BIGINT

,IndexSizeMB DECIMAL(16,1)

)

INSERT INTO #indstats (database_id,index_id,IndexSizeMB)

SELECT database_id,index_id

,CONVERT(DECIMAL(16,1)

,(SUM(ps.avg_record_size_in_bytes * ps.record_count) / (1024.0 * 1024))) AS IndexSizeMB

FROM sys.dm_db_index_physical_stats(DB_ID(),null,NULL,NULL,'DETAILED') ps

GROUP BY database_id,index_id

;

To get things rolling I want to show what the initial query looked like. The only point here was to show what objects held BLOB data.

Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.partition_id = a.container_id
			And a.type = 2					--LOB data is stored in pages of type Text/Image
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And a.data_pages &gt; 0
		--And filegroup_name(a.data_space_id) = 'Primary'
Union
Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.hobt_id = a.container_id
			And a.type = 3					--Overflow data is stored in pages of type Text/Image
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And filegroup_name(a.data_space_id) = 'Primary'
		--And a.data_pages &gt; 0
	Order By TableName asc

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

From sys.allocation_units a

Inner Join sys.partitions p

On p.partition_id = a.container_id

And a.type = 2 --LOB data is stored in pages of type Text/Image

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And a.data_pages > 0

--And filegroup_name(a.data_space_id) = 'Primary'

Union

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

From sys.allocation_units a

Inner Join sys.partitions p

On p.hobt_id = a.container_id

And a.type = 3 --Overflow data is stored in pages of type Text/Image

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And filegroup_name(a.data_space_id) = 'Primary'

--And a.data_pages > 0

Order By TableName asc

The query requires a union select in order to find all of the BLOBS involved. Depending on whether it is LOB or it is OverFlow_Data, the query will be different and needs to account for both types. The views relevant to finding this information and being queried are sys.partitions and sys.allocation_units. This query was adequate for the initial needs in determining what was causing the bloat in my primary filegroup. This is why it is also the base query. Now, I would like to build on it and find more information concerning these BLOBs.

Up a Step

Using the same views as the base query and just expanding the results returned, I can calculate an approximate size for each BLOB. This additional information is more useful to me. This gives me a greater insight into my database objects. The changes are quite simple, just to iterate, since the pertinent information is obtainable from the present structure of the query.

Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	,LOBUsedPages = a.used_pages
	,LOBTotalPages = a.total_pages
	,LOBDataSizeMB = a.used_pages * 8/1024
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.partition_id = a.container_id
			And a.type = 2					--LOB data is stored in pages of type Text/Image
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And a.data_pages &gt; 0
		--And filegroup_name(a.data_space_id) = 'Primary'
Union
Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	,LOBUsedPages = a.used_pages
	,LOBTotalPages = a.total_pages
	,LOBDataSizeMB = a.used_pages * 8/1024
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.hobt_id = a.container_id
			And a.type = 3					--Overflow data is stored in pages of type Text/Image
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And filegroup_name(a.data_space_id) = 'Primary'
		--And a.data_pages &gt; 0
	Order By TableName asc

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

,LOBUsedPages = a.used_pages

,LOBTotalPages = a.total_pages

,LOBDataSizeMB = a.used_pages * 8/1024

From sys.allocation_units a

Inner Join sys.partitions p

On p.partition_id = a.container_id

And a.type = 2 --LOB data is stored in pages of type Text/Image

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And a.data_pages > 0

--And filegroup_name(a.data_space_id) = 'Primary'

Union

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

,LOBUsedPages = a.used_pages

,LOBTotalPages = a.total_pages

,LOBDataSizeMB = a.used_pages * 8/1024

From sys.allocation_units a

Inner Join sys.partitions p

On p.hobt_id = a.container_id

And a.type = 3 --Overflow data is stored in pages of type Text/Image

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And filegroup_name(a.data_space_id) = 'Primary'

--And a.data_pages > 0

Order By TableName asc

Besides the additional fields pulled into the query, you can see that I also left a couple of comment lines in the query. This is for informational purposes. Sometimes, I like to filter out the BLOBs by specific filegroup or by whether or not they have any data in them.

Taking it to the Xtreme

OK, cheesy subheading but it has been getting ingrained in my head from Tony Horton. This is not an advertisement for that product, but it is kinda like how I feel about the next query. In this query I will be querying that table that was pre-populated as well as adding a view. The view that has been added is sys.dm_db_index_usage_stats. I am also querying these via a Left Outer Join. I want to return the additional reporting fields for the BLOB related items even if there is not index statistics associated with it. With these additional objects added to the base query, I have also added a handful of fields to return in the report.

And now the beef of the query.

Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	,LOBUsedPages = a.used_pages
	,LOBTotalPages = a.total_pages
	,LOBDataSizeMB = a.used_pages * 8/1024
	, ps.IndexSizeMB
	, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests
	, (us.user_updates) AS UserUpdates
	, us.last_user_update AS LastUpdate
	, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)
		/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates
	,a.type_desc as AllocUnitType
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.partition_id = a.container_id
			And a.type = 2					--LOB data is stored in pages of type Text/Image
		Left Outer Join sys.dm_db_index_usage_stats us
			On us.object_id = p.object_id
			And us.index_id = p.index_id
			And us.database_id = db_id()
		Left Outer Join #indstats ps
			ON us.index_id = ps.index_id
			And us.database_id = ps.database_id
			And us.object_id = ps.object_id
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And a.data_pages &gt; 0
		--And filegroup_name(a.data_space_id) = 'Primary'
Union
Select FileGroupName = filegroup_name(a.data_space_id)
	,TableName = object_name(p.object_id)
	,IndexName = i.name
	,LOBUsedPages = a.used_pages
	,LOBTotalPages = a.total_pages
	,LOBDataSizeMB = a.used_pages * 8/1024
	, ps.IndexSizeMB
	, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests
	, (us.user_updates) AS UserUpdates
	, us.last_user_update AS LastUpdate
	, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)
		/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates
	,a.type_desc as AllocUnitType
	From sys.allocation_units a
		Inner Join sys.partitions p
			On p.hobt_id = a.container_id
			And a.type = 3					--Overflow data is stored in pages of type Text/Image
		Left Outer Join sys.dm_db_index_usage_stats us
			On us.object_id = p.object_id
			And us.index_id = p.index_id
			And us.database_id = db_id()
		Left Outer Join #indstats ps
			ON us.index_id = ps.index_id
			And us.database_id = ps.database_id
			And us.object_id = ps.object_id
		Left Outer Join sys.indexes i
			On i.object_id = p.object_id
			And i.index_id = p.index_id
	Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0
		--And filegroup_name(a.data_space_id) = 'Primary'
		--And a.data_pages &gt; 0
	Order By TableName asc,a.type_desc

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

,LOBUsedPages = a.used_pages

,LOBTotalPages = a.total_pages

,LOBDataSizeMB = a.used_pages * 8/1024

, ps.IndexSizeMB

, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests

, (us.user_updates) AS UserUpdates

, us.last_user_update AS LastUpdate

, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)

/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates

,a.type_desc as AllocUnitType

From sys.allocation_units a

Inner Join sys.partitions p

On p.partition_id = a.container_id

And a.type = 2 --LOB data is stored in pages of type Text/Image

Left Outer Join sys.dm_db_index_usage_stats us

On us.object_id = p.object_id

And us.index_id = p.index_id

And us.database_id = db_id()

Left Outer Join #indstats ps

ON us.index_id = ps.index_id

And us.database_id = ps.database_id

And us.object_id = ps.object_id

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And a.data_pages > 0

--And filegroup_name(a.data_space_id) = 'Primary'

Union

Select FileGroupName = filegroup_name(a.data_space_id)

,TableName = object_name(p.object_id)

,IndexName = i.name

,LOBUsedPages = a.used_pages

,LOBTotalPages = a.total_pages

,LOBDataSizeMB = a.used_pages * 8/1024

, ps.IndexSizeMB

, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests

, (us.user_updates) AS UserUpdates

, us.last_user_update AS LastUpdate

, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)

/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates

,a.type_desc as AllocUnitType

From sys.allocation_units a

Inner Join sys.partitions p

On p.hobt_id = a.container_id

And a.type = 3 --Overflow data is stored in pages of type Text/Image

Left Outer Join sys.dm_db_index_usage_stats us

On us.object_id = p.object_id

And us.index_id = p.index_id

And us.database_id = db_id()

Left Outer Join #indstats ps

ON us.index_id = ps.index_id

And us.database_id = ps.database_id

And us.object_id = ps.object_id

Left Outer Join sys.indexes i

On i.object_id = p.object_id

And i.index_id = p.index_id

Where OBJECTPROPERTY(p.object_id,'IsMSShipped') = 0

--And filegroup_name(a.data_space_id) = 'Primary'

--And a.data_pages > 0

Order By TableName asc,a.type_desc

The fields that I added to the above query are listed below.

	, ps.IndexSizeMB
	, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests
	, (us.user_updates) AS UserUpdates
	, us.last_user_update AS LastUpdate
	, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)
		/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates
	,a.type_desc as AllocUnitType

, ps.IndexSizeMB

, (us.user_seeks + us.user_scans + us.user_lookups) AS UserRequests

, (us.user_updates) AS UserUpdates

, us.last_user_update AS LastUpdate

, CAST(us.user_seeks + us.user_scans + us.user_lookups AS REAL)

/ CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL) AS RatioRequestsToUpdates

,a.type_desc as AllocUnitType

The field that I considered omitting is the ps.IndexSizeMB. This field reports the size of the Index and not necessarily the size of BLOB data. I decided to keep it because it can prove useful when trying to determine where the bloat may or may not be. I also added the type_desc field so I could determine the type of data reported in the query. Without this field, I think it would be too easy to dismiss some of the entries as duplicates when they really should be present in the report. The others are information on the usage of the data related to the index and BLOB data.

Conclusion

As you run these queries you will find that there are useful tidbits of information for administrative reporting purposes. It should also be noted that running the final query will produce the same index usage info for an Index involved in the two types of BLOB data being reported. It will not split out the UserRequests (for instance) that are performed against LOB_DATA versus ROW_OVERFLOW_DATA. This can create a reporting problem / mis-perception if one is not aware of it first.

Scripts can be downloaded here.

Edit: Added the scripts for download.

6 thoughts on “BLOB Report T-SQL Tuesday #005- Reporting”

Pingback: T-SQL Tuesday #005 – Reporting – The Round-Up – SQLvariations: SQL Server, a little PowerShell, maybe some Hyper-V
Todd Harrison says:

April 26, 2010 at 15:55

I can’t seem to run the 3rd query. The temp table creates and fills with 12 rows of data but the 3rd query state above only gives me errors on SQL Server 2005

Msg 207, Level 16, State 1, Line 15

Invalid column name ‘OBJECT_ID’.

Msg 207, Level 16, State 1, Line 15

Invalid column name ‘OBJECT_ID’.

What I was really looking for thou was a query that would show me all BLOB columns that are indexed so I can plan for the offline BLOB index rebuilds. Any help would be nice. Thanks.

Pingback: BLOB Index Columns | SQL RNNR
Pingback: T-SQL Tuesday #006: A Blobbing We Will Go | SQL RNNR
Pingback: TSQL Tuesday Indexes and Blobs | SQL RNNR
Pingback: T-SQL Tuesday Participation Over the Years | SQL RNNR

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Background

Pre-Req

Up a Step

Taking it to the Xtreme

Conclusion

6 thoughts on “BLOB Report T-SQL Tuesday #005- Reporting”

Leave a Reply Cancel reply