Locks, Blocks, and Deadlocks Oh My!

Categories: News, Professional, SSC, SSSOLV
Tags: , ,
Comments: No Comments
Published on: February 12, 2014

Picture0001

Abstract

Managing concurrency is one of the most challenging aspects of working with any enterprise DBMS. There is much confusion out there about locking, blocking, and deadlocks. In this demo heavy session we will clear up the confusion.

BIO

Randy Knight, MCM, is an IT professional who has worked with Microsoft technology for over 20 years, focusing on SQL Server for the last 16. He has worked in a variety of settings, including 6 years as a Database Architect for Match.com, one of the largest and most successful dot-coms. He is currently the President and Principal Consultant for SQL Server Solutions Group LLC, a Utah-based SQL Server consultancy.

LiveMeeting Info

Attendee URL:

https://www.livemeeting.com/cc/UserGroups/join?id=9KSDPD&role=attend

Meeting ID: 9KSDPD

January 2014 S3OLV and Tabular Models

Categories: News, Professional, SSC, SSSOLV
Tags: , ,
Comments: No Comments
Published on: January 9, 2014

This is really late notice.  We had some extenuating circumstances this month and then a last minute cancellation.  Now that everything is squared away, we have some great news.invite

If you are in the Las Vegas area, you are welcome to attend our monthly meeting which is now being held at the south end of the strip.  This should put you a bit out of the way of the CES 2014 crowds – though you will probably have a fantastic time getting through the crowd.

Then again, maybe you just can’t peel yourself away from these types of things at CES.

single_car

 

Or maybe you are just too busy trying to find the right pair of speakers for your car at CES 2014.

carwbeats

 

Well good news!  S3OLV holds the meetings in the evening and the main show at CES should be pretty close to wrapping up.

Come join us this month for a presentation by Julie Koesmarno.  She hails from down under and is making time in her schedule last minute to help us out.  You can read all about the presentation and Julie at our Meetup site here.

Guess what else you can do by visiting our Meetup site.  You can find out about how we have changed venues to the South end of the Strip.  That’s right, S3OLV has moved.  We are now meeting at the Tahiti Village.  Details of the location are in the meetup.  Check it out.

Oh, and don’t forget.  We still offer an online virtual meeting space for the user group just in case you can’t make it in person.  Details for joining the virtual meeting are also in the Meetup page.

SQL Claus and BI at S3OLV

Categories: News, Professional, SSC, SSSOLV
Tags: , ,
Comments: No Comments
Published on: December 9, 2013

tree

Tis the season once again.  We have SQL gifts a plenty for any and all who wish to partake.

This month, we have Stacia Misner (twitter) that will be playing SQL Clause for the Las Vegas chapter of PASS.  In this season of giving, it is nice to have a little SQL giving too.  Then again, we get SQL giving all year long.

You can read all about the meeting and what Stacia would like to present to us at our meetup site.  The link is here.

Besides the meeting, we have some news for the group.  Stacia will also be assuming the role of President for S3OLV.  Charley had a long run and started the group.  Now Stacia and I will continue the group.  If you see Charley, tell him thanks for everything he did.

Invite

Las Vegas UG September 2013

Categories: News, Professional, SSC, SSSOLV
Comments: No Comments
Published on: September 9, 2013

invite

 

Are you ready for some SQL?

If you are like me, you are probably almost always ready for some SQL.  Well, we have just what you are hungry for – more SQL.

Thursday evening we have the distinct pleasure of having Bob Hovious present to the Las Vegas SQL User Group.  If I am not mistaken, this is his first time presenting to a SQL audience outside of a work setting.

Bob is going to try and teach us about good design and proper datatype use.  I think many a data professional could highly benefit from this type of presentation.

As usual, we will have the presentation available for those interested in showing up in person as well as those who can only attend via the virtual world.

Here are the details on how to attend.

With the Move to Meetup, All of the Meeting information (abstract and bio) is posted there.

http://www.meetup.com/SQL-Server-Society-of-Las-Vegas/events/114687742/

LiveMeeting Information:

 

 
Attendee URL:https://www.livemeeting.com/cc/UserGroups/join?id=RH8QKM&role=attend
Meeting ID:RH8QKM

 

 

VERY IMPORTANT

The meeting location has changed.  We will no longer be meeting at The Learning Center.  New meeting location is M Staff Solutions & Training / 2620 Regatta Drive Suite 102 Las Vegas, NV 89128.

May 2013 Las Vegas UG

Categories: News, Professional, SSC, SSSOLV
Comments: No Comments
Published on: May 14, 2013

Spring is in the air, I think.

bbqWith that scent in the air, we have a nice juicy topic coming up this month for any and all that are interested.

Chad Crawford will be presenting to the group on the tastiness that is Service Broker.

Service Broker in Action

SQL Server Service Broker is a messaging framework built into the SQL Server engine. It enables SQL Server to handle messaging between servers and applications with light setup and overhead. The flexibility of the framework enables Service Broker to queue event notifications, task execution requests or other messages while leveraging the strength of SQL Server transaction management, reliability and recoverability. In this session we will see how to set up Service Broker, discuss case studies where it has been implemented in industry, and step through a practical example implementing an audit log.

Session Level: Intermediate

Chad Crawford’s BIO

Chad has been working with database engines for 14 years, the last 12 focused specifically on SQL Server. He has filled a variety of roles spanning architecture, development and administration. Chad is currently the Database Architect at Henry Schein Practice Solutions in American Fork. When he isn’t optimizing a query, you will find him running, dreaming about airplanes, or looking for a new strategy board game.

Meeting Details

Attendee URLhttps://www.livemeeting.com/cc/UserGroups/join?id=KWRMQ3&role=attend

Meeting ID: KWRMQ3

 

March 2013 LV UG Meeting

Categories: News, Professional, SSC, SSSOLV
Comments: No Comments
Published on: March 11, 2013

Another month and we have another installment (meeting) for the Data Professionals of the Las Vegas area.

This month we will be getting back on schedule with the usual meeting time and place.  We had a slight divergence last month due to that heartsy holiday.

We are excited about some of the upcoming speakers and feedback we have been getting of late.  We are feverishly working to get some presenters on Big Data and Hadoop.  If you are in the market to present on these topics, ping me.

Here are some of the details for the meeting.


Presenting is Jason Kassay

Jason Will be presenting to us on the topic of “The Accidental Programmer.

Most people in the database world have heard of the term, “The Accidental DBA”, those programmers who have to work with databases. How about the other side of the coin? What about DBA’s an Dev’s who have to write code or have to work closely with programmers? This presentation is a best practices guide for working with SQL Server in a .Net environment. You will learn how to recognize when bad code is written that interacts with the database, how to track it down, and most importantly how to fix it. On top of that you will also receive an introduction to object oriented programming concepts such as data abstraction, encapsulation, tier architecture, and class objects so that you can better communicate with your programmers.

BIO

Jason has been a software developer for over 10 years and currently works at EZFacility, providing scheduling, management, full-service billing, and membership solutions for the sports, health, and fitness industries. He has worked primarily with .Net (C# and VB) and SQL Server as well as HTML, CSS, and Javascript. When he is not coding or spending time with his awesome family, you will find him either running (to train for the zombie apocalypse) or stopping hockey pucks as a goaltender.

 

LiveMeeting Information:

Attendee URL: https://www.livemeeting.com/cc/UserGroups/join?id=DC769N&role=attend
Meeting ID: DC769N

 

VERY IMPORTANT

The meeting location has changed.  We will no longer be meeting at The Learning Center.  New meeting location is M Staff Solutions & Training / 2620 Regatta Drive Suite 102 Las Vegas, NV 89128.
We are also moving to meetup for meeting management. Please join us for free at http://www.meetup.com/SQL-Server-Society-of-Las-Vegas/

January S3OLV 2013

Tags: , ,
Comments: No Comments
Published on: January 9, 2013

I hope you are ready for some free SQL learning.  Coming up on Thursday January 10th at 6:30 PM Pacific (GMT – 8 ) we will have a stellar opportunity to do just that – Learn!

Edwin Sarmiento will be presenting for us this at the upcoming monthly Las Vegas User Group meeting.  Edwin has chosen (well he left it up to me to decide) to present on Powershell.

Title: Windows PowerShell for the SQL Server DBA

Abstract

Windows PowerShell is becoming the scripting language of choice for managing Microsoft servers and workstations. And while T-SQL is still the scripting language for DBAs to administer and manage SQL Server, there are tons of stuff that can be easily done with Windows PowerShell. In this session, learn Windows PowerShell from the ground up and how you can use it with SQL Server Management Objects (SMO) to administer and manage a SQL Server instance.

Here is the bonus behind all of this – we will be broadcasting the presentation to the virtual sphere.Here are the details for joining the meeting via livemeeting.

Copy this address and paste it into your web browser:

Copy and paste the required information:

 

On the Ninth Day…

It’s the end of the world as we know it.  And as the song goes…I feel fine!  But hey, we didn’t start the fire.

Those are a couple of songs that pop into my head every time somebody starts talking doomsday and doomsday prophecies.

If you are reading this, I dare say that the world is still turning.  And that is a good thing because that gives us a chance to talk about the 12 days of pre-Christmas.

Today we will be talking about a tool that can be at the DBAs disposal to help in tracking performance as well as problems.

First there are a couple of items of housekeeping.  First item is that I only realized with this post that the post on the first day was wrong.  I had miscalculated my twelve days to end on Christmas day.  That is wrong!  Counting down from that first post on the 13th means the 12th day will end up on December 24th.  Maybe I will throw in a bonus 13th day post, or delay a post for a day, or just go with it.  It will be a mystery!

Second item is naturally the daily recap of the 12 days to date.

On the Ninth Day of pre-Christmas…

My DBA gave to me a Data Collection!

If only my DBA had told me that I would need to have Management Data Warehouse (MDW) preconfigured.  Well, that is not a problem.  We can handle that too.  For a tutorial on how to setup MDW, I am going to provide a link to one that has been very well written by Kalen Delaney.

The article written by Kalen covers the topic of MDW very well all the way from setting up the MDW, to setting up the Data Collectors, and all the way down to the default reports you can run for MDW.  The MDW and canned Data Collectors can provide some really useful information for your environment.

What I want to share though is a means to add custom data collections to your MDW.  To create a custom collection, we need to rely on two stored procedures provided to us by Microsoft.  Those stored procedures are: sp_syscollector_create_collection_item and sp_syscollector_create_collection_set.  Both of these stored procedures are found in the, you guessed it, msdb database.

Each of these stored procedures has a number of parameters to help in the creation of an appropriate data collection.  When creating a data collection, you will first create the collection set, and then you add collection items to that set.

There are a few notable parameters for each stored procedure that I will cover.  Otherwise, you can refer back to the links for each of the stored procedures to see more information about the parameters.

Starting with the sp_syscollector_create_collection_set stored procedure, I want to point out the @schedule_name, @name, and @collection_mode parameters.  The name is pretty obvious – make sure you have a distinguishable name that is descriptive (my opinion) or at least have good documentation.  The collection mode has two values.  As noted in the documentation, you will want to choose one value over the other depending on the intended use of this data collector.  If running continuously, just remember to run in cached mode.  And lastly is the schedule name.  This will help determine how frequently the job runs.

Unfortunately, the schedule names are not found in the documentation, but rather you are directed to query the sysschedules tables.  To help you find those schedules, here is a quick query.

SELECT schedule_id,name
	FROM dbo.sysschedules
	WHERE name LIKE '%collector%';

Now on to the sp_syscollector_create_collection_item stored procedure.  There are three parameters that I want to lightly touch on.  For the rest, you can refer back to the documentation.  The parameters of interest here are @parameters, @frequency and @collector_type_uid.  Starting with the frequency parameter, this tells the collector how often to upload the data to the MDW if running in cached mode.  Be careful here to select an appropriate interval.  Next is the parameters parameter which is really the workhorse of the collection item.  In the case of the custom data collector that I will show in a bit, this is where the tsql query will go.

Last parameter to discuss is the collector type uid.  Like the schedule for the previous proc, the documentation for this one essentially refers you to a system view - syscollector_collector_types.  Here is a quick query to see the different collector types.

SELECT name
	FROM syscollector_collector_types;

The collector type that I will be using for this example is Generic T-SQL Query Collector Type.  A discussion on the four types of collectors can be reserved for another time.

Let’s move on to the example now.  This custom data collector is designed to help troubleshoot deadlock problems.  The means I want to accomplish this is by querying the system_health extended event session.

I can query for deadlock information direct to the system_health session using a query like the following.

SELECT CAST(
                  REPLACE(
                        REPLACE(XEventData.XEvent.VALUE('(data/value)[1]', 'varchar(max)'), 
                        '', ''),
                  '','')
            AS VARCHAR(4000)) AS DeadlockGraph
FROM
(SELECT CAST(target_data AS XML) AS TargetData
FROM sys.dm_xe_session_targets st
join sys.dm_xe_sessions s ON s.address = st.event_session_address
WHERE name = 'system_health') AS DATA
CROSS APPLY TargetData.nodes ('//RingBufferTarget/event') AS XEventData (XEvent)
WHERE XEventData.XEvent.VALUE('@name', 'varchar(4000)') = 'xml_deadlock_report';

You may notice that I have converted to varchar(4000) from XML.  This is in large part to make sure the results will play nicely with the data collector.  Now to convert that to a query that can be used in the @parameters parameter, we get the following.

<ns:TSQLQueryCollector xmlns:ns="DataCollectorType"><Query><Value>
 
SELECT CAST(
                  REPLACE(
                        REPLACE(XEventData.XEvent.VALUE(''(DATA/VALUE)[1]'', ''VARCHAR(MAX)''), 
                        '''', ''''),
                  '''','''')
            AS VARCHAR(4000)) AS DeadlockGraph
FROM
(SELECT CAST(target_data AS XML) AS TargetData
FROM sys.dm_xe_session_targets st
join sys.dm_xe_sessions s ON s.address = st.event_session_address
WHERE name = ''system_health'') AS DATA
CROSS APPLY TargetData.nodes (''//RingBufferTarget/event'') AS XEventData (XEvent)
WHERE XEventData.XEvent.VALUE(''@name'', ''VARCHAR(4000)'') = ''xml_deadlock_report'' 
 
</Value><OutputTable>systemhealthdeadlock</OutputTable></Query></ns:TSQLQueryCollector>

With this query, we are loading the necessary schema nodes that correlate to the Data Collector Type that we chose.  Since this parameter is XML, the schema must match or you will get an error.  We are now ready to generate a script that can create a deadlock data collector.

BEGIN TRANSACTION
BEGIN Try
DECLARE @collection_set_id_1 INT
DECLARE @collection_set_uid_2 UNIQUEIDENTIFIER
EXEC [msdb].[dbo].[sp_syscollector_create_collection_set]
	  @name=N'systemhealthdeadlock'
	, @collection_mode=1
	, @description=N'systemhealthdeadlock'
	, @logging_level=1
	, @days_until_expiration=14
	, @schedule_name=N'CollectorSchedule_Every_15min'
	, @collection_set_id=@collection_set_id_1 OUTPUT
	, @collection_set_uid=@collection_set_uid_2 OUTPUT
 
SELECT @collection_set_id_1, @collection_set_uid_2
 
DECLARE @collector_type_uid_3 UNIQUEIDENTIFIER
SELECT @collector_type_uid_3 = collector_type_uid 
	FROM [msdb].[dbo].[syscollector_collector_types] 
	WHERE name = N'Generic T-SQL Query Collector Type';
 
DECLARE @collection_item_id_4 INT
EXEC [msdb].[dbo].[sp_syscollector_create_collection_item]
	  @name=N'systemhealthdeadlock'
	, @PARAMETERS=N'<ns:TSQLQueryCollector xmlns:ns="DataCollectorType"><Query><Value>
 
SELECT CAST(
                  REPLACE(
                        REPLACE(XEventData.XEvent.value(''(data/value)[1]'', ''varchar(max)''), 
                        '''', ''''),
                  '''','''')
            AS varchar(4000)) AS DeadlockGraph
FROM
(SELECT CAST(target_data AS XML) AS TargetData
from sys.dm_xe_session_targets st
join sys.dm_xe_sessions s on s.address = st.event_session_address
where name = ''system_health'') AS Data
CROSS APPLY TargetData.nodes (''//RingBufferTarget/event'') AS XEventData (XEvent)
where XEventData.XEvent.value(''@name'', ''varchar(4000)'') = ''xml_deadlock_report'' 
 
</Value><OutputTable>systemhealthdeadlock</OutputTable></Query></ns:TSQLQueryCollector>'
	, @collection_item_id=@collection_item_id_4 OUTPUT
	, @frequency=30
	, @collection_set_id=@collection_set_id_1
	, @collector_type_uid=@collector_type_uid_3
 
SELECT @collection_item_id_4
 
COMMIT TRANSACTION;
END Try
BEGIN Catch
ROLLBACK TRANSACTION;
DECLARE @ErrorMessage NVARCHAR(4000);
DECLARE @ErrorSeverity INT;
DECLARE @ErrorState INT;
DECLARE @ErrorNumber INT;
DECLARE @ErrorLine INT;
DECLARE @ErrorProcedure NVARCHAR(200);
SELECT @ErrorLine = ERROR_LINE(),
       @ErrorSeverity = ERROR_SEVERITY(),
       @ErrorState = ERROR_STATE(),
       @ErrorNumber = ERROR_NUMBER(),
       @ErrorMessage = ERROR_MESSAGE(),
       @ErrorProcedure = ISNULL(ERROR_PROCEDURE(), '-');
RAISERROR (14684, @ErrorSeverity, 1 , @ErrorNumber, @ErrorSeverity, @ErrorState, @ErrorProcedure, @ErrorLine, @ErrorMessage);
 
END Catch;
 
GO

Upon creation, this will create a SQL Agent job with the defined schedule.  Since this is a non-cached data collector set, the Agent job will adhere to the schedule specified and upload data on that interval.

Now all we need to do is generate a deadlock to see if it is working.  It is also a good idea to introduce you to the table that will be created due to this data collector.  Once we create this collector set, a new table will be created in the MDW database.  In the case of this collector set, we have a table with the schema and name of custom_snapshots.systemhealthdeadlock.

This new table will have three columns.  One column represents the DeadlockGraph as we retrieved from the query we provided to the @parameters parameter.  The remaining columns are data collector columns for the collection date and the snapshot id.

Now that we have covered all of that, your favorite deadlock query has had enough time to finally fall victim to a deadlock.  We should also have some information recorded in the custom_snapshots.systemhealthdeadlock table relevant to the deadlock information (if not, it will be there once the agent job runs, or you can run a snapshot from SSMS of the data collector).  With a quick query, we can start looking into the deadlock problem.

SELECT collection_time,CAST(DeadlockGraph AS XML) AS DeadlockGraph, snapshot_id
	FROM mdw.custom_snapshots.systemhealthdeadlock;

This query will give me a few entries (since I went overkill and created a bunch of deadlocks).  If I click the DeadlockGraph cell in the result sets, I can then view the XML of the DeadlockGraph, as in the following.

<deadlock>
  <victim-list>
    <victimProcess id="process5a4ebc8" />
  </victim-list>
  <process-list>
    <process id="process5a4ebc8" taskpriority="0" logused="0" waitresource="KEY: 26:72057594048020480 (b4903b2250cc)" waittime="609" ownerId="2803145" transactionname="user_transaction" lasttranstarted="2012-12-20T22:32:09.987" XDES="0x8008d950" lockMode="X" schedulerid="8" kpid="13656" status="suspended" spid="87" sbid="0" ecid="0" priority="0" trancount="4" lastbatchstarted="2012-12-20T22:38:50.020" lastbatchcompleted="2012-12-20T22:38:50.020" lastattention="2012-12-20T22:38:24.217" clientapp="Microsoft SQL Server Management Studio - Query" hostname="SomeServer" hostpid="7604" loginname="SomeLogin" isolationlevel="serializable (4)" xactid="2803145" currentdb="26" lockTimeout="4294967295" clientoption1="673187936" clientoption2="390200">
      <executionStack>
        <frame procname="" line="10" stmtstart="40" sqlhandle="0x02000000eac1af36f412db4e21d9dcc86feb261fa6bcd230" />
        <frame procname="" line="10" stmtstart="356" stmtend="518" sqlhandle="0x0200000095b4ee32a25e9724dd73fd6894c60748af6c136b" />
      </executionStack>
      <inputbuf>
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
WHILE 1=1 
BEGIN 
BEGIN TRAN
UPDATE Purchasing.Vendor
SET CreditRating = 2
WHERE BusinessEntityID = 1492;
 
UPDATE Purchasing.Vendor
SET CreditRating = 1
WHERE BusinessEntityID = 1494;
 
COMMIT TRAN 
END
   </inputbuf>
    </process>
    <process id="process5a44bc8" taskpriority="0" logused="0" waitresource="KEY: 26:72057594048020480 (ade87e3a717c)" waittime="609" ownerId="2878446" transactionname="user_transaction" lasttranstarted="2012-12-20T22:38:50.020" XDES="0xa9abd950" lockMode="X" schedulerid="7" kpid="15008" status="suspended" spid="86" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2012-12-20T22:38:47.887" lastbatchcompleted="2012-12-20T22:38:47.887" lastattention="2012-12-20T22:36:21.247" clientapp="Microsoft SQL Server Management Studio - Query" hostname="SomeServer" hostpid="7604" loginname="SomeLogin" isolationlevel="serializable (4)" xactid="2878446" currentdb="26" lockTimeout="4294967295" clientoption1="673187936" clientoption2="390200">
      <executionStack>
        <frame procname="" line="9" stmtstart="40" sqlhandle="0x02000000eac1af36f412db4e21d9dcc86feb261fa6bcd230" />
        <frame procname="" line="9" stmtstart="352" stmtend="510" sqlhandle="0x020000000c4b9412577ec884cbd51882e5310dd340216739" />
      </executionStack>
      <inputbuf>
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
WHILE 1=1 
BEGIN 
BEGIN TRAN
UPDATE Purchasing.Vendor
SET CreditRating = 1
WHERE BusinessEntityID = 1494;
UPDATE Purchasing.Vendor
SET CreditRating = 2
WHERE BusinessEntityID = 1492;
COMMIT TRAN 
END
   </inputbuf>
    </process>
  </process-list>
  <resource-list>
    <keylock hobtid="72057594048020480" dbid="26" objectname="" indexname="" id="lock5aeec80" mode="X" associatedObjectId="72057594048020480">
      <owner-list>
        <owner id="process5a44bc8" mode="X" />
      </owner-list>
      <waiter-list>
        <waiter id="process5a4ebc8" mode="X" requestType="wait" />
      </waiter-list>
    </keylock>
    <keylock hobtid="72057594048020480" dbid="26" objectname="" indexname="" id="lock6164d80" mode="X" associatedObjectId="72057594048020480">
      <owner-list>
        <owner id="process5a4ebc8" mode="X" />
      </owner-list>
      <waiter-list>
        <waiter id="process5a44bc8" mode="X" requestType="wait" />
      </waiter-list>
    </keylock>
  </resource-list>
</deadlock>

Code to generate deadlock courtesy of “SQL Server 2012 T-SQL Recipes: A Problem-Solution Approach” by Jason Brimhall, Wayne Sheffield et al (Chapter 12, pages 267-268).  If you examine the deadlock graph you will see the code that generated the deadlock.

Since this is being pulled from the RingBuffer target of the system_health, it can prove useful to store that data into a table such as I have done.  The reason being that the Ringbuffer can be overwritten, and with good timing on the data collector, we can preserve this information for later retrieval and troubleshooting.  Deadlocks don’t always happen at the most opportune time and even less likely to occur when we are staring at the screen waiting for them to happen.

As you read more about the stored procedures used to create a data collector, you will see that there is a retention parameter.  This helps prevent the table from getting too large on us.  We can also ensure that an appropriate retention is stored for these custom collectors.

Conclusion

Creating a custom data collector can be very handy for a DBA.  Especially in times of troubleshooting.  These collectors are also quite useful for trending and analysis.  Consider this a tool in the chest for the poor man. ;)

Enjoy and stay tuned!

All scripts and references were for SQL 2008 R2.  The Deadlock script was pulled from the 2012 book, but the script runs the same for the 2008 version of the AdventureWorks database.

On the Fifth Day…

Today we will have a quick change of pace.  This will be less about maintenance and more about internals.

Today’s topic is one of particular interest to me as well.  The topic is a small section of a presentation I like to give at User Groups and at SQL Saturdays.

Before we take this diversion from the under-carriage to something more related to the engine, let’s have a brief recap of the previous four days of pre-Christmas.

So far the gifts of pre-Christmas have included the following articles.

  1. Exercise for msdb – Day 4
  2. Backup, Job and Mail History Cleanup - 3rd Day
  3. Service Broker Out of Control - 2nd Day
  4. Maint Plan Logs - 1st Day

On the Fifth Day of pre-Christmas…

My DBA gave to me a much more compact database.  Surprisingly it can be as much as much as 20% of the original size.

I know, I know.  I’ve heard it before but this is not the compression used by doublespace and drivespace that we were given many years ago.  This really is much better.

And yes I have heard about the performance factor too.  That is a topic for another discussion.  As an aside, when it comes to performance, I always tell people that they must test for themselves because mileage will vary.

No, what I want to talk about is much different.  I want to talk about the CD Array(at the page level) and the new data types specific to compression that you may encounter within the CD Array.

CD Array Data Types

SQL Server introduces us to 13 data types that are used within the CD Array when Compression has been enabled.  Twelve of these data types can be seen when Row Compression has been enabled on an object.  The thirteenth data type is only applicable when page compression has been implemented.

There is no guarantee that any or all of these data types will be present on a page related to an object that has been compressed using either Row Compression or Page Compression.  If you want to find these data types on a compressed page, you may have to do a little hunting.

To demonstrate that these data types exist and that they can be found, I have a sample script.

USE CompressTest
GO
 
IF EXISTS(SELECT NAME FROM sys.tables WHERE name=N'CDTypes')
BEGIN
	DROP TABLE dbo.CDTypes
END
 
IF EXISTS(SELECT NAME FROM sys.tables WHERE name=N'CDTypes2')
BEGIN
	DROP TABLE dbo.CDTypes2
END
 
CREATE TABLE dbo.CDTypes( 
	SomeNull INT
	,SomeBit BIT
	,Some1Byte INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
	,Some2Byte CHAR(2)
	,Some3Byte CHAR(3)
	,Some4Byte CHAR(4)
	,Some5Byte CHAR(5)
	,Some6Byte CHAR(6)
	,Some7Byte CHAR(7)
	,Some8Byte CHAR(8)
	,SomeLong CHAR(9)
	,SomeBit2 BIT
) WITH (DATA_COMPRESSION=ROW) ;	
 
 SELECT TOP 20000
        SomeID       = IDENTITY(INT,1,1),
        SomeInt      = ABS(CHECKSUM(NEWID()))%50000+1,
        SomeLetters2 = REPLICATE(CHAR(ABS(CHECKSUM(NEWID()))%26+65)
                     + CHAR(ABS(CHECKSUM(NEWID()))%26+65),10),
        SomeMoney    = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
        SomeDate     = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
        SomeHex12    = RIGHT(NEWID(),12)
   INTO dbo.CDTypes2
   FROM Master.dbo.SysColumns t1,
        Master.dbo.SysColumns t2;
GO
 
/*
Let's create a Clustered Index
*/
CREATE CLUSTERED INDEX CompressSTable ON  dbo.CDTypes2(SomeID) WITH (DATA_COMPRESSION = PAGE);
GO
 
/* Insert Data */
INSERT INTO dbo.CDTypes
        ( SomeNull, SomeBit,  Some2Byte, Some3Byte,Some4byte, Some5byte
        , Some6byte, Some7byte, Some8byte, SomeLong, Somebit2)
VALUES  (NULL, 0,  '12', '123', '1234', '12345', '123456', '1234567', '12345678', '123456789', 1)
Go
 
DBCC IND(CompressTest, 'CDTypes', 1)
go
 
/* Make sure CDTypes2 is page compressed */
ALTER INDEX CompressSTable ON CDTypes2 REBUILD WITH (DATA_COMPRESSION = PAGE);
 
DBCC IND(CompressTest, 'CDTypes2', 1)
go
 
DBCC TRACEON(3604)
go
/*
Set Trace 3604 To Get Output to SSMS
Then take a look at the Page
 
--or--
Use the tableresults method
Both methods shown here
 
*/
 
DBCC PAGE('CompressTest', 1, 20392, 3) WITH tableresults
go 
DBCC PAGE('CompressTest', 1, 24889, 3) --in this exercise the second page of type 1 is usually page compressed
go

Now let’s take a look at the different data types, starting with the 12 available with Row Compression.

0×00 NULL SomeNull
0×01 EMPTY SomeBit
0×02 ONE_BYTE_SHORT Some1Byte
0×03 TWO_BYTE_SHORT Some2Byte
0×04 THREE_BYTE_SHORT Some3Byte
0×05 FOUR_BYTE_SHORT Some4Byte
0×06 FIVE_BYTE_SHORT Some5Byte
0×07 SIX_BYTE_SHORT Some6Byte
0×08 SEVEN_BYTE_SHORT Some7Byte
0×09 EIGHT_BYTE_SHORT Some8Byte
0x0a LONG SomeLong
0x0b BIT_COLUMN SomeBit2

If we look at page 20392 (yours will likely be different), we will find all of these data types present.  We will also note that this page should show (COMPRESSED) PRIMARY_RECORD – which indicates that the page is Row Compressed.  When hunting for these data types, it is a good idea to make sure the page is compressed first.  In the supplied table, you can see what data type matches to which column in the table we created via the script.  The table also provides a short description of what that data type represents (as you would see in the CD Array).

If we now want to explore and find the 13th data type, we need to look at the second table we created in the attached script – CDTypes2.  Notice that this table has been page compressed.  I even did that twice.  I did this to make sure the data was page compressed (occasionally when testing I could not easily find a page compressed page unless I page compressed a second time).

Much the same as was done with Row Compression, we need to verify that a page is Page Compressed before searching up and down trying to find this 13th type.  To find a Page Compressed page, we need to look in the CompressionInfo section of the page for CI_HAS_DICTIONARY.  If this notation is present, the page is Page Compressed.

Here is a page snippet with the pertinent data type.

The CD array entry for Column 4 is what we are interested in here.  In this particular record, we see the thirteenth data type of 0x0c (ONE_BYTE_PAGE_SYMBOL).  This data type can appear for any of the columns in the CD array and can be different for each record on the page.  This is due to how Page Compression works (which is also a different discussion).

For this last example, I want to backtrack a bit and point out that I used page 24889 to query for the Symbol data type in the CD Array.  This page was different every time I re-ran the test script.  One thing that was consistent with how I setup the exercise is that the page compressed page that had this data type was always the second page of type 1 (data page) in the results from DBCC Ind.

Conclusion

I hope this helped demystify compression in SQL Server – a little.  This is only one piece of compression but an interesting one in my opinion.  Use the provided script and play with it a little bit to get some familiarity.

Stay tuned for the 6th day of pre-Christmas.

Let’s Talk About Joins

Comments: 1 Comment
Published on: December 11, 2012

T-SQL Tuesday #37

This month please join us in the TSQL blog party that happens on the second tuesday of the month.  It is hosted this month by Sebastian Meine (@sqlity).

Sebastien has a month long blog going this month all about Joins.  You can read all about that fun in his post titled A Join A Day – Introduction.

 

This is a good topic.  And I pondered what to write this month for a bit.  I immediately went to a topic that I had on my toblog list.  Unfortunately I had already covered that topic once upon a time.  But with it being at the first thing my mind went to when thinking of this topic, I feel it would be good to re-hash it.  I may change it up a bit, I may not – we’ll just have to see where this post takes us.  So jump on board the Crazy Train for a little Salt N Peppa remix – Let’s Talk About Joins.

Business Requirement

I have some tables that I need to query.  One of the tables has lookup information with a bitmask applied to the id field.  Another table references this table but the ids can be a direct match or an indirect match to the id of the lookup table.  For this case, they will always only differ by the value of 1 if the two values do not directly match.  No other match possibility is considered for this example.

Based on this need, the solution dictates some sort of bitwise math.  I have several examples of how a join could be written to accomplish this primary objective.  I am only going to show the possible ways of performing this.  In my environment these all yield the same results and the data is unique and large enough (4.2 million records).  I will compare performance of these different queries in a later post as I demonstrate a query tuning method to drive the query down from nine seconds to 100ms or less.  For the record, I would choose any of queries 5, 6, or 7 for this particular need based on plan cost and performance.

The JOINS

First up is the ANSI style INNER Join using addition in one of the conditions as well as an OR to fulfill the second portion of the business requirement.

PRINT 'Query 1 -- Join with Or and source+1'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV
		INNER Join SourceType US
			ON (PPV.SourceID = US.SourceID
				Or PPV.SourceID = US.SourceID+1)

This is probably the easiest to understand and it performs well enough.  Until running into this business requirement, I hadn’t considered putting an OR in the JOIN conditions.  But it makes sense considering that an AND can be used there.

Next is a NON-ANSI style of JOIN.

PRINT 'Query 2 -- Non-Ansi Join with Or and source+1'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV, SourceType US
	WHERE (PPV.SourceID = US.SourceID
				Or PPV.SourceID = US.SourceID+1)

Through 2008 R2, this works just as well as the ANSI JOIN already shown.  I haven’t tested in SQL 2012 but I do know that the NON-ANSI syntax of *= (for example) no longer works.  I am not a big fan of this style JOIN because it is far too easy to end up with a Cartesian product.

Another type of JOIN that I like is the use of APPLY.

PRINT 'Query 3 -- Cross Apply with Or and source+1'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV
		Cross Apply SourceType US
	WHERE (PPV.SourceID = US.SourceID
				Or PPV.SourceID = US.SourceID+1)

This particular code segment is the equivalent of the first query shown.  This is the last in the set of using basic math and an OR in the JOIN conditions.  The remaining queries all rely on bitwise operations to perform the JOIN.  Again, until this particular need, I had never even considered using a bitwise operation in a JOIN.  First in this series is the NON-ANSI style JOIN.

PRINT 'Query 4 -- Non-Ansi with COALESCE and Bit compare'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV, SourceType US
	WHERE (PPV.SourceID|1 = COALESCE(US.SourceID|1,Us.SourceID))

The big change here is in the where clause.  Notice the use of COALESCE and the first comparison value in that COALESCE.  This is called a BITWISE OR.  From MSDN: “The bits in the result are set to 1 if either or both bits (for the current bit being resolved) in the input expressions have a value of 1; if neither bit in the input expressions is 1, the bit in the result is set to 0.”

So I am comparing the bit values of 1 and the SourceID.  The SourceID from RumorView will create a match meeting the requirements put forth thanks in large part to the BIT OR operation being performed on both sides of the equality in the WHERE clause.  It is also worth mentioning that the COALESCE is completely unnecessary in this query but it I am leaving it as a pseudo reference point for the performance tuning article that will be based on these same queries.

Next on tap is the CROSS Apply version.

PRINT 'Query 5 -- Cross with Bit compare'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV
		Cross Apply SourceType US
	WHERE (PPV.SourceID|1 = US.SourceID|1)

And the last two queries that the optimizer equate to the same query.

PRINT 'Query 6 -- Join with Bit compare'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV
		INNER Join SourceType US
			ON (PPV.SourceID|1 = US.SourceID|1)
------
PRINT 'Query 7 -- Join with ISNULL and Bit compare'
------
SELECT TOP 1000
		PPV.RumorID,PPV.PersonRumorID,PPV.PersonID
		,US.Source,Us.SourceID, PPV.SourceID
	FROM	RumorView PPV
		INNER Join SourceType US
			ON (PPV.SourceID|1 = ISNULL(US.SourceID|1,Us.SourceID))

The query optimizer in this case is smart and eliminates the ISNULL.  These two queries use the same exact plan, have the same cost and the same execution statistics.  The version with COALESCE is considered more expensive and takes longer to run than these queries.  It is also important to note that the Cross Apply Join also produces the exact same plan as these two queries.

Conclusion

So there you have it.  Many different ways to write the JOIN for this little query.  Performance and results may vary.  It is good to have a few different ways of writing this particular JOIN.  During my testing, it was evident that various methods performed better under different circumstances (such as how the indexes were configured – which will be discussed in the follow-up article).

Yes, I did re-use my previous post on this particular topic.  That said, I want to add another tidbit.

This Post just Goes On and On

When Joining objects, you should take care as to the data type used in the Join.  Without going into implicit conversions in the Joins, I just want to discuss briefly the impact of data type choice in a Join.  This is where knowing your Data, the workload, and usage patterns is going to be an absolute necessity.

That said, you should not be surprised by a change in the performance of your queries if you were to change from a varchar() datatype to a varbinary() datatype.  I’m not saying that the query is going to perform better – but that the performance may change.  As an example, I was effectively able to turn a well performing query into a very poor performing query by changing from varchar to varbinary.  On the flipside, I have seen the reverse also become true.  It all boils down to proper data types for the data.

«page 1 of 6




Calendar
April 2014
M T W T F S S
« Mar    
 123456
78910111213
14151617181920
21222324252627
282930  
Content
SQLHelp

SQLHelp


Welcome , today is Monday, April 21, 2014