Extended Events and Data Types

Comments: No Comments
Published on: April 14, 2015

TSQL2sDay150x150

Today is another one of those installments in the long-running TSQL Party held monthly (a.k.a TSQL2SDAY).

This month we have an open invitation from Mike Donnelly (blog | twitter), asking us to talk about something new we have learned and then to teach about it. You can read the invitation in Mike’s own words on his blog.

Coincidentally, the topic is both pretty straight forward and easy on the one hand while somewhat difficult on the other hand.  Mike said: “The topic this month is straight forward, but very open ended. You must learn something new and then write a blog post explaining it.” I find the topic to be difficult because I will usually blog about a topic when I have learned something new. On the other hand, sharing new stuff  is pretty straight forward and enjoyable.  Enter the brain split!

So, what I have learned recently?

Quite a bit.  But what would I really like to share on this occasion?

For today, I would like to share more information about extended events.  XEvents are great.  There is a wealth of information to be garnered from XEvents.  Furthermore, XEvents provide a great opportunity to keep learning.

While researching some events to help troubleshoot a specific issue, it dawned on me that there was some info that I had only looked at when I was looking at specific events.  I started wondering how much of that info was out there.  So here I will share some of this information that is available to you via queries within SQL Server.  Much of this info is attainable through the re-purposing of some scripts I shared previously – here.

Custom Data Types

This wasn’t too much of a surprise because I had seen them throughout and taken advantage of the custom data types to get better information.  But I might consider these custom data types to be more of the EAV model coming through than custom data types.  One can expose the custom data types through an evaluation of data in the map_values DMV.  Let’s take a look at a script that would lay the groundwork to see these data types.

Evaluating this data, one will see that in addition to the “standard” datatypes such as integer, there will be a “wait_types” data type.  This data type will map to all of the wait types available through extended events.  Additionally, the event that is associated to each of these custom data types is exposed through this query.  When getting ready to use an extended event, knowing the kinds of data that will be exposed through a data point in the session will make the session data more usable.  Knowing there is a custom data type (yes, it is really just a key value pair), can be extremely helpful.

Collection Flags

Many of the available events have “customizable” collection flags exposed.  Understanding that these collection flags can be on or off is essential to saving some hair.  Not all data is automatically collected for all events.  Some prime examples of such events that do not automatically collect certain pieces of data are sp_statement_completed and object_created.  The nice thing about these flags is that they have a value of “customizable” in the column_type field.  Another good thing with these flags is that the description field gives a little documentation on what the behavior should be for the “on” and “off” states.

There is a good reason that some of those may be off by default.  The addition of this information may cause an additional load or may be information overload.  It is up to the consumer to determine if the data is going to be of significant importance.  Once determined, enable or disable the flag as appropriate.

These queries provide a good amount of information about the extent of custom data types as well as the collection flags that may be available to use when creating event sessions in SQL Server.  Understanding that this data and these options are there is important to capturing better event info.

Audit who Dropped the Database

Categories: News, Professional, Scripts, SSC
Comments: 2 Comments
Published on: April 7, 2015

In the first article on this topic (which can be read here), I discussed the problem of having a database get dropped and the need to find out who dropped the database and when they dropped it.  In that article, I demonstrated how the information (at least some of it) could be found through querying the default trace.  It is a good enough solution if you have not disabled the default trace.

On the other hand, what if you need to do something more flexible?  What if you wanted to track this down through a more robust tool such as extended events?  XEvents has what you need to be able to properly track these types of events.  In this article, I share how to use Extended Events to capture this kind of audit data.

There are a few things to note with this event session.  First is that I am trapping two separate events: sqlserver.object_deleted and sqlserver.object_created.  The next important note is that I have to enable a flag on both events to trap the database name.  That is done like this: SET collect_database_name = (1).  Last note is something that should be noticed after enabling the session and performing a couple of trials.  The events I am using will fire twice for every DROP or CREATE operation.  This happens due to the ddl_phase for each.  There is one event fired for the start of the event and then another event when the event commits or hits a rollback.  Because of this, I am outputting the ddl_phase in my query to read from the session data.  Let’s run a quick test and see how this data can be queried.

Also of importance is to note the file path for the output file. If the path does not exist or you do not have permissions to the directory, you will get an error message and the session will not create.

Nice and simple for a test. This should be enough to have trapped some data for our XEvent session.  Let’s parse the data and take a look.

With the XEvent session started, and a database that has been created and dropped, this last query will produce some data to help track who has been adding or removing databases from your SQL Instance.  If you have run the example code, you should see data very similar to what is shown in this next image.

xe_results

In the attached image, I have separated the pair of statements for the DB create from the pair of statements for the DB drop (recall that I mentioned each will fire twice).  I hope this serves you well in your attempts to reign in the control of your environment and discover who may be creating rogue databases or even dropping the databases that are essential to the business.  Enjoy!!

Audit Schema Change Report

Comments: 4 Comments
Published on: March 11, 2015

In a recent article on SSG, I discussed how to use Extended Events to function in a Profiler like fashion.  You can read about that here.  I recommend reading that article first because it helps to lay some of the groundwork for this article.

Within Management Studio, from a righ-click context menu, there is an option to run a report called “Schema Changes History”.  The name of this report might indicate that there is some means to find when a change occurred in the schema and potentially what the change was as well as who made the change.  It does say “Schema Changes History” after-all.

If you run the report, you would be able to see a report similar to the following.

Change_Report

 

This looks like it could be a pretty useful report.  If I drill into the collapsed items, I can get better information.  But, how exactly is this report producing this kind of data?  I don’t have a database audit running to trap this information.  I also don’t have SSDT-BI installed, so I can’t export the report and check the report definition.

I could run a trace and refresh the report data and see what pops there.  I would really hate to use Profiler though, and I really don’t want to go through the hassle of creating a server side trace to figure out the source data definitions.  Then the lightbulb moment occurs.  I already have an XE session that can trace the SQL queries just like Profiler.  Why not use a better tool to do the job and satisfy my curiosity at the same time?

So, go read that article, I wrote for SSG, to get the XE session definition in order to follow along a little better.  With that XEvent session running, I can refresh the report data and start to evaluate what is happening in order to produce the Audit report.

By this point, you have probably thought to yourself that the source of the data must be coming from the default trace.  If you thought that, then you are correct.  It is a good presumption based on a knowledge of what kind of data is being tracked in the default trace.  But I really wanted to know for certain what the source of the data was.  As I perused the XEvent session data, I noticed several queries similar to the following in the results.

XE_Capture

Yes, I am using the GUI (*shrug*) in this image to explore the data from the session.  That is just for display friendliness.  But, as you look through those queries you start to see a few patterns and some obvious signs that the source of the data is the default trace.  Here is a cleaner look at the code being used to get the data-set.

Now, I must confess that I altered it a bit to make sure it was doing what I thought.  Just a little thing like swapping a missing definition for the table variable for a quick insert into a temp table.  But the reality is, it is pretty much the source code of the report.  It is not the prettiest of code for the source, but it works.

Due to the lack of prettiness to the code and the seemingly overly complex means to get to the end result, I decided I wanted to “tune” it up a little bit.

Now, I have something that is easier to read and maintain (my opinion) and works well.  It can also be used easily enough in an RDL should you wish to create an additional report different from the canned report in SSMS.  Or just continue to use it from within Management Studio and check the results without the overhead of the report presentation.

Effects of Max Mem on Plan Cache

Reading a Microsoft Article (which can be found here) while making sure I understood what can cause a query plan to be recompiled or removed from the plan cache, I got to thinking a bit.

In the article, it lists the following as things that can cause a plan to be removed from cache or to be recompiled.

  • Execution plans remain in the procedure cache as long as there is enough memory to store them. When memory pressure exists, the Database Engine uses a cost-based approach to determine which execution plans to remove from the procedure cache. To make a cost-based decision, the Database Engine increases and decreases a current cost variable for each execution plan.
  • Changes made to a table or view referenced by the query (ALTER TABLE and ALTER VIEW).
  • Changes made to a single procedure, which would drop all plans for that procedure from the cache (ALTER PROCEDURE).
  • Changes to any indexes used by the execution plan.
  • Updates on statistics used by the execution plan, generated either explicitly from a statement, such as UPDATE STATISTICS, or generated automatically.
  • Dropping an index used by the execution plan.
  • An explicit call to sp_recompile.
  • Large numbers of changes to keys (generated by INSERT or DELETE statements from other users that modify a table referenced by the query).
  • For tables with triggers, if the number of rows in the inserted or deleted tables grows significantly.
  • Executing a stored procedure using the WITH RECOMPILE option.

The first item removes a plan from cache while the rest mark the plan is invalid and a recompile will be forced on the next execution of the SQL Statement.  But then the question comes, why make the distinction between the two?  Doesn’t removing a plan from cache in essence equate to a recompile further down the road when the query is executed again?  While semantically it is not a recompile, it sure sounds a lot like one.  The query was compiled, invalidated and removed from cache and then compiled again when executed the next time around.

When thinking about all of this, how can we see that memory pressure can “invalidate” query plans?  That is actually quite simple.  We can see the same net effects with the following experiment.

WARNING: DO NOT DO THIS ON ANY ENVIRONMENT THAT IS NOT A SANDBOX ENVIRONMENT.

In this setup, I have a Sandbox server with a simulated load that keeps plenty of plans in cache and SQL Server is happy with about eight gigabytes of memory allocated to it.  To show that memory can cause a bunch of plans to be invalidated (and removed from the plan cache), I am going to take memory away from SQL Server and in essence simulate a memory leak or memory pressure.

This shows that the settings have taken effect on SQL Server and all of this without a Service restart (I keep hearing that myth).

Max Memory in SQL Server

Seeing that the memory settings are taking immediate effect, we can turn our attention to that query that was run.  In the query to experiment with this, I took a count of the number of plans in the plan cache first, then changed the max memory, and then took a count of the number of plans in cache again.  This next image shows the results of those two plan count queries.  It is obvious here that the number of plans in cache was severely impacted by this change in Max memory when taking TOO much memory away from SQL Server (this is why you should only do this on a sandbox server).

Max Mem effect on Plan Cache

 

If you happen to take too much memory away from SQL Server, you could end up in a severe memory pressure situation where connections are limited.  Even trying to increase max memory could be severely hampered due to insufficient resources in the default resource pool.  Here is a blog post that demonstrates the effects on connections due to this error (even as bad as SQL Server not starting up).

Since the problem is in the default resource pool, you could try a connection to the DAC and modify the max memory there.  If the pressure is severe enough, you may be limited in your ability to connect to the DAC as well.  In that case, you may be in need of starting up in single user mode as referenced in that prior blog post.  Sometimes, it is enough to reduce the number of connections or resource requirements to regain access to the server to change the max memory appropriately.  Suffice it to say, do not run this on a production type of box, lest you are ready to cause a bit of an outage.  The intent here is solely to share that max memory can have an immediate impact on compiles and plans in cache.

Ghosts – an eXtrasensory Experience

ghostrip_fireThis is the last article in a mini-series diving into the existence of ghosts and how to find them within your database.

So far this has been a fun and rewarding dive into Elysium to see and chat with these entities.  We have unearthed some means to be able to see these things manifesting themselves in the previous articles.  You can take a look at the previous articles here.

For this article, I had planned to discuss another undocumented method to look into the ghost records and their existence based on what was said on an msdn blog.  But after a lot of research, testing and finally reaching out to Paul Randal, I determined that won’t work.  So that idea was flushed all the way to Tartarus.

Let it be made very clear that DBTABLE does not offer a means to see the ghosts.  Paul and I agree that the other article that mentioned DBTABLE really should have been referring to DBCC Page instead.

Despite flushing the idea to Tartarus, it was not a fruitless dive.  It just was meaningless for the purpose of showing ghosts via that DBCC command.  I still gained value from the dive!!

All of that said, the remainder of the plan still applies and it should be fun.

Really, at this point what is there that hasn’t been done about the ghosts?  Well, if you are well tuned to these apparitions, you may have received the urge to explore them with Extended Events – sometimes called XE for short.

As has been done in the past, before we board Charon’s boat to cross the River Styx to Hades to find these ghosts in Elysium, one really needs to run the setup outlined here.

With the framework in place, you are now ready to explore with XE.

Look at that! There are several possible events that could help us track these ghosts.  Or at the least we could get to know how these ghosts are handled deep down in the confines of Hades, err I mean the database engine.

Ghost_XE

 

From these possible events, I opted to work with ghost_cleanup and ghost_cleanup_task_process_pages_for_db_packet.  The sessions I defined to trap our ghost tracks are as follows.

You can see there are two sessions defined for this trip down the Styx.  Each session aptly named for our journey.  The first (GhostHunt) is defined to trap ghost_cleanup and sends that information to a histogram target.  The second (SoulSearch) is defined to use the other event, and is configured to send to the ring_buffer.  Since the second event has a “count” field defined as a part of the event, it will work fine to just send it to the ring buffer for later evaluation.

Once I have the traps, I mean event sessions defined, I can now resume the test harness from the delete step as was previously done in previous articles.  The following Delete is what I will use.

Prior to running that delete though, I checked the Event Session data to confirm a starting baseline.  Prior to the delete, I had the following in my histogram target.

 

predelete_count

 

After running the delete, and checking my histogram again, I see the following results.

post_count

 

You can see from this that in addition to the 25 pre-existing ghosts, we had another 672 ghosts (666 of which were from the delete).

This is how I was able to investigate the GhostHunt Extended Event Histogram.

But what about looking at the other event session?

Let’s look at how we can go and investigate that session first and then look at some of the output data.

ghostclean

 

Cool!  Querying the SoulSearch session has produced some information for various ghosts in the database.  Unlike the histogram session that shows how many ghosts have been cleaned, this session shows us some page ids that could contain some ghosts – in the present.  I can take page 1030111 for instance and examine the page with DBCC PAGE as follows.

 

 

pagealtLook at that page and result!! We have found yet another poltergeist.

RIP

Once again we have been able to journey to the depths of the Database engine and explore the ghosts that might be there.  This just happens to illustrate a possible means to investigate those ghosts.  That said, I would not necessarily run these types of event sessions on a persistent basis.  I would only run these sessions if there seems to be an issue with the Ghost cleanup or if you have a strong penchant to learn (on a sandbox server).

Some good information can be learned.  It can also give a little insight into how much data is being deleted on a routine basis from your database.  As a stretch, you could even possibly use something like this to get a handle on knowing the data you support.  Just be cautious with the configuration of the XE and understand that there could be a negative impact on a very busy server.  And certainly proceed at your own risk.

Database Ghosts

phantasmripRecently you may have seen my first article on this topic over at SQL Solutions Group.  If not, here is the link so you can read that article first before proceeding here.  This article is intended as a first of two part follow-up to that article.

Now that you have read that other article, we can recap it a little bit.

You will recall that in that article, I discussed the presence of Ghosts in your database.  I also discussed that those ghosts are a good thing.  They have some benefits to them, of which is a bit of a performance boost for some operations like the rollback of a delete.

In that article I discussed one method with which you could see these ghost records.  In this article, I would like to share an alternate method to be able to see these ghosts.

In order to explore this alternate means, let’s go ahead and follow the first few steps from the other article to get the setup complete so our second ghost hunting foray can begin.

Now with the setup complete, we should once again confirm that we have appropriate data available for the hunt.  Once again a simple query can suffice to show the data.

Upon execution of the check script, we should see something similar to the following data-set.

datasample

 

Great, the data is confirmed and we should have a wonderful chance to find some ghosts once again.  In the previous attempt, we needed an additional plasma blaster in the form of a trace flag.  In this attempt we will hunt these ghosts without that tool and see what we can find.  Why attempt it in this way?  Well, it is simply because I’d rather not use a trace flag if it is not necessary.  If you recall, that trace flag had a couple of noted effects.  One of the effects was that it turned off the ghost cleanup process.  If I can avoid it, I’d rather leave the containment unit in tact.

Now, due to that clerical error of putting Halloween on the wrong date, we need to follow the prescribed policy to delete records prior to creating the proper records.

Notice that we issued the delete in a transaction this time around.  We are going to leave that transaction open for a bit while we hunt those ghosts.

The first step is to verify that some ghosts might be present.  To do that we should run a query such as the following.

From that query we should see something like the following.

idxstats_ghosts

Once again we are on the right track.  We can further confirm the existence of these ghosts through a little more investigation and monitoring.  Let’s try the trick with the dblog function again.

With that query, we should see something like the following result set.

dblog_ghosts

 

Very cool.  We once again can see that these phantasms are in the database.  We have enough information that we can proceed on to the next stage.  We can pass the PageID into DBCC PAGE in order to investigate the ghosts on the page.  If we use the PageID that is circled with the green in the preceding result set, we can get a better feel for these specters.

And we may see results such as the following.

page_ghosts2

Recall that the log says this page has ghost records on it.  When we check the page with DBCC PAGE we can definitely see that there are ghosts on the page.  This is very cool.  Now, had we tried to check for ghost records on the PFS page we would not be able to see the ghost count like we were able to see by enabling the TF.

Once again we have been able to display the existence of ghosts in the database.  In order to get these ghosts to move on from the database to their afterlife, we merely need to commit the transaction or roll back the transaction.

Stay tuned for the next article in this mini-series about ghosts.  Who knows, we may even have a bonus Halloween article since this is Halloween month.

Part III of the series can now be found here.

Effects of sp_rename on Stored Procedures

There comes a time when mistakes are made.  Sometimes those mistakes can be as annoying as a spelling mistake during the creation of a stored procedure.  When a mistake such as that happens, we are given a few choices.  One could either rename the stored procedure, drop and recreate the stored procedure or simply leave the mistake alone.

When choosing to rename the stored procedure, one may quickly reach for the stored procedure that can be readily used for renaming various objects.  That procedure was provided by Microsoft after-all and is named sp_rename.  Reaching for that tool however might be a mistake.  Here is what is documented about the use of sp_rename to rename a stored procedure.  That documentation can be read at this link on MSDN.

We recommend you do not use this statement to rename stored procedures, triggers, user-defined functions, or views; instead, drop the object and re-create it with the new name.

And later in the same documentation, one can read the following.

Renaming a stored procedure, function, view, or trigger will not change the name of the corresponding object name in the definition column of the sys.sql_modules catalog view. Therefore, we recommend that sp_rename not be used to rename these object types. Instead, drop and re-create the object with its new name.

Now, a chief complaint against dropping and recreating the stored procedure, as recommended, is that process can cause permissions issues.  I am less concerned about the permissions issues and see that as more of a nuisance that is easily overcome due to great documentation and a few quick script executions to restore the permissions.  Despite that, I think we might have a means to address the rename and permissions issue that will be shared later in this article.

Using sp_rename

When using sp_rename, it would be good to understand what happens and what one might expect to see.  Let’s use the following script to create a stored procedure to step through an exercise to rename a stored procedure and evaluate the results.

When I execute that series of batches, I will get an output that matches the following.

renameme

 

When looking at the results we can see that the use of sp_rename does indeed change the name of the stored procedure as it is represented via sys.objects and metadata.  We can also see that the definition of the stored procedure does not change as it is held within the metadata.

If I choose to check the definition through the use of OBJECT_DEFINITION()  instead of sys.sql_modules, you will be pleased to know that sys.sql_modules calls OBJECT_DEFINITION() to produce the definition that is seen in the catalog view.

Well, that does pose a potential problem.  We see that the object definition is unchanged and may report the name as being different than what the object name truly is.  What happens if I execute the stored procedure?  Better yet, if I can execute the stored procedure and then capture the sql text associated to that plan, what would I see?

Yes!  The renamed stored procedure does indeed execute properly.  I even get three results back for that execution.  Better yet, I get an execution plan which I can pull a plan_hash from in order to evaluate the sql text associated to the plan.  In case you are wondering, the execution plan does contain the statement text of the procedure.  But for this case, I want to look at the entire definition associated to the plan rather than the text stored in the plan.  In this particular scenario, I only see the body of the procedure and not the create statement that is held in metadata.

plan

For this particular execution and plan, I can see a plan_hash of 0xE701AFB2D865FA71.  I can now take this and provide it to the following query to find the full proc definition from metadata.

And after executing that query, I can see results similar to the following.

execplancache_text

 

Now is that because in some way the query that was just run was also running OBJECT_DEFINITION()?  Let’s look at the execution plan for both OBJECT_DEFINITION() and the query that was just run.

obj_def_plan

 

Looking at the XML for that particular plan and we see xml supporting that plan.  There is no further function callout and the plan is extremely simple.

Now looking at the plan for the query involving the query_plan_hash we will see the following.

fngetsql

 

Looking at this graphical plan, we can see that we are calling FNGETSQL.  Looking at the XML for this plan, we can verify that FNGETSQL is the only function call to retrieve the full sql text associated to this plan execution.  FNGETSQL is an internal function for SQL server used to build internal tables that might be used by various DMOs.  You can read just a bit more about that here.

What now?

After all of that, it really looks pessimistic for sp_rename.  The procedure renames but does not properly handle metadata and stored procedure definitions.  So does that mean we are stuck with drop and create as the Microsoft documentation suggests?

If you have access to the full procedure definition you could issue an alter statement.  In the little example that I have been using, I could issue the following statement.

After executing that script, I could check sys.sql_modules once again and find a more desirable result.

And my results…

finallymatching

 

If you don’t have the text to create the proc, you could use SSMS to script it out for you.  It is as simple as right-clicking the proc in question, selecting modify and then executing the script.  It should script at with the correct proc name (the beauty of SMO) and then you can get the metadata all up to snuff in your database.

Of course, if you prefer, you could just drop and recreate the procedure.  Then reapply all of the pertinent permissions.  That is pretty straight forward too.

Scaring a Database Near you

Comments: 1 Comment
Published on: October 31, 2013

r2

Something I have a hard time by-passing is a good R2 unit.  I have R2 units in so many different forms, including a standing R2 cake one year for my birthday.  So when I cam across this R2 unit, I just had to share it.

That is a pumpkin carved into the resemblance of R2-D2.  I think it is a mighty fine job too.  It’s amazing how many good Star Wars related pumpkin carvings there are out there.  You probably wouldn’t have too difficult a time finding three or four hundred if you tried a little google-fu.

Each year I try to have something for the Halloween Holiday such as this one or this one.  I failed to provide something in 2012, and this is getting back on the right track.

Despite the ease to find haunting Halloween effects related to SQL Server, I am amazed at how few have even heard of “Halloween Protection” which stems from the “Halloween Problem.”

I am not going to dive into the problem or the protection of it.  I think that has been covered plenty and even quite masterfully by Paul White (blog | twitter).  I recommend that you read his four part series on the topic starting here.

With all of the COSPLAY going about here in the States, I find some of the scarier things to be about either stuff I have previously fixed or about which I have written or tweeted or all of the above.

Take for instance this article about the blatant disregard by some vendors and clients in regards to security.  I still can’t figure out why the public role would ever need to be dbo for a database – at least not a legitimate reason.

Or we can take on the recent time I tweeted about a cursor that I fixed.  I took that scary cursor down from a 30+ hour run time to a mere 50 seconds.  Here is a segment of the execution plan (plan is roughly 4mb in size to give a little scale) zoomed out to 1/5th.

optimized_segment

 

The query was much uglier than that originally.  Imagine that beast looping through on your server for 30 hrs, and that is not even the entire thing.  It is little wonder why things started to drag on the server.

Another scary item I like is the effect of implicit conversions.  That is a topic that can be viewed pretty easily through the use of google-fu.  Here is a short demo on the implications of implicit conversions.

[codesyntax lang=”tsql”]

[/codesyntax]

In this demo I have created three temp tables.  Each is pretty simple in nature and each is to receive 10,000 records.  The insert statement just inserts an integer into each field of each table through the while loop.  Notice that I intentionally named a column in #T3 to be SomeReal but the datatype is an NVARCHAR.  This is to underscore a pet peeve of mine that I have seen over and over again – naming the field in the table after the datatype and the datatype doesn’t even match.

When this query runs, I get the following timing results.

timing

 

The thing that stands out to me is the huge difference in time between the implicit-free query and the query replete with an implicit conversion.  The implicit conversion query

grim

was about 930 times slower than the query free of implicit conversions.  Granted that query was against a cold cache, so let’s see what happens to an average of five runs each against a warm cache.

With a warm cache I see an average of 51ms for the implicit free query.  On the other hand, the implicit conversion query runs at an average of 84525ms.  That equates to about 1644 times slower.

Sure this was a contrived example.  But keep in mind the table sizes, the datatypes and the number of records in each table.  Had this been a more true to life example with larger tables and millions of records, we could be seeing a query that is far more devastating due to the implicit conversions.  Let’s just call it the grim reaper of your database. (Grim Reaper from www.mysticalpassage.com/grim_reaper.html)

With these horrifying things to haunt your database, I leave you with this new hope as you battle the dark side and grim that is in your database.

swbattle

Database In Recovery

Comments: 4 Comments
Published on: June 4, 2012

What do we do?

Have you ever run into a database that is in the “In Recovery” state?

If that has happened, have the bosses and/or endusers come to you asking “What do we do?” or “When will it be done?”.  They probably have – it is inevitable.

The question is, what do you do when you run into a database that is in this state?

We all know that it doesn’t help much if we are panicked about the issue – that just feeds the already growing anxiety.  If you feel anxiety – that’s OK, just don’t show that to the endusers or to the boss.  You need to portray to them that you are on top of the issue.

While trying to keep everybody calm and apprised of the situation, you would probably like some assurances for yourself that the database is progressing to a usable state.  That is what I want to share today – a little query that I wrote for this very instance.

Anxiety Tranquilizer

[codesyntax lang=”tsql”]

[/codesyntax]

Unfortunately, this query does not demonstrate the time remaining for the rollback nor the percent complete without needing to query the error log.  Those would be awesome additions if you know how to do it (and let me know), other than via the error log.  Thanks to a blog post by Tim Loqua for the base info on querying the error log for the percent complete.

I think the key component on this query is the LEFT OUTER JOIN to sys.dm_tran_active_transactions.  This is essential since the recovery is shown in two transactions.  One transaction is numbered and is the placeholder for the un-numbered transaction where the work is actually being done.  In the numbered transaction, you should see a transaction name of “Recovery Allocation Locks” and nothing for the unnumbered transaction.

Now, unnumbered is not entirely accurate because that transaction has an id of 0, but you will not find a correlating transaction for that in the sys.dm_tran_active_transactions DMV.

The transactions displayed here will be displayed until recovery is complete.  That also means that if you really wanted to, you could create a table to log the recovery process by inserting the results from this query into it.  Then you could revisit the table and examine at closer detail what happened during recovery.

The anxiety killer from this query is to watch two columns in the unnumbered transaction.  These columns are database_transaction_log_record_count and database_transaction_next_undo_lsn.  I rerun the query multiple times throughout the process of recovery.  I check those columns to ensure the data in them is changing.  Changing results in those fields means that you are seeing progress and can provide some comfort by seeing actual progress (even though we know in the back of our head that it is progressing).

Move the T-log file of the Mirror Database

Categories: News, Professional, SSC, SSSOLV
Comments: 2 Comments
Published on: April 2, 2012

On more than one occasion, I have run into a case where a large database with a large transaction log was being mirrored.  Then a hiccup comes along – the transaction log on the primary server grows to a point where the transaction log on the secondary server fills the drive.  Now you have a problem and you need to reclaim some space or you need to find more disk.

Well, just because the database is mirrored and the drive is out of space, doesn’t mean there is nothing that can be done without impacting significantly the primary server – nor the mirror.

SQLCat has a nice write-up on what can be done.  I have used this method a few times, and felt that it needs to be brought up.  You can read the SQLCat article here.

In short (here are the steps from that article), you can do the following:

  1. On the secondary server, Use ALTER DATABASE MODIFY FILE to move the log file.

ALTER DATABASE <db_name> MODIFY FILE (NAME = LOG_FILE, FILENAME = ‘new location’).

  1. Stop the SQL Server Service for the instance which has the mirrored (secondary) database.
  2. Move the log file to the new location specified in the Modify File script already run.
  3. Start the SQL Server Service for the instance which has the mirrored (secondary) database.

Performing these steps can be just the trick needed to save the day.  Performing this move this way has saved me an outage on more than one occasion.  Also, this has saved me hours of work that could come along with having to break and rebuild the mirror.

«page 1 of 2






Calendar
May 2015
M T W T F S S
« Apr    
 123
45678910
11121314151617
18192021222324
25262728293031
Content
SQLHelp

SQLHelp


Welcome , today is Monday, May 4, 2015