IP and Default Trace…T-SQL Tuesday #005

Comments: 9 Comments
Published on: April 6, 2010

As chance would have it, I had been checking Adam’s blog daily for the last few days to find the next T-SQL Tuesday.  Not having seen it, I started working on my next post yesterday evening.  The fortunate thing is that the post I was working on fits well with this months topic.  T-SQL Tuesday #005 is being hosted by Aaron Nelson.  And now I have one more blog to bookmark since I didn’t have his bookmarked previously.

This month the topic is Reporting.  Reporting is a pretty wide open topic but not nearly as wide open as the previous two months.  Nonetheless, I think the topic should be handled pretty easily by most of the participants.  My post deals with reporting on User connections.  I have to admit that this one really just fell into my lap as I was helping somebody else.

The Story

Recently I came across a question on how to find the IP address of a connection.  From experience I had a couple of preconceptions on what could be done to find this information.  SQL server gives us a few avenues to be able to find that data.  There are also some good reasons to try and track this data as well.  However, the context of the question was different than I had envisioned it.  The person already knew how to gain the IP address and was already well onto their way with a well formed query.  All that was needed was just the final piece of the puzzle.  This final piece was to integrate sys.dm_exec_connections with the existing query which pulls information from trace files that exist on the server.

After exploring the query a bit, it also became evident that other pertinent information could prove quite useful in a single result set.  The trace can be used to pull back plenty of useful information depending on the needs.  Here is a query that you can use to explore that information and determine for yourself the pertinent information for your requirements, or if the information you seek is even attainable through this method.

Now, I must explain that there is a problem with joining the trace files to the DMVs.  The sys.dm_exec_connections maintains current connection information as does the sys.dm_exec_sessions DMV.  Thus mapping the trace file to these DMV’s could be problematic if looking for historical information.  So now the conundrum is really how to make this work.  So that the data returned in the report, some additional constraints would have to be placed on the query.  But let’s first evaluate the first go around with this requirement.

Query Attempts

While evaluating this query, one may spot the obvious problem.  If not seen at this point, a quick execution of the query will divulge the problem.  SPIDs are not unique, they are re-used.  Thus when querying historical information against current running information, one is going to get inaccurate results.  Essentially, for this requirement we have no better certain knowledge what the IP Address would be for those connections showing up in the trace files.  The IP Addresses of the current connections will cross populate and render inaccurate results from the historical information.

My next step gets us a little closer.  I decided to include more limiting information in the Join to the sys.dm_exec_connections view.  The way I determined to do this was that I needed to also include the loginname as a condition.  Thus in order to get that, I need to include sys.dm_exec_sessions in my query.  To make it all come together, I pulled that information into a CTE.  Here is the new version.

The information pulled back is much cleaner now.  But wait, this mostly reflects current connections or connections from the same person who happens to have the same SPID as a previous time that person connected.  Yes, an inherent problem with combining historical information to the current connection information in the server.

Solution

My recommendation to solve this need for capturing IP address information along with the person who connected, their computer hostname, and the time that they connected is to do so pre-emptively.  (I know this diverges from the report for a minute, but is necessary to setup for the report).  A solution I have implemented in the past is to use a logon trigger that records the pertinent information to a Table.

Tables

Trigger

This solution gets the IP Address from the ClientHost Field of EventData() which is where the logon occurred.  That should be enough to trace back to the source of the connection.  Also note that the application was not included in this, that is information that could be gained from the trace files.  What this does is create a structure for logging logon events as well as prevent logons from unauthorized sources using an account that has been compromised (while logging that action as well).  With this, I now have a solution that complements the trace file.  I could query both and create a more accurate report of the logon activity that has occurred historically (depending on retention policy etc.).  This last method is somewhat limited by the EventData function and the xml schema that comes with it.  You can check it out here.

Reports

So here are a couple of parting solutions that would be able to give us the relevant data for tracking these historical logons.

The above is a simple query to pull back the information from both the violations and successful logon audit tables.

The following is a solution for combining the previous requirement (as I have translated it to truly correlate to Host rather than IP).

The last such report that I wish to share related to this topic is to find current activity.  Since the original idea was to combine the trace with the DMV, I decided that I needed to at least look at the DMV for a relevant query.  These DMV’s provide current activity and need to be combined to provide a quick snapshot of the activity that is occuring on the server at a given moment.  One quick snapshot can tell me who is logged in, from what IP Address, the application, read and write activity, and what the SQL is that is being executed.

There are several queries out there similar to this last one.  The idea is to capture a quick glimpse based on the DMVs which in essence are a report of a subset of server relevant information.

Conclusion

When it is necessary to provide reports on activity occurring on the server, it pays to do a little prep work.  Be Prepared.  It is not an easy task to be able to go back in time and report on data that isn’t captured.  The little prep work that one may need to do is well worth the effort in the end.

Edit:  Corrected 1 detail and fixed some formatting.

SQL Servers Black Box Recorder – Def Trace

Across many professions and industries there is often a need to have some sort of device that “audits” everything that happens with a device or process. We are probably all quite familiar with the infamous black box recorders used by the passenger airline and train industries. It is also quite possibly fairly common knowledge that many pace makers perform the same sort of functionality – on a smaller scale. Various metrics and data points are registered and recorded somewhere. Whether to a local storage device or whether the device phones home to some other remote location, it is recorded.

We often pontificate about the what-ifs for a black box recorder within SQL Server, right? We wish and dream and go about creating our own little recorder to suit the requirements we deem necessary for successful monitoring of the server for the just-in-case scenario. Well, the truth of the matter is that we really don’t need to go to such great lengths to create a “black box recorder” because Microsoft has already done all of that for us.

Wait, what? Yes, that’s right! Truth be told this shouldn’t be much news for most Senior Level data professionals working with SQL Server for the past few years. But if you are new to the product, this might be good news for you. Then again, it might just be bad news depending on your take.

Very much like what you may find with the airline industry, the black box recorder in SQL Server is not just a single method (device) implemented to capture all of the desired data points. On a passenger jet, you may find that there are three or more such devices that contribute to the capture and recording of the in-flight data. In SQL Server, there are three major processes that help capture our in-flight data. Over the next few articles I will discuss each of these processes. These processes include:

  • Default Trace
  • system_health Extended Event Session
  • sp_server_diagnostics procedure

This multi-tiered approach does provide a sort of fail-safe system. If one should be disabled, there still remain up to two more processes that may be running. That said, each of these can be disabled but it does take a bit of an overt effort. And since it does require somebody to put out effort to try and disable each of the black box components, you could potentially capture the culprit via audits from the other components or via an explicit audit that you create for these types of purposes.

Since this will be something to discuss over a few articles, I will break out each process into an individual article. For today, I will discuss the default trace.

Default Trace

The default trace by itself is something that can be turned off via configuration option. There may be good reason to disable the default trace. Before disabling the default trace, please consider the following that can be captured via the default trace. I will use a query to demonstrate the events and categories that are configured for capture in the default trace.

In this query, I have requested a few more data points than necessary to illustrate the point. That is to help illustrate an additional point that the default trace isn’t filtering out any data for these events. If the event fires (in this trace), it is recorded. Let’s divert right back to the events and categories for now. The results of that query will produce the following sample list of events for me on SQL Server 2014:

That is quite a range of events covered by this particular recording device. From changes to objects down to security related events and even errors and warnings. If somebody drops an object, the default trace can catch it. If one of the various DBCC statements is executed, it will be trapped in this trace.

There is one thing that is captured by this trace that is not overly obvious. In fact, it is rather annoying in how it is recorded in my opinion. Server configuration settings such as “cost threshold of parallelism” or the lesser utilized “user options” are not registered as an “Object:Alter” event but rather as an ErrorLog event. I wouldn’t call this type of “change” an error and don’t necessarily like seeing these registered with other legitimate errors. That said, it would be nice to see these logged differently (they are currently logged the same way with Extended Events). So, if somebody is dorking around with server configurations, it becomes a bit more convoluted to figure it out, but we can get there. Let’s see how that works:

Looking at my system, I get the following sample results:

You can see there are plenty of settings that I have been playing with captured by the default trace. If you happen to have a junior DBA or maybe a cowboy DBA that loves to shoot from the hip and make changes, this will help you find all the necessary details to bring to that DBA for discussion.

The default trace comprises one third of the black box recorder. I have shown a quick use for the default trace and have also shown some of the various events that are captured from this trace. I recommend getting to know your default trace just a little bit better. You never know when you may need to resort to the flight recorder data held in the default trace. Being familiar with the default trace before you need to use it will help improve your comfort level when under the stress of trying to figure out what happened just before the server went belly up!

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the fourth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Capture the Flag – The Trace Flag

Many people work in a tightly controlled environment. Every change requires a change control form and approval to make a change. In some environments, things are more lax, but you still wish you could tell when certain changes were made. That is even true in some of those tightly controlled environments. You know what I am talking about. Despite the controls in place there always seems to be some change that is unknown or even some cowboy that makes changes as s/he sees fit and forgets to notify people or follow process.

Then come the times when you are new to an environment and you really need to know some basic information about the servers you have just inherited. Well, truth be told, you don’t necessarily have to be new to a job to inherit some new servers. Sometimes, blackbox servers end up getting dropped into your realm of responsibility.

When you run into something like this, you really need to have some means to figure out what changed, when it changed and who changed it. Having that information as a base, you would then have the tools to inquire as to why it might have been changed. One of the things that would be nice to know is around the trace flags that can be enabled in SQL Server. There are trace flags (or knobs) for so many different things within SQL Server that it would be nice to get a handle on them if they do happen to be enabled or even disabled.

As a DBA in SQL Server, you should know by now that there are multiple different ways to do all sorts of things within SQL Server. Today we will discuss three different methods that you could use to perform this investigation for yourself.

Report Style

Did you know that Management Studio comes with a dashboard report? It does! In that report, one will find a pretty decent amount of overview information as a first pass to get to know your server. If you were to right click the instance within SSMS, you can browse to the Reports/Standard Reports submenus. From there you can select the Dashboard report. The path to the reports is similar to what is displayed in the following screen capture.

The report that is of interest is circled in red in the preceding image.

If you have any trace flags enabled (globally) you can see those listed in the report similarly to the following image:

Once again, I have encircled the relevant data in red. From this instance that I queried, I currently have one trace flag enabled. As it stands, this particular trace flag is the one that disables the noise of successful backups from being recorded to your log.

This is one particularly easy method to retrieve that information. A benefit from this method is that you will be able to see some graphs and charts and so forth by using the report.

SQL Query

As logic would dictate, it would seem that we should be able to trap that information through the use of our most frequently used tool – TSQL. Case in point is that the report itself requires the use of TSQL to produce the dataset for the nice display we would see when the report is rendered.

Using similar code to what is used to produce the Dashboard report, we can get a decent understanding of the trace flags that might be employed on your particular instance.  Let’s take a quick look at what we could do.

As you can see, that code is terribly simple. The crux of the code boils down to a DBCC command to return the results of TRACESTATUS. Granted, this is a bit over the top and designed to dump the data into a table for later consumption (for good reason). You can legitimately get away with simply running DBCC TRACESTATUS. The results of this would produce the following for the same instance of SQL Server shown in the Server Dashboard report we already explored. Those results look like the following.

queryout

So far, so good. If I had several trace flags enabled globally on this instance then I would see every single one of them in the result set just displayed. Cool!

To this point, this is wonderful for anything that might be currently enabled on the instance. Should any of those trace flags be disabled shortly after being enabled, or even be disabled months after being enabled but before you run either of the two preceding methods, then you are up a creek without a paddle. That is a problem. How would you trap that cowboy sysadmin or cowboy DBA in that case? Well, you could go with a full blown audit. Or you could try a similar approach but on a somewhat smaller scale.

If you are paying close attention, you are probably thinking to yourself that the query just demonstrated seems more complex than it need be.  After all, the documentation for DBCC TRACESTATUS says that I can run that command with a -1 parameter and just have it return the global trace flags.  And in the script I am showing, I filter on Global = 1.  Well, as it turns out, the -1 option in DBCC TRACESTATUS does not work as documented.  The -1 parameter does not return the appropriate result set and that is reproducible.  If you were curious, the MSDN reference is here.  The use of that parameter value still returns session flags as well.

Extended Event

Oh my, we are talking about something really abstract and difficult now! No, not really.  This is a pretty straightforward approach and easy enough to implement. Within XE, there is an event called trace_flag_changed. If you were curious, you could use a query such as the following to find any events that might be related. The key is finding the correct key terms to find what you need/want.

Running that query will return just one result. You might find a few more if you expand your search criteria to include the term “flag” within the description field. It’s up to you to explore a bit more. As the description and name imply for this particular event, you can presume correctly that it tracks when a TF is either enabled or disabled (the state of the TF changes). Let’s call this a mini audit for a very targeted type of event.

Knowing the name of the event, let’s go ahead and look at a possible setup for an event session to trap data:

This gets us to a point where we can now trap all changes to a trace flag status (enabled/disabled). At this point, a bit of a caution is that the event appears to fire twice for every time it occurs. There is one case where it will not produce a double result for each change. That case happens to be if the TF is already set to the same setting to which you are attempting to change it to. So if it is enabled, trying to enable it again just records one event to fire. However, trying to enable or disable the TF when it is in the opposite state, you will see two occurrences of the event firing. Here is a sample output to help show what happens.

xe_results Trace Flag

Displayed in the preceding image, you will notice both scenarios. In the scenario where the TF was not enabled, and I attempted to enable it, you will see two entries. And then when I disabled that TF after enabling it, I once again got a double entry. A corresponding example is encircled by the green. Now, when I tried to disable a TF that was not currently enabled, you see there is a single entry. That occurrence is marked via the orange circle.

A problem does present itself here with this result set and the use of XE.  How in the heck did I get the results from the XE Session so I could peruse it? Here is a sample script that will help parse this session data.

Now isn’t that really helpful?

Conclusion

Today I covered three different means to help isolate the trace flags in use or the changes to the state of those trace flags. This is good information that any DBA should want to know. It is information to help get an understanding of the health and operating environment of your instance. It is also a means to help identify what might be changing under your nose!

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Need a little help with extensive audits of your servers, check out these articles.

Common Tempdb Trace Flags – Back to Basics

Once in a while I come across something that sounds fun or interesting and decide to dive a little deeper into it. That happened to me recently and caused me to preempt my scheduled post and work on writing up something entirely different. Why? Because this seemed like fun and useful.

So what is it I am yammering on about that was fun?

I think we can probably concede that there are some best practices flying around in regards to the configuration of tempdb. One of those best practices is in regards to two trace flags within SQL Server. These trace flags are 1117 and 1118. Here is a little bit of background on the trace flags and what they do.

A caveat I have now for the use of trace flags is that I err on the same side as Kendra (author of the article just mentioned). I don’t generally like to enable trace flags unless it is very warranted for a very specific condition. As Kendra mentions, TF 1117 will impact more than just the tempdb data files. So use that one with caution.

Ancient Artifacts

With the release of SQL Server 2016, these trace flags were rumored to be a thing of the past and hence completely unnecessary. That is partially true. The trace flag is unneeded and SQL 2016 does have some different behaviors, but does that mean you have to do nothing to get the benefits of these Trace Flags as implemented in 2016?

As it turns out, these trace flags no longer do what they did in previous editions. SQL Server now pretty much has it baked into the product. Buuuuut, do you have to do anything slightly different to make it work? This was something I came across while reading this post and wanted to double check everything. After all, I was also under the belief that it was automatically enabled. So let’s create a script that checks these things for me.

Holy cannoli batman – that is more than a simple script, right? Well, it may be a bit of overkill. I wanted it to work for version before and after and including SQL Server 2016 (when these sweeping changes went into effect). You see, I am checking for versions where the TF was required to make the change and also for versions after the change where the TF has no effect. In 2016 and later, these settings are database scoped and the TF is unnecessary.

The database scoped settings can actually be queried in 2016 more specifically with the following query.

In this query, I am able to determine if mixed_page_allocations and if is_autogrow_all_files are enabled. These settings can be retrieved from sys.databases and sys.filegroups respectively. If I run this query on a server where the defaults were accepted during the install, I would see something like the following.

You can see here, the default settings on my install show something different than the reported behavior. While autogrow all files is enabled, mixed_page_allocations is disabled. This matches what we expect to see by enabling the Trace Flags 1117 and 1118 – for the tempdb database at least. If I look at a user database, I will find that mixed pages is disabled by default still but that autogrow_all_files is also disabled.

In this case, you may or may not want a user database to have all data files grow at the same time. That is a great change to have implemented in SQL Server with SQL 2016. Should you choose to enable it, you can do so on a database by database basis.

As for the trace flags? My query checks to see if maybe you enabled them on your instance or if you don’t have them enabled for the older versions of SQL Server. Then the script generates the appropriate action scripts and allows you to determine if you want to run the generated script or not. And since we are changing trace flags (potentially) I recommend that you also look at this article of mine that discusses how to audit the changing of trace flags. And since that is an XEvent based article, I recommend freshening up on XEvents with this series too!

The Wrap

In this article I have introduced you to some basics in regards to default behaviors and settings in tempdb along with some best practices. It is advisable to investigate from time to time some of these recommendations and confirm what we are really being told so we can avoid confusion and mis-interpretation.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Changing Default Logs Directory – Back to Basics

Every now and then I find a topic that seems to fit perfectly into the mold of the theme of “Back to Basics”. A couple of years ago, there was a challenge to write a series of posts about basic concepts. Some of my articles in that series can be found here.

Today, my topic to discuss is in regards to altering the default logs directory location. Some may say this is no big deal and you can just use the default location used during install. Fair enough, there may not be massive need to change that location.

Maybe, just maybe, there is an overarching need to change this default. Maybe you have multiple versions of SQL Server in the enterprise and just want a consistent folder to access across all servers so you don’t have to think too much. Or possibly, you want to copy the logs from multiple servers to a common location on a central server and don’t want to have to code for a different directory on each server.

The list of reasons can go on and I am certain I would not be able to list all of the really good reasons to change this particular default. Suffice it to say, there are some really good requirements out there (and probably some really bad ones too) that mandate the changing of the default logs directory to a new standardized location.

Changes

The logs that I am referring to are not the transaction logs for the databases – oh no no no! Rather, I am referring to the error logs, the mini dumps, and the many other logs that may fall into the traditional “logs” folder during the SQL Server install. Let’s take a peek at a default log directory after the install is complete.

I picked a demo server that has a crap load of stuff available (and yeah not so fresh after install) but where the installation placed the logs by default. You can see I have traces, default XE files, some SQL logs, and some dump files. There is plenty going on with this server. A very fresh install would have similar files but not quite as many.

If I want to change the Log directory, it is a pretty easy change but it does require a service restart.

In SQL Server Configuration Manager, navigate to services then to “SQL Server Service”. Right click that service and select properties. From properties, you will need to select the “Startup Parameters” tab. Select the parameter with the “-e” and errorlog in the path. Then you can modify the path to something more appropriate for your needs and then simply click the update button. After doing that, click the ok button and bounce the SQL Service.

After you successfully bounce the service, you can confirm that the error logs have been migrated to the correct folder with a simple check. Note that this change impacts the errorlogs, the default Extended Events logging directory, the default trace directory, the dumps directory and many other things.

See how easy that was? Did that move everything over for us? As it turns out, it did not. The old directory will continue to have the SQL Agent logs. We can see this with a check from the Agent log properties like the following.

To change this, I can execute a simple stored procedure in the msdb database and then bounce the sql agent service.

With the agent logs now writing to the directory verified after agent service restart as shown here.

At this point, all that will be left in the previous folder will be the files that were written prior to the folder changes and the service restarts.

The Wrap

In this article I have introduced you to an easy method to move the logs for SQL Server and the SQL Server Agent to a custom directory that better suits your enterprise needs. This concept is a basic building block for some upcoming articles – stay tuned!

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Default Sessions – Back to Basics

Comments: No Comments
Published on: September 1, 2016

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Default Sessions

SQL Server is full of some pretty cool stuff. Some of it is rather basic. Some of it is a bit complex. Whether basic or complex, there is a lot that seems to get overlooked because it just might be hidden from plain sight or maybe it is overlooked just because it is already done for you.

Today, I am going to talk about some of the stuff that is already done for you. It is pre-baked and ready to roll. Since it is pre-baked, it may be a good idea to get familiar with it and understand what kinds of data it may offer to you when troubleshooting.

defaultsessionIt is still somewhat surprising to me to discover how many people have never heard of the default trace in SQL Server. Slightly less surprising is how many people have never heard of the default Extended Events system_health session. What if I told you there were a bunch more default sessions than the default trace or system_health? I wrote about some of those default sessions previously.

With the evolution of SQL Server, there is an evolution in Extended Events. New features in SQL Server also means there may be some additional default sessions. I want to touch on some of these other default sessions in brief.

Default Sessions

Phone Home

First, lets start with an easy to find default session – telemetry_xevents. This session is a visible session and behaves like a user defined session with the caveat that it is a session installed with SQL Server 2016 and it is a system default session. I have another post queued up to discuss this session in a bit more detail so will just keep it brief here.

This session is there to basically capture specific data and phone it home to Microsoft. You can presume that based on the definition of the word telemetry

“is an automated communications process by which measurements and other data are collected at remote or inaccessible points and transmitted to receiving equipment for monitoring.”

Before you get your tin-foil hat in a crumple over sending data back to the mother-ship, understand that this data is not sensitive data. The type of data being gathered is usage metrics. Think of it in terms of gathering data about how many times a specific feature was used or how many times a group of errors occurred within the environment. This is in an effort to help improve the product.

While the session data is designed to be sent back to Microsoft, it could also be of use to the Enterprising DBA who is looking to get a better grasp of the environment or to troubleshoot various issues.

It’s a Stretch

With the release of SQL Server 2016 there is another default session that gets installed. This session is probably the most difficult to find. This is a private hidden session when it is installed. In addition, it is not present by default with SQL 2016 RTM but it was present by default with various CTP versions. The session is specific to the new stretch feature and is called rdaxesession.

Breaking down this particular session reveals pretty easily what feature the session is designed to support. The name rdaxesession breaks down to the following: rda = remote data archive, and then the rest is self explanatory. If I investigate the metadata for the session I will discover that the following events are traced in the session:

stretch_database_enable_completed
stretch_database_disable_completed
stretch_table_codegen_completed
stretch_table_remote_creation_completed
stretch_table_row_migration_results_event
stretch_table_validation_error
stretch_table_unprovision_completed
stretch_index_reconciliation_codegen_completed
stretch_remote_index_execution_completed
stretch_database_events_submitted
stretch_table_query_error

Well, that certainly clears it up! This session is all about the stretch feature. If you are on RTM and don’t use the stretch feature, then you won’t see this session even if you do query the metadata.

Don’t be a HADR

The last of the default sessions I want to cover both can be combined into a category about HA and DR – loosely at least. One of the sessions deals with backups and the second of the sessions deals with clusters.

When dealing with a cluster, it is important to know about the hidden trace file that records events related to the cluster. The session target data can be found in “%PROGRAMFILES%\Microsoft SQL Server\MSSQLxx.MSSQLSERVER\MSSQL\Log” by default and cannot be seen within SQL Server metadata at all (not that I have been able to find). For a few details about how this trace works, I recommend reading this article. This session can be called the SQLDiag session.

The session related to backups is actually broken down into two log files and is similar in nature to the SQLDiag session. This is true in that the session is hidden, is found in the same directory, is read in the same fashion, and is equally as hidden from within SQL Server as the SQLDiag session. The backup session deals specifically with managed backup.

What is managed backup? Well, it is a type of backup performed to Azure and which relies upon the “smart_admin” procedures, functions and process.

If you don’t perform managed backups or you don’t have a cluster configured, you will not need either of these sessions (and potentially not see the smart_admin files depending on SQL Server version). If you have either of these features, you will definitely want to ensure you become acquainted with them and how to retrieve the data.

Recap

There are plenty of mechanisms in place to help troubleshoot some of the more complex features of SQL Server. Getting to know the tools will prepare you for that moment when problems arise and you are under the gun.

I did not dive deep into any of these sessions holding that for later articles. I wanted to keep this article to an introductory level. If you are interested in the more detailed articles, please stay tuned for those upcoming articles.

If you are interested in reading more about Extended Events, I recommend reading my series which can be found here.

Database Offline – Redux

I have written a couple articles showing how to audit database offline events via the default trace as well as via Extended Events. These are great for a more proactive approach. Sometimes the proactive approach is just not plausible for the given circumstances. A good example of this is when you inherit servers (e.g. new client, new project, new job) and there are databases on the server that were taken offline prior to your inheritance.

In a situation like this, you need to find out when the databases were taken offline, but your somewhat limited in data available to you for your research purposes. The default trace has rolled beyond the retention point for your server for events that were trapped. Setting up an Extended Events session would be less than helpful given the databases are already offline. So we must find an alternative to finding approximately when the databases were taken offline.

Alternatives

If your predecessor leaves you with a database that has been taken offline and no documentation about it, there is an alternative to find the approximate time it was taken offline – using TSQL. Granted, you could explore the file system and make the same sort of guess based on the file modified date. Data people generally prefer some sort of script and a script that is native to their language (tsql).

So, let’s take a look at a viable option for figuring out the mystery behind your database that has been set offline with no good info surrounding it.

Running this query, I see the following on my test server:

Note that this query uses GETUTCDATE. The differential_base_time column in sys.master_files is in UTC time. Thus, in order to compare properly, we need to ensure we use a UTC datestamp for comparison. Also, of note, this query doesn’t work if there is no full database backup for the database in question. Who doesn’t backup their databases though, right? No seriously, you really should.

Since this is my test server and I am well prepared, I just so happen to have an XEvent session running that tracks the offline events that I can use to compare the results.

Yes, there is a bit of variance in the actual database offline event and the differential_base_time column. This variance is due to timing of the backup and the actual offline event. Again, this is about finding an approximate time of when the database was taken offline as a viable alternative when other data is not really available.

 

Conclusion

We all aspire to having a perfect database environment where nothing surprising or unexpected happens. Unfortunately, that is the desire of dreams and fairy tales. The unexpected will happen. A database can be offline at the time that you take over responsibility of the database. You will be asked when it was taken offline. This script will help you get a reasonable deduction for that offline event in the absence of better data.

Extended Events is a powerful tool to help in troubleshooting and tuning your environment. I recommend investing a little time in reading the 60 day series about Extended Events. This is not a short series but is designed to provide an array of topics to help learn the tool over time. Don’t forget to go back and read the companion article showing how to audit these events via the default trace.

SQLs Black Box Recorder – sp_server_diagnostics

Yesterday i shared the first article in a three part series about the SQL Black Box Recorder, that you can read here, with a promise of at least two more posts. The first article in the series laid the groundwork about the black box recorder in SQL Server. Furthermore, there was a brief introduction into the three components of the black box recorder. The three components as laid out in that article are:

  • Default Trace
  • system_health Extended Event Session
  • sp_server_diagnostics procedure

The first article went into deeper detail about the first leg of this three-legged black box recorder – the default trace. In this article, the focus will be directed to the next leg of the black box recorder – or the system_health Extended Event Session. If by some measure you are in the dark about what Extended Events is, then I recommend you read my personal blog series on the subject. There are numerous articles stepping through the subject in easy-to-digest fashion. You can find many of the articles in the following table of contents – here.

The focus of this article will take on the topic of the third piece of the black box recorder, sp_server_diagnostics. I will show how this procedure is critical in gathering pertinent information that is crucial to gathering an overall picture of the system—from a default black box recorder perspective.

SQL Black Box Recorder

What is the sp_server_diagnostics process?

Beyond being a component of the black box for SQL Server, what exactly is this diagnostics process? The sp_server_diagnostics is much as the name implies—it is a “diagnostics” service that attempts to gather information about various events that may affect the overall health of the SQL Server instance.

The diagnostics process will trap various server related health (diagnostics) information related to the SQL Server instance in an effort to try and detect potential failures and errors. This diagnostics session/process traps information for five different categories by default. There is a sixth category of information for those special servers that happen to be running an Availability Group.

The Six

  • system: overall health of the server including information about spinlocks, severe processing conditions, non-yielding tasks, page faults, and CPU usage.
  • resource: health of resources, specifically including physical and virtual memory, buffer pools, pages, cache and other memory objects
  • query_processing: captures query processing data about worker threads, tasks, wait types, CPU intensive sessions, and blocking tasks.
  • io_subsystem: health related data to IO.
  • events: errors and events of interest recorded by the server, including details about ring buffer exceptions, ring buffer events about memory broker, out of memory, scheduler monitor, buffer pool, spinlocks, security, and connectivity.
  • AvailabilityGroup: basic information about the AvailabilityGroup if there is an AvailabilityGroup enabled on the server.

This is fantastic introductory information. As luck would have it, you can also read more about it from my blog – here.

How to use this process?

What good is any level of information if we don’t quite know how to use or consume that data? Do we consume it via trace? Do we execute the stored procedure? Do we capture the information from some other means? The answer is yes! Wait, yes to what? Yes to all of it! Let’s explore how to consume this data.

Option 1

The first step to learning how to consume this data is to start at probably the most basic of levels. As you have already noted, this is a stored procedure. It was probably the sp_ in the name that gave it away, right? Since it comes in the form of a stored procedure, it makes it pretty easy to figure out the first means to get the data from it—just execute it. Actually, let me clarify that just a bit. sp_server_diagnostics is a CLR stored procedure. Nonetheless, all that needs be done is execute it at this level to get to the data.

Look at that! Pretty slick! I execute the stored procedure and I can see a point in time result of the “health check” from this component of the black box recorder. Notice here that I only have five of the six categories that I had mentioned. This is due to the lack of a configured availability group (AlwaysOn) from this particular instance. Another point of interest is that the data column is not in a very friendly format for DBA consumption. This is flat text that just goes on and on. It is painful to consume in this manner.

Tidy that Code

Looking at the code, you can see quickly the difference. I have changed the data column to be XML-based to better match the data being output from the stored procedure. Let’s see that in action:

That is much better. Now I can simply click one of the data results and see a nicely formatted easy to read XML output of that component. Here is an example from the query_processing component:

Option 2

Cool! I have just shown one means of exploring this facet of the black box recorder. Now, let’s take a look at other means to get at this particular data, beyond just executing the stored procedure. Did you know that this procedure is very closely related to Extended Events? (I am sure that you picked up on that nugget due to reading the previous article in the series or possibly the other article I mentioned here). With such a close relationship to Extended Events, it makes sense that sp_server_diagnostics also has a couple of events and is consumed by multiple sessions.

SQL Server comes with various default Extended Event sessions, of those sp_server_diagnostics is consumed by the system_health session as well as the hidden session for SQLDiag that is used by Windows Clustering and Availability Groups to determine server health. You already heard about the system_health (from the previous article), but few have heard about the hidden session. I will take a look at the events as well as the sessions that consume the sp_server_diagnostics data.

Events

First, I will share the details for each of the events related to the sp_server_diagnostics process.

As I have shown here, you can see that these events are a part of the debug channel. In the next couple of images, I want to show some of the payload attached to these events.

These events show what I have shown previously with regards to output of the stored procedure. There is a consistency in the output of the diagnostics process. What I have yet to show in this data is one more piece of information that could be helpful down the road and that pertains specifically to the extended events data. That piece of information I need to share is the keyword associated with each of these events. When looking for related events, I need you to understand that the sp_server_diagnostics events are a part of the scheduling classification. There, I have said it. I just gave you a big clue to help you in your troubleshooting endeavors and the use of the diagnostics processes/events.

Which Sessions then?

Now that I have shown which events are related to diagnostics, I can finally show which sessions are dependent upon the diagnostics data. To show that kind of information, I have this next query to help:

When I execute the statements in the preceding query, I will receive a single result—system_health. That said, do not believe everything I just showed you with that query. SQL Server does a decent job at hiding some of the details. When I run the next query, I will see a little bit more information.

When I run the preceding query, I will see the following two sessions (given that I have not change the default settings):

Take note of the names of those two sessions. I showed that the system_health session is reliant upon the diagnostics events. Now, I am showing that there is another event session called “sp_server_diagnostics session“. Wait, there is more! SQL Server continues to hide stuff from us.

Hidden Sessions

When you have AlwaysOn configured, there is an additional hidden session that is not visible within the sessions lists inside of SQL Server. To view this hidden session, you must follow the directions shown in this article on msdn. Once you have navigated to the folder containing the hidden XEL files, you can open them in Management Studio and view the data. It is while viewing the data that you will be able to see the relationship between sp_server_diagnostics and this newly discovered trace file.

The payload from this image demonstrates the relationship between sp_server_diagnostics and the hidden trace file. I challenge you to take some time to explore this hidden trace file on one of your servers where AlwaysOn is configured. I know you will be thankful for the investment at some point down the road.

Conclusion

I have just shown a ton of info about the third component of the black box recorder—sp_server_diagnostics. This procedure may be lesser known than the first two components, but it is highly integral into the overall monitoring and health of your SQL Server instance. In this article, I have not only shown how to use sp_server_diagnostics (albeit quickly), but I have also shown that the diagnostics are heavily relied upon by other components of the black box recorder. In addition, I have shown how this component is integrated into other hidden traces that are essential in the monitoring and troubleshooting of your instance.

The black box recorder is real. You need to invest time in getting to know the black box recorder components and how they can be of help to you in becoming a superhero DBA.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the sixth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

SQL Servers Black Box Recorder – system_health

Yesterday i shared the first article in a three part series about the SQL Black Box Recorder, that you can read here, with a promise of at least two more posts. The first article in the series laid the groundwork about the black box recorder in SQL Server. Furthermore, there was a brief introduction into the three components of the black box recorder. The three components as laid out in that article are:

  • Default Trace
  • system_health Extended Event Session
  • sp_server_diagnostics procedure

The first article went into deeper detail about the first leg of this three-legged black box recorder – the default trace. In this article, the focus will be directed to the next leg of the black box recorder – or the system_health Extended Event Session. If by some measure you are in the dark about what Extended Events is, then I recommend you read my personal blog series on the subject. There are numerous articles stepping through the subject in easy-to-digest fashion. You can find many of the articles in the following table of contents – here.

The focus of this article will not be to introduce the topic of Extended Events. Rather, it will be to introduce the system_health session and dive into it a fair bit.

SQL Black Box Recorder

What is the system_health session?

Beyond being a component of the black box for SQL Server, what exactly is this event session? The system_health is much as the name implies – it is a “trace” that attempts to gather information about various events that may affect the overall health of the SQL Server instance.

The event session will trap various events related to deadlocks, waits, clr, memory, schedulers, and reported errors. To get a better grasp of this, let’s take a look at the event session makeup based on the available metadata in the dmvs and catalog views.

In addition to the types of events I mentioned, there are also a few more interesting things to discover from the results of this particular query. For instance, the actions that are being employed help to gather even more information such as the callstack, tsql_callstack, database id, and sql text of the query that was executing at the moment of the event that is trapped.

One very peculiar event that is trapped is the sp_server_diagnostics_component_result event. Remember from earlier in this article that I noted the third leg of the black box record happens to be a very similarly named session – “sp_server_diagnostics procedure”. When I cover this third leg, it will make a little more sense. Understand, for now, that your suspicion of it being a bit more integral to the black box (than merely 1/3 of the overall black box) is confirmed.

This is a good set of information that can be trapped. Furthermore, this is a good set of information which can give a fairly good diagnosis of several different potential problems within your SQL Server Instance! (Yes it may seem repetitive, but I wanted to underscore the importance of the data that can be trapped.) That said, compare the types of events to those of the default trace. Take note of how many events actually overlap between the two components of the black box recorder.

For ease of comparison, let’s simplify that previous query to get a set of events to read without the duplication due to the actions on each event.

And since I am feeling particularly generous, I have lumped the events from the two traces together in the following query:

That should help to compare the 17 events from the system_health session and the 34 from the default trace with a little more ease and speed. The events from the two sessions are rather complimentary to each as they help to construct the SQL Server Black Box recorder.

How to use this session?

With the events trapped by this session, there are a few ways to use the trapped data that should stand out pretty easily. To me, the quick application of the data from this session can come from the review of deadlocks, reported errors, or even waits analysis. Using the deadlock as probably the most common problem people will want to troubleshoot, this information is suddenly much more valuable and easy to access (no more need for that trace flag). Since the system_health is now trapping the deadlock by default, one could quickly access the deadlock graph from the event data and proceed to troubleshooting. More information on getting that deadlock graph can be found here and here.

Learn to use the data made available through this member of the black box recorder. Becoming familiar with this tool (and additionally with Extended Events) will make you a rock-star in the office. The information needed to troubleshoot most problems is right at your fingertips through the use of the different pieces of the black box recorder. If it is not available there, then more than likely it can be made available through a bit of tinkering with Extended Events. Get to know the tools and be that rock-star DBA.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the fifth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

T-SQL Tuesday Participation Over the Years

Comments: No Comments
Published on: December 19, 2018

Loads of TSQL Tuesdays

TSQL2sDay150x150

Years ago the TSQL Tuesday party was started by Adam Machanic (b|t). The premise of the monthly event is to get people a topic each month to try and ease some of the difficulty with writing (figuring out what to write) and to get more involved in the community. The party started in December 2009 and is now embarking on the 10th year. The party is running strong and will likely continue for many years to come.

I have personally participated in quite a large number of these events. I am far from a perfect participation record like Rob Farley (b | t) but I do participate when I can. Every now and again I think about what I might have participated in or what I have written in the past for a TSQL Tuesday. This post will serve as a table of contents for all of those articles. This is to help make it easier to find for me (as well as give me a chance to review and fix problems from really old posts). This post will provide a quick synopsis of my article (if one exists) along with topic and links to original invite and roundups for the monthly party.

T-SQL Tuesday #001: Date/Time Tricks (Dec. 2009)

Invite and roundup

My Article: N/A

T-SQL Tuesday #002: A Puzzling Situation (January 2010)

Invite and roundup

My Article: TSQL Tuesday – But I was late

This was the first time I participated in TSQL Tuesday. I was very new to the blogging concept – and it shows. The puzzling problem I encountered in the article was a fun little job that kept giving different results than running the same code from SSMS. Check it out!

T-SQL Tuesday #003: RELATIONSHIPS (February 2010)

Invitation and summary

My Article: Relationships

I covered all sorts of different relationships that affect SQL Server from work, to marital, to table relationships in a database. It all ties together in requiring an individual to constantly tune the different types of relationships.

T-SQL Tuesday #004: IO (March 2010)

Invitation and roundup.

My Article: IO IO IO

What if we could create a numbers table without IO?  What if we could perform several different kinds of queries without IO?  Itzik Ben Gan proposed a cascading CTE solution that does this sort of thing. This article shows that method in use to help improve certain IO conditions.

T-SQL Tuesday #005: Reporting (April 2010)

Invitation and roundup.

My ArticleIP and Default Trace…T-SQL Tuesday #005

Having been introduced to a requirement to report on IP addresses of connections, I dive into a solution that will help show hostname, ip address and a few more tidbits.

When it is necessary to provide reports on activity occurring on the server, it pays to do a little prep work.  Be Prepared.  It is not an easy task to be able to go back in time and report on data that isn’t captured.  The little prep work that one may need to do is well worth the effort in the end.

My 2nd ArticleBLOB Report T-SQL Tuesday #005- Reporting

Learning how to decrypt the individual object sizes broken down into type and filegroup. Then report on the data that was retrieved from the DMOs.

T-SQL Tuesday #006: “What About BLOB?” (May 2010)

Invitation and roundup.

My ArticleT-SQL Tuesday #006: A Blobbing We Will Go

In the DMVs for SQL 2005 and SQL 2008 there is more than one place you can find the information about the size of your LOB data?  And it goes without saying that there is more than one way to find information about LOBs in your database.

T-SQL Tuesday #007 – Summertime in the SQL (June 2010)

Invitation and roundup.

My Article: Data Compression

I see database compression, as offered with SQL 2008, to be more like these file compression utilities than DriveSpace.  Data compression in SQL 2008 is not an all or none implementation.  You get to pick and choose what gets compressed.  That is a big time bonus for me.

T-SQL Tuesday #008: Gettin’ Schooled

Invitation and roundup.

My ArticleGettin’ Skewled

I am learning that learning is not just formalized education in a classroom or in specific settings.  There are things to be learned from all aspects of life.  This can be learned if only a little observation is used.

T-SQL Tuesday #009: Beach Time (August 2010)

Invitation and roundup.

I hosted this particular event.

My ArticleR & R

I find it completely useless to go on vacation if I am going to be checking email or project statuses every 10 minutes.  There is no rest or relaxation in doing those things while I am supposed to be doing something else.  Vacation should be fun and enjoyable.  Thus, if I am to enjoy vacation, I need to do a few extra things in the office prior to leaving.

T-SQL Tuesday #010 – Indexes (September 2010)

Invitation and roundup.

My ArticleTSQL Tuesday Indexes and Blobs

How does one find what columns were LOB columns in the database.  I knew I had some past blog posts about various aspects of LOBs, but I had never broken it down to find the columns in the LOB.  Even better was that I wanted to know what columns were in what index that were also a BLOB.

T-SQL Tuesday #011 – Misconceptions in SQL Server (October 2010)

Invitation and roundup.

My ArticleA Haunting TSQL Tuesday Tale

I chose the myth that truncate is unrecoverable. Check it out and see how a truncate is most certainly capable of a rollback or being recoverable.

T-SQL Tuesday #012 – Why are DBA skills necessary? (November 2010)

Invitation and roundup.

My ArticleT-SQL Tuesday #012 – Skills

As a DBA, we occasionally have the opportunity of attracting a new client or a new job or a new database.  It seems that more often than not, some of the skills requisite (at least they should be) of owning a database are missing. (Nunchuku skills could come in handy from time to time too!)

T-SQL Tuesday #13 – What the Business Says Is Not What the Business Wants (December 2010)

Invitation and roundup.

My ArticleT-SQL Tuesday #13 – Business Requirements

I think a common area that is easily overlooked when it comes to requirements and interpretation of requirements is report creation.  A common problem is that there are no defined or written requirements for the creation of a report.

T-SQL Tuesday #014 – RESOLUTIONS (January 2011)

Invitation and roundup.

My ArticleTSQL Tuesday 14: Committed

This month was all about resolutions and goals. My list of goals were pretty good and entirely profession based.

T-SQL Tuesday #015 – Automation in SQL Server (February 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #15 DBA Automaton

I shared a script that will load a sample of data from every table in every database for every column and give you the length of the pertinent columns (I have excluded obvious columns such as numeric types and certain LOB types).

T-SQL Tuesday #016 – Aggregate Functions (March 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #016: Aggregates and Statistics

Super geeky article on stats, quartiles and the like. For my data analysis and trending, I wanted to find a simple distribution across quartiles.

T-SQL Tuesday #017 – APPLY Knowledge (April 2011)

Invitation and roundup – unavailable on origin site.

My ArticleT-SQL Tuesday #17 – APPLY Knowledge

In this article I take an unexpected twist in my application of the APPLY operator. In that twist, I show a quick and simple script to grab role members.

T-SQL Tuesday #018 – CTEs (May 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #18 – CTEs

I shared a rather complex series of CTEs that are used in the same way a hierarchy and ancestry tree would be generated – except for Foreign Key relationships. That script was a load of fun.

T-SQL Tuesday #019 – Disasters and Recovery (June 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #19 – Disasters & Recovery

Who needs a single disaster when you can enjoy multiple disasters in a single sitting? I share three different types of disasters. I am sure many have had experiences with all three and probably many more types of disasters.

T-SQL Tuesday #020 – T-SQL Best Practices (July 2011)

Invitation and roundup.

My Article: N/A

I missed the announcement or something like that.

T-SQL Tuesday #021 – A Day Late and Totally Full Of It. (Aug. 2011)

Invitation and roundup.

My ArticleTSQL Tuesday 21 – FAIL FAIL FAIL

I don’t hide much as I share my feelings about crap code and then show my own crap code followed by a little bit of explanation on an improved version of the same code.

T-SQL Tuesday #022 – Data Presentation (September 2011)

Invitation and roundup

My ArticleT-SQL Tuesday #22 – Data Presentation

I chose to touch on several aspects of data presentation – Performance, Accuracy, Display, and Business Requirements.

T-SQL Tuesday #023 – Joins (October 2011)

Invitation and roundup.

My Article: N/A

D’oh – missed another one in 2011.

T-SQL Tuesday #024 – Prox ‘n’ Funx (November 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #024: Prox ‘n’ Funx

Big takeaway from this month was the value of a very good string splitter.

T-SQL Tuesday #025 – Invitation to Share Your Tricks (Dec. 2011)

Invitation and roundup.

My ArticleT-SQL Tuesday #025 – Holiday Gifts

In the spirit of the Holidays, I wanted to share some tricks and tips.  They can be my gifts to you during the holidays.  And maybe they can help you give to somebody else.

T-SQL Tuesday #026 – Second Chances (January 2012)

Invitation and roundup. (Original site toasted).

My ArticleTSQL Tuesday #26 or #23 – Identity Crisis

Having missing TSQLTuesday 23 and the topic being about Second Chances, I chose to write about JOINs and sort of fulfilling the requirements for both TSQL Tuesdays in one shot.

T-SQL Tuesday #027 – Invitation to The Big Data Valentine’s Edition (February 2012)

Invitation and roundup.

My Article: N/A

After that second chance I blew it the very next month. Luckily I have a good reason – Valentines Day!

T-SQL Tuesday #028 – Jack of All Trades or Master of None (March 2012)

Invitation and no roundup.

My ArticleT-SQL Tuesday #028 – Jack of All Trades, Master of None?

This one brings up bad memories of when as the DBA it was also required to be the janitor.

 

More updates coming soon!

What is T-SQL Tuesday?

T-SQL Tuesday is a monthly blog party hosted by a different blogger each month. This blog party was started by Adam Machanic (blog|twitter). You can take part by posting your own participating post that fits the topic of the month and follows the requirements below. Additionally, if you are interested in hosting a future T-SQL Tuesday, contact Steve Jones via the tsqltuesday website – here.

How to Participate

  • Your post must be published between 00:00 GMT Tuesday and 00:00 GMT Wednesday.
  • Your post must contain the T-SQL Tuesday logo from above and the image should link back to this blog post.
  • Trackbacks should work. But, please do add a link to your post in the comments section below so everyone can see your work.
  • Tweet about your post using the hash tag #TSQL2sDay.
«page 1 of 5

Calendar
May 2019
M T W T F S S
« Apr    
 12345
6789101112
13141516171819
20212223242526
2728293031  

Welcome , today is Saturday, May 25, 2019