Disk Space in SQL Server

One of the frequently required job functions of the database administrator is to track disk space consumption. Whether this requirement comes from management or from a learning opportunity after a production outage, the need exists.

As a hard working DBA, you want to make sure you hit all of the notes to make management sing your praises. Knowing just when the database may fill the drives and prevent a production outage just happens to be one of those sharp notes that could result in a raise and management singing hallelujah. The problem is, how do you do it from within SQL Server? You are just a DBA after all and the disk system is not your domain, right?

Trying to figure it out, you come across a pretty cool function within SQL Server. The name of the function is sys.dm_os_volume_stats. Bonus! This is an excellent discovery, right? Let’s see just how it might work. First a sample query:

If I run that on my local system, I might end up with something that looks like the following:

Looking at the image you may be wondering to yourself right now why I have highlighted a couple of things. You may also be wondering why I used the word “might” in the previous paragraph as well. The reasoning will become more evident as we progress. For now, you have resolved to continue testing the script so execute it again and end up with something that may look like the following (for the same server):

Whoa! What just happened there? Why are there two listings for the C: drive? Why does each register a different value for the FreeSpace column? In addition without any additional usage on the C drive (as verified through other tools) the FreeSpace is changing between executions as well as within the same execution. This is problematic, so you continue testing:

And yet again!

This can’t be correct, can it? Just for giggles let’s modify it just a bit to see if there are any additional clues. Using the following changed script, hopefully a clue will help shed some light on this:

This script yields the following potential results:

Look at the different highlighted areas! There are three different values for FreeSpace for the C: drive in this particular execution. The case of the C: drive plays no role in whether the value is recorded differently or not. This seems to be more of a bug within the dynamic management function. From execution to execution, using this particular method, one could end up with duplicate entries but distinct values. The sort of the execution could be returned differently (though we could fix that).

All of these tests were run on my local machine and I really do only have one C: drive. I should never receive multiple entries back for any drive. If using this particular DMF to track space usage, it could be somewhat problematic if the duplicate drive data pops up. How do we get around it, you ask? Here is another example that I have that has not yet produced this duplication:

Using this version of the script is not terribly more complex, but it will prove to be more reliable. You can see I used some CTEs to provide a little trickery and ensure that I limit my results. What if it is a mount point or a non-standard drive letter? I have not tested that. Let me know how that goes. As you can see, I am restricting the drive selection by using the row_number function against the drive letter.

For alternative reliable methods to find your disk space consumption, I would recommend something different because it is tried and tested. I would recommend using a wmi call to fetch the data. Samples are provided as follows:

Easy peasy, right? Have at it and try tracking your disk space.

Thanks for reading! This has been another article in the Back to Basics series.

Passion, Challenges, and SQL

Published on: February 12, 2019

TSQL Tuesday

The second Tuesday of the month comes to us a little early this month. That means it is time again for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

The Why?

Long time friend Andy Leonard (b | t) invites us this month to do a little checkup on ourselves and talk about the “why” around what we do. This could be a very easy topic for some. Equally, this could be a very difficult topic for those same people at different times in their lives. Thus the problem, the topic is simple in nature but sure requires a firm reflection on self and what you have been doing.

The problem for me is less about the “why” behind what I do, and more about how to stretch it out into something more than a few sentences. Think! Think! Think!


One of my biggest reasons why I do what I do, boils down to the challenges that I frequently get to encounter. There is a wild satisfaction to working on a very difficult and challenging task, product, tool, profession, skill, etc. This satisfaction often involves reward and a sense of accomplishment.

The challenge can be anything from how to effectively communicate with a difficult person, a tough to find internals problem in SQL Server that could be causing a performance issue, or taking over a project and turning it back from the edge of failure and onto a track of success. Sometimes, the challenge may be as simple as converting a pathetic cursor into a set based approach and gaining an improvement of 100x in performance.

I really do enjoy some of the puzzles (challenges) that I get to work on routinely. This gives me an opportunity to improve my skillset as well as continue to learn. Being able to continually improve is a great motivation for me. The frequent challenges and continual opportunity to learn presents a great opportunity to evolve ones self and career. In a constantly changing world, being able to naturally and easily evolve your personal career is a bonus!


“Do what you love and you will never work a day in your life.” This is a common saying in the United States. Agree or disagree – there is some truth to it. Being able to do something one loves makes the really hard days a lot easier. Knowing, I may be able to solve a complex problem makes it easier to face the day.

I really enjoy the opportunity to face difficult challenges and resolve those challenges. The passion to solve these puzzles with data doesn’t end there. I also really do enjoy the opportunity to learn which brings up two other challenges that help me learn: speaking and writing.

By putting myself out there regularly to speak and write, I am becoming a better technical person. I am becoming better equipped to solve many of the puzzles I face. Those are great benefits. That said, I don’t feel I could get out there and talk about something about which I wasn’t passionate. I have learned to become passionate about writing and speaking – though I still have plenty of room for improvement (just as I do in my quest to become a good DBA).

Wrapping it Up

I really do enjoy the challenges I get to face on a frequent basis in the world of data. This is the big “WHY” for me to continue my progress in this career.

Find something you are passionate about and strive to envelop your career with as many opportunities to do that thing. If that means accepting some less wanted tasks in order to do more of the thing you love, it could very well be worth it!

Interview Trick Questions

Today, I am diverging from the more technical posts that I routinely share. Instead, as the title suggests, I want to dive into a something a little more fun.

Anybody that has interviewed for a job has most likely run into the trick question. Some interviewers like to throw out multiple trick questions all in an effort to trip up the candidate and get the candidate to doubt him/her self. Sure, there can be some benefit to throwing out a trick question or four. One such benefit would be to see how the candidate performs under pressure (see them squirm).

The downside to throwing out trick questions, in my opinion, would be that you can turn a serious candidate into an uninterested candidate. So, when throwing out the tricks, tread carefully.

Let’s take a look at an interview trick question candidate. This is a more technical question and is designed to make you think a little bit. Before reading on to see the answer, I implore that you try to answer the question for yourself legitimately.

How can you insert data into two tables using a single statement without the use of triggers, service broker or some other behind-the-scenes feature?

Are you thinking about it?

Do you have your answer yet?

Now that you have your answer, go ahead and continue reading.

Is your answer to this question something along the lines of “You can’t do that and this is just a trick question”?

Well, honestly, it is a bit of a trick question. But I assure you, you can certainly perform an insert into multiple tables from a single statement. Here is one such setup that demonstrates how you can do this:

Do you see how I was able to perform that insert into multiple tables? The trick is in using the OUTPUT clause. This little feature in SQL Server can be of great use for things such as building multiple staging tables during an ETL process.

Here is that little trick again just to highlight it.


There are cases when an interview trick question is suitable. It is when the purported question is truly more technical than trick and is really trying to evaluate your depth and knowledge of SQL Server. The puzzle during the interview boils down to figuring out when it is a trick and when it might not be. Then from there, work your way through possible solutions. But don’t be afraid to admit when you haven’t got a clue. That will be far more impressive than to try and flim-flam the interviewer.

I invite you to share your trick questions in the comments.  Also, how did you solve this particular trick question?


Thanks for reading! This has been another article in the Back to Basics series.

Defaults In msdb Database

Today is a day to discuss defaults. It started with the day being TSQL Tuesday and having a topic of “Say No to Defaults.” You can read more about that from the invite – here. I already participated in the party but did also want to discuss defaults a little bit more. That said, this article is not participating in the blog party. That would seem a bit silly.

While, this post is not a part of the party, the defaults to be discussed are fairly important. I have seen severe consequences due to these defaults being ignored and not changed. So today, in addition to my earlier article (you can read it here), I implore you to make some fundamental changes to your production servers with regards to various defaults.

A Trio of msdb Defaults

There aren’t really that many defaults within the msdb database that must be changed, are there? I mean, seriously, beyond the defaults that are available to every database, what could possibly be unique to this database that could have a severe consequence?

I am so glad you asked!

The defaults in the msdb database are more about what is missing than what is actually there. By default, this database is missing quite a few things that could be deemed critical to your environment.

Let’s start with an easy one – Indexes

There are a few out there that may disagree, but the proof really is in the effect on performance for backup jobs and such. I have three indexes I like to put on every instance. I have seen the implementation of these indexes aid in improved job times as well as aid in reduced time to “clean” up the database.

Easy enough. These indexes are very straight forward and pretty small in the grand scheme of things. But if the index can help improve performance by a factor of 10, then I am in favor of them (and I have seen that performance gain).

Now that we have some supporting indexes to help a bit with performance, we should take a look at the next item. This one can help with job performance as well as help with keeping the msdb database nice and trim.

Data Pruning

I have walked into client instances that had backup history dating all the way back to 2005 and included two-three full backups a day per database with quarter-hourly log backups. Oh and this was for an instance containing well north of 200 databases. Can you say sluggish backups and sluggish msdb overall?

The fix is very easy! Not only do I recommend pruning the backup history, but also the job history, mail history and maintenance plan history (eew – if you use those things). Think about it – do you really need to know that Job XYZ ran successfully in 2006 and only took 15 seconds? This is 2015 and that kind of data is probably not pertinent at this point.

The pruning of this data is not enabled by default! You have to configure this for each of the servers under your purview. Luckily, this is easy to do!

If you use this code sample, be sure to adjust the number of days shown in the retention to match your specific needs.

Now we have addressed a couple of defaults in msdb that can impact your performance. We are tidying up the database and in a much happier state these days. There is one more default, though, that is really critical to your data’s well being. This one is set within the msdb database but it really is for all of your databases!

Configuring Alerts!

I’m not talking about just any alerts. There are some very specific alerts that really should be configured. These are the alerts that can help you intervene to minimize corruption.

If you haven’t faced a problem with corruption – you will. It is only a matter of time. Corruption happens. When it happens, the earlier one can intervene, usually the better the outcome. Every minute counts, so why not try to reduce that time as much as possible?

This one is not terribly difficult to implement. I happen to have a query ready to go for that as well. All that needs to be done is a minor adjustment to the alert email address:


Wow! Now there are three quick defaults that must be changed on every server. These defaults will help improve performance as well as help you stay on top of things when they start to go south (corruption). With timely notifications, and better performance, your servers will be happier, healthier, and longer lasting.

Thanks for reading! This has been another article in the Back to Basics series.

Audit SQL Agent Jobs

One probably seldom thinks of the SQL Agent jobs scheduled on the SQL Server instance – unless they fail. What if the job failed because something was changed in the job? Maybe you knew about the change, maybe you didn’t.

Once upon a time, I was in the position of trying to figure out why a job failed. After a bunch of digging and troubleshooting, it was discovered that the job had changed but nobody knew when or why. Because of that, I was asked to provide a low cost audit solution to try and at least provide answers to the when and who of the change.

Tracking who made a change to an agent job should be a task added to each database professionals checklist / toolbox. Being caught off guard from a change to a system under your purview isn’t necessarily a fun conversation – nor is it pleasant to be the one to find that somebody changed your jobs without notice – two weeks after the fact! Usually, that means that there is little to no information about the change and you find yourself getting frustrated.

To the Rescue

When trying to come up with a low to no-cost solution to provide an audit, Extended Events (XE) is quite often very handy. XE is not the answer to everything, but it does come in handy very often. This is one of those cases where an out of the box solution from XE is pretty handy. Let’s take a look at how a session might be constructed to help track agent job changes.

With this session, I am using degree_of_parallelism as a sort of catch-all in the event that queries that cause a change are not trapped by the other two events (sql_statement_completed and sp_statement_completed). With the degree_of_parallelism event, notice I have a filter to exclude all “Select” statement types. This will trim some of the noise and help track the changes faster.

Looking at data captured by this session, I can expect to see results like the following.

And the degree_of_parallelism event will catch data such as this.

In this example, the deletion of a job was captured by the degree_of_parallelism event. In addition to catching all of the various events that fire as Jobs are being changed and accessed, one will also be able to get a closer look at how SQL Agent runs about its routine.

The Wrap

Extended Events can prove helpful for many additional tasks that may not be thought of on an every day basis. With a little more thought, we can often find a cool solution via Extended Events to help us be better data professionals. In this article, we see one example of that put to use by using XE to audit Agent Job changes.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

For another interesting article about SQL Agent, check this one out!

Automating like an Enterprise DBA

Published on: January 8, 2019

TSQL Tuesday

The second Tuesday of the month comes to us a little early this month. That means it is time again for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.


The theme as chosen by Garry Bargsley (b | t) is about automation. Specifically, Garry has provided two requirements about automation for this month. As is always, there is leeway in a post that participates in TSQL Tuesday.

One of the things that should seem very commonplace to a data professional is the effort to become a lazy DBA. A lazy DBA is not a bad thing. It just means the DBA works hard to automate the repetitive mundane tasks that may be tedious and/or time consuming. Time can always be better spent somewhere else, right?

If you are lacking in any ideas for what can be automated, here are a few TSQL Tuesday roundups from when we have talked about automation previously (yes it is a hot topic – ALWAYS!).

  1. August 2010 – Beach Time – what do you automate to earn beach time?
  2. February 2011 – Automation in SQL Server – Give your best tricks for making your life easier through automation.
  3. January 2014 – Automation – How much of it is the same?
  4. September 2015 – The Enterprise – How does one manage an enterprise of databases?
  5. September 2017 – PowerShell Automation – Find something and automate it.

In the past, I have written about automation a few times. Some of my favorites are automated restores, automation in the cloud, and my poor mans automated audit.

I automate many processes and have automated loads of tasks over the years. You see, automation means I can spend more time doing other tasks that require more time, more thought/concentration, more effort, and frankly more interest. So what have I automated recently that may be different from what I have previously written? This time, I have something that may seem utterly trivial but in the end it is rather tedious and time consuming to manually check over and over and over.


When I automate a task, I generally will try to use the tool that seems the most appropriate for the task: windows scheduler, SQL Agent, TSQL, SSIS, VB, C#  and now I am trying to add PoSh to that list. I don’t believe there is a one size fits all automation tool. Sometimes, one has to be flexible enough to adapt other technologies into the tool-belt.

I have been working with a client to check their servers for SQL Server version, SSMS version, PoSH version and so on. All of this to try and get the appropriate updates installed on the server. Believe it or not, many of their servers were still running PoSH v2 and didn’t have any Service Packs installed for their database servers. OUCH!

Touching every single server (even if it is only 10 servers) is far too tedious and error prone. So, I spent a little time klooging together with my neanderthal level PoSH skills and found a way to retrieve various pieces of information from the servers and then store those data points in a database so I could report on the entire environment easily with TSQL. In addition, I could show change history and find approximately (at worst) when an update was installed.

Of all of the things I scripted to start tracking, the one I want to share this time can also be used to audit security on each of the database servers. I use the following script to audit the localadmins on each of the database servers in the enterprise. In order to trap each local admin on the server, I also recurse through domain groups to find all users of a group to find everybody that may have access. Here is a version of the script that is similar to what I use now.

Could I improve on the efficiency of this script? Most definitely I believe there is room for improvement. Remember, I am very novice at my PoSH skills. Scripting issues aside, it works and basically fetches a list of servers from a database, then iterates through each of those servers to fetch the complete list of local admins on each of the servers. Then the script writes out the complete list of admins for each server back to my database so I can generate a history of changes to the admins or report on who has admin access on the server.

For anybody that has admin access to a database server, the permission path (nested group path) is recorded in hierarchical form separated by the carrot character (^). Using this script, I have been able to provide a report to domain admins to clean out various unwanted individuals from access that was not intended or necessary.

Wrapping it Up

TSQL2sDay150x150Automation is an essential tool for every data professional. Wait, no, that’s not accurate. Automation is an essential tool in all facets of IT. Automation is a definitive method to work more efficiently and offload some of the mundane repetitive tasks that consume too much time.

Even if the task is not trivial but needs to be repeated and done so without error, the best tool is automation. Performing tasks over and over naturally leads to higher risk of error. The way to minimize that risk is to perform the task via some automation script or routine.

Short Circuiting Your Session

It isn’t very often that one would consider a short circuit to be a desired outcome. In SQL Server we have a cool exception to that rule – Extended Events (XE).

What exactly is a short circuit and why would it be undesirable in most cases? I like to think of a short circuit as a “short cut” in a sense.

I remember an experience that happened while running a marathon many years ago. A person I had pulled up next to and started chatting with needed to use the restroom. I continued along on the course and a mile later I saw the same person suddenly reappear on the course ahead of me. This person had found a short cut on the course and decided to use it. If caught, he would have been disqualified. He may have saved himself a mile of running and gotten a better time, but the act was to take a course that was not the intended official course for that race.

In electricity, a short circuit does a similar thing. The electricity will follow the path of least resistance. Sometimes, this means the unofficial desired path for the current to flow. The end result can be very bad in electrical terms as an overload may occur which can cause overheating and sparking.

Why would we want an overload?

In electricity and mechanical parts, we really don’t want anything to cause short cuts in the system. On the other hand, when we are dealing with tracing and anything that can put a load on the system, we want that load to be as small as possible.

Trying to trace for problems in the SQL Server engine comes with a cost. That cost comes in the form of additional resource requirements which could mean fewer resources available for the engine to process user requests. None of us wants for the end-user to be stuck waiting in a queue for resources to free due to our tracing activities (i.e. Profiler). So a lightweight method (to trace) is needed.

XE is that lightweight method. A big part of the reason for that is the ability of XE to short-circuit (short-cut) to the end result. How can an XE session short-circuit? Think logic constraints and predicates. I previously demonstrated how to short-cut the system by using a counter in the predicate, but the short circuit isn’t constrained to just a counter in the predicate. The short-circuit is super critical to performance and success, but it is often misunderstood and poorly explained. So, I am trying to explain it again – better.

If we follow the principle that a short-circuit is the path of least resistance, we have a construct for how to build the predicate for each event in a session. Think of it as path of least work. Just like with children, XE and electricity will evaluate each junction with a bit of logic. Do I have to do more work if I go down this path or less work? Less work? Great, I am going in this direction.

As an event is fired off and is picked up by the XE session, the session compares that event payload to the conditions in the predicate. Everything in the predicate is processed in precise order – until a predicate condition fails the comparison (or result is false). Immediately when a condition results to negative (false) then the XE session jumps right to the end and closes. Nothing more is processed.

This is why predicate order matters. If a predicate evaluates to false, the short-circuit is invoked and the evaluation ends. With that in mind, what is the most desirable condition in the predicate to be first?

I have heard multiple people state that the “most likely to succeed” predicate should be first. Well, if the “most likely success” is first what does that mean for your session? It will have to do more work! That is exactly the model that Profiler used (uses) and we all know what happens with Profiler and performance!

No! We don’t want the most likely to succeed to be the first predicate to be evaluated. We want the least likely to succeed to be first. This means less work – just as illustrated in the previous image where the short-circuit is represented by the red line. If you would like, we can also call each of the three light-bulbs “predicates” and the switch would be the event (nothing is traced in the session if the event doesn’t even match).

Which Comes First?

This brings us to the hard part. How should one order the predicates for each event? The answer to that is not as cut and dry as you probably want. There are many variables in the equation. For instance, the first variable would be the environment. Each SQL environment is different and that makes a difference in evaluating events and predicates. However, lets use a common-ish set of criteria and say we need to decided between database name and query duration.

The questions in this case now comes down to 1) how many databases are on the server? and 2) what are the chances of a query lasting more than 5 seconds? If you have 100 databases on the server and 99 of them frequently see queries over 5 seconds, then this predicate order would make sense. What if you have only 4 databases and a query over 5 seconds occurs roughly 1 in 10,000 times? Then the predicate order should be switched to the following.

If you don’t have a database by the name of “AdventureWorks2014” then the database name predicate would remain first but really it should be changed to an appropriate database name that exists.

The Wrap

Predicate order in an XE session is very important. A well designed predicate can lead to a highly tuned and well performing trace that will ease your life as a data professional. Just remember, contrary to various people out there, the most desirable predicate order is to have the “least likely to succeed” first and the “most likely to succeed” should be last.

And yes, we truly do want our XE sessions to short-circuit! As we aspire to do less busy work, an XE session should be configured to do as little work as is necessary.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the eleventh article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Automatic Tuning Monitoring and Diagnostics

Cool new toys/tools have been made available to the data professional. Among these tools are query data store and automatic tuning. These two tools actually go hand in hand and work pretty nicely together.

With most new tools, there is usually some sort of instruction manual along with a section on how to troubleshoot the tool. In addition to the manual, you usually have some sort of guide as to whether or not the tool is working within desired specifications or not.

Thanks to Extended Events (XE), we have access to a guide of sorts that will help us better understand if our shiny new tool is operating as desired.

Operationally Sound

XE provides a handful of events to help us in evaluating the usage of Automatic Tuning in SQL Server. To find these events, we can simply issue a query such as the following.

When executed, this query will provide a result set similar to the following.

I have grouped the results from this query into three sets. In the red set, I have four events that are useful in the diagnostics and monitoring of automatic tuning. These events show errors, diagnostic (and performance) data, configuration changes and state changes.

For instance, the state change event will fire when automatic tuning is enabled and will also fire when the database is started (assuming the session is running). The automatic_tuning_diagnostics event fires roughly every 30 minutes on my server to gather performance and diagnostic data that can help me understand how well the feature is performing for my workload in each database.

Highlighted in the green section is a couple of maps that show the various values for the current phase or state of the automatic tuning for each database. One can view these different values with the following query.

This query yields these results.

We will see those values in use in the events in a session shortly.

We have seen some of the events and some of the maps at a very quick glance. That said, it is a good time to pull it all together and create a session.

Seeing as this session won’t produce any results without Query data store being enabled and automatic tuning being configured for a database, I have set all of that up in a demo database and have some fresh results to display.

Here I show an example of the output filtered for just the diagnostics event. Note the phase_code shows some of those map values previously discussed. I can also see that roughly every 30 minutes each database undergoes a diagnostics check.

Now, looking at another event in that same session, I can see the following.

The state_code in this event payload demonstrates more values from the maps previously discussed (CorrectionEnabled and DetectionEnabled). In this case, the automatic_tuning_state_change fired a few times for database 6 because that database was intentionally taken offline and set back online to test the event.

The use of these particular events in this session is very lightweight. I don’t have a predicate configured for any of the events because I wanted to trap everything. Of course, the number of events can increase with an increased load and usage scenarios on different servers.

The Wrap

Automatic tuning can be a pretty sharp tool in your tool-belt on your way to becoming that rock-star DBA. As you start to sharpen your skills with this tool, you will need to have some usage and diagnostic information at your fingertips to ensure everything is running steady. This event session is able to provide that diagnostic information and keep you on top of the automatic tuning engine.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the eleventh article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Event Tracing for Windows Target

There are many useful targets within SQL Server’s Extended Events. Of all of the targets, the most daunting is probably the Event Tracing for Windows (ETW) target. The ETW target represents doing something that is new for most DBAs which means spending a lot of time trying to learn the technology and figure out the little nuances and the difficulties that it can present.

With all of that in mind, I feel this is a really cool feature and it is something that can be useful in bringing the groups together that most commonly butt heads in IT (Ops, DBA, Devs) by creating a commonality in trace data and facts. There may be more on that later!

Target Rich

The ETW target is a trace file that can be merged with other ETW logs from Windows or applications (if they have enabled this kind of logging). You can easily see many default ETW traces that are running or can be run in Windows via Perfmon or from the command line with the following command.

And from the gui…

Finding the traces is not really the difficult part with this type of trace. The difficult parts (I believe) come down to learning something new and different, and that Microsoft warns that you should have a working knowledge of it first (almost like a big flashing warning that says “Do NOT Enter!”). Let’s try to establish a small knowledgebase about this target to ease some of the discomfort you may now have.

One can query the DMVs to get a first look at what some of the configurations may be for this target (optional and most come with defaults already set).

Six configurations in total are available for the ETW target. In the query results (just above) you will see that the default value for each configuration option is displayed. For instance, the default_xe_session_name has a default value of XE_DEFAULT_ETW_SESSION. I like to change default names and file paths, so when I see a name such as that, rest assured I will change it. (Contrary to popular belief, the path and session name default values can certainly be changed.)

As I go forward into creating an XE session using the ETW target, it is important to understand that only 1 ETW session can exist. This isn’t a limitation of SQL Server per se, rather a combination of the use of the ETW Classic target (for backwards compatibility) and Windows OS constraints. If the ETW target is used in more than one XE session on the server (even in a different SQL Server instance), then all of them will use the same trace target in windows (consumer). This can cause a bit of confusion if several sessions are running concurrently.

My recommendation here is to use a very precise and targeted approach when dealing with the ETW target. Only run it for a single XE session at a time. This will make your job of correlating and translating the trace much easier.

The ETW target is a synchronous target and does NOT support asynchronous publication. With the synchronous consumption of events by the target, and if you have multiple sessions with the same event defined, the event will be consumed just a single time by the ETW target. This is a good thing!

Two more tidbits about the ETW target before creating an event session and looking at more metadata. The default path for the target is %TEMP%\<filename>.etl. This is not defined in the configuration properties but is hardwired. Any ideas why one might want to specify a different path? I don’t like to use the temp directory for anything other than transient files that are disposable at any time!

Whether you change the directory from the default or leave it be, understand that it cannot be changed after the session starts – even if other sessions use the same target and are started later. However, if you flush the session and stop it, then you can change it. I do recommend that it be changed!

Second tidbit is that other than the classic target, ETW does have a manifest based provider. Should Extended Events (XE) be updated to use the manifest based provider then some of the nuances will disappear with translating some of the trace data (future article to include ntrace and xperf – stay tuned). For now, understand that viewing the ETW trace data is not done via SQL Server methods. Rather, you need to view it with another tool. This is due to the fact that the ETW is an OS level trace and not a SQL Server trace.

Session Building

If it is not clear at this point, when creating an XE session that utilizes the ETW target, two traces are, in essence, created. One trace is a SQL server (XE) trace that can be evaluated within SQL Server. The second trace is an ETW trace that is outside the realm of SQL Server and thus requires new skills in order to review it. Again, this second trace can be of extreme help because it is more easily merged with other ETW traces (think merging perfmon with sql trace).

When I create a session with an ETW target, it would not be surprising to see that I have two targets defined. One target will be the ETW target and a second may be a file target or any of the others if it makes sense. The creation of two targets is not requisite for the XE session to be created. The XE data will still be present in the livestream target even without a SQL related target.

Before creating a session, I need to cover a couple of possible errors that won’t be easy to find on google.

Msg 25641, Level 16, State 0, Line 101 For target, “5B2DA06D-898A-43C8-9309-39BBBE93EBBD.package0.etw_classic_sync_target”, the parameter “default_etw_session_logfile_path” passed is invalid.
The operating system returned error 5 (ACCESS_DENIED) while creating an ETW tracing session.
ErrorFormat: Ensure that the SQL Server startup account is a member of the ‘Performance Log Users’ group and then retry your command.

I received this error message even with my service account being a member of the “Performance Log Users” windows group. I found that I needed to grant explicit permissions to the service account to the logging directory that I had specified.

Msg 25641, Level 16, State 0, Line 105 For target, “5B2DA06D-898A-43C8-9309-39BBBE93EBBD.package0.etw_classic_sync_target”, the parameter “default_xe_session_name” passed is invalid.
The default ETW session has already been started with the name ‘unknown‘.
Either stop the existing ETW session or specify the same name for the default ETW session and try your command again.

This error was more difficult than the first and probably should have been easier. I could not find the session called ‘unknown’ hard as I might have tried. Then it occurred to me (sheepishly) that the path probably wanted a file name too. If you provide a path and not a filename for the trace file, then this error will nag you.

I found both error cases to be slightly misleading but resolvable quickly enough.

The session is pretty straight forward here. I am just auditing logins that occur on the server and sending them to both the ETW and event_file targets. To validate the session is created and that indeed the ETW session is not present in SQL Server, I have the following script.

Despite the absence of the ETW session from SQL Server, I can still easily find it (again either shell or from the perfmon gui). Here is what I see when checking for it from a shell.

Even though the session (or session data) is not visible from SQL Server, I can still find out a tad more about the target from the XE related DMVs and catalog views.

Running that query will result in something similar to this:

The Wrap

I have just begun to scratch the surface of the ETW target. This target can provide plenty of power for troubleshooting when used in the right way. The difficulty may seem to be getting to that point of knowing what the right way is. This target may not be suitable for most troubleshooting issues – unless you really need to correlate real windows metrics to SQL metrics and demonstrate to Joe Sysadmin that what you are seeing in SQL truly does correlate to certain conditions inside of windows. Try it out and try to learn from it and figure out the right niche for you. In the interim, stay tuned for a follow-up article dealing with other tools and ETW.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the tenth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Checking your Memory with XE

It is well known and understood that SQL Server requires a substantial amount of memory. SQL Server will also try to consume as much memory as possible from the available system memory – if you let it. Sometimes, there will be some contention / pressure with the memory.

When contention occurs, the users will probably start screaming because performance has tanked and deadlines are about to be missed. There are many different ways (e.g. here or here) to try and observe the memory conditions and even troubleshoot memory contention. Extended Events (XE) gives one more avenue to try and troubleshoot problems with memory.

Using XE to observe memory conditions is a method that is both geeky/fun and an advanced technique at the same time. If nothing else, it will certainly serve as a divergence from the mundane and give you an opportunity to dive down a rabbit hole while exploring some SQL Server internals.

Diving Straight In

I have a handful of events that I have picked for an event session to track when I might be running into some memory problems. Or I can run the session when I suspect there are memory problems to try and provide me with a “second opinion.” Here are the pre-picked events.

Investigating those specific events a little further, I can determine if the payload is close to what I need.

That is a small snippet of the payload for all of the pre-picked events. Notice that the large_cache_memory_pressure event has no “SearchKeyword” / category defined for it. There are a few other events that also do not have a category assigned which makes it a little harder to figure out related events. That said, from the results, I know that I have some “server” and some “memory” tagged events, so I can at least look at those categories for related events.

This query will yield results similar to the following.

If you look closely at the script, I included a note about some additional interesting events that are related to both categories “server” and “memory.”

After all of the digging and researching, now it’s time to pull it together and create a session that may possibly help to identify various memory issues as they arise or to at least help confirm your sneaking suspicion that a memory issue is already present.

When running this session for a while, you will receive a flood of events as they continue to trigger and record data to your trace file. You will want to keep a steady eye on the trace files and possibly only run the session for short periods.

Here is an example of my session with events grouped by event name. Notice anything of interest between the groups?

If the data in the session does not seem to be helpful enough, I recommend looking at adding the additional events I noted previously.

Here is another view on a system that has been monitoring these events for a while longer and does experience memory pressure.

Here we can see some of the direct results of index operations on memory as well as the effects on memory for some really bad code. Really cool is that we can easily find what query(ies) may be causing the memory pressure issues and then directly tune the offending query(ies).

The Wrap

Diving in to the internals of SQL Server can be useful in troubleshooting memory issues. Extended Events provides a means to look at many memory related events that can be integral to solving or understanding some of your memory issues. Using Extended Events to dive into the memory related events is a powerful tool to add to the memory troubleshooting toolbelt.

Try it out on one or more of your servers and let me know how it goes.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the ninth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

