Database Recovery Monitoring with XE

On of the greatest benefits of Extended Events (xe) is how the tool simplifies some of the otherwise more difficult tasks.

Recently, I wrote a rewrite of my database recovery progress report script. That script touches on both the error log and some DMVs along with some fuzzy logic to join the data sets together. That script may not be the most complex script out there, but it is more more complex than using the power of XE.

Database recovery (crash recovery) is a nerve wrenching situation under the wrong conditions. It can be as bad as a root canal and just as necessary to endure that pain at times. When the business is waiting on you waiting on the server to finish crash recovery, you feel nervous at best. If you can be of some use and provide some information back to the business, that anxiety dissipates and the business becomes more calm as well. While the previous script can help you get that information easily enough, I want to introduce the easiest method to capture that information currently available.

If you are interested in a history lesson first, here are the first couple of versions of the aforementioned script (here and here).

Discovery First

As always, I like to explore the event repository to see if there is an event that may be applicable to my situation. This can be done via TSQL script or from the XE Gui. I will almost always break out my scripts to figure out if an event is there or not.

This query will yield any events that match my description. In this case, I am looking for events related to “database_recovery”. This search will yield four relevant events we can use to track our database recovery progress. Those events are shown in the following image (with the event names being circled in green).

If I explore the events a little more, I will eventually come across an attribute in the database_recovery_progress_report event that leads to a map. This map is called recovery_phase. For me, that is an interesting attribute/map and makes me want to look at it further.

Things are coming together a little bit now. We all know (or should know) that there are the analysis, redo and undo phases to crash recovery. This aligns with that knowledge and throws in a couple more phases of the recovery process.

So, now we know there are four relevant events for us to use and that one of the events will tell us specifically which phase of recovery is currently processing. We have enough information that an event session can now be built.

You may notice that I have thrown a lot of actions including the kitchen sink at this event session. Some of that is for consistency across sessions and some of it is simply for exploratory wants (and not needs). Feel free to add/remove actions form this list as you explore the use of this session in your environment.

Here is what that session produces on my test server with a simple stop/start of the SQL Server instance.

In the preceding image, I have the different events circled in red. I have also added the event_sequence action so I can see the relationship between these events as the progress from one to the next. If you note the items circled in green (and connected by green arrow), you will see a couple of different things such as the trace message, the database name, the database id, and the recovery time remaining). Circled in blue are the “destress” items that let us know that the recovery is 100% complete.

Wrap

SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online.

Just because you need to have patience during the crash recovery process does not mean you have to work hard to get a status of the process or that you need to stress throughout the process. This XE event session will take a lot of work and stress out of the process. I would recommend having this lightweight session running on the server with the startup state set to enabled. This will make your job easier and definitely can make you look like a rockstar DBA.

This article has demonstrated the power of Extended Events, for a lot more reading on the topic, here is a list of over 100 articles.

An Experiment with Deadlocks

Everything can be fixed with a query hint (*cough* directive), right? If a certain process is consistently causing deadlocks, a simple ROWLOCK hint can be added to prevent it, right?

Well, for whatever reason, there seems to be a myth out there that when deadlocks come a-knocking, then just throw this little directive at it and all will be well. Today, we get to test that and show what will continue to happen.

First, lets look at what the ROWLOCK actually means:

Specifies that row locks are taken when page or table locks are ordinarily taken.

This seems like a fabulous idea if the deadlocks are occurring against a table involving a massive update. Let’s take a look at a small update scenario involving just a handful of records. For the setup, we will use the same setup used in a previous article by Wayne.

Looking at the code, we can see there are only five rows in each of the tables. In addition, an update will be performed to both col1 and col2 in each table for one specific row. So we are keeping this to a singleton type of update, and we are able to force a deadlock by using this setup. Not only do we see that a deadlock will occur consistently, we would see the following in the sys.dm_tran_locks DMV as well as in the deadlock graphs.

In my environment, I used session 51 and 54 consistently for the deadlock repro. In each of the tests, each spid did obtain page locks as well as RID locks (row locks) that were waiting on each other in order to perform an Update. This is what we saw when I ran the setup without the ROWLOCK directive. What if I used the ROWLOCK directive (are you guessing already that there would be no change because the deadlock occurs on the lock held for the update that is waiting on the other update to complete?)? Let’s take a look at that too!

The only change to this setup is that the ROWLOCK directive has been added to the update statements. Examining the sys.dm_tran_locks DMV reveals the same locks being held as was seen without the directive. This shouldn’t be too big of a surprise since the updates are against a single row.

In addition to the same locks being held, we continue to experience the same deadlock problem. Using an Extended Events session to trap deadlock information (similar to the article previously discussed), we can pull out some pretty good info. Let’s examine some of the deadlock data trapped by an XE session.

The results from this query will show us the deadlock graph, the event data, as well as several other pieces of data already parsed from the session data for you. And looking at the session data, one can see that the sql_text from each of the queries will demonstrate both the ROWLOCK directive and the directive-free versions of the query. In this query you can also see that I did a little black magic to match up the two event types from the event session (lock_deadlock and xml_deadlock_report). Then I was able to join the two together to produce one row per deadlock event and to see the sql_text with the deadlock graph on one row. Otherwise, the sql_text does not produce with the deadlock_report event. I leave the rest of the query to the reader to discover and question.

From the EventDeadlockGraph column, we could click the cell and take a close look at the XML generated for the deadlock event. Further, if I choose to save the xml as an XDL file and then reopen it in SSMS, I can see the deadlock graphical report as shown in the following.

We see that row locks are still in effect for the update coming from both sides. This further supports that the directive really is just a waste of time in trying to combat this type of deadlock. This is one of those cases where the best option would be to optimize the code and work things out without trying to take a shortcut.

Wrapping Up

Look to optimize the code instead of trying to take a shortcut. In addition, take a look at the deadlocks, the locks held, and the code to get a better understanding of what is truly happening.

This article demonstrates briefly the power of Extended Events while diving into deadlocks. For more on using Extended Events, start reading here! This article may also be of interest.

Easy Permissions Audit

Comments: 2 Comments
Published on: January 21, 2019

Something I have written about more than a handful of times is the need to audit. When people think about audits, the first thing that comes to mind is most likely the IRS and taxes. More than taxes are audit-able. Despite that tendency to first think taxes when somebody says “audit”, I am not writing about taxes. I will typically write about the different topics within SQL Server that can be audited. Has Johnny performed a logon to the server? When was the last time the permissions to the database changed? Did somebody change an object? When was the last time stats were updated? How about auditing success and failure of your backups (or all agent jobs for that matter)? Those are the topics I will typically write about. Today, I am going to share a simple method to help perform an easy permissions audit permissions – on a manual basis.

Easy Permissions Audit

As the article title denotes, today I will be discussing a simple way to get quick permissions for various principals. If you are looking for a more comprehensive and human friendly report version, I recommend reading any of my many other articles on the topic such as the following article – here or here. In the second of those links there is a clue as to what tool we will be using in this easy version of the audit. That tool is called sp_helprotect.

The stored procedure sp_helprotect is a system stored procedure from Microsoft that can help divulge permissions for various principals in a simple table result set for you. Bearing in mind that I am keeping this to a simple audit, the examples will be simplistic in nature. Without further ado, here is the easy audit for your permissions.

sp_helprotect

This stored procedure was introduced in SQL Server 2008 and comes with a few parameters to help narrow the results down to a specific principal and even to any object to which that principal may have been granted permissions. Here are those parameters for quick review:

@name = This parameter is to filter your request down to a specific object or a statement that can be executed against that object (e.g. alter, create, drop)

@username = Is the name of the principal for which permissions are returned.

@grantorname = Is the name of the principal that granted permissions.

@permissionarea = This is the group of grant-able permissions. There are two types of groups: object and statement. The default setting here is to return both groups.

The easiest way to use sp_helprotect is as follows:

Do you see how easy that is? This returns the following results for me.

Note from the results that I see results for roles and users for various different objects. This is due to how the procedure was executed – with no parameters. Using no parameters in this query, the default behavior is to return as much information as possible for all objects and principals within the database.

What if I only want the results for a principal named “Gargouille”? I can do that in the following way.

Now, I will receive the following results:

Recap

There are many ways to produce an audit. Today, I have shown how one can produce a permissions audit when in a hurry that will produce a simple result set for database permissions. I want to underscore that this was at the database level and not the server level. While this is an adequate means for a quick peek into some of the objects and granted permissions, I do recommend using one of the other methods I have introduced in the other articles for ongoing complex audits and results that are somewhat prettier and more human friendly to read.

For more articles on audits and auditing check here and here.

Audit SQL Agent Jobs

One probably seldom thinks of the SQL Agent jobs scheduled on the SQL Server instance – unless they fail. What if the job failed because something was changed in the job? Maybe you knew about the change, maybe you didn’t.

Once upon a time, I was in the position of trying to figure out why a job failed. After a bunch of digging and troubleshooting, it was discovered that the job had changed but nobody knew when or why. Because of that, I was asked to provide a low cost audit solution to try and at least provide answers to the when and who of the change.

Tracking who made a change to an agent job should be a task added to each database professionals checklist / toolbox. Being caught off guard from a change to a system under your purview isn’t necessarily a fun conversation – nor is it pleasant to be the one to find that somebody changed your jobs without notice – two weeks after the fact! Usually, that means that there is little to no information about the change and you find yourself getting frustrated.

To the Rescue

When trying to come up with a low to no-cost solution to provide an audit, Extended Events (XE) is quite often very handy. XE is not the answer to everything, but it does come in handy very often. This is one of those cases where an out of the box solution from XE is pretty handy. Let’s take a look at how a session might be constructed to help track agent job changes.

With this session, I am using degree_of_parallelism as a sort of catch-all in the event that queries that cause a change are not trapped by the other two events (sql_statement_completed and sp_statement_completed). With the degree_of_parallelism event, notice I have a filter to exclude all “Select” statement types. This will trim some of the noise and help track the changes faster.

Looking at data captured by this session, I can expect to see results like the following.

And the degree_of_parallelism event will catch data such as this.

In this example, the deletion of a job was captured by the degree_of_parallelism event. In addition to catching all of the various events that fire as Jobs are being changed and accessed, one will also be able to get a closer look at how SQL Agent runs about its routine.

The Wrap

Extended Events can prove helpful for many additional tasks that may not be thought of on an every day basis. With a little more thought, we can often find a cool solution via Extended Events to help us be better data professionals. In this article, we see one example of that put to use by using XE to audit Agent Job changes.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

For another interesting article about SQL Agent, check this one out!

Quick Permissions Audit

Whether it is for a client, an audit, or just for good housekeeping, DBAs will often need to figure out who has access to what.  In addition, they may need to know by what means people have access within your databases.

When that need arises, it is frequently adequate to just perform a quick audit and then dive deeper if the quick audit shows anything questionable. Today, I am sharing a means to perform this quick audit. I will start with the logins (server principals), peek at the users (database principals), then the server roles, and wrapping it up with the database roles. What I don’t do is dive into the windows group memberships.  So if a windows group exists within SQL Server as a login, I do not try to figure out who is a member of that role.  That can be obtained, but is a separate discussion.

So, in the vein of a quick permissions audit, I will quickly get to the script to help perform this permissions audit.

As you look through this code, you will notice that I have done a few interesting things.  Here is a quick recap:

  • Build a temp table based on a delimited list of databases input into a variable
  • Build a temp table based on a delimited list of logins input into a variable
  • Used a string splitter known as DelimitedSplit8K by Jeff Moden (google it) or LMGTFY
  • Used FOR XML to build a delimited string of Server Role Permissions
  • Used FOR XML to build a delimited string of Database Role Permissions
  • Used a CURSOR to build a dynamic SQL string to execute for each database

Holy cow! That is a lot of stuff for such a simple quick audit.  A cursor? Seriously? Well yeah! It is OK in this type of scenario to use a cursor.  This is a well-performing cursor for an administrative function. In addition, I chose to use sub-queries to build my delimited string of permissions so I could keep those things together in a single record per login/role.

If I run that query on my instance of SQL Server, I would end up with something that looks like the following.

Now, obviously, unless you had these same databases, logins, and even the same permissions assigned to these logins, your results will be different. So, make sure you go and change those inputs for logins and databases to be queried.

For more articles on audits and auditing check here and here.

Short Circuiting Your Session

It isn’t very often that one would consider a short circuit to be a desired outcome. In SQL Server we have a cool exception to that rule – Extended Events (XE).

What exactly is a short circuit and why would it be undesirable in most cases? I like to think of a short circuit as a “short cut” in a sense.

I remember an experience that happened while running a marathon many years ago. A person I had pulled up next to and started chatting with needed to use the restroom. I continued along on the course and a mile later I saw the same person suddenly reappear on the course ahead of me. This person had found a short cut on the course and decided to use it. If caught, he would have been disqualified. He may have saved himself a mile of running and gotten a better time, but the act was to take a course that was not the intended official course for that race.

In electricity, a short circuit does a similar thing. The electricity will follow the path of least resistance. Sometimes, this means the unofficial desired path for the current to flow. The end result can be very bad in electrical terms as an overload may occur which can cause overheating and sparking.

Why would we want an overload?

In electricity and mechanical parts, we really don’t want anything to cause short cuts in the system. On the other hand, when we are dealing with tracing and anything that can put a load on the system, we want that load to be as small as possible.

Trying to trace for problems in the SQL Server engine comes with a cost. That cost comes in the form of additional resource requirements which could mean fewer resources available for the engine to process user requests. None of us wants for the end-user to be stuck waiting in a queue for resources to free due to our tracing activities (i.e. Profiler). So a lightweight method (to trace) is needed.

XE is that lightweight method. A big part of the reason for that is the ability of XE to short-circuit (short-cut) to the end result. How can an XE session short-circuit? Think logic constraints and predicates. I previously demonstrated how to short-cut the system by using a counter in the predicate, but the short circuit isn’t constrained to just a counter in the predicate. The short-circuit is super critical to performance and success, but it is often misunderstood and poorly explained. So, I am trying to explain it again – better.

If we follow the principle that a short-circuit is the path of least resistance, we have a construct for how to build the predicate for each event in a session. Think of it as path of least work. Just like with children, XE and electricity will evaluate each junction with a bit of logic. Do I have to do more work if I go down this path or less work? Less work? Great, I am going in this direction.

As an event is fired off and is picked up by the XE session, the session compares that event payload to the conditions in the predicate. Everything in the predicate is processed in precise order – until a predicate condition fails the comparison (or result is false). Immediately when a condition results to negative (false) then the XE session jumps right to the end and closes. Nothing more is processed.

This is why predicate order matters. If a predicate evaluates to false, the short-circuit is invoked and the evaluation ends. With that in mind, what is the most desirable condition in the predicate to be first?

I have heard multiple people state that the “most likely to succeed” predicate should be first. Well, if the “most likely success” is first what does that mean for your session? It will have to do more work! That is exactly the model that Profiler used (uses) and we all know what happens with Profiler and performance!

No! We don’t want the most likely to succeed to be the first predicate to be evaluated. We want the least likely to succeed to be first. This means less work – just as illustrated in the previous image where the short-circuit is represented by the red line. If you would like, we can also call each of the three light-bulbs “predicates” and the switch would be the event (nothing is traced in the session if the event doesn’t even match).

Which Comes First?

This brings us to the hard part. How should one order the predicates for each event? The answer to that is not as cut and dry as you probably want. There are many variables in the equation. For instance, the first variable would be the environment. Each SQL environment is different and that makes a difference in evaluating events and predicates. However, lets use a common-ish set of criteria and say we need to decided between database name and query duration.

The questions in this case now comes down to 1) how many databases are on the server? and 2) what are the chances of a query lasting more than 5 seconds? If you have 100 databases on the server and 99 of them frequently see queries over 5 seconds, then this predicate order would make sense. What if you have only 4 databases and a query over 5 seconds occurs roughly 1 in 10,000 times? Then the predicate order should be switched to the following.

If you don’t have a database by the name of “AdventureWorks2014” then the database name predicate would remain first but really it should be changed to an appropriate database name that exists.

The Wrap

Predicate order in an XE session is very important. A well designed predicate can lead to a highly tuned and well performing trace that will ease your life as a data professional. Just remember, contrary to various people out there, the most desirable predicate order is to have the “least likely to succeed” first and the “most likely to succeed” should be last.

And yes, we truly do want our XE sessions to short-circuit! As we aspire to do less busy work, an XE session should be configured to do as little work as is necessary.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the eleventh article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Automatic Tuning Monitoring and Diagnostics

Cool new toys/tools have been made available to the data professional. Among these tools are query data store and automatic tuning. These two tools actually go hand in hand and work pretty nicely together.

With most new tools, there is usually some sort of instruction manual along with a section on how to troubleshoot the tool. In addition to the manual, you usually have some sort of guide as to whether or not the tool is working within desired specifications or not.

Thanks to Extended Events (XE), we have access to a guide of sorts that will help us better understand if our shiny new tool is operating as desired.

Operationally Sound

XE provides a handful of events to help us in evaluating the usage of Automatic Tuning in SQL Server. To find these events, we can simply issue a query such as the following.

When executed, this query will provide a result set similar to the following.

I have grouped the results from this query into three sets. In the red set, I have four events that are useful in the diagnostics and monitoring of automatic tuning. These events show errors, diagnostic (and performance) data, configuration changes and state changes.

For instance, the state change event will fire when automatic tuning is enabled and will also fire when the database is started (assuming the session is running). The automatic_tuning_diagnostics event fires roughly every 30 minutes on my server to gather performance and diagnostic data that can help me understand how well the feature is performing for my workload in each database.

Highlighted in the green section is a couple of maps that show the various values for the current phase or state of the automatic tuning for each database. One can view these different values with the following query.

This query yields these results.

We will see those values in use in the events in a session shortly.

We have seen some of the events and some of the maps at a very quick glance. That said, it is a good time to pull it all together and create a session.

Seeing as this session won’t produce any results without Query data store being enabled and automatic tuning being configured for a database, I have set all of that up in a demo database and have some fresh results to display.

Here I show an example of the output filtered for just the diagnostics event. Note the phase_code shows some of those map values previously discussed. I can also see that roughly every 30 minutes each database undergoes a diagnostics check.

Now, looking at another event in that same session, I can see the following.

The state_code in this event payload demonstrates more values from the maps previously discussed (CorrectionEnabled and DetectionEnabled). In this case, the automatic_tuning_state_change fired a few times for database 6 because that database was intentionally taken offline and set back online to test the event.

The use of these particular events in this session is very lightweight. I don’t have a predicate configured for any of the events because I wanted to trap everything. Of course, the number of events can increase with an increased load and usage scenarios on different servers.

The Wrap

Automatic tuning can be a pretty sharp tool in your tool-belt on your way to becoming that rock-star DBA. As you start to sharpen your skills with this tool, you will need to have some usage and diagnostic information at your fingertips to ensure everything is running steady. This event session is able to provide that diagnostic information and keep you on top of the automatic tuning engine.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the eleventh article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Event Tracing for Windows Target

There are many useful targets within SQL Server’s Extended Events. Of all of the targets, the most daunting is probably the Event Tracing for Windows (ETW) target. The ETW target represents doing something that is new for most DBAs which means spending a lot of time trying to learn the technology and figure out the little nuances and the difficulties that it can present.

With all of that in mind, I feel this is a really cool feature and it is something that can be useful in bringing the groups together that most commonly butt heads in IT (Ops, DBA, Devs) by creating a commonality in trace data and facts. There may be more on that later!

Target Rich

The ETW target is a trace file that can be merged with other ETW logs from Windows or applications (if they have enabled this kind of logging). You can easily see many default ETW traces that are running or can be run in Windows via Perfmon or from the command line with the following command.

And from the gui…

Finding the traces is not really the difficult part with this type of trace. The difficult parts (I believe) come down to learning something new and different, and that Microsoft warns that you should have a working knowledge of it first (almost like a big flashing warning that says “Do NOT Enter!”). Let’s try to establish a small knowledgebase about this target to ease some of the discomfort you may now have.

One can query the DMVs to get a first look at what some of the configurations may be for this target (optional and most come with defaults already set).

Six configurations in total are available for the ETW target. In the query results (just above) you will see that the default value for each configuration option is displayed. For instance, the default_xe_session_name has a default value of XE_DEFAULT_ETW_SESSION. I like to change default names and file paths, so when I see a name such as that, rest assured I will change it. (Contrary to popular belief, the path and session name default values can certainly be changed.)

As I go forward into creating an XE session using the ETW target, it is important to understand that only 1 ETW session can exist. This isn’t a limitation of SQL Server per se, rather a combination of the use of the ETW Classic target (for backwards compatibility) and Windows OS constraints. If the ETW target is used in more than one XE session on the server (even in a different SQL Server instance), then all of them will use the same trace target in windows (consumer). This can cause a bit of confusion if several sessions are running concurrently.

My recommendation here is to use a very precise and targeted approach when dealing with the ETW target. Only run it for a single XE session at a time. This will make your job of correlating and translating the trace much easier.

The ETW target is a synchronous target and does NOT support asynchronous publication. With the synchronous consumption of events by the target, and if you have multiple sessions with the same event defined, the event will be consumed just a single time by the ETW target. This is a good thing!

Two more tidbits about the ETW target before creating an event session and looking at more metadata. The default path for the target is %TEMP%\<filename>.etl. This is not defined in the configuration properties but is hardwired. Any ideas why one might want to specify a different path? I don’t like to use the temp directory for anything other than transient files that are disposable at any time!

Whether you change the directory from the default or leave it be, understand that it cannot be changed after the session starts – even if other sessions use the same target and are started later. However, if you flush the session and stop it, then you can change it. I do recommend that it be changed!

Second tidbit is that other than the classic target, ETW does have a manifest based provider. Should Extended Events (XE) be updated to use the manifest based provider then some of the nuances will disappear with translating some of the trace data (future article to include ntrace and xperf – stay tuned). For now, understand that viewing the ETW trace data is not done via SQL Server methods. Rather, you need to view it with another tool. This is due to the fact that the ETW is an OS level trace and not a SQL Server trace.

Session Building

If it is not clear at this point, when creating an XE session that utilizes the ETW target, two traces are, in essence, created. One trace is a SQL server (XE) trace that can be evaluated within SQL Server. The second trace is an ETW trace that is outside the realm of SQL Server and thus requires new skills in order to review it. Again, this second trace can be of extreme help because it is more easily merged with other ETW traces (think merging perfmon with sql trace).

When I create a session with an ETW target, it would not be surprising to see that I have two targets defined. One target will be the ETW target and a second may be a file target or any of the others if it makes sense. The creation of two targets is not requisite for the XE session to be created. The XE data will still be present in the livestream target even without a SQL related target.

Before creating a session, I need to cover a couple of possible errors that won’t be easy to find on google.

Msg 25641, Level 16, State 0, Line 101 For target, “5B2DA06D-898A-43C8-9309-39BBBE93EBBD.package0.etw_classic_sync_target”, the parameter “default_etw_session_logfile_path” passed is invalid.
The operating system returned error 5 (ACCESS_DENIED) while creating an ETW tracing session.
ErrorFormat: Ensure that the SQL Server startup account is a member of the ‘Performance Log Users’ group and then retry your command.

I received this error message even with my service account being a member of the “Performance Log Users” windows group. I found that I needed to grant explicit permissions to the service account to the logging directory that I had specified.

Msg 25641, Level 16, State 0, Line 105 For target, “5B2DA06D-898A-43C8-9309-39BBBE93EBBD.package0.etw_classic_sync_target”, the parameter “default_xe_session_name” passed is invalid.
The default ETW session has already been started with the name ‘unknown‘.
Either stop the existing ETW session or specify the same name for the default ETW session and try your command again.

This error was more difficult than the first and probably should have been easier. I could not find the session called ‘unknown’ hard as I might have tried. Then it occurred to me (sheepishly) that the path probably wanted a file name too. If you provide a path and not a filename for the trace file, then this error will nag you.

I found both error cases to be slightly misleading but resolvable quickly enough.

The session is pretty straight forward here. I am just auditing logins that occur on the server and sending them to both the ETW and event_file targets. To validate the session is created and that indeed the ETW session is not present in SQL Server, I have the following script.

Despite the absence of the ETW session from SQL Server, I can still easily find it (again either shell or from the perfmon gui). Here is what I see when checking for it from a shell.

Even though the session (or session data) is not visible from SQL Server, I can still find out a tad more about the target from the XE related DMVs and catalog views.

Running that query will result in something similar to this:

The Wrap

I have just begun to scratch the surface of the ETW target. This target can provide plenty of power for troubleshooting when used in the right way. The difficulty may seem to be getting to that point of knowing what the right way is. This target may not be suitable for most troubleshooting issues – unless you really need to correlate real windows metrics to SQL metrics and demonstrate to Joe Sysadmin that what you are seeing in SQL truly does correlate to certain conditions inside of windows. Try it out and try to learn from it and figure out the right niche for you. In the interim, stay tuned for a follow-up article dealing with other tools and ETW.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the tenth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Checking your Memory with XE

It is well known and understood that SQL Server requires a substantial amount of memory. SQL Server will also try to consume as much memory as possible from the available system memory – if you let it. Sometimes, there will be some contention / pressure with the memory.

When contention occurs, the users will probably start screaming because performance has tanked and deadlines are about to be missed. There are many different ways (e.g. here or here) to try and observe the memory conditions and even troubleshoot memory contention. Extended Events (XE) gives one more avenue to try and troubleshoot problems with memory.

Using XE to observe memory conditions is a method that is both geeky/fun and an advanced technique at the same time. If nothing else, it will certainly serve as a divergence from the mundane and give you an opportunity to dive down a rabbit hole while exploring some SQL Server internals.

Diving Straight In

I have a handful of events that I have picked for an event session to track when I might be running into some memory problems. Or I can run the session when I suspect there are memory problems to try and provide me with a “second opinion.” Here are the pre-picked events.

Investigating those specific events a little further, I can determine if the payload is close to what I need.

That is a small snippet of the payload for all of the pre-picked events. Notice that the large_cache_memory_pressure event has no “SearchKeyword” / category defined for it. There are a few other events that also do not have a category assigned which makes it a little harder to figure out related events. That said, from the results, I know that I have some “server” and some “memory” tagged events, so I can at least look at those categories for related events.

This query will yield results similar to the following.

If you look closely at the script, I included a note about some additional interesting events that are related to both categories “server” and “memory.”

After all of the digging and researching, now it’s time to pull it together and create a session that may possibly help to identify various memory issues as they arise or to at least help confirm your sneaking suspicion that a memory issue is already present.

When running this session for a while, you will receive a flood of events as they continue to trigger and record data to your trace file. You will want to keep a steady eye on the trace files and possibly only run the session for short periods.

Here is an example of my session with events grouped by event name. Notice anything of interest between the groups?

If the data in the session does not seem to be helpful enough, I recommend looking at adding the additional events I noted previously.

Here is another view on a system that has been monitoring these events for a while longer and does experience memory pressure.

Here we can see some of the direct results of index operations on memory as well as the effects on memory for some really bad code. Really cool is that we can easily find what query(ies) may be causing the memory pressure issues and then directly tune the offending query(ies).

The Wrap

Diving in to the internals of SQL Server can be useful in troubleshooting memory issues. Extended Events provides a means to look at many memory related events that can be integral to solving or understanding some of your memory issues. Using Extended Events to dive into the memory related events is a powerful tool to add to the memory troubleshooting toolbelt.

Try it out on one or more of your servers and let me know how it goes.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the ninth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

Finding Application Session Settings

One of the underused troubleshooting and performance tuning techniques is to validate the application session settings. Things can work fabulous inside of SSMS, but run miserably inside the application. I have long been using Extended Events to help me identify these settings (and yes XE has saved the day more than once by identifying the application settings easily). This article will help show how to use XE to help save the day or at least identify what an application is doing when connecting to SQL Server.

This is only one method, there are other methods. My second option is usually to drop into the DMVs – but others exist beyond that. Tara Kizer jumps into some of those other methods here.

Easy Stuff First

Before diving into XE, first it makes sense to get some more data on what the possible connection settings include. We can query SQL Server for most of the applicable information. For the extended details we have to look it up online.

Inside SQL Server, we have been given the information for what the values are and what the setting name happens to be. Querying the spt_values table for the group of values of type “sop” (think set options) we get the results we need. That will yield results similar to this.

If I take that a little further, I can modify the query to figure out what configurations are enabled for my current session (in SSMS).

For me, currently, this yields the following.

Everything marked with a “1” is enabled and the rest are disabled. Ok, easy enough. Now that we can figure out SSMS values and we have an idea of what they mean, it is time to trap the settings from the application. We will be doing that via XE.

App Settings

In order to find the application settings, we need to capture a specific data point called “collect_options_text”. To find which events have this type of data, we can query the XE infrastructure.

Running the preceding query finds two events – login and existing_connection. Both indicate that the “collect_options_text” is a flag that is disabled by default. When enabled it will collect the options_text for each session (new or existing depending on your connections).

If I delve further into the “login” event, I can see some nice data points for troubleshooting and learn more about what the event does.

Which yields this…

Everything in the orange circles is useful in various troubleshooting scenarios. Just a little side tidbit to keep in your reserves. The blue box is highlighting the options and options_text data points. The options_text becomes enabled when we flip the “collect_options_text” flag to on.

Another interesting note is the “SearchKeyword”. This is a category of sorts (it is a category when looking at it in the GUI). This can tell me all of the events that also might be related to the login event. Looking deeper at that, I can see the following.

That is another juicy tidbit to keep in your back pocket as an extra tool for future use! Seventeen events are in the “session” category and could be related, but we will not use them for this particular event session.

The Juicy Center

Having covered some of the path to getting to the events that matter and what data is available in the events, we are now ready to put a session together.

After creating and starting the XE session, all that is needed is to wait for a login event to occur from the application. Once it does, then check the trace file and evaluate the data. As I look at the data from the application and look specifically at the options_text data, I will see something like the following.

I circled an interesting difference that pops up between the XE session and the @@Options server variable. A login captured by XE will also show the language and date settings for the connection.

The Wrap

Creating a session to capture the settings being used by an application is particularly easy. Being able to trap the relevant data and troubleshoot performance issues is a tool necessary (and yes easy to do) to be able to quickly become a rock-star DBA. I showed how to search for the necessary events (quickly) as well as how to figure out relationships between events in a particular category.

Try it out on one or more of your servers and let me know how it goes.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This has been the eighth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.

«page 1 of 12

Calendar
April 2019
M T W T F S S
« Mar    
1234567
891011121314
15161718192021
22232425262728
2930  

Welcome , today is Monday, April 22, 2019