Detail Job History – Back to Basics

Recently, I covered the need to understand job failure frequency and knowing the value of your SQL Agent jobs. You can read the specifics in the article – here.

Sometimes, just knowing the frequency of the job failure is good enough. Sometimes, more information is helpful. Having more information is particularly helpful when meeting with the business to discuss the validity of the job in question.

What do you do in times like this? The most basic answer to that question is – get more data. But that barely scratches the surface. The real question being asked there is how do you go about gathering that data?

There are two methods to gather the data – the hard way and the easy way. Do you like to work hard? Or would you rather work more efficiently?

Efficiency Matters

As was discussed in the previous article, I prefer to do things just a little bit less manually where possible. The consistency of a script matters, but it also is just so much faster than doing things the hard, manual, iterative way. So let’s build a little bit on the script from the previous article.

And here is a sample of the output.

With this script, I have the ability to quick show which step is failing, what the command is for that step, what kind of process is running on that step, any passwords (in the event of an SSIS password), and of course the failure frequency. This is golden information at the fingertips. There is no need to click through the GUI to gather this information. You can get it quickly and easily in one fell swoop.

The Wrap

An important part of any DBAs job is to ensure database related jobs are running prim and proper. Sometimes that just doesn’t happen. When jobs are being overlooked, it is useful to be able to gather data related to consistency of job success or failure. This script will help you in your investigation efforts. In addition, I also recommend this article in your agent job audit efforts.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Database File Changes

Data professionals around the globe are frequently finding themselves occupied with figuring out why and when a file (data or log) for a database has changed in size. Whether that change is a growth or shrink, or if the change was expected to happen or not.

I have written previously about the need to monitor these changes and how to do it more efficiently (there is even a fail-safe that occasionally works. As SQL Server improves, so does our ability to capture these types of events.

Given that SQL Server has undergone many enhancements, let’s take a look at the enhancements related to capturing the database file size changes.

Database File Related Events

If you read the articles mentioned previously, you will have encountered a previous article that include an Extended Event session that can be used for tracking file changes. That session is really good enough in most cases. One major change that I would suggest off the top is the asynchronous file target. While that target will work on all editions of SQL Server since 2008, the name of the target was changed in SQL Server 2012 to event_file. No big deal there. The XEM file is also no longer necessary, so that piece can just be wiped out.

That said, what else has changed? Let’s cover some deprecated events that may get you frustrated if you encounter them. The following events should be avoided because they will do nothing (a couple of them have been covered in this previous article).

  • sqlserver.databases_log_growth – Databases log growth
  • sqlserver.databases_log_file_size_changed – Databases log file size changed
  • sqlserver.databases_data_file_size_changed – Databases data file size change
  • sqlserver.auto_shrink_log – Auto shrink log ended.

Each of the preceding events have been replaced by a single event called database_file_size_change.

And then there is this one that I can’t seem to get to generate any events but it might possibly still be valid. In short, don’t be surprised one way or the other if it does something.

  • sqlserver.auto_shrink_log_caught_exception – Auto shrink log caught an exception.

Great, we have some events we can avoid. Now let’s look at some events we should consider.

  • sqlserver.database_file_size_change – Occurs when any of the data or log files for a database changes size. Use this event to monitor changes to the size of database files.
  • sqlserver.databases_shrink_data_movement – Databases shrink data movement
  • sqlserver.databases_log_file_used_size_changed – Databases log file used size changed (this one gets noisy – use judiciously).
  • sqlserver.databases_log_shrink – Databases log shrink

Now that we have some events that are viable as well as some events to avoid, let’s put a session together.

All we need to do at this juncture is test the session.

Let’s drill into the database_file_size_change event and take a closer look.

There are a few things going on with this event. As previously mentioned, this event captures multiple different types of size related events. If you now look at the sections that I have circled in red, you will note that there is a flag that tracks if the event was an automatic size change. It doesn’t matter if it was a growth or shrink, both can be automatic. The way we can tell if it was an automatic shrink is due to the negative value in the size_change_kb field.

Next up, we have the green highlighted section. I have three distinct timestamps circled and separated by a dotted red line. Note the time difference between each of the groups. Each group of events is separated by 30 minutes. As it turns out, if you have Autoshrink enabled on your database, the timer is a 30 minute interval. Yes! Autoshrink just so happened to be enabled on this database – for testing purposes.

Wrapping it Up

Extended Events is a powerful tool with plenty of ease of use and flexibility. This flexibility allows the DBA to capably monitor the server for any issue be it small or large. This article demonstrated how to use Extended Events to monitor for file size changes and the same principles can be applied to any of the waits you may need to investigate. If you are in need of a little tune-up for your XE skiils, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

In addition, through the power of XE, we were able to capture a quick glimpse into some internals related to the database engine. In this case, we were able to see autoshrink in action and discover the frequency that it runs on when enabled.

Job History – Back to Basics

How necessary is that SQL Server Agent job that you have running on the server? I ask that question of my clients on a routine basis.

Sometimes, I will ask that question as a routine part of a health check for the server. Others, it might be due to a migration or upgrade that is required for the server. Generally, the answer to the question will be one of two things: a) “Yes it is necessary.” or b) “What is that job doing?”.

Believe it or not, both answers will typically spawn more questions. You see, I don’t usually ask that question unless the job is failing on a regular basis. You wouldn’t believe how many jobs exist out there that are scheduled and just fail every time as well.

When I encounter a situation like this, it means it is time to have a discussion. In order to have that discussion, one needs to have done a bit of homework in order to better understand the situation. For me, part of this homework involves running the numbers and figuring out the frequency of the job’s failure or success.

Data Gathering

For me, I like to understand how often a job has executed and what is the frequency of failure for that quantity of executions. If I see a job that has not succeeded successfully in 60 consecutive executions, it is probably a safe bet that the job is not needed. Why? Well, if nobody has noticed the job hasn’t been working for that long, the likelihood of the job providing any use to the business is slim to none. In this case, I would present a case to the business as to why it should be removed.

But, how do I get to that point? Well, you could go through the job history for each job one by one and run some manual analytics. Or, you could take advantage of a script. I prefer the script route because it is faster, more reliable and a lot less mundane.

Running that script against my sandbox, I may see something such as the following.

Here you will note that the “wtf” job has two entries. One entry for “Succeeded” (in green) and one entry for “Failed” (in red). Each row receiving counts for number of executions.

This is the type of information I can use in a meeting to discuss with the business. This is no longer a discussion of opinion, but rather one that is now based on facts and data. It now becomes very easy to demonstrate to the business that a job has failed 60/60 times and nobody noticed it or cared enough about the consistent failures to do anything about it. Imagine if the failing job happens to be the database backups. I wonder what the action items for that job failure might include.

The Wrap

An important part of any DBAs job is to ensure database related jobs are running prim and proper. Sometimes that just doesn’t happen. When jobs are being overlooked, it is useful to be able to gather data related to consistency of job success or failure. This script will help you in your investigation efforts. In addition, I also recommend this article in your agent job audit efforts.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Public Role Permissions – Back to Basics

Permissions in the database environment is such an important task. Setting permissions correctly is paramount to a successful audit and one of your best defenses against improper/unwanted access. Yet, in spite of the importance, security is often overlooked, neglected, improperly configured or just flat out ignored. Let’s not forget the times that security is intentionally misconfigured so certain individuals are permitted a backdoor access.

Security, just like performance tuning, is a perpetual (and sometimes iterative) task. There is no excuse for setting your security and forgetting it. It must be routinely reviewed.

While performing a script review for a client, I was reminded of the need to also review their security setup. The reminder was more of a slap in the face as I noticed that the developer had built in some permissions assignments for some upgrade scripts. Unfortunately, we were not permitted to alter any of the scripts due to them being from a third party vendor (and that vendor refused as well to fix the problems with the scripts but I digress).

What could be wrong with this?

I want you to digest that for just a moment. This is an example of the permissions this particular vendor insists on setting for the public role. What could possibly be wrong with that? Let’s examine a couple of the permissions like “Control” and “View Change Tracking”.

View Change Tracking

This permission is an elevated permission that is required in order to use the change tracking functions. This permission is necessary for the following reasons:

  1. Change tracking records contain the PK value for rows that have been deleted. If sensitive information was deleted that a user should not be able to access, the user would be able to reverse engineer the data from the change tracking data.
  2. A user may be denied access to a column that contains sensitive data. If data in the column is changed, the data would be stored in the change tracking and a user can determine the values that were updated for the sensitive data.

Control

I am going to take this one direct from the Microsoft documentation.

Confers ownership-like capabilities on the grantee. The grantee effectively has all defined permissions on the securable. A principal that has been granted CONTROL can also grant permissions on the securable. Because the SQL Server security model is hierarchical, CONTROL at a particular scope implicitly includes CONTROL on all the securables under that scope. For example, CONTROL on a database implies all permissions on the database, all permissions on all assemblies in the database, all permissions on all schemas in the database, and all permissions on objects within all schemas within the database.

Now digest that a bit. Once digested, consider what the public role does to user access in a database. The public role permissions are inherited by all users of the database whether the users have been granted the permission or not. You should only grant permissions to the public role that you really honestly believe that ALL users should have. If you are being serious in your role, then the amount of times you grant permissions to the public role should either be a) never, b) when you want to have a data breach, or c) you are testing in a sandbox to improve your skills.

Check for Perms

When you are uncertain of which permissions have been assigned to the public role, or you just haven’t reviewed your permissions real-estate in some time, it is best to pull out a script and start the process. As luck would have it, I have a few scripts that can help with that (here or here) and I have a new one that I am sharing now.

Let’s start with a basic query that will display all of the permissions assigned to the public role in a specific database.

There is nothing super special about this query. Looking at it, it is querying the permissions for the public role specifically. I display where the permission is a “Deny” or “Grant”. Then we list the permission name and then the schema and the object.

Let’s take that script and evolve it now. I am going to plan for the worst and expect that some permissions have been applied that shouldn’t have by some vendor upgrade script (because – well, history). Since I am expecting the worst, I am going to add some script generating code that will revoke the unwanted permissions. And still expecting the worst would be that revoking the permissions will break something, I will also add some code that can generate the appropriate “Grant” statements.

That looks better. I have a way of identifying the unwanted permissions as well as an easy script I can execute to remove the unwanted permissions. Note the use of the collate in the final two columns. As it turns out, permission_name from sys.database_permissions has a column collation of Latin1_General_CI_AS_KS_WS. Since I ran into some errors (shown below), it is easier to direct the DB engine to use the collation that matches the permission_name column.

Msg 451, Level 16, State 1, Line 11
Cannot resolve collation conflict between “SQL_Latin1_General_CP850_CS_AS” and “Latin1_General_CI_AS_KS_WS” in add operator occurring in SELECT statement column 5.
Msg 451, Level 16, State 1, Line 11
Cannot resolve collation conflict between “SQL_Latin1_General_CP850_CS_AS” and “Latin1_General_CI_AS_KS_WS” in add operator occurring in SELECT statement column 6.

Alas, this is still not quite as efficient of a script as I would like. I may have hundreds of databases on the instance and need to evaluate all of them. Time for the bigger guns.

That will take care of all of the permissions for the public role in all of the databases, with a slight caveat. I am only checking against that objects that are not flagged as is_ms_shipped. Now, isn’t there also a public role at the server scope? Indeed there is! Let’s also capture those permissions.

Now, I feel many times better about what could possibly be going wrong with the public role.

If you are in a tightly controlled environment or you are just sick of people doing this sort of thing to your servers, there are more extreme measures that can be taken. You can read about it here or here.

The Wrap

It is amazing what some people will do that just doesn’t make sense. Granting permissions to the public role is one of these cases. That behavior also explains why there are documents and procedures for hardening the public role (here and here).

If necessary, I recommend locking down your public role. It will make your job a little easier and give you better rest at night.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Get the Source of PREEMPTIVE_OS_PIPEOPS Waits

A client reached out to me for help about a wait_type they were seeing. That’s not terribly uncommon, but the specific wait_type was something that was fairly obscure.

Every day, they could see the delta on this wait_type continue to climb higher and higher. The problem was not in their ability to capture the deltas for the biggest wait_types. The problem as it were, was in figuring out what exactly was the root cause of their biggest wait_type.

What was the wait_type? Well, the obscure wait_type was called PREEMPTIVE_OS_PIPEOPS. What causes this wait? As it turns out, this is a generic wait that is caused by SQL pipe related activities such as xp_cmdshell.

Knowing this much information however does not get us to the root cause of this particular problem for this client. How do we get there? This is another case for Extended Events (XEvents).

Tracking Pipe Ops

Knowing which wait_type needs attention is most of the battle in figuring out what precisely is causing the specific wait. The second piece of the puzzle is to understand whether the wait is an internal or external. This is essential because a different event is triggered depending on whether the wait is internal or external. There will be more on that in a future article.

Since I happen to know already that the PREEMPTIVE_OS_PIPEOPS wait is an external wait, then I also know that I will want to use the sqlos.wait_info_external. The downside to this method is that conventional wisdom teaches us that we must use the integer value for the wait in order to monitor for it. To find that integer value, one can query the sys.dm_xe_map_values dmv. Since the values of many of the waits change from CU to CU, then one should validate this value on every server instead of using the same value on all servers without regard.

Depending on which server I run that code on, I may receive different results.

As you can see, I have a small sample showing five different values for this specific wait. Using the wrong value on the wrong server could result in a bit of hair tugging since nothing may be trapped.

Due to this potential for change, writing a script to monitor for the session gets a little bit trickier and a lot more dynamic – for now.

After executing that script, the session will be created with the appropriate wait_type as well as the session will be started. To verify it is working, then I can execute a query that will use xp_cmdshell.

With that information in hand, we now have a tool that will help us identify the root cause of this wait climbing to the top of the waits list. Once I am able to identify the root cause, then I can start applying an appropriate fix (or do nothing if the situation deems that necessary).

But what about this map_key that keeps on changing with updates to SQL Server? Surely there is a better method than figuring out the integer value. As luck would have it, conventional wisdom is wrong in this case. There is a much easier method of creating this event session. So much easier in fact that I don’t need to worry about the map_key value. All I need to know is the wait_type I wish to monitor. So let’s look at a better version of the script.

This will yield the same results as the first script, but without the dynamic sql and with a more friendly script for the DBA to understand. This method is actually employed in the system_health system default session if you ever decide to peruse that session.

Wrapping it Up

Extended Events is a powerful tool with plenty of ease of use and flexibility. This flexibility allows the DBA to capably monitor the server for any issue be it small or large. This article demonstrated how to use Extended Events to monitor for a specific wait_type and the same principles can be applied to any of the waits you may need to investigate. If you are in need of a little tune-up for your XE skiils, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

 

Note to Self

TSQL2sDay150x150TSQL Tuesday

The second Tuesday of the month is upon us once again. That means it is time for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

Sands of Time

This month, MOHAMMAD DARAB (b | t) invites us to do the 40 yr. old self reflection. You know that age that is almost as dreadful to reach as the righteous age of 50? From there age only gets better – or at least I have been told that. I have done a couple of articles similar to this in past TSQLTuesdays. Similar yet still different just enough. Check out my article on passion,  or this on one influence.

The problem with this particular challenge is the request to write a note to your 20 yr. old self. Well, maybe not an extreme problem other than the time travel part. As luck would have it, I recently discussed the topic with my grandmother. There was a lot of wisdom in what she taught me as we discussed the topic and I must admit, I was in agreement prior to even broaching the topic with her.

Passing Notes

Here is basically what my note to myself would say: “Self, you are doing good. Keep it up!”

That is even if I chose to send myself a note. You see, I would not want to change how things have happened. Life experience is good – both the good and the bad. All of it is necessary to grow and develop. Just like Mater’s dents, each experience is valuable and shouldn’t be changed.

Now, what if I were to create a note of advice for other more junior level professionals that I might be mentoring? That is a different story. My note to them would be simple – “Listen to me.” Maybe not so literally “me” and maybe more precisely as advice to listen to anybody who is giving their time to present, mentor, blog, or write (articles, books, etc). They are imparting of their experience and time and you should respect that and take advantage of it.

Not only should you pay respect to those giving of their time, but you should spend a little time to also give back to the community. Why? Well, I covered that pretty well in this article here. I will say it again, the more you give of yourself to help others learn your trade, the more you will personally learn and grow in your professional and personal self. It is an investment well worth taking.

 

Wrapping it Up

I don’t believe in trying to alter who I have become with any sort of visit back to my past self. The changes could cause me to change my path and not end up where I have reached due to personal growth and experience. Trial and error is good for growth. Everybody needs to learn that for themselves and cannot be forced into it. Just learn to grow and be patient with anybody you may mentor.

Easy Open Event Log Files

One of the beauties of using a tool such as Extended Events (XEvents) is the versatility and ease of use. XEvents can be used for so many things to help an aspiring DBA or Developer do their job better.

There has long been a sticking point about allowing various people access to production servers. Part of the sticking point is the developer who believes that access to read and evaluate trace files is mandatory on prod (yes I have heard many times where this has been used effectively by developers to gain prod access). But is that prod access truly necessary? Without an adequate method to provide the developer access to the trace data, the DBA loses confidence (from management) and standing ground for their side of the argument. How can a middle ground be met?

XEvents to the Rescue

With the use of XEvent tracing, a DBA is given a new realm of possibilities. This particular realm of possibilities becomes available when the XEvent trace (session) is created using a file target. Through the use of a file target, and the flexibility of XEvents, I have shown the beginnings of how a developer might be able to access the trace data easily.

That is merely the beginning though. There are a few more layers to making life easier for both the developer and DBA in regards to fetching event trace data. This article will show a few methods that will help make life just a bit easier.

Let’s start with the basics. After an XEvent session is created, it is important to know where the event file is being stored. If you don’t happen to know where it is, that is not a problem. The file source is easy enough to find and I have detailed it in this article.

Method 1

As shown in the previous article, the first quick method to open an XEvent trace file is from within SSMS as shown here.

After clicking on “Merge Extended Event Files…”, a new window titled “Merge Extended Event Files” will open. From the new window, follow the following steps.

Method 2

Now, that is an extremely simple method to open and view a trace file for XEvents. That said, would you believe there are other equally easy methods? Let’s look at the next method that is very simple as well.

As was previously mentioned, you will need to know the file location first and then navigate to that location. So let’s do that. My common location is C:\Database\XE. Once navigated to the folder, I need to choose my file to be opened and follow the prompts as shown here.

A new window will pop up. By default, SSMS will be showing. If not, you can select SSMS and enable the option to always use SSMS for this file type by clicking the check box before clicking OK.

Method 3

If the prior two methods feel unreasonably easy, then there is this next method that will spice things up a bit. It does require a bit of coordination. Once you have identified the file to be opened, you can elect to perform a drag and drop operation of the file onto SSMS. Once done properly, the file will open in SSMS and you can start evaluating the data.

Method 4

This is as equally simplistic as the previous method but requires just a scoch less coordination. This method involves a double-click method. A double-click on the file will open the file up in SSMS.

Method 5

If you paid close attention to the screenshots in method 2, you will have noticed that there was a menu option called “Open” that I did not discuss. If you select open from the context menu at that point, then the xel file will open in SSMS just like any of the other methods mentioned thus far.

Wrapping it Up

Extended Events is a powerful tool with plenty of ease of use and flexibility. This flexibility allows the DBA to better service the needs of the developers when the developers need access to the trace data. This article demonstrated another use for Extended Events. If you are in need of a little tune-up for your XE skiils, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

Implicit Conversion Insights with XE

Implicit

In a recent article, I covered a little bit about implicit conversions and an error that may crop up from time to time. It is one thing to understand what an error means, it is an entirely different problem to understand what is causing the error in the first place.

Seldom does a DBA have the opportunity to get out in front of infrequent or random errors such as implicit conversions. More often than not, it is the privilege of the DBA to find out about the problem after the fact from a developer or, worse yet, an end-user.

What if there was an easy method to track when these errors occurred? Even better, it would be ideal to understand what query caused the error as well as which types of queries are merely causing implicit conversion warnings. Even better, what if you had a way to understand which type of conversion problem is at hand.

Implicit Playground

To start this playground, let’s begin with a simple query based on the AdventureWorks database.

This query will yield a plan similar to the following.

If I hover the select operator, I will see a little window like the following.

I have a few items highlighted in this window. In the larger red box, I am outlining much of the type conversion warning. Inside of that, there is a blue box highlighting the “convert_implicit” warning. If I look further into the error message, there is text about query plan choice and “CardinalityEstimate” or “SeekPlan” for the different convert_implicit warnings. Another way of looking at it is as follows.

This seems a lot more friendly than that little context window. Here we have an easy to follow layout of what is happening. As I drill down into the warnings, I can see there are three convert_implicit warnings and they all trigger a “planaffectingconvert” type of warning. As we look closer at this, I can see there are two distinct types of convert issues.

There is one for the conversion from nvarchar to int in the HumanResources.Employee table for the NationalIDNumber field (green).

Two warnings that state that an index seek has been disabled due to the conversions ( orange and purple).

Curious to see what all of the possible “ConvertIssue” values could be for these implicit conversions? That is easy enough with the following query.

Which shows us the following results.

That is a grand total of three. Having covered the cause of two of these already, all that is left is the type called “unknown”. I think we can safely presume that the cause of this one is something else that is undocumented and hence “unknown”.

So far, so good. We have seen how these warnings can crop up in execution plans. We have also seen how to query the for the “ConvertIssue” types or “plan_affecting_convert_type” from the Extended Events (XE) maps system DMV.

Right about now, the light bulb should be popping up for you. If we can query the XE metadata for the different convert issue types, then does that mean there is a way we can use XE to track these things and review them at our own convenience? Yes there is!

XE

Without further adieu, let’s look at an XE session that can be useful in discovering more information in regards to implicit conversions in the database environment.

This session is pretty comprehensive with the ability to be very much like a fire-hose session if the commented out events are uncommented prior to session creation. Running this on a busy server ( less than 1 minute) for a client with a third party vendor application produces results that look like the following.

As we can see here, the session is trapping when I have any of those pesky implicit conversion errors (red box outline) as mentioned in the previous article, as well as the convert_implicit warnings (purple box outline) that were demonstrated in the execution plans in this article.

How am I trapping the implicit conversion error? That is done via the following piece of the event session shown above.

Error number 257 is the message id for the implicit conversion error. When filtering on that ID for this event, then I will receive only the implicit errors that match that number.

As for the implicit conversion warnings, we have the following that accomplishes that requirement.

The event, plan_affecting_convert should seem eerily familiar at this point. We have seen terminology like that in the execution plans as well as in the query for the map values.

The combination of these events will provide great insight into the issues revolving around any implicit conversions you may be having in your database environment.

Wrapping it Up

Implicit conversions are a fail-safe for when bad design or code (or both) crops up in your environment. When this fail-safe is being used in your environment, it is helpful to gain further insights via Extended Events. This article demonstrated another use for Extended Events. If you are in need of a little tune-up for your XE skiils, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

 

page 1 of 1

Calendar
June 2019
M T W T F S S
« May   Jul »
 12
3456789
10111213141516
17181920212223
24252627282930

Welcome , today is Wednesday, October 23, 2019