Detail Job History – Back to Basics

Recently, I covered the need to understand job failure frequency and knowing the value of your SQL Agent jobs. You can read the specifics in the article – here.

Sometimes, just knowing the frequency of the job failure is good enough. Sometimes, more information is helpful. Having more information is particularly helpful when meeting with the business to discuss the validity of the job in question.

What do you do in times like this? The most basic answer to that question is – get more data. But that barely scratches the surface. The real question being asked there is how do you go about gathering that data?

There are two methods to gather the data – the hard way and the easy way. Do you like to work hard? Or would you rather work more efficiently?

Efficiency Matters

As was discussed in the previous article, I prefer to do things just a little bit less manually where possible. The consistency of a script matters, but it also is just so much faster than doing things the hard, manual, iterative way. So let’s build a little bit on the script from the previous article.

And here is a sample of the output.

With this script, I have the ability to quick show which step is failing, what the command is for that step, what kind of process is running on that step, any passwords (in the event of an SSIS password), and of course the failure frequency. This is golden information at the fingertips. There is no need to click through the GUI to gather this information. You can get it quickly and easily in one fell swoop.

The Wrap

An important part of any DBAs job is to ensure database related jobs are running prim and proper. Sometimes that just doesn’t happen. When jobs are being overlooked, it is useful to be able to gather data related to consistency of job success or failure. This script will help you in your investigation efforts. In addition, I also recommend this article in your agent job audit efforts.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Job History – Back to Basics

How necessary is that SQL Server Agent job that you have running on the server? I ask that question of my clients on a routine basis.

Sometimes, I will ask that question as a routine part of a health check for the server. Others, it might be due to a migration or upgrade that is required for the server. Generally, the answer to the question will be one of two things: a) “Yes it is necessary.” or b) “What is that job doing?”.

Believe it or not, both answers will typically spawn more questions. You see, I don’t usually ask that question unless the job is failing on a regular basis. You wouldn’t believe how many jobs exist out there that are scheduled and just fail every time as well.

When I encounter a situation like this, it means it is time to have a discussion. In order to have that discussion, one needs to have done a bit of homework in order to better understand the situation. For me, part of this homework involves running the numbers and figuring out the frequency of the job’s failure or success.

Data Gathering

For me, I like to understand how often a job has executed and what is the frequency of failure for that quantity of executions. If I see a job that has not succeeded successfully in 60 consecutive executions, it is probably a safe bet that the job is not needed. Why? Well, if nobody has noticed the job hasn’t been working for that long, the likelihood of the job providing any use to the business is slim to none. In this case, I would present a case to the business as to why it should be removed.

But, how do I get to that point? Well, you could go through the job history for each job one by one and run some manual analytics. Or, you could take advantage of a script. I prefer the script route because it is faster, more reliable and a lot less mundane.

Running that script against my sandbox, I may see something such as the following.

Here you will note that the “wtf” job has two entries. One entry for “Succeeded” (in green) and one entry for “Failed” (in red). Each row receiving counts for number of executions.

This is the type of information I can use in a meeting to discuss with the business. This is no longer a discussion of opinion, but rather one that is now based on facts and data. It now becomes very easy to demonstrate to the business that a job has failed 60/60 times and nobody noticed it or cared enough about the consistent failures to do anything about it. Imagine if the failing job happens to be the database backups. I wonder what the action items for that job failure might include.

The Wrap

An important part of any DBAs job is to ensure database related jobs are running prim and proper. Sometimes that just doesn’t happen. When jobs are being overlooked, it is useful to be able to gather data related to consistency of job success or failure. This script will help you in your investigation efforts. In addition, I also recommend this article in your agent job audit efforts.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Public Role Permissions – Back to Basics

Permissions in the database environment is such an important task. Setting permissions correctly is paramount to a successful audit and one of your best defenses against improper/unwanted access. Yet, in spite of the importance, security is often overlooked, neglected, improperly configured or just flat out ignored. Let’s not forget the times that security is intentionally misconfigured so certain individuals are permitted a backdoor access.

Security, just like performance tuning, is a perpetual (and sometimes iterative) task. There is no excuse for setting your security and forgetting it. It must be routinely reviewed.

While performing a script review for a client, I was reminded of the need to also review their security setup. The reminder was more of a slap in the face as I noticed that the developer had built in some permissions assignments for some upgrade scripts. Unfortunately, we were not permitted to alter any of the scripts due to them being from a third party vendor (and that vendor refused as well to fix the problems with the scripts but I digress).

What could be wrong with this?

I want you to digest that for just a moment. This is an example of the permissions this particular vendor insists on setting for the public role. What could possibly be wrong with that? Let’s examine a couple of the permissions like “Control” and “View Change Tracking”.

View Change Tracking

This permission is an elevated permission that is required in order to use the change tracking functions. This permission is necessary for the following reasons:

  1. Change tracking records contain the PK value for rows that have been deleted. If sensitive information was deleted that a user should not be able to access, the user would be able to reverse engineer the data from the change tracking data.
  2. A user may be denied access to a column that contains sensitive data. If data in the column is changed, the data would be stored in the change tracking and a user can determine the values that were updated for the sensitive data.

Control

I am going to take this one direct from the Microsoft documentation.

Confers ownership-like capabilities on the grantee. The grantee effectively has all defined permissions on the securable. A principal that has been granted CONTROL can also grant permissions on the securable. Because the SQL Server security model is hierarchical, CONTROL at a particular scope implicitly includes CONTROL on all the securables under that scope. For example, CONTROL on a database implies all permissions on the database, all permissions on all assemblies in the database, all permissions on all schemas in the database, and all permissions on objects within all schemas within the database.

Now digest that a bit. Once digested, consider what the public role does to user access in a database. The public role permissions are inherited by all users of the database whether the users have been granted the permission or not. You should only grant permissions to the public role that you really honestly believe that ALL users should have. If you are being serious in your role, then the amount of times you grant permissions to the public role should either be a) never, b) when you want to have a data breach, or c) you are testing in a sandbox to improve your skills.

Check for Perms

When you are uncertain of which permissions have been assigned to the public role, or you just haven’t reviewed your permissions real-estate in some time, it is best to pull out a script and start the process. As luck would have it, I have a few scripts that can help with that (here or here) and I have a new one that I am sharing now.

Let’s start with a basic query that will display all of the permissions assigned to the public role in a specific database.

There is nothing super special about this query. Looking at it, it is querying the permissions for the public role specifically. I display where the permission is a “Deny” or “Grant”. Then we list the permission name and then the schema and the object.

Let’s take that script and evolve it now. I am going to plan for the worst and expect that some permissions have been applied that shouldn’t have by some vendor upgrade script (because – well, history). Since I am expecting the worst, I am going to add some script generating code that will revoke the unwanted permissions. And still expecting the worst would be that revoking the permissions will break something, I will also add some code that can generate the appropriate “Grant” statements.

That looks better. I have a way of identifying the unwanted permissions as well as an easy script I can execute to remove the unwanted permissions. Note the use of the collate in the final two columns. As it turns out, permission_name from sys.database_permissions has a column collation of Latin1_General_CI_AS_KS_WS. Since I ran into some errors (shown below), it is easier to direct the DB engine to use the collation that matches the permission_name column.

Msg 451, Level 16, State 1, Line 11
Cannot resolve collation conflict between “SQL_Latin1_General_CP850_CS_AS” and “Latin1_General_CI_AS_KS_WS” in add operator occurring in SELECT statement column 5.
Msg 451, Level 16, State 1, Line 11
Cannot resolve collation conflict between “SQL_Latin1_General_CP850_CS_AS” and “Latin1_General_CI_AS_KS_WS” in add operator occurring in SELECT statement column 6.

Alas, this is still not quite as efficient of a script as I would like. I may have hundreds of databases on the instance and need to evaluate all of them. Time for the bigger guns.

That will take care of all of the permissions for the public role in all of the databases, with a slight caveat. I am only checking against that objects that are not flagged as is_ms_shipped. Now, isn’t there also a public role at the server scope? Indeed there is! Let’s also capture those permissions.

Now, I feel many times better about what could possibly be going wrong with the public role.

If you are in a tightly controlled environment or you are just sick of people doing this sort of thing to your servers, there are more extreme measures that can be taken. You can read about it here or here.

The Wrap

It is amazing what some people will do that just doesn’t make sense. Granting permissions to the public role is one of these cases. That behavior also explains why there are documents and procedures for hardening the public role (here and here).

If necessary, I recommend locking down your public role. It will make your job a little easier and give you better rest at night.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Implicit Conversion Insights with XE

Implicit

In a recent article, I covered a little bit about implicit conversions and an error that may crop up from time to time. It is one thing to understand what an error means, it is an entirely different problem to understand what is causing the error in the first place.

Seldom does a DBA have the opportunity to get out in front of infrequent or random errors such as implicit conversions. More often than not, it is the privilege of the DBA to find out about the problem after the fact from a developer or, worse yet, an end-user.

What if there was an easy method to track when these errors occurred? Even better, it would be ideal to understand what query caused the error as well as which types of queries are merely causing implicit conversion warnings. Even better, what if you had a way to understand which type of conversion problem is at hand.

Implicit Playground

To start this playground, let’s begin with a simple query based on the AdventureWorks database.

This query will yield a plan similar to the following.

If I hover the select operator, I will see a little window like the following.

I have a few items highlighted in this window. In the larger red box, I am outlining much of the type conversion warning. Inside of that, there is a blue box highlighting the “convert_implicit” warning. If I look further into the error message, there is text about query plan choice and “CardinalityEstimate” or “SeekPlan” for the different convert_implicit warnings. Another way of looking at it is as follows.

This seems a lot more friendly than that little context window. Here we have an easy to follow layout of what is happening. As I drill down into the warnings, I can see there are three convert_implicit warnings and they all trigger a “planaffectingconvert” type of warning. As we look closer at this, I can see there are two distinct types of convert issues.

There is one for the conversion from nvarchar to int in the HumanResources.Employee table for the NationalIDNumber field (green).

Two warnings that state that an index seek has been disabled due to the conversions ( orange and purple).

Curious to see what all of the possible “ConvertIssue” values could be for these implicit conversions? That is easy enough with the following query.

Which shows us the following results.

That is a grand total of three. Having covered the cause of two of these already, all that is left is the type called “unknown”. I think we can safely presume that the cause of this one is something else that is undocumented and hence “unknown”.

So far, so good. We have seen how these warnings can crop up in execution plans. We have also seen how to query the for the “ConvertIssue” types or “plan_affecting_convert_type” from the Extended Events (XE) maps system DMV.

Right about now, the light bulb should be popping up for you. If we can query the XE metadata for the different convert issue types, then does that mean there is a way we can use XE to track these things and review them at our own convenience? Yes there is!

XE

Without further adieu, let’s look at an XE session that can be useful in discovering more information in regards to implicit conversions in the database environment.

This session is pretty comprehensive with the ability to be very much like a fire-hose session if the commented out events are uncommented prior to session creation. Running this on a busy server ( less than 1 minute) for a client with a third party vendor application produces results that look like the following.

As we can see here, the session is trapping when I have any of those pesky implicit conversion errors (red box outline) as mentioned in the previous article, as well as the convert_implicit warnings (purple box outline) that were demonstrated in the execution plans in this article.

How am I trapping the implicit conversion error? That is done via the following piece of the event session shown above.

Error number 257 is the message id for the implicit conversion error. When filtering on that ID for this event, then I will receive only the implicit errors that match that number.

As for the implicit conversion warnings, we have the following that accomplishes that requirement.

The event, plan_affecting_convert should seem eerily familiar at this point. We have seen terminology like that in the execution plans as well as in the query for the map values.

The combination of these events will provide great insight into the issues revolving around any implicit conversions you may be having in your database environment.

Wrapping it Up

Implicit conversions are a fail-safe for when bad design or code (or both) crops up in your environment. When this fail-safe is being used in your environment, it is helpful to gain further insights via Extended Events. This article demonstrated another use for Extended Events. If you are in need of a little tune-up for your XE skiils, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

 

Mass Backup All Sessions

Migrating Extended Event Sessions from one server to another should be a simple task. So simple, one would think there was no need to give it a second thought, right?

Well, I have previously written about this topic, you are welcome to read it here. The article discusses quite a bit about scripting out your XE Sessions. One thing lacking in that article is the ability to script out every session on a server.

If you are still not using Extended Events, I recommend checking out this library of articles that will cover just about all of the basics concerning Extended Events.

New and Improved

What about scripting out all of the sessions in SSMS? Surely there is an easy way to do that, right? Well, you might think that. Let me step through the problem that I have seen in SSMS (and unfortunately it is not consistent).

First, from Object Explorer Details (or F5), let’s try to script a single session.

When scripting a single session from the “Object Explorer Details”, I have several sub-menus that allow me to script the session to a “New Query Editor Window”. Now, let’s see what happens when trying to script multiple sessions.

With several sessions selected, I try yet again to script the sessions and I get an unwanted result. Notice that the “Script Session as” option is grayed out and unusable. However, if I try it again (several times or maybe just once, your mileage may vary and it seems to not be relevant to version of SSMS), I may see something like this.

Tada! Luck was with me and it finally worked that time. So, what should I do to be able to consistently script all of sessions? Well, that comes with an enhancement to the script I presented in the prior article here.

Lets just dive straight into the new script.

This is a rather lengthy script, so I won’t explain the entire thing. That said, this script will produce the exact XE Session as it was written when you deployed it to the server. In addition, the script will ensure the destination directory for the event_file target is created as a part of the script.

I can definitely hear the gears of thought churning as you ponder about this whole scenario. Surely, you have all of your XE Sessions stored in source control so there is no need whatsoever for this little script. Then again, that would be in an ideal environment. Sadly, source control is seldom considered for XE Sessions. Thus, it is always good to have a backup plan.

Why

Sadly, I had the very need of migrating a ton of sessions from one server to another recently and the methods in SSMS just wouldn’t work. There was no source control in the environment. Building out this little script saved me tons of time in migrating all of the sessions for this server and also provided me with a good script to place in source control.

Conclusion

In the article today, I have provided an excellent tool for backing up all of your XE sessions on the server. This script will help create the necessary scripts for all of your XE Sessions (or even just a single session if you like) in order to migrate the sessions to a new server or place them in source control.

To read more about Extended Events, I recommend this series of articles.

Event Log File Paths

How does one consistently find the correct path to the Extended Event Log file (XEL file)?

This is a topic that I ventured into some time ago. The previous article can be read here. In that article I covered some of the various trouble spots with capturing the file path for various XE log files. One of the main problems being that there is frequently an inconsistency in where XE logs may actually be stored.

Using what was shown in that previous article, I have some improvements and minor tweaks to fill some gaps I hadn’t completed in the previous script.

If you are still not using Extended Events, I recommend checking out this library of articles that will cover just about all of the basics concerning Extended Events.

New and Improved

First, lets just dive straight into the new script.

One of the things I wanted to accomplish with this update was to find the correct path for all of the sessions on the server. As mentioned in the previous article, sometimes there are complications with that. Due to the way log files can be specified for an XE session, behaviors can be a bit funky sometimes when trying to parse the correct paths. Due to those problems, I couldn’t quite short-cut the logic in the previous script and had to do the less desirable thing and create a cursor.

In addition to the cursor, I threw in a fix for when a full path is not declared for the session (at the time of creation) and the session was subsequently never started. In these odd cases, the script had been returning an empty result set and thus was not working properly. Now, it is fixed and here is an example of the output.

The third column in this result set is purely for informational purposes so I could determine at which point the file path was being derived. For the 30+ sessions running on my test instance, most paths are resolved via the first select. In the image, that is denoted by the label “Phase1” and circled in red. The system_health session happened to be running, but did not have a full path declared so it fell into the “Phase2” resolution group and is circled in blue. The last group includes those cases where a path could not be resolved for any number of reasons so they fall to the “FailSafe” grouping and an example is circled in green in the image.

Why

Truth be told, there is a method to short cut this script and get the results faster but I felt it would be less accurate. I could obviously just default to the “FailSafe” group automatically if a full path is not defined in the session creation. Would that be accurate though? Most of the time it would be accurate, but then there are the edge cases where occasionally we forget that something has changed. One such case of this is if after the session is created, you decide the SQL Server log files needs to be moved from the default path (this is where the XEL files default to if no path is defined)?

I have run across multiple scenarios where the logs were required (both technical as well as political) to be moved from the default location. Ideally, this move occurs prior to server startup. When the log file path is changed, the logs are not moved automatically to the new location. This, for me, is a case where it is best to be thorough rather than snake bit. I also like to document these things so I can compare them later if necessary.

Alternatively, here is the just good enough to pass muster version of that script.

 

Conclusion

In the article today, I have shown some of the internals to retrieving file paths for Extended Event Sessions. I dove into metadata to pull out the path for the session and discussed some concerns for some of these methods. In the end, you have a few viable options to help retrieve the file path in a more consistent fashion.

To read more about Extended Events, I recommend this series of articles.

Database Offline – Redux

I have written a couple articles showing how to audit database offline events via the default trace as well as via Extended Events. These are great for a more proactive approach. Sometimes the proactive approach is just not plausible for the given circumstances. A good example of this is when you inherit servers (e.g. new client, new project, new job) and there are databases on the server that were taken offline prior to your inheritance.

In a situation like this, you need to find out when the databases were taken offline, but you are somewhat limited by the data available to you for your research purposes. The default trace has rolled beyond the retention point for your server for events that were trapped. Setting up an Extended Events session would be less than helpful given the databases are already offline. So we must find an alternative to finding approximately when the databases were taken offline.

Alternatives

If your predecessor leaves you with a database that has been taken offline and no documentation about it, there is an alternative to find the approximate time it was taken offline – using TSQL. Granted, you could explore the file system and make the same sort of guess based on the file modified date. Data people generally prefer some sort of script and a script that is native to their language (tsql).

So, let’s take a look at a viable option for figuring out the mystery behind your database that has been set offline with no good info surrounding it.

Running this query, I see the following on my test server:

Note that this query uses GETUTCDATE. The differential_base_time column in sys.master_files is in UTC time. Thus, in order to compare properly, we need to ensure we use a UTC datestamp for comparison. Also, of note, this query doesn’t work if there is no full database backup for the database in question. Who doesn’t backup their databases though, right? No seriously, you really should.

Since this is my test server and I am well prepared, I just so happen to have an XEvent session running that tracks the offline events that I can use to compare the results.

Yes, there is a bit of variance in the actual database offline event and the differential_base_time column. This variance is due to timing of the backup and the actual offline event. Again, this is about finding an approximate time of when the database was taken offline as a viable alternative when other data is not really available.

Conclusion

We all aspire to having a perfect database environment where nothing surprising or unexpected happens. Unfortunately, that is the desire of dreams and fairy tales. The unexpected will happen. A database can be offline at the time that you take over responsibility of the database. You will be asked when it was taken offline. This script will help you get a reasonable deduction for that offline event in the absence of better data.

Extended Events is a powerful tool to help in troubleshooting and tuning your environment. I recommend investing a little time in reading the 60 day series about Extended Events. This is not a short series but is designed to provide an array of topics to help learn the tool over time. Don’t forget to go back and read the companion article showing how to audit these events via the default trace.

Database Recovery Monitoring with XE

On of the greatest benefits of Extended Events (xe) is how the tool simplifies some of the otherwise more difficult tasks.

Recently, I wrote a rewrite of my database recovery progress report script. That script touches on both the error log and some DMVs along with some fuzzy logic to join the data sets together. That script may not be the most complex script out there, but it is more more complex than using the power of XE.

Database recovery (crash recovery) is a nerve wrenching situation under the wrong conditions. It can be as bad as a root canal and just as necessary to endure that pain at times. When the business is waiting on you waiting on the server to finish crash recovery, you feel nervous at best. If you can be of some use and provide some information back to the business, that anxiety dissipates and the business becomes more calm as well. While the previous script can help you get that information easily enough, I want to introduce the easiest method to capture that information currently available.

If you are interested in a history lesson first, here are the first couple of versions of the aforementioned script (here and here).

Discovery First

As always, I like to explore the event repository to see if there is an event that may be applicable to my situation. This can be done via TSQL script or from the XE Gui. I will almost always break out my scripts to figure out if an event is there or not.

This query will yield any events that match my description. In this case, I am looking for events related to “database_recovery”. This search will yield four relevant events we can use to track our database recovery progress. Those events are shown in the following image (with the event names being circled in green).

If I explore the events a little more, I will eventually come across an attribute in the database_recovery_progress_report event that leads to a map. This map is called recovery_phase. For me, that is an interesting attribute/map and makes me want to look at it further.

Things are coming together a little bit now. We all know (or should know) that there are the analysis, redo and undo phases to crash recovery. This aligns with that knowledge and throws in a couple more phases of the recovery process.

So, now we know there are four relevant events for us to use and that one of the events will tell us specifically which phase of recovery is currently processing. We have enough information that an event session can now be built.

You may notice that I have thrown a lot of actions including the kitchen sink at this event session. Some of that is for consistency across sessions and some of it is simply for exploratory wants (and not needs). Feel free to add/remove actions form this list as you explore the use of this session in your environment.

Here is what that session produces on my test server with a simple stop/start of the SQL Server instance.

In the preceding image, I have the different events circled in red. I have also added the event_sequence action so I can see the relationship between these events as the progress from one to the next. If you note the items circled in green (and connected by green arrow), you will see a couple of different things such as the trace message, the database name, the database id, and the recovery time remaining). Circled in blue are the “destress” items that let us know that the recovery is 100% complete.

Wrap

SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online.

Just because you need to have patience during the crash recovery process does not mean you have to work hard to get a status of the process or that you need to stress throughout the process. This XE event session will take a lot of work and stress out of the process. I would recommend having this lightweight session running on the server with the startup state set to enabled. This will make your job easier and definitely can make you look like a rockstar DBA.

This article has demonstrated the power of Extended Events, for a lot more reading on the topic, here is a list of over 100 articles.

Database In Recovery Redux

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: April 8, 2019

One of the more gut-wrenching experiences of a DBA is to have a database get stuck in recovery. This typically happens when the server was rebooted unexpectedly and typically coincides with a rather large transaction.

The reboot and sudden stop of the transaction requires the transaction to be replayed or to experience an undo. This could take hours or days depending on worst case scenarios. Sitting there and waiting for it to complete just makes you sweat.

This happens to have happened to a client recently so I dusted off an old script (which you can find if you click this link), and started investigating and monitoring the server. While waiting and investigating, I decided to take on my challenge (quoted hereafter) from the first time I posted this query (here).

Unfortunately, this query does not demonstrate the time remaining for the rollback nor the percent complete without needing to query the error log.  Those would be awesome additions if you know how to do it (and let me know), other than via the error log.

I did not remove the dependency on the error log, rather I leveraged that dependency to allow me to get more information from the DMVs and link it to the error log messages to give a bigger more complete picture of the recovery process from a simple query.

This script is now enhanced from the 2014 version and works through 2017.

Wrap

SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online. It will require patience and this script can help achieve that patience.

Indexed Views – Performance Panacea or Plight

Indexed View(s)

An indexed view is a view where the result set from the query (the view definition) becomes materialized in lieu of the virtual table result set of a standard (non-indexed) view. Many times we see that an indexed view would be created to help improve performance. Far too often, an indexed view is created without consideration for the costs of the indexed view. In this article, I hope to cover some of the more important costs that are frequently overlooked when considering an indexed view as a performance panacea.

The Setup

To demonstrate the overlooked costs of indexed views, I have decided to use the AdventureWorks2014 database (if you don’t have this sample database, you can get your copy here). I will create a view in that database and then I will add a few indexes to that view. Prior to sharing the entire setup, and like all good DBAs, I need to first grab a baseline. For this baseline, I am doing nothing terribly complex. I will just grab table sizes and storage information for the entire database.

Here are my results for the size of the objects within the AdventureWorks2014 database:

These results show that the largest object in the database happens to be the Person.Person table at just about 30MB. Not terribly large, but the database is not terribly large as a whole. Let’s see what happens when I add a materialized view, based on the Person.Person table, to the mix. Here is the definition of that view along with the indexes that I will add to help the performance of some essential production related queries.

After executing all of that code to create this new view with its indexes, I have the following size results:

The creation of this view has chewed up a bunch of storage. It has jumped right up to the number two spot on the biggest objects list within this database. You can see that differences by comparing the highlighted rows to the previous image. The vPerson view is highlighted in red in this second image to help point it out quickly.

Surely this must be a contrived example and people don’t really do this in the real world, right? The answer to that is simply: NO! It DOES happen. I see situations like this all too often. Far too often, large text fields are added to an indexed view to make retrieval faster. I have mimicked that by adding in two XML columns from the Person.Person table. This is definitely overkill because a simple join back to the table based on the BusinessEntityID would get me those two columns. All I have effectively done is duplicated data being stored and I have achieved that at the low low cost of increased storage of 25% for this small database. If you are curious, the column count between the Person.Person table and this new view is 13 columns each.

I call the 25% increase storage cost a significant increase. An increase of 25% in storage for a single materialized view is not an uncommon occurrence. I have seen multi-terabyte databases with 25% of the total data storage being caused by a single view. If you are tight on storage, any unexpected data volume increase can cause a non-linear growth in data requirements to occur just because of the presence of indexed views. Take that into consideration when presented with an option to either performance tune the code or to take a short-cut and create an indexed view.

Band-aid Performance

I alluded to the use of indexed views as a cog to help improve performance. This is done by taking the query that sucks and turning it into a view. The thought is that it will run faster because an index is created specifically for that data set. Generally the performance gains from using an indexed view, to camouflage the bad query, are seen when the query is complex. Unfortunately, the query I provided for this article is not terribly complex. Due to the simplicity of the first example, let’s look first at the execution plan for another query that runs terribly slow but is also not terribly complex:

There are a few table spools and index scans. The most costly operator of the bunch appears to be a merge join with a many-to-many join operation. The indexed view in this case does give me a very significant improvement on this query (look at that estimated cost and then the estimated number of rows). What if I could tune the query a little and avoid the all of storage cost? Let’s see!

Consider the base query for this new view:

Looking at the query, you may be asking yourself why use so many trim functions on each of the joins? This is an extreme that can be easily fixed. None of the fields in the joins should need to be trimmed since they are all numeric in this case. Starting there and removing all of the function use in the joins should provide significant improvements in query speed and without the use of a view. Testing it out would yield similar performance results to the first view in this article. This small change would cause the query to complete in 1 second compared to about 90 seconds. That is a huge gain, and without the added cost of increased storage use.

Do I really see stuff like this in the field? Yes! Spending a little time to fix the root of the problem rather than looking for a quick fix and performance can be spectacular. There are times, though, that an indexed view may be an absolute requirement. That would be perfectly fine. There are times when it is warranted and there is no way around it. If an indexed view is an absolute must, then there are a couple more considerations to be taken into account.

Disappearing Act

I showed previously that the indexed view requires storage space and showed that the new view created quickly claimed its place as the second largest object within the database. Now, to stage the next segment, I will show once again that the indexes are present and consuming disk space. This can be shown via the following query:

And the results of that query should look something like the following (quick reminder that I created a new indexed view for the second demo and these results will be based on that second view):

Now that we know the index is present and consuming space, time to make a change to the view and see what happens. I am proceeding under the premise that I have determined that including the two XML columns in the view are completely unnecessary and are the cause of too much space consumption. I can reap the benefits of the view at a fraction of the cost if I ALTER the view and remove those columns. So I will proceed by issuing the following ALTER statement:

Take note of how long this ALTER command takes to complete – nearly instantaneous.  That rocks! Right? Let’s look at the indexes to make sure they are still in working order and don’t require a defrag. Based on that, I will re-run that query to check the indexes and sizes – for giggles.

This time when the query completes, I get a nice little surprise: I no longer have any indexes on that view. Changing the view definition instantly drops any indexes that were built on that view. If this is done in a production system, imagine the implications and problems that could arise.

Conclusion

In this article, I have covered indexed views and three different considerations that are often overlooked when looking to implement a materialized view. The three considerations for indexed views, as discussed, are: storage, changes to the view, and the band-aid effect. Overlooking the cost of storage can have serious repercussions on the production environment. Implementing an indexed view without attempting to tune the code first could lead to overlooking the storage cost and could end up just being a waste of resources. The last tidbit is that any change to an indexed view will drop the indexes. That is an easily overlooked feature of indexed views. If you forget to recreate the indexes on the view after making changes, you could be looking at a production outage.

There are a couple more critical considerations for indexed views that are often overlooked as well. These additional considerations are blocking, deadlocking, and maintenance. These considerations will be discussed further in a future article.

For more reading enjoyment try the Back to Basics or Xevents series.

«page 1 of 17

Calendar
November 2019
M T W T F S S
« Jul    
 123
45678910
11121314151617
18192021222324
252627282930  

Welcome , today is Sunday, November 17, 2019