Top 3 Database Life Hacks

TSQL Tuesday

Woohoo, it is an illuminating time of the month this second Tuesday of February 2020 and that also means it is time for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

This month, the blog party is all about hacking your database – err hacking your database life. Jess Pomfret (b | t) wants us to each share some of the best tips we have that can help save us oodles of time or can simplify our job significantly. These has been a growing phenomenon of late commonly called life hacks. You can find the invite from Jes – here.

I have a plethora of hacks of all sorts. But since we are supposed to focus on “life” hacks, my plethora is reduced to a smaller plethora. Whittling down that smaller plethora, I have a few that I really must recommend. A couple of these have appeared at other times in some of my articles. Check out my top 3!

Solutions

This is one that I have been using for most of my career. I started using Solutions in SSMS to help organize my scripts and keep them handy for quick use. The amount of time saved there is beyond count at this point. I adopted this method later in my career to also use it when giving presentations.

By adding all of my scripts for a presentation to a solution, it is a quick and efficient way to organize my scripts for use through the presentation. In addition, I have the solution saved so I can quickly reload it at any time when preparing to give the presentation again. Open the solution – BAM! I am ready to go.

As you can see in the associated image, I don’t just do a single presentation per solution. I create one solution to house all of my regular presentations. An added benefit to this is that I will have it already loaded just in case there is a pertinent question that can be best explained through the use of a demo from a different presentation.

I explain in more detail how to use solutions from my previous post on the topic – here.

 

Mapping it Out

This feature is one of my favorite features still – almost four years later. Even better is that MANY data professionals are still surprised to learn that it exists. I wrote about the scrollbar map in this article and here is a quick synopsis of the feature.

In the following image, I have highlighted the scrollbar as it is configured in SSMS for all instances where I use SSMS. Notice how it looks somewhat different than a typical scrollbar! To help point it out, I have circled the scrollbar in red.

How do we enable that map? That is easy, right click the scrollbar itself (as shown in the following image) and then select “Scroll Bar Options…”

After clicking that menu option, the following window will open.

Circled in red is the set of configurations that will control the look and behavior of the scrollbar. Personally, I prefer map mode over bar mode. Map mode also comes with the option of a “source preview” that will show the code as I hover over the map (as shown in the first image). Again, using this tool, I can navigate the code so much faster and save soooo much time, it is a must have.

Clickable TSQL

One of my favorite time saving hacks is the ability to quickly find code that references a specific object. In addition to being able to quickly find the desired object references in code, I like to be able to display the referencing code quickly without having to go search for it in source control or in SSMS by name. This is where a handy dandy script that formats the code into clickable format is helpful. Not only does it give me the ability to display the code – but it displays it in the format that it was committed to the database.

All of the credit for this code nugget goes to Adam Machanic (b | t). He took the time to develop this method and to map out the “?”-able characters appropriately.

When running this code, I get something like the following.

When clicking the link illustrated in the preceding image (big red arrow pointing to it), I get to see something like the following.

Note here that whatever was committed to the database, that is what is being displayed. Sure, I put a sarcastic comment in the green box about the developers notes (entirely because it is all XML and is poorly written for the SQL side imo), but there is also the important illustration in red showing the Create Procedure statement where the stored proc is actually being created. Furthermore, this is the exact format in which the proc was committed to the database. Now, I have have just saved myself some time every time (potentially dozens of times a day) that I have to inquire of the database for a specific referenced object and all of the code that references that object.

Wrapping it Up

There you have it – three very powerful but simple hacks that can give you back dozens of minutes per day. These hacks can help you better navigate code and respond to some of those interrupt requests requiring either some referencing code or maybe to troubleshoot an issue that just cropped up. Check them out and try them. I am sure you will find they will save you oodles of time too!

Feel free to explore some of the other TSQL Tuesday posts I have written.

If you are in need of a little tune-up for your XE skills, I recommend reading a bit on Extended Events to get up to date. For some “back to basics” related articles, feel free to read here.

Coping with Self Doubt

TSQL Tuesday

Behold, the second Tuesday of January 2020 and that means it is time for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

This month, the blog party is almost like a masquerade ball. John Shaulis (b | t) has implored us to take hold of our inner impostor. Okay okay, he hasn’t asked us to be impostors, but rather he has requested we talk about Impostor Syndrome.

At first glance, and seeing the word “impostor”, my thoughts turned outward. I thought John wanted us to out all of those Sr. DBA candidates out there who are soooo confident but don’t know the difference between a PK and FK or even the difference between a delete and truncate (as examples).

In reality, John is asking about something entirely different. Looking up “Impostor Syndrome” I see several definitions similar to the one that John posted. Here are a couple of examples.

Impostor syndrome can be defined as a collection of feelings of inadequacy that persist despite evident success. ‘Impostors’ suffer from chronic self-doubt and a sense of intellectual fraudulence that override any feelings of success or external proof of their competence. – Harvard Business Review

And…

Impostor syndrome is a psychological pattern in which one doubts one’s accomplishments and has a persistent internalized fear of being exposed as a “fraud”. – Wikipedia

These definitions reveal something very pertinent to the topic. The syndrome consists of the presence of persistent, nagging, chronic self-doubt. Even in the face of success, the feelings of inadequacy can’t be escaped. Well, this certainly adds light to the topic and made me ponder even further. You see, I think the disease is a much larger problem than having the occasional feeling of being inadequate. Persistent chronic feelings of inadequacy and Impostor Syndrome (as defined) would make me concerned about the presence of depression as well. This can be a very serious issue and shouldn’t be taken lightly. Seek help!

Well, What if…

What about the fact that so many of us do have feelings of inadequacy? The occasional feeling of not being good enough is not entirely a bad thing. (Again, the trick is figuring out where that line between chronic/persistent/nagging feelings ends and the occasional feelings is ok.)

There have been many times when I have personally had the feeling of inadequacy as a Data Professional. These thoughts often creep up whenever doing something new or different or even something uncommon (but practiced). I have even had these thoughts occasionally right before speaking. I have a close friend who has these thoughts just about every time before he speaks.

When these thoughts occasionally crop up, what is done next is the important thing. How does one cope? Does the coping mechanism work? Do you practice your speaking in front of a mirror for hundreds of hours? Maybe you have a few people proof read an article you are writing. Maybe, you build a proof of concept and then ask a couple friends to test it and try to break it to make sure it works.

Coping mechanisms are great methods to assist in the removal of self-doubt. Another good coping mechanism is to get some physical activity. Physical activity can help with the chronic self doubt and is also something therapists may recommend for depression (important because the symptoms for both Impostor Syndrome and Depression are so close).

Another component of Impostor Syndrome is the fear of being discovered as a fraud. Living in fear isn’t healthy. If there is the persistent unshakable feeling of fear about being discovered as a fraud, then it is time to find help and also start troubleshooting your personal health. Here is a good starting point to help troubleshoot your health in regards to this syndrome.

One of the best coping mechanisms for Impostor Syndrome is the use of learned behaviors. Some of the learned behaviors that are great for everybody are as follows:

  1. Learn to take your mistakes in stride. Mistakes are natural. Own the mistake. Mistakes can make us better at we do. The sooner we own a mistake, the sooner we can learn how to not make that same mistake again.
  2. Learn to accept internal validation. Not everybody needs a participation trophy. Internal validation is just as good as external validation. Also be accepting of constructive criticism.
  3. Set smaller attainable goals directed at specific skills or behaviors you wish to improve upon or learn.
  4. Learn to rely on others. Be able to reach out and ask for help.
  5. Instead of being an information hoarder, learn to disseminate knowledge. In addition, it is perfectly acceptable to admit you don’t know something and trust that you can learn it quickly or find somebody that already knows it.

Wrapping it Up

It is not abnormal to have the occasional feelings of inadequacy. In IT, and particularly among those who are successful, thoughts of being a fraud can occur. The important thing to do is recognize the thoughts and frequency. When the thoughts start to creep in, learn to replace it with a coping mechanism or an alternate behavior. This article provided some coping mechanisms, and some behaviors to try and learn.

In the case of experiencing chronic inescapable self-doubt and living in constant fear of being found out as a fraud, that is an entirely serious issue and professional help may be required.

Feel free to explore some of the other TSQL Tuesday posts I have written.

If you are in need of a little tune-up for your XE skills, I recommend reading a bit on Extended Events to get up to date. For some “back to basics” related articles, feel free to read here.

Cannot Open Backup Device

Your success as a DBA is directly relational to your ability to ensure proper Database backups are occurring. There is a whole slew of considerations that go into effect when discussing what a proper Database backup actually is. To drill it down to its most basic definition, a proper database backup is one that fulfills business requirements for recovery, retention, and has been routinely tested to ensure the backup can be restored.

Remember, a backup that cannot be restored is factually not a backup. Why? Well, going back to the definition of a backup, it is a fail-safe mechanism to be implemented in the event of a failure. If it cannot be implemented (in this case restored), then it provides no value because it cannot be used.

What about when you run into a problem even getting to the point of getting a successful backup operation? For instance, this error can be a bit troubling and may just cause you to lose plenty of time troubleshooting it.

Msg 3201, Level 16, State 1, Line 3
Cannot open backup device ‘\\SomeServer\C$\MSSQL14\MSSQL\DATA\Full_Backup2.bak’. Operating system error 1326(The user name or password is incorrect.).
Msg 3013, Level 16, State 1, Line 3
BACKUP DATABASE is terminating abnormally.

Sure, the error seems to report plenty of adequate information, but you may be surprised at the pain this particular error can cause.

Backup Device Permissions

If you search long enough you will come across numerous articles or forum posts that all pretty much have the same direction. They all point to the problem being a permissions issue. There are a few concerns with the permissions stigma however. In a recent issue for a client, we could verify that permissions were indeed not at fault. To ensure the permissions were not at fault, we granted local admin in the OS as well as sysadmin within SQL Server for the service account. In addition, we also tried a proxy account with full blown access in Windows and in SQL.

To take it a step further, we also ensured that the special permissions for service accounts were also in effect. Those permissions are:

  1. Permission to bypass traverse checking (SeChangeNotifyPrivilege)
  2. Permission to replace a process-level token (SeAssignPrimaryTokenPrivilege)
  3. Permission to adjust memory quotas for a process (SeIncreaseQuotaPrivilege)
  4. Permission to access this computer from the network (SeNetworkLogonRight)

None of this seemed to have any effect when trying to perform a database backup in SQLServer. In our case, we also ensured that the service account had full blown access to the folders as well.

Following the traditional troubleshooting measures, I created a virtual backup device as well only to get the same error. I tried to use the UNC from a remote server and it worked perfectly fine for all accounts in question. What could be going on? Then, I figured why not try the UNC from the server itself. Finally, a clue! Suddenly I was getting prompted for my credentials but my credentials would not work no matter what. Just for giggles, I verified my account was not locked out, and indeed my account was working just fine.

As it turns out, sometimes the problem is not really a permissions issue but it is a veiled security issue. You see, if I use the actual server name instead of the CNAME in all cases above, suddenly everything worked. The issue turns out to be more along the lines of a Windows issue and is documented here (still valid on Windows 2016).

Here is the basic SQL backup test script that led us down the path to the CNAME being the issue.

Put a bow on it

When running into error 1326, it makes plenty of sense to try to create a backup dump device (only as a testing exercise) as well as test the connectivity to the UNC path from the local server instead of a remote server. In my case, the problem was the local server was blocking access to a CName version of the server.

Interested in learning about some deep technical information instead? Check these out!

Want to learn more about your indexes? Try this index maintenance article or this index size article.

This is the eighth article in the 2019 “12 Days of Christmas” series. For the full list of articles, please visit this page.

The Gift of the SPN

TSQL Tuesday

The second Tuesday of the month is upon us once again. That means it is time for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

This month, we are looking at the last opportunity for the year to contribute to this long-standing tradition. This month being right in the middle of the holiday season, we have an opportunity to explore how we have been blessed over the past year or so.

I personally like the topic. I have one comment concerning the topic before diving in too far. I feel it wise to reflect upon how you may have been blessed in various ways on a far more regular basis. Doing so just once a year is far too infrequent. Thanks to Mala (b | t) for bringing this topic up to remind us of how important it is.

In Malas words:

This is a time for material gift giving, for many of us. It might also be a time to consider the many gifts we have received through the year, and perhaps use this opportunity to appreciate people or situations that we were blessed with.

SPN

Service Principal Names (SPN) are used by Kerberos to help associate a service instance to a service logon account. SPNs are useful and powerful little things. That said, sometimes they can cause wonderful little issues with connectivity to SQL Server. One of the more common issues is an error message such as this:

Cannot Generate SSPI Context

As circumstances would have it, the cause of this issue is stated in a kb article as follows.

A Service Principal Name (SPN) for the SQL Server Database Engine may either be missing, misplaced, or is a duplicate to other SPNs configured in the Active Directory of the domain.

I have known many people run out of hair dealing with SPN issues and the dreaded SSPI context error. Gratefully, there are ways to resolve some of these issues and make life just a touch easier. One of the easier ways is a lesser known tool that you can download from Microsoft – here.

The tool is fairly simple to install and use. The one sticking point for using it is that you need to know where to find it so you can use it after it is installed. Here is the path to the executable.

%SystemDrive%:\Program Files\Microsoft\Kerberos Configuration Manager for SQL Server

The link mentioned provides the path and the install instructions along with more info on on how to use the tool from the command line. Personally, I use the tool to discover my SPNs on the server in question and then if I might be missing any.

Upon opening the tool, you are greeted with this functional screen.

From the welcome screen, click on the “Connect” menu option. From there a new screen will open that prompts for the server name, service account and password. However, if you are just connecting to the local server then none of that is required and you can just proceed. After a few minutes, and with a bit of magic, the SPNs are evaluated and a report is generated. Along with this report, there is even an option to “Fix” the missing SPNs if they exist.

This tool really takes some of the room for error out of the equation and makes the job fairly easy. I recommend it. I have shared this with some sysadmins and they found it to be very helpful and time saving. Check it out!

Wrapping it Up

Every once in a while there is an extremely valuable tool that comes along. While the footprint and use frequency of this tool may not be that big, the tool is essential to making the DBA job easier to do. I recommend getting this tool – especially if you have ever run into SSPI issues.

In this article I showed the Kerberos Configuration Manager Tool for SQL Server to help troubleshoot SSPI Context errors and wayward SPNs. Another exceptionally useful tool is Extended Events. If you are in need of a little tune-up for your XE skills, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

Why oh Why? Effects of Antivirus on a DB Server

TSQL Tuesday

*shrug* my post is late due to technical issues.

The second Tuesday of the month is upon us once again. That means it is time for another group blog party called TSQLTuesday. This party that was started by Adam Machanic has now been going for long enough that changes have happened (such as Steve Jones (b | t) managing it now). For a nice long read, you can find a nice roundup of all TSQLTuesdays over here.

This month, we are looking at the 10 yr anniversary of this long standing blog party / tradition. For the 10 yr party, everybody has the opportunity to bag elaborate on something questionable they have seen being done as it relates to SQL Server.

This is a topic that I know has sparked a lot of commentary by the host Wayne Sheffield (b | t). In Wayne’s invite, he says “Not too long ago, I ran across a situation where I was scratching my head, wondering why something had been implemented the way it had been (you can read about it here). And that gave me the idea for this T-SQL Tuesday topic.

Take a moment to read about his “aha this is odd” moment and then you can totally see the tone that was intended for this months topic.

Dubious Backups

My topic (be it ever so late) is about some rather large frustrations with phantom backup processes that were interfering with the normal backup operations, production performance, and were running at random intervals throughout the day sometimes as much as 12 times during normal business hours.

Every single time these backups kicked off, we would get I/O frozen alerts and failed logins until the I/O was unfrozen. Sometimes it would last as little as seven seconds and sometimes lasting for an uncomfortable minute or so.

In working with the sysadmins they kept insisting that it must have been something configured within SQL Server. Each time they insisted, we would have to rehash the scheduled backups in SQL Server as well as validate any windows scheduled tasks. Here is what some of those backups looked like from within SQL Server.

Each of those queries produces results similar to the following.

The preceding image clearly shows that there have been multiple full backups of the DBA database each day and that there is a variable quantity of these backups each day.

In this preceding image we see more details about these full backups that have been occurring. We know that the backups are not coming from the SQL Service account. We also know that these are not SQL backups but rather they are coming from an outside entity that is causing the backups to be sent to a virtual device. We know a virtual device is the target due to the GUID for the BackupPath instead of an actual file path.

If I take this a step further and try to trace it via an Extended Events session, I can further confirm that the backups are from an outside entity and that it appears to be something that is using the windows shadow copy service – or something like that. Let’s look closer at that.

League of Shadows

Here we see the properties for the volume shadow copy service for all volumes (including those involved with SQL Server). Intriguing to see that despite the service being disabled for every volume, there appears to still be something causing there to be storage consumed towards the VSS quota. This supports that something is causing the backups to be performed via VSS somehow but it is not the actual Windows service because it is disabled for each volume.

Let’s investigate further. Time to break out some command line queries.

Running that command, I can compare all scheduled tasks and their run times against the actual backups that were occurring and determine if any might be related to the backups. As it turned out, none of the tasks in Task Scheduler ran at any time remotely close to these backups.

Next, let’s look at the VSS configurations more closely. I can query the providers, shadows and shadowstorage.

Looking at the providers first to determine if there is something there, we might be introduced to something like this output.

Maybe there is an aha for you at this moment. Two services are certainly worth investigating: “CommVault VSS Hardware Provider Service” and “Galaxy VSS Provider Service” (both are related to CommVault). In this particular case, we exhausted those avenues and were able to determine that neither was related to this issue.

Let’s try the shadows.

A couple of things of note here. The timestamps from these vss shadows are consistent with FULL database backup time stamps in SQL Server. Next, we see that these are designed to serve as an “ApplicationRollback” and Differential backup. So, something is taking an application consistent differential backup every so often for some reason yet unknown. Despite the VSS differential backup, it is not a differential backup inside SQL Server – it is a FULL backup. This configuration can be detrimental to your recovery plan if you are unaware of it (some ramifications here).

So, what is causing this? Well, one more clue actually helped us figure it out. The issues started on August 20th. On that day, a new Antivirus software was rolled out to the servers. This software, unbeknownst to the sysadmins, was taking VSS snaps to try and protect against ransomware – supposedly. I have to say, that is a pretty dumb thing to do to a database server. A legitimate backup / recovery plan would suffice in the case of a hijacked OS and without the added burden of the sanpshots from the AV software nor the added stress of having something screw with the backup/recovery plan.

This is a prime example of using a wrecking ball to hammer in a screw. All in the guise of security and protection. Please stop the insanity.

Making matters more difficult is the fact that the GUIDS provided by the VSS storage and what we see listed as the virtual device in SQL Server do not correlate directly to each other. You have to base your assertions off of timestamps.

If your sysadmins insist on having a tool (whether it be Antivirus or a backup solution) that causes VSS snaps to occur which could foul up the database recovery plan, then I recommend you insist on having access to that tool.

Wrapping it Up

I am not a fan of tools that interfere with the database recovery plans that may be requisite to properly meet RTO and RPO. At best these phantom backups cause undue headache in troubleshooting. At worst, they make it impossible to recover in the event of a database related disaster. Sadly, the database is regarded as the red-headed step child and these tools ignore best practices for the database world. Forcing a database server to fit in the same mold as a file or print server is flat out absurd and these tools make no effort to educate the sysadmins of the piss-poor practices they are forcing down the throat of your database server.

In this article I showed various tools in the command line as well as within SQL Server to troubleshoot these problematic backups. One of the tools was an XEvent Session. If you are in need of a little tune-up for your XE skills, I recommend reading a bit on Extended Events to get up to date. For other basics related articles, feel free to read here.

Ghosts in your Database

Yes Virginia, there are ghosts in your database.  More specifically, there are ghosts in your SQL Server database.  They are not there to haunt you.  They are not there just for this holiday season (speaking of Halloween Month).

How can there be ghosts in the database?

Why would there be ghosts in the database?

Do they happen because somebody issued a KILL statement?

Let’s address each of those in turn.   A database ghost record is (in a very basic form) one that’s just been deleted in an index on a table . Delete operations don’t actually physically remove records from pages – they only mark them as having been deleted (ghosted). Now why is it done this way?  The answer here is largely performance based.  This is a performance optimization that allows delete operations to complete more quickly. Additionally, it allows the rollback of delete operations to process more quickly.  The rollback processes faster because all that needs to happen is to “flip the flag” for the records as being deleted/ghosted, instead of having to reinsert the deleted records.  That may be a bit over-generalized, but I hope you get the gist.  In short, records are marked as “ghosted” when a delete operation is performed; and to rollback, you simply undo that mark.

Now, what about this KILL statement thing?  The kill statement is pure Halloween fun and does not create ghost records.

Ghost Hunting

Now that we have established the purpose of Ghosts in the database, how do you verify the existence of Ghosts?  In other words, what can we do to prove there really are spectral things in the database?  This is where the fun really begins.  First, we need to get out the equipment and tools (as any good ghost hunter would do) so we can capture these phantasms.  Let’s call the first tool the “trap”.  Here is what you will need for it.

This trap, err database, can be a bit large.  As currently configured, we will need about 16GB of disk space to support it.  If that is too much, I recommend removing the last column – “TheBlob”.  As you can see, we are setting a rather large trap.  The table we create (Halloween.Ghosts) will receive One Million records.  This is most probably overkill to catch these ghosts, so you can also cut back on the number of records to be affected.

Now, to make sure we have some data and that we can use the table, let’s just run a little test query.

Excellent, we have a good sample of data.

database ghost records

At this point, it is important to note that we have done nothing that will cause database ghost records.  All that has been done is to set the framework so we can see the ghosts.  With the framework in place, let’s try to catch some ghosts.  To do so, we need to try to delete something.  Since we just happen to have had a clerical error in our database, we have 666 prime candidates to try and fix.  We happen to have several records that were supposed to be given a Slimer date of Halloween.  The clerk, being absent minded, thought that Halloween was supposed to be on Oct. 30.  Our business model dictates that the invalid records must be deleted first and then we can try to enter the replacement records.  So, let’s go ahead and try to remove those records.

Before we remove the records though, we need to discuss one important requirement for us to be able to see the ghosts.  Let’s call it spectral vision goggles.  In the database realm, we call it a trace flag.  In order to see the the ghosts on the pages, we need to enable TF 661.  We can do that with the following statement.  There is a serious side effect to this method too – it alters the behavior of the Ecto Containment Unit or automatic ghost cleanup process.  If you enable this, you will need to disable it later and/or manually run a ghost cleanup.

Now that we have the last piece of equipment in place, let’s go ahead and try to delete some records.

With all of those records deleted (all 666 of them), let’s see what we might have captured.  First, let’s take a look at some index stats.

If we look at the output of this query, we will see that we did indeed attempt to delete 666 records.  Those records will now display in the ghost_record_count column.  We will also see that, since we had two indexes on the table, there are 666 ghost records marked on each index.

idxstats_ghostcount

Very cool!  We are definitely on the track to capturing those ghosts.  We have a trail that they exist in the database.  Let’s keep going and see where we can see them.  You should note that there is an additional column in our result set that looks like it might be related to ghost records.  We are going to leave the discovery of version_ghost_record_count as a homework experiment for you to perform.  It is beyond the current scope of this article.

Now this is getting exciting.  We have stronger evidence in the log showing that these ghosts are hanging around in the database.  Not only are they hanging around in the database, we can see which pages in the database on which they are trying to hide.

dblog_output

This is really solid information!  fn_dblog is giving us just about everything we need in order to get those ghosts.  It took a little bit of work since the log reports the page number in hex.  Converting that to an integer page number is essential for us to look at the page (besides integer values are easier to interpret for most people).  Now I can take that PageID and pass that number, for any of the records reported by fn_dblog, and pass it into yet another undocumented procedure known as DBCC Page.

When looking to use DBCC page, we can either look at the PFS Page and see more pages that have ghost record counts.  Or we can take the results seen from the fn_dblog output  and then look at the contents of the page and catch those ghosts.  We will take a quick look at the PFS page first.  Then we will take a look at an index page next.  In this database that we have created, the PFS page will show several other pages that have ghost records on them.  Due to the size (over 2 million pages), we only see index pages with ghost records in that result.  If our database were smaller, we would quite possibly see data pages in our first PFS page of the database.  Let’s see a sample from the first PFS in this database.

ghost_displayedonpageduetotf

We can follow that link from this point to page 126.  Page 126 happens to be an index page similar to the following.  There are a couple of indicators that this is an index page.  First being that when we run DBCC Page with a format of 3, we will see two result sets.  The second result set will show statistics and index information.  The second being in the image attached after the query.  We will leave it as an exercise to you to see other ways to demonstrate that this is an index page.

ghost_indexpage

That is great, but we have more ghosts to find.  Let’s look at a ghost on a data page.  Randomly picking a PageID from that list that was output from fn_dblog, let’s see what DBCC Page will provide to us.

ghstcntondatapage

Conclusion

Well, isn’t that just cool!  We have trapped a bunch of ghosts and were even able to see them.  This has been a fantastic deep dive into the crypts of the database.  This is merely a scratch on the surface though.  We hope this will encourage you to explore a bit and at least try one of the homework assignments we left behind in this article.

With all of that, we have a bit of cleanup to do.  The cleanup comes in one of two methods.  Method one involves manual labor.  Method two involves our friendly little trace flag we already used.  Since most DBAs prefer the automated mechanisms over manual, let’s just discuss method two for now.  It is extremely effortless.

That will put the system back to the way it was when we started (and of course we trust that nobody did this on their prod box).

This has been one of a few articles about ghosts in the database. You can check out some of the others here and here.

PowerShell ISE Crashes

Working with PowerShell brings a lot of advantages and power to help manage a server. The more current your PoSh version, the more efficiently you will be able to manage your server. Sometimes getting to the current PoSh versions comes with a little pain such as ISE crashes.

I recently had the mis-adventure of working through some ISE crashes after bringing some systems up to PoSh 5.1 that were either PoSh 2.0 or 3.0. It’s not a very fun situation to run a WMI update and then run into a crash of any type when testing if it worked. Your first thought is something terrible has happened.

As it stands, the problem is more of a nuisance than a critical failure. That said, it is enough of a problem that anyone who uses the ISE or .Net applications may experience a slight cardiac event.

Fonts

As you work to quickly recover from your missed heart beat, you start digging through logs and then hitting good old trusty google.

Diving through the logs, you might just happen across an error similar to the following:

Problem signature: Problem Event Name: PowerShell NameOfExe: PowerShell_ISE.exe FileVersionOfSystemManagementAutomation: 6.1.7600.16385 InnermostExceptionType: System.Xml.XmlException OutermostExceptionType: System.Reflection.TargetInvocation
DeepestPowerShellFrame: indows.PowerShell.GuiExe.Internal.GPowerShell.Main DeepestFrame: indows.PowerShell.GuiExe.Internal.GPowerShell.Main ThreadName: unknown.

Maybe the first error you encounter might look like this one instead:

System.TypeInitializationException

“FileFormatException: No FontFamily element found in FontFamilyCollection
that matches current OS or greater: Win7SP1”.

Inner exception originates from: CompositeFontParser

Either way, the error shoots us back to the same fundamental problem. The ISE won’t load, you get an error message and you can’t confirm that the WMI patch was applied properly.

As you work your fingers faster and faster through the pages on google, you discover that this problem is caused more explicitly by a patch for the .Net framework and not necessarily the work to upgrade your PoSh version. It only waited to manifest itself after the upgrade.

That’s gravy and all, but how does one fix the problem? For me, the quickest and most reliable fix was to simply jump straight to the root of the problem – fonts. The ISE is a WPF application and it also requires a fallback font (if a character isn’t present in your font set, then the app chooses a substitute from the fallback font – or something like that).

The fix is extremely simple and really underscores why this is merely a nuisance issue and not a critical problem. Thus it shouldn’t cause any sort of sinking internal feelings of any sort. There are a few plausible fixes floating around out there. I recommend just doing a manual font replacement. It is all but three simple steps:

  1. Download GlobalUserInterface.CompositeFont
  2. XCOPY the font to %windir%\Microsoft.NET\Framework\v4.0.30319\WPF\Fonts
  3. XCOPY the font to %windir%\Microsoft.NET\Framework64\v4.0.30319\WPF\Fonts

After you have copied the font to those two directories, then all that is needed to be done is launch the ISE. I ran into the same problem on three or four servers and the fix took no more than 5 minutes on each of the servers.

Conclusion

I previously mentioned that I have been working more and more with PoSh to try and improve my skillset there. This is one of those very low-level trinkets that I ran into as I have been working to hone my skills in that tech. For other, possibly, interesting articles about my experiences with PowerShell, you can check out these articles.

Given this job is tightly related to the system_health black box sessions (sp_server_diagnostics and system_health xe session), I recommend fixing the job. In addition, I also recommend reading the following series about XE and some of those black box recorder sessions – here.

Syspolicy Phantom Health Records

SQL Server comes with a default SQL agent job installed (for most installations) to help manage the collection of system health data. I would dare say this job is ignored by most people and few probably even know it exists.

This topic is not new to me. You may recall a previous article I wrote entailing one type of failure and how to resolve that failure. That article can be found here.

I recently ran into a variant of the problem outline in that previous article that requires just a bit of a different approach. The errors turn out to be similar on the surface but really are different upon closer inspection.

Phantom Health Records

If you are unfamiliar with the topic, I recommend reading the previous article. Then after reading that article, maybe brush up a little bit on the SQL Agent. The failures we will be looking at are within the SQL Agent and come from the job called: syspolicy_purge_history.

For this latest bout of failure, I will start basically where I left off in the the last article. The job fails on a server and I have access to other servers of the same SQL version where the job works.

Before diving any further into this problem, let’s look at what the error is:

A job step received an error at line 1 in a PowerShell script.
The corresponding line is ‘set-executionpolicy RemoteSigned -scope process -Force’.
Correct the script and reschedule the job. The error information returned by PowerShell is:
‘Security error. ‘. Process Exit Code -1. The step failed.

Having the error in hand, and knowing that the job works elsewhere, my next logical step (again based on experience from the last article with this job) is to script the job from another server and use it to replace the job on the server where it fails. In principle this is an AWESOME idea.

 

Sadly, that idea was met with initial failure. As it turns out, the error remained exactly the same. This is good and unfortunate at the same time. Good in that I was able to confirm that the job was correctly configured with the following script in the job:

Since the step fails from SQL Server let’s see what else we can do to make it run. Let’s take that code and try it from a powershell ise. So, for giggles, let’s cram that script into powershell and see what blows up!

Now isn’t that a charming result! This result should be somewhat expected since the code I just threw into the ISE is not entirely powershell. If you look closer at the code, you will notice that it is using sqlcmd like conventions to execute a parameterized powershell script. Now, that makes perfect sense, right? So let’s clean it up to look like a standard PoSH script. We need to replace some parameters and then try again.

This will result in the following (resume reading after you scratch your head for a moment):

The key in this failure happens to be in the sqlserver. PoSH thinks we are trying to pass a drive letter when we are just trying to access the SQLServer stuff. Depending on your version of server, SQL Server, and PoSH you may need to do one of a couple different things. For this particular client/issue, this is what I had to try to get the script to work.

If you read the previous article, you may notice this command looks very much like the command that was causing the problems detailed in that previous article. Yes, we have just concluded our 180 return to where we started a few years back. Suffice it to say, this is expected to be a temporary fix until we are able to update the system to PoSH 5 and are able to install the updated sqlserver module.

As is, this script is not quite enough to make the job succeed now. To finish that off, I created a ps1 file to house this script. Then from a new step (defined as a sqlcmd step type) in the syspolicy purge job, I execute that powershell script as follows:

Tada, nuisance job failure alert is resolved and the system is functioning again.

Conclusion

I dare say the quickest resolution to this job is to probably just disable it. I have seen numerous servers with this job disabled entirely for the simple reason that it fails frequently and just creates noise alerts when it fails. Too many fixes abound for this particular job and too few resolve the failures permanently.

I would generally err on the side of fixing the job. Worst case, you learn 1000 ways of what not to do to fix it. 😉

Given this job is tightly related to the system_health black box sessions (sp_server_diagnostics and system_health xe session), I recommend fixing the job. In addition, I also recommend reading the following series about XE and some of those black box recorder sessions – here.

Index Cannot Be Reorganized…

sunburst_spaceWorking diligently as any good DBA might, you have established maintenance routines for each of the SQL Servers under your purview.

As a part of this maintenance you have scripted solutions to intelligently manage and maintain the fragmentation levels for each of the indexes within each database on each instance.

To further show how diligent you are as a DBA, the outcomes of each maintenance run are logged and you review the logs each morning. This routine helps keep you on top of everything that is happening within the environment.

For months, maybe even years, things are running smoothly. Never a failure. Never an error. Just routine log review day after day. Then one day it happens – there is an error. The index maintenance script failed one night.

Index Cannot be Reorganized…

You receive the error message similar to the following:

Msg 2552, Level 16, State 1, Line 1 The index “blah” (partition 1) on table “blah_blah_blah” cannot be reorganized because page level locking is disabled

Immediately, you start double-checking yourself and verifying that it worked the previous night. You even go so far as to confirm that the same index was previously reorganized. How is it possible that it is failing now on this index. What has changed? Has something changed?

Time for a little digging and investigating, so the dirty work begins. On one side of the coin you are relieved to be able to do something different. On the other side of that coin, you are rather annoyed that something seems to have changed. These feelings are perfectly normal!

First things first – you investigate the indexes in question to confirm what the error is saying. This is easily done with a query such as the following:

Scrolling through the results, you notice (eventually) that the IX_SpecialOfferProduct_ProductID in the AdventureWorks2014 database has page locks disabled. You are absolutely certain that this index was created allowing page locks. Pondering the problem for a moment, you recall having read about the default trace (there are several articles on the default trace – here) and the thought occurs to try and see if there is a breadcrumb there about the change. Using the query from that default trace article, a picture starts to unscramble. Here is that query reposted and a snippet of the results:

index_deftrace_audit

This is a great start. Not seen in the results is the timestamp showing when it was done – which was due solely to snipping. Also not shown is the text of the statement that was run for those three events. So close, yet so far away. This is not quite enough to have the smoking gun evidence to show Jason (me) that I did something wrong and unauthorized. What to do now?

All is not lost yet! Your stunning memory kicks in and you recall several articles about using Extended Events to audit server and database changes. Better yet, you recall that you deployed an XE session to the server where this error occurred. And yes, you are slightly embarrassed that you failed to remove the XE session after fiddling with it. We won’t tell anybody that you deployed a test XE session to a production server as if it were your sandbox. The session currently deployed is trapping all object changes unlike the following session that has it filtered down to just objects that are indexes.

You query the trace file with a query like this:

Wow! Look at the results! There is a smoking gun finally.

index_xe_audit

Two results happen to pin the root cause of the change squarely on my shoulders. Yup, Jason changed that index to disallow page locks. I highlighted three different areas of interest in the results. The yellow and green indicate the DDL phase. One row for the start of the statement, and another row for the commit of that statement. The red highlight shows me the exact change that was made to this index. This is all very good info!

What Now?

It really is great to have the smoking gun. If something is broke and it worked previously, it is essential to find the root cause. With a root cause under the belt, what needs to be done to fix the failure? That is a little bit easier that finding the root cause. That is, unless there is a technical reason for the index to no longer allow page locks (maybe that smoking gun is less of a smoking gun and more like baby spittle after all).

Here is how you fix the problem:

But, But, But…

The Extended Events session would be very noisy and capture every alter index statement, right? It should capture statements like the following, right?

The answer to that question is: Yes, Yes, Yes. These statements are all captured due to the use of the ALTER statement. Here are the results of the XE session with all of these scripts having been executed:

index_xe_audit2

If you want to audit when the indexes are changing and how they are changing, this will do just the trick. If there are a ton of changes, then be prepared for a deluge of information.

Compressing Encrypted Backups

TSQL2sDayA common requirement, whether it be based out of pure want or truly out of necessity, is to make a large database backup file, that is encrypted, be much smaller.

This was a knock for the early days of Transparent Data encryption (circa SQL Server 2012). If TDE were enabled, then a compressed backup (though compression was available) was not an option. Not only did compression in the 2012 implementation of TDE make the database backup not smaller, it occasionally caused it to be larger.

This was a problem.  And it is still a problem if you are still on SQL 2012. Having potentially seen this problem, amongst many others, Ken Wilson (blog | twitter) decided to ask us to talk about some of these things as a part of the TSQL Tuesday Blog party. Read all about that invite here.

Encrypted and Compressed

dbsecurityWell, thankfully Microsoft saw the shortcoming as well. With SQL Server 2014, MS released some pretty cool changes to help us encrypt and compress our database backups at rest.

Now, instead of a database backup that could potentially get larger due to encryption and compression combined, we have a significant hope of reducing the encrypted backup footprint to something much smaller. Here is a quick example using the AdventureWorks2014 database.

In this little exercise, I will perform three backups. But before I can even get to those, I need to ensure I have a Master Key set and a certificate created. The encrypted backups will require the use of that certificate.

Do this in a sandbox environment please. Do not do this on a production server.

In the first backup, I will attempt to backup the AW database using both encryption and compression. Once that is finished, then a backup that utilizes the encryption feature only will be done. And the last backup will be a compressed only backup. The three backups should show the space savings and encryption settings of the backup if all goes well. The compressed and encrypted backup should also show an equivalent savings as the compression only backup.

With that script executed, I can query the backup information in the msdb database to take a peek at what happened.

This should produce results similar to the following:

backup_results

Looking at the results, I can see that the compression only backup and the compression with encryption backup show very similar space savings. The compression only dropped to 45.50MB and the Compression with encryption dropped to 45.53MB. Then the encryption only backup showed that, interestingly, the CompBackSizeMB (compressed_backup_size) got larger (which is the actual size on disk of this particular backup).

At any rate, the compression now works with an encrypted backup and your backup footprint can be smaller while the data is protected at rest. Just don’t go using the same certificate and password for all of your encrypted backups. That would be like putting all of your eggs in one basket.

With the space savings available in 2014, and if you are using SQL 2014, why not use encrypted backups?

«page 1 of 5

Calendar
March 2020
M T W T F S S
« Feb    
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Welcome , today is Sunday, March 29, 2020