Database Recovery Monitoring with XE

On of the greatest benefits of Extended Events (xe) is how the tool simplifies some of the otherwise more difficult tasks.

Recently, I wrote a rewrite of my database recovery progress report script. That script touches on both the error log and some DMVs along with some fuzzy logic to join the data sets together. That script may not be the most complex script out there, but it is more more complex than using the power of XE.

Database recovery (crash recovery) is a nerve wrenching situation under the wrong conditions. It can be as bad as a root canal and just as necessary to endure that pain at times. When the business is waiting on you waiting on the server to finish crash recovery, you feel nervous at best. If you can be of some use and provide some information back to the business, that anxiety dissipates and the business becomes more calm as well. While the previous script can help you get that information easily enough, I want to introduce the easiest method to capture that information currently available.

If you are interested in a history lesson first, here are the first couple of versions of the aforementioned script (here and here).

Discovery First

As always, I like to explore the event repository to see if there is an event that may be applicable to my situation. This can be done via TSQL script or from the XE Gui. I will almost always break out my scripts to figure out if an event is there or not.

This query will yield any events that match my description. In this case, I am looking for events related to “database_recovery”. This search will yield four relevant events we can use to track our database recovery progress. Those events are shown in the following image (with the event names being circled in green).

If I explore the events a little more, I will eventually come across an attribute in the database_recovery_progress_report event that leads to a map. This map is called recovery_phase. For me, that is an interesting attribute/map and makes me want to look at it further.

Things are coming together a little bit now. We all know (or should know) that there are the analysis, redo and undo phases to crash recovery. This aligns with that knowledge and throws in a couple more phases of the recovery process.

So, now we know there are four relevant events for us to use and that one of the events will tell us specifically which phase of recovery is currently processing. We have enough information that an event session can now be built.

You may notice that I have thrown a lot of actions including the kitchen sink at this event session. Some of that is for consistency across sessions and some of it is simply for exploratory wants (and not needs). Feel free to add/remove actions form this list as you explore the use of this session in your environment.

Here is what that session produces on my test server with a simple stop/start of the SQL Server instance.

In the preceding image, I have the different events circled in red. I have also added the event_sequence action so I can see the relationship between these events as the progress from one to the next. If you note the items circled in green (and connected by green arrow), you will see a couple of different things such as the trace message, the database name, the database id, and the recovery time remaining). Circled in blue are the “destress” items that let us know that the recovery is 100% complete.

Wrap

SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online.

Just because you need to have patience during the crash recovery process does not mean you have to work hard to get a status of the process or that you need to stress throughout the process. This XE event session will take a lot of work and stress out of the process. I would recommend having this lightweight session running on the server with the startup state set to enabled. This will make your job easier and definitely can make you look like a rockstar DBA.

This article has demonstrated the power of Extended Events, for a lot more reading on the topic, here is a list of over 100 articles.

Database In Recovery Redux

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: April 8, 2019

One of the more gut-wrenching experiences of a DBA is to have a database get stuck in recovery. This typically happens when the server was rebooted unexpectedly and typically coincides with a rather large transaction.

The reboot and sudden stop of the transaction requires the transaction to be replayed or to experience an undo. This could take hours or days depending on worst case scenarios. Sitting there and waiting for it to complete just makes you sweat.

This happens to have happened to a client recently so I dusted off an old script (which you can find if you click this link), and started investigating and monitoring the server. While waiting and investigating, I decided to take on my challenge (quoted hereafter) from the first time I posted this query (here).

Unfortunately, this query does not demonstrate the time remaining for the rollback nor the percent complete without needing to query the error log.  Those would be awesome additions if you know how to do it (and let me know), other than via the error log.

I did not remove the dependency on the error log, rather I leveraged that dependency to allow me to get more information from the DMVs and link it to the error log messages to give a bigger more complete picture of the recovery process from a simple query.

This script is now enhanced from the 2014 version and works through 2017.

Wrap

SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online. It will require patience and this script can help achieve that patience.

T-SQL Tuesday #108: New Horizons Beyond SQL Server

Comments: 1 Comment
Published on: November 13, 2018

There comes a point in one’s career when a change is requisite. Big or small there always seems to be a tipping point that mandates some sort of change. Maybe the change is an entirely new career field. Maybe the change is adapting to the ever improving features of a specific software or product. Maybe, that change means learning a tangential technology.

This is precisely the goal Malathi Mahadevan (b | t) seems to have envisioned for the 108th installment of TSQL Tuesday.

If you are interested in reading the original invite, you can find that here.

So the challenge for this T-SQL Tuesday is – pick one thing you want to learn that is not SQL Server. Write down ways and means to learn it and add it as another skill to your resume. If you are already learning it or know it – explain how you got there and how it has helped you. Your experience may help many others looking for guidance on this.”

Personally, I am not one to settle, so learning and improving are important. New technologies, changes in technologies, new features, tangential technologies – they are ways to continue to learn and improve – most of the time. Sometimes, a new technology offers a good change of pace and offers an exit from something that is becoming too standard, while providing an entrance to something different, difficult, exciting and expanding.

Through the year (2018), I created a few goals for myself around some of these new or different technologies:

  1. Become proficient at MySQL (maybe even certify who knows)
  2. Become proficient at PowerShell
  3. Work towards the TCM (I got to busy with 1 and 2 to even accord any time towards this one)

Proficient is sort of a vague term because it can have a different meaning to different people. For me, I will describe what I have been doing to become proficient in both PoSh and MySQL.

PowerShell

A shout out is absolutely necessary for Adam Machanic (twitter) for picking the right blog meme that has been able to survive so long in the SQLFamily. This party has helped many people figure out fresh topics as well as enabled them to continue to learn.

I have dabbled in PowerShell over the years – just enough to “be dangerous” as some may say. I wouldn’t call it proficient and probably not even dangerous really. I was able to muddle my way through being able to create some basic scripts to perform certain tasks.

This rudimentary ability just isn’t enough to be able to put the skill on a resume (imho). It certainly wasn’t enough skill to be able to manage a large group of servers and perform various enterprise type tasks. My objective was to be able to create a set of tools for myself that I could use and have repeatable success at great ease.

I sat down with my first set of various tasks I wanted to be able to perform and worked on them as I set time aside to learn PoSh better – every day. It took several weeks and by no means am I on the same level as Rob Sewell (b | t) or Chrissy LeMaire (b | t) or Sean McCown (b | t). That said, I do feel I am far more capable in PoSh now than I was last year.

This is a skill that I plan to continue to hone. I am learning every time I pick it up and try to do something different. It is good that I am able to keep learning. Next year, I plan on being able to say I am more proficient than I am now. I am also hopeful to be able to be good enough to properly contribute to the dbaChecks project. I do also hope to share some of the scripts I have created on my blog as well.

Here are some of my first dabbles with powershell that I have integrated into other blog posts. Yes, they are very rudimentary.

MySQL

Learning MySQL is probably not too big of a stretch to be honest. I certainly think it is far more similar to SQL Server in many regards than PoSh. After all, it is just another little DBMS and does happen to be rather popular.

I wanted to pick up MySQL so I could support clients that have it installed here there and everywhere in their environments. Many clients have more than one DBMS platform and it is good to understand and be able to administer multiple platforms with a high level of competence. Unfortunately, MySQL comes with a fair amount of gotchas. There are serious limitations depending on version and flavor. Some clients may be stuck on a rather old version of MariaDB (akin to SQL 2000). This means developing a broad set of scripts and skills quickly and on the fly.

I have a ways to go in my learning due to the varied flavors of MySQL but I am getting there. I do feel pretty comfortable hopping in and troubleshooting performance issues and doing a quick health assessment at this point. I would call that proficient. Similar to what I said about PoSh, I plan on being able to say next year that I am more proficient. More opportunity with this platform is what lends itself to better proficiency.

TSQL2sDay150x150The Wrap

I believe in continuous integration / improvement when it comes to personal growth and development. It is necessary to keep your personal skills sharp as well as keep yourself marketable.

Oh, and if you are interested in some of my community contributions (which according to Jens Vestargaard is an awesome contribution), read this series I have published.

Audit Database Offline Events

Categories: News, Professional, Scripts, SQLBP, SSC
Comments: No Comments
Published on: November 13, 2018

Knowing when an event occurred within the database environment is a very important thing. Being able to act quickly when certain events occur is equally as important. Sometimes, we may not find out about an event for a few days or weeks and then we are asked to figure out the who, when, why and how of the event. Hopefully there was enough foresight in place to plan for such a request to travel back in time.

Recently while networking at a large database related conference, I had the chance to discuss some of this with a semi-desperate DBA. In his environment, apparently a database can be taken offline at random times without the involvement of the DBA team. He was looking for a solution to prevent it from happening. For me, the simplest solution is to revoke permissions from the person doing it. If you can’t revoke the permissions, then at least capture the pertinent information to document who and when the databases are being taken offline.

Auditing and Facts

I am a huge fan of auditing the database environment to gather facts. There is comfort that can be gained in knowing that if a change happens to a database or server under your purview, that you can report some facts back up the chain when a post-mortem is required. I recently shared another example of such an auditing scenario – here.

There is a technique in that article that may be of interest for you if you are restricted to implementing this kind of audit. I won’t go into detail about the poor man audit techniques in this article, so it is worth reading that previous article.

When looking to find data about a database being taken offline, one could capture the data from the default trace or from the SQL Server Error Log. While this article will show how to capture this data from both of those sources, it is important to understand that the data just might not be available in either location depending on settings and activity of the server combined with how old the offline event was.

Lets first take a look at the default trace. That one should be easy enough.

And a snippet of some possible output looks like the following:

If you look closely at the output, you will probably notice that I don’t get a lot of information. I find out that there was an ALTER event to a database but I do not know what the event was nor do I see the text from the SQL statement that caused the change. So, at best, I get a partial view of the picture by looking at just the default trace.

With that in mind, let’s look at the error log and see if that provides better information.

This is pretty cool here. I am checking all of the log files available on SQL Server and scanning for a search term of “offline.”

Ok, this shows a little bit of promise. I can see that the database

was indeed set to offline (or “option OFFLINE” was set to ON) with a spid and a timestamp. Unfortunately, the data shown here is far from being super useful in a post-mortem. The error log does not show who took the database offline.

What if, there was a way to combine these two queries into a single result set and better correlate the results into something highly useful? Something like that seems like it could be possible given we have a spid and timestamp in each result set.

Let’s try it and see.

If I use the timestamp from each result and the spid from each result set, I can surely join the two results together. As you can see, that is exactly what I did. I had to format the spid and timestamp a little bit – but that is ok. With that little bit of magic, I get a result set similar to the following now.

Now, I can easily see who put the database into offline mode. I can also tell you what the statement was when they executed the offline. I also know what the timestamp was when the database was put offline. This is usable data now when it comes time for a post-mortem or fact finding request.

The Wrap

Is this the most eloquent method to trap this kind of data? Absolutely not, but it is usable. When it comes to something more sleek and modern, I would recommend a better tool such as Extended Events. I will show how to do this same task with greater ease and reliability through the use of XE in my next article.

Oh, if you are interested in any of my other articles about Auditing or Extended Events, I recommend you read start in here (for Auditing) or here (for XE).

T-SQL Tuesday #104: Just Can’t Cut That Cord

We all have our favorite scripts, tools or utilities. Those are the things that help make our jobs easier. Some of us may have an unhealthy relationship with some of those scripts (similar in nature to the relationship many have with their phone). Whether or not the need to cut that proverbial cord exists, today we are not discussing the health of that dependence. Suffice it to say, sometimes we simply need to upgrade our scripts. How else can we get better scripts or make our scripts better – by sharing them.

This is precisely the goal Bert Wagner (b | t) seems to have envisioned for the 104th installment of TSQL Tuesday.

If you are interested in reading the original invite, you can find that here.

For this month’s T-SQL Tuesday, I want you to write about code you’ve written that you would hate to live without.

Maybe you built a maintenance script to free up disk space, wrote a query to gather system stats for monitoring, or coded some PowerShell to clean up string data.  Your work doesn’t need to be completely original either – maybe you’ve improved the code in some open source project to better solve the problem for your particular situation.”

There is a high probability that through the sharing of your script, somebody out there can benefit from that script. In addition, it is very likely that somebody will make a suggestion to help make your script better. Worst case (emphasis on worst case here), you have the script stored somewhere with half decent instructions on what it does and making it easily accessible for you to use again and again. Just in case you forget you have it out there – you can google for it again and find it on your own blog ;).

Personally, I have been able to find and re-use some of my older scripts. Not only do I get to re-discover them, but I also get to re-imagine a new use or improvement for the script.

Brief Intermission

A shout out is absolutely necessary for Adam Machanic (twitter) for picking the right blog meme that has been able to survive so long in the SQLFamily. This party has helped many people figure out fresh topics as well as enabled them to continue to learn.

Easy Access

While pondering the topic for today, I had the thought occur about how frequently I post a script on my blog already anyway. An easy out for this topic would have been to re-share one of those old scripts. For instance, I could easily redo a recent article about server access that has a couple scripts demonstrated in it. Or I could go back a few years to my articles about foreign keys (here or here) and space use (here or here). Even more intriguing could be to re-envision some of my articles on Extended Events. But where would the fun in that be?

Rather than take the easy road and rehash something, I have something different. This one goes hand in hand with the numerous articles and scripts I have previously provided on auditing – yet it is different.

Not every shop can afford third party software or even Enterprise edition and so they have to come up with a different way to audit their database instances. One of the problems with a home grown solution is to ensure the data is not stored local to the server (lots of good reasons for that). Here is an example of what I did for one client that happened to have a developer that found a back door that was giving him SA access to the SQL Server Instance and was changing things and trying to cover his tracks – even after being warned.

First the query

This query will be run from a job on a different server that is restricted in access to just a select few people. I do rely on the use of the default trace in this query. I am also reliant upon a little bit of sneaky behavior. If I run this from a separate server, prying eyes are usually unlikely to find that it is running and thus makes it easier to catch them red-handed. In addition, if they discover via some sort of trace and by a lot of luck that it is running, then they have no access to the remote server to alter anything that was captured.

The query does go out to the default trace and pull back any changes to permissions or principals on the server in question. The captured data is then stored in a database that is also restricted to a select few people. Lastly, the captured data can be routinely queried, or automated reports can be created to send email notifications of changes encountered.

The second part of the trickery here is that I am using a linked server to perform the queries (a slight change and I could also do this via powershell which will be shown in a future article). The linked server query uses the openquery format and sends the default trace query to the remote server. Since I am running this from a job on an administrative server that pulls a limited data set, I am not overly concerned with the linked server setup here.

Storing It

Once I query the data, I need to put it somewhere on my administrative server. The table setup for that is very straight forward.

After creating this table, I am ready to store the data. All I need to do is throw the audit query into an agent job and schedule it to run on a regular schedule. For my purposes, I usually only run it once a day.

TSQL2sDay150x150The Wrap

This has been my diatribe about service and giving back to the community. When done properly, there is a natural born effect of enhancing one’s personal life equal in some way to the amount of effort given towards the community.

Oh, and if you are interested in some of my community contributions (which according to Jens Vestargaard is an awesome contribution), read this series I have published.

Incorrect Syntax – What?

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: November 17, 2017

Intuitive errors are very helpful when trying to troubleshoot a problem. On the other hand, when we receive an error that is vague, we sometimes find ourselves wanting to throw the laptop out the window. What do we do though when the error message really does look to be pretty intuitive but we can’t figure out where the problem is?

Today, I am going to cover one of these falsely intuitive errors. As it turns out, this is a problem that I had actually forgotten about since I had not seen it in many years. The error is shown here.

This error looks pretty straight forward, right? It tells me there is an incorrect syntax somewhere in my query. Running into a syntax error somewhere along the line does happen to all of us at one time or another. Because of that, I would not be terribly alarmed by the error and just proceed to fixing the syntax problem. Here is the query that is throwing that error.

Now is the time to start pulling out your hair. There is no syntax error in the query. Go ahead and look it over 10-15 times. I know I did.

Ok, if there is no syntax error, then what could possibly be the real problem? Is the database corrupt? Maybe a system table is corrupt? Grasping at straws here, but could it possibly even be some sort of royally screwed up permissions.

Everything seems to be checking out properly. There is no corruption whatsoever. Laptop is soon to be launched at this point right? Ok, maybe not launched because this is a simple query. But, had this been a production related query that was rather intense and complicated, there really may be something getting launched as the frustration mounts.

Simple Solution

Well, since the error does not fit the crime, maybe this is one of those really odd errors that truly means something else. In this case, if we dissect the query we will probably notice that there is a DMO being called here. If I query sys.stats directly there is no problem but joining to sys.dm_db_stats_properties I get the error. If I run the query from the master database, the query runs without error. In fact, running the query from within a connection to most databases succeeds.

Now we have two clues to the problem. It only fails in some databases and only fails when the DMO is used in the query. Let’s take a look at database properties and see what we can find. In this scenario I am finding this little nugget in the affected databases.

This database happens to be in SQL Server 2000 compatibility mode. I wonder why that may be and proceed to investigating as well as fixing it. Let’s see what happens after changing the compatibility mode to something more recent (and yes this is an OLD version of SQL Server and the whole thing should be updated anyway but that is a discussion for another time).

If we step through this whole script, we will see that this is a script that will reproduce the entire scenario from break to fix. First thing is to set the compatibility mode to 80 (SQL Server 2000), then run the query (results in the error). Next I change the compatibility mode to 100 (SQL Server 2008) and rerun the query (no error this time) and I get all of the stats with properties that I was hoping to get. Then I conclude with confirming that I am in the correct compatibility mode.

Conclusion

Errors will happen as we work with TSQL – that can be expected. Sometimes we type too fast or just miss something here or there. When we get a syntax error, it really sets our mind to focusing on it being a mistake that we made. In this case however, the error is thrown for a syntax problem because the DMO is not accessible in compatibility mode 80.

Due to the straight forward error – we may spend a little to much time looking at the wrong thing. Keep your mind open to the error being elsewhere than your code should you see a syntax error near “.” in your endeavors.

What’s that SSIS Password

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: October 30, 2017

A recurring theme over the past several weeks (there are always recurring themes it seems) has been an issue that relates to SSIS. Clients will call for help with an SSIS package that has started failing and they are stuck trying to figure it out. They are all stuck at the same spot in the process – trying to get the package open. As it turns out somebody has decided to password protect the package in each case and the client is unable to open the package because they don’t know the password.

Inevitably they are all stuck at the following screen:

Granted this screenshot is for the password prompt for a project password – it really is the same issue if it is a package password. This seems like pretty good security right? If nobody knows the password then nobody can alter the package, right? At least that seems to be the prevailing course of thought.

Of course this introduces various other issues, one of which I alluded to already – a failing package. What if the requirements change? What if the package needs to be migrated? What if the package needs to be documented from stem to stern? Even better -what if this type of security is not all it is purported to be?

SSIS Security

Microsoft has provided various different “protection” levels for affecting sensitive information within a package. These levels can be set from within the GUI or from the dtutil utility. Here are some of those settings.

Setting Description Value
ServerStorage Rely on SQL Server database roles for protection. Only valid if saved to msdb and not to the file system.
DontSaveSensitive Suppresses the values of sensitive properties in the package when the package is saved. 0
EncryptSensitiveWithUserKey Uses a key that is based on the current user profile to encrypt only the values of sensitive properties in the package. 1
EncryptSensitiveWithPassword Uses DPAPI to to encrypt sensitive values in the package based on a user generated password. 2
EncryptAllWithPassword Encrypt the entire password based on a user generated password. 3
EncryptAllWithUserKey Uses a key based on current user profile to encrypt the package. Only the user that encrypted the package can open or run the package. 4

Based on these descriptions, a look at the package contents (via notepad) would look something like this:

The package will have that ProtectionLevel value added to the xml of the package with the selected value based on the descriptions from the previous table. In the case of this image, it appears that the package in question has been set to “EncryptSensitiveWithPassword” which correlates to a value of 2. This is also what is causing that password prompt to be displayed when opening the package.

So What’s that SSIS Password?

First, a bit of a birdwalk. Some will say you don’t need that password. They say you can go ahead and just change the “ProtectionLevel” value in the project/package xml and all will be rosy good. That may or may not work. Try it at your own risk. That said, if your ProtectionLevel is at a value of 2, there is an alternative method you could retrieve that password. I am going to show you just how to do that.

This method is going to need to assume that the package is being executed via a SQL Agent job and that you also have access to view job properties on the server in question.

Let’s just jump straight to a script to help us with that.

Take note here that I am only querying the msdb database. There is nothing exceedingly top secret here – yet. Most DBAs should be extremely familiar with these tables and functions that I am using here.

What does this show me though? If I have a package that is being run via Agent Job in msdb, then the sensitive information needs to be decrypted somehow. So, in order to do that decryption the password needs to be passed to the package. As it turns out, the password will be stored in the msdb database following the “DECRYPT” switch for the dtutil utility. Since I happen to have a few of these packages already available, when I run this particular query, I will see something like the following in my results.

Now, let’s go ahead and take one of those packages and test this out. I will just take the password that was listed for one of the packages, find the path of that package and then try to open the package using the password I just uncovered. Let’s keep it simple and say I am trying to open the first package called “Sports Adobe Ticket Sales Data Upload”.

After finding the package in question, I enter the password – shown here.

After I click the “OK” button, I am greeted with the following.

Tada! I now have full access to the package fully decrypted. If I need to change the ProtectionLevel, I can easily do that now as well. From here, I am set to proceed with troubleshooting the failure or any of the other reasons I noted for needing to gain access to the package.

 

Conclusion

Occasionally it becomes necessary to access an SSIS package that may be encrypted with a password. This is easily accomplished (unfortunately) if the package happens to be run via a job in SQL Server. While it is possible to gain access to the package, it would be far better to not need to circumvent the security to gain access. Instead, it would seem a much better idea to properly document the necessary passwords etc in a common vault so the appropriate people could access the package in the event of an emergency.

Drop That Schema

Categories: News, Professional, Scripts, SSC
Comments: 1 Comment
Published on: October 27, 2017

An often under utilized or maybe even mis-utilized feature of SQL Server is a database object schema. In the event of the latter, there is an occasional requirement for change. When this need arises, it can turn into a bit of a problem. That is of course if we are not well prepared. In this article, I am going to explore one possible change – the dropping of a schema.

What is a Schema?

A schema is a distinct namespace to facilitate the separation, management, and ownership of database objects. It removes the tight coupling of database objects and owners to improve the security administration of database objects (Source: Technet).

Much like a skeleton is a distinct system that helps support the structural integrity of certain biological entities, a schema helps to support a distinct functioning and structure within a database. This is, of course, a very simplistic description and analogy, but it works.

Another way to look at a schema is almost like a blueprint of what has been or will be created within the database. Like many blueprints, a database schema can map out all sorts of “sub-systems” at our disposal within the database. Some of these may include functions, procedures, tables, views and so forth.

What happens when you try to take out the entire blueprint without understanding the relationship to all of the subsystems? In the case of the schema it just so happens that you will be presented with an error message informing you that you are trying to do something that is either ill-advised or not supported or both.

Here is an example of such an error.

Msg 3729, Level 16, State 1, Line 2
Cannot drop schema ‘Maintenance’ because it is being referenced by object ‘AutoStatsHistory’.

This error message proves to be helpful. I obviously have some objects bound to the schema that need to be blown away before I can drop the schema. If I only have a few objects, this may not be too terribly annoying to do one by one. But what if I have 100 or more objects? Now that becomes an entirely different story.

Drop that Schema!

I have run into this very issue where there are far too many objects in the schema to be able to drop one by one. Add to the problem that I am looking to do this via script. Due to the need to drop the schema and the (albeit self imposed) requirement of doing it via script, I came up with the following that will cover most cases that I have encountered.

And a sample of the output:

As you can see here in the output, I have set the script to generate a series of drop statements for each of the dependent objects within the schema. In addition, the drop statements are ordered to remove objects that may have dependencies on other objects first. A big key here is that this script does not drop the objects for you. Rather it just creates the scripts to do the work. You must review the output and then execute the scripts separately. That means you will be entirely responsible for the results.

Once all of the objects are out of the way, the last statement in the sequence is to finally drop the schema.

You may notice that there are a few things not included in this script. The most notable may be that the Service Broker related objects are not accounted for in this script. I leave that for a later revision.

Conclusion

Occasionally it becomes necessary to remove a schema from the database for one reason or another (e.g. somebody decided to do both Dev and Prod in the same database separated only by schemas). Dropping a schema can become a little bit of a pain without the right tools. This script will help get you on your way to a scriptable solution (to help with documentation of what was changed) and a little less pain while doing it.

Linked Servers and Stats

Linked Servers

A linked server is a fabulous feature in SQL Server to help accomplish various data tasks between local and remote servers. There is a time and a place for the use of linked servers. Unfortunately, some are of the belief that a linked server is the solution to all problems including the way to end world hunger.

You may recall that I have written about linked servers on a few occasions. Usually that is due to the relationship of a poorly performing process and the linked server. Here is a prime example here or here. On the other hand, here are some examples of using a linked server in a much better light here and here.

Using a linked server is entirely up to you. If you choose to use a linked server, you will need to understand there are risks involved – beyond just the documentation and performance nightmares it can cause. Today, I will share one of those fabulous gotchas – mostly because it has cropped up more than once with my clients.

Remote Stats

On more than one occasion I have had an emergency request because everything was broken. The everything in almost every incident is an SSIS package that is failing with error messages. The error message will typically have text similar to the following:

Could not locate statistics ‘_WA_Sys_00000015_346C780E’ in the system catalogs.

Due to the error, the package fails processing and grinds to a halt. When diving into the package it is discovered that the missing stats happen to be coming from a linked server query. This raises a big bright blaring alarm for me. Why is the SSIS package accessing the data via a linked server? This is rather counter-productive and definitely contrary to what is desired from a performance perspective.

You can certainly see why this is a sudden emergency right? Absolutely nothing is working anymore, right? Well, at least that is the way it seems for the person reporting the issue. Understanding that their realm of work has come to a halt is important in order to get to a resolution more quickly. Knowing that the point of failure is not as grande scale as claimed is critical because it can help you keep your cool while assessing the real problem.

Since the problem presents itself when accessing a specific object or even set of objects across a linked server, we have our scope of the issue scaled down quite a bit already. Now, what do we do from here? Obviously the statistic with that name is missing, right? Don’t be so hasty to assume it is actually missing. Let’s verify that the stat is indeed missing first. Even better – let’s just eliminate as many variables as possible. We can do this by querying the affected objects directly from SSMS.

Easy Troubleshooting for the DBA

For me, it is essential to eliminate variables when we run into a failure. In this case, I have just a few major variables that can potentially be causing this problem. Those variables are:

  1. SSIS
  2. Linked Server
  3. The query itself
  4. Security
  5. Bad Plan
  6. The stat is legitimately missing

I can easily assess the legitimacy of each variable through various quick tests. To eliminate or validate the “query” variable, I can issue a very simple query to retrieve data from the affected object. So let’s eliminate variables 1 and 3 in one fell swoop.

This query, in this scenario, results in the following:

This, while disappointing, is actually quite productive. This has eliminated two variables for me. I now know that SSIS is not related to the problem. I also know that query complexity is not related to the problem. There are still a few variables left to either validate or eliminate. Since I know the problem occurs when querying via linked server, let’s try querying the remote server direct (not via the linked server).

Well, while that does not entirely eliminate or validate any variables, it does tell me that the problem is still specifically related to the linked server. What if I try to use OPENQUERY instead of the traditional linked server query?

Wow, look at that? Ok, so this is a bit of trickery because I have told SQL Server to execute that query on the linked server as if it were a “local” query on that server. This does work without error and is definitely pushing the findings to be more conclusive that it is a linked server problem.

While the openquery version works, I do still need to eliminate some variables from the problem. One of the variables is security. Since the error mentions sp_table_statistics2_rowset, I googled about for that proc and found some mentions that maybe there are some column denies related to the stats in question that is giving the linked server some major fits. Or it could also be insufficient permissions to execute DBCC SHOW_Statistics. I highly doubt this to be an issue since the openquery version works while using the same pass through authentication of the linked server that the traditional linked server query would use.

In order to eliminate security as a potential cause, the test is simple (while it could be more complex, I went for the jugular to just eliminate the theory as quickly as possible) – I will add my account as the pass through account (which is a sysadmin on the remote server) and then query the linked server all over again. Suffice it to say, there was no change in the result – the error persisted.

This does not just yet eliminate the security variable because there could be a cached plan somewhere. So, just to be sure, I chose to flush the cache on both the linked server and the local server. Running the simple “Select *” query all over again yielded no difference in the query results. I can now safely say that the problem is not related to a cached plan nor is it related to the security. At this point, I set the linked server security back to the way it was. I have effectively eliminated all variables but 2 and 6 (linked server and the stat is missing).

Let’s eliminate the missing stat variable right now.

Undoubtedly you have noticed that I built a drop statement into the result set from this particular query. That aside, the green highlighted row is the very stat that was producing the error. This stat is most certainly available on the source server (linked server). This entirely eliminates the sixth variable because the stat is not missing.

This brings us to the final step – see what happens when we drop the stat and try the query again. I have the drop statement ready to roll, so let’s just run that and see what happens. My risk here is very small. This is an auto-generated stat and will be recreated if needed. After dropping the stat, I run the simple “Select *” query across the linked server again and guess what? It works perfectly fine.

In my opinion, we can just skip a few of these steps if the problem is on a system generated stat and just go straight to dropping the stat and trying again.

That said, the better solution in this case would be to do one of two things.

  1. Modify the Package to not use the linked servers in this way and create a data source connection for each of the different server connections.
  2. Modify the Package to use OPENQUERY style queries.

My preference between those two options would be to create specific data sources. It seems very silly to pull data across the linked server to then be consumed via SSIS package.

I have just illustrated what I would classify as a nuisance problem related to linked servers. Simplifying the approach as I described with the two potential solutions would alleviate this nuisance while also giving an inevitable performance boost.

Parting thought: One observation of the “fix” that involves dropping the stats is that on busy systems it is a very short lived kluge. I have seen the problem re-surface within five minutes on busy systems. The stats are not really a problem, it is just how the linked server connection interfaces with the stats.

Endpoint Owners – Back to Basics

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

While the challenge may have been for the year 2016 and I haven’t contributed in several months, it is still a worthwhile effort. In that vain, I am adding this article to that series.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Endpoints

You may have heard the term endpoints thrown around in technical discussion and wondered “what the heck is an endpoint?” Well, that is a good question. An endpoint in the simplest form is a connection or point of entry into SQL server.

Another way of thinking about an endpoint is to look at the ends of a line. Generally speaking, there is a stop point at each end of the line. At this stopping point, the line may connect to something else and therefore be a point of entry into that “something else”.

When we deal with SQL Server there is a handful of default endpoints and then a handful of other types of endpoints. To figure out what the default endpoints are, it is pretty easy. The following query will expose the default endpoints.

Executing that query will produce results similar to the following:

Some of those might look pretty straight forward and then you get to that VIA endpoint. Don’t worry about that endpoint. The VIA endpoint is on the deprecation list. All of these default endpoints will be owned by sa and don’t require you to do anything with them per se other than to understand they exist. You may have noticed that I filtered my results by looking only at endpoints with an id of less than 65536. Any endpoint id lower than this number is a default endpoint. All others are user defined endpoints.

Endpoint Owners

So far so good. This is pretty straight forward to this point. If you have implemented anything like mirroring, Availability Groups, or Service Broker, then you may be interested to know that you have also created additional endpoints in the Instance. When you create an endpoint, did you know that you become the owner of that endpoint by default? It is very similar to when you restore a database or create a database – the default owner of that database will be the person that restored/created it.

Do you know who owns your endpoints? Did you create all of the endpoints? Do you know if you have any additional endpoints beyond the default endpoints? If you cannot answer yes to all of these questions, then you will probably want to figure out what the endpoints are and who owns those endpoints. Let’s try that with a slight modification to the previous query.

This will yield results similar to the following:

I took the script a step further than necessary. I wanted to illustrate potential endpoint types that may not be in use. This script covers both in use endpoint types and those not used. In my results you may note that I have a DATABASE_MIRRORING endpoint. As reality would actually have it, it is an Availability Group endpoint but those are presented as DATABASE_MIRRORING endpoints.

Note that my mirroring endpoint (aka Hadr_endpoint) is owned by “YourDomain\DBAdmin”. What if the owner of that particular endpoint was no longer present in the organization and I wanted to change it to something more sustainable? Well, I could do the following:

In this case, the default endpoints are owned by sa and it does make enough sense to assign the owner to be sa for the mirroring endpoint. Take notice that the name of the endpoint is required in order to reassign the owner. The name of the endpoint follows the :: in the script. So, whatever your endpoint name happens to be, just place “Hadr_endpoint” that follows the :: in my script.

Recap

Endpoints are a fundamental piece of the puzzle with SQL Server.  Getting to know your endpoints and the owners of those endpoints is an essential component of knowing your environment. Who knows, it may come to pass that the owner of an endpoint may no longer exist in your environment or possibly lose permissions along the way. Knowing who owns the endpoint may just save three or four grey hairs when that day comes.

«page 1 of 8

Calendar
April 2019
M T W T F S S
« Mar    
1234567
891011121314
15161718192021
22232425262728
2930  

Welcome , today is Monday, April 22, 2019