Auditing when Database Files Change

As a Database Administrator, something that should be part of your database audit is monitoring the growth of files. That means tracking when the log file grows and tracking when the data file(s) grow(s). Many will argue that there is nothing worse than not knowing when a database-related file changed in size except maybe when it rapidly and constantly grows until the disk is filled.

Beyond just trapping when these events occur, the DBA usually also wants to know what caused the file to grow or shrink. This has been a problem that has plagued many a DBA over time. Sure there are plenty of methods to assist in the capture of such things. I even wrote about that quite some time ago here. In that article, I gave the tools to help track when a transaction log grew. Today, I am looking to share a different method on how to trap the executing SQL that causes a file to grow. And unlike that previous method, this method will also help to track when a file decreases in size.

I do want to point out that there is an alternate method to capture the shrink events if you wish to read more. I wrote previously about using the default trace to capture when a shrink occurs within the instance. This can be captured thanks to the default trace. You can read all about that over here. Despite the availability of the information within the default trace, I think the method I share here will be lightweight enough that it won’t hurt to have it running—at least occasionally.

Enter the Database Audit

If you have become accustomed to reading my articles, you will probably surmise that there is some degree of setup that needs to be done before I get to the meat of the matter. That holds true today. It just wouldn’t be right to share a method to do something without a harness to show how to get there. So, let’s get the setup out of the way.

Since we will be wanting to create something that will track when a file changes in size, we will need to have a baseline of the size of the files for the newly created Sandbox2 database. Let’s capture that with the following query. Small note on this query is that I am including the tempdb related file sizes. You don’t need to include that, but it may be of interest for future testing.

That is all the setup that is needed at this point. We are now ready to implement a tool that can help us monitor these changes. This tool is lightweight and pretty efficient at capturing the info that would be essential for the exceptional DBA to keep on top of the changes occurring in his/her environment. This tool comes to us in the form of an extended event. Let’s do a little checking into what is available before we dive into the actual XE session itself.

First, when you are curious, if there is an extended event that may be sufficient for your needs, you should check the event store to see what is there. We can do that via a query similar to the following.

This query may return something along the lines of what we see in the following image:

xe_events_2k8

From that list, we can immediately see that we have some options to help us try and capture what is happening with our databases when our back is turned. From here, we can query to find what kind of data is captured with each of these events. To do that, we simply need to run a query such as the following query:

Looking at the available data points in each of these sessions can prove promising. It is indeed promising enough that we can now proceed with the creation of an XE session.

I decided to just focus on the two events that included file_size_changed in the name of the event. You can also see that I chose to send this to two targets. You would be fine to just send this to an asynchronous file target. Just bear in mind that you do need to specify a path for the file_target that does exist or an error will be thrown. I have also specified that this session will restart on server startup and then I turned the session to the started state.

Does it work?

That is a really good question. Now that we have a test database and a session to trap events, all we need is to figure out how to test it. We would want to do a few different things to test—like grow and shrink a file in some way. And that is exactly what we are going to do at this point.

First test, since we have a really small empty database, is to try and force the database to grow by inserting some data. We will do that with this next query.

I didn’t post the numbers earlier for my files for the Sandbox2 database, but they were small. I had a 2MB data file and a 1mb log file. Now, when I look at the results from that last query that included the file size information, I will see a database that has grown a fair amount.

Database audit newfilesize

That should be enough of a change to have triggered something in our extended event session. Let’s take a look at the session. To do that, I am going to pull out a query to help parse the XML and see what has happened. Here is that query:

Despite having two targets in my session, we will only cover the query that helps parse the data from the asynchronous file target. When I run that query against the files that exist for this session I get a result set of 90 records on my system. That probably seems like a ton of results for such a small increase in the database. As it would happen, I left the growth settings at the default growth increments (don’t do that in a production system) and this means I get a ton of growth activities at very small numbers (1mb for the data file and 10% for the log file). Here is a sample of the output:

xe_output_grow

You can see how this might be helpful for when a process runs hog wild in the environment. What about going in the other direction though? What if we need to know about when the database is shrunk or when a file is shrunk? Well, let’s do that too. Let’s try the following query.

Again, I check for a baseline on the files to see if the file sizes changed. In this case, you can run that final query and compare or just trust me on it. Having shrunk both files in the Sandbox2 database, let’s check the XE session data again:

xe_output_shrink

Check that out! We have captured the shrink events too! There are multiple shrink events in this case only because I ran the shrink statements multiple times. This is excellent news for everybody that is trying to keep an eye on these database size changes. You can see in the XE session that I applied the sql_text() action. I have done this so I will be able to see what query caused the growth or shrink event to occur. Even better news is that this event session is perfect for those of you still on SQL 2008.

If you enjoyed this article, check out some of the follow-up articles: Data Growth Audits or Mysterious Growth.

Ghosts in your Database

Yes Virginia, there are ghosts in your database.  More specifically, there are ghosts in your SQL Server database.  They are not there to haunt you.  They are not there just for this holiday season (speaking of Halloween Month).

How can there be ghosts in the database?

Why would there be ghosts in the database?

Do they happen because somebody issued a KILL statement?

Let’s address each of those in turn.   A database ghost record is (in a very basic form) one that’s just been deleted in an index on a table . Delete operations don’t actually physically remove records from pages – they only mark them as having been deleted (ghosted). Now why is it done this way?  The answer here is largely performance based.  This is a performance optimization that allows delete operations to complete more quickly. Additionally, it allows the rollback of delete operations to process more quickly.  The rollback processes faster because all that needs to happen is to “flip the flag” for the records as being deleted/ghosted, instead of having to reinsert the deleted records.  That may be a bit over-generalized, but I hope you get the gist.  In short, records are marked as “ghosted” when a delete operation is performed; and to rollback, you simply undo that mark.

Now, what about this KILL statement thing?  The kill statement is pure Halloween fun and does not create ghost records.

Ghost Hunting

Now that we have established the purpose of Ghosts in the database, how do you verify the existence of Ghosts?  In other words, what can we do to prove there really are spectral things in the database?  This is where the fun really begins.  First, we need to get out the equipment and tools (as any good ghost hunter would do) so we can capture these phantasms.  Let’s call the first tool the “trap”.  Here is what you will need for it.

This trap, err database, can be a bit large.  As currently configured, we will need about 16GB of disk space to support it.  If that is too much, I recommend removing the last column – “TheBlob”.  As you can see, we are setting a rather large trap.  The table we create (Halloween.Ghosts) will receive One Million records.  This is most probably overkill to catch these ghosts, so you can also cut back on the number of records to be affected.

Now, to make sure we have some data and that we can use the table, let’s just run a little test query.

Excellent, we have a good sample of data.

database ghost records

At this point, it is important to note that we have done nothing that will cause database ghost records.  All that has been done is to set the framework so we can see the ghosts.  With the framework in place, let’s try to catch some ghosts.  To do so, we need to try to delete something.  Since we just happen to have had a clerical error in our database, we have 666 prime candidates to try and fix.  We happen to have several records that were supposed to be given a Slimer date of Halloween.  The clerk, being absent minded, thought that Halloween was supposed to be on Oct. 30.  Our business model dictates that the invalid records must be deleted first and then we can try to enter the replacement records.  So, let’s go ahead and try to remove those records.

Before we remove the records though, we need to discuss one important requirement for us to be able to see the ghosts.  Let’s call it spectral vision goggles.  In the database realm, we call it a trace flag.  In order to see the the ghosts on the pages, we need to enable TF 661.  We can do that with the following statement.  There is a serious side effect to this method too – it alters the behavior of the Ecto Containment Unit or automatic ghost cleanup process.  If you enable this, you will need to disable it later and/or manually run a ghost cleanup.

Now that we have the last piece of equipment in place, let’s go ahead and try to delete some records.

With all of those records deleted (all 666 of them), let’s see what we might have captured.  First, let’s take a look at some index stats.

If we look at the output of this query, we will see that we did indeed attempt to delete 666 records.  Those records will now display in the ghost_record_count column.  We will also see that, since we had two indexes on the table, there are 666 ghost records marked on each index.

idxstats_ghostcount

Very cool!  We are definitely on the track to capturing those ghosts.  We have a trail that they exist in the database.  Let’s keep going and see where we can see them.  You should note that there is an additional column in our result set that looks like it might be related to ghost records.  We are going to leave the discovery of version_ghost_record_count as a homework experiment for you to perform.  It is beyond the current scope of this article.

Now this is getting exciting.  We have stronger evidence in the log showing that these ghosts are hanging around in the database.  Not only are they hanging around in the database, we can see which pages in the database on which they are trying to hide.

dblog_output

This is really solid information!  fn_dblog is giving us just about everything we need in order to get those ghosts.  It took a little bit of work since the log reports the page number in hex.  Converting that to an integer page number is essential for us to look at the page (besides integer values are easier to interpret for most people).  Now I can take that PageID and pass that number, for any of the records reported by fn_dblog, and pass it into yet another undocumented procedure known as DBCC Page.

When looking to use DBCC page, we can either look at the PFS Page and see more pages that have ghost record counts.  Or we can take the results seen from the fn_dblog output  and then look at the contents of the page and catch those ghosts.  We will take a quick look at the PFS page first.  Then we will take a look at an index page next.  In this database that we have created, the PFS page will show several other pages that have ghost records on them.  Due to the size (over 2 million pages), we only see index pages with ghost records in that result.  If our database were smaller, we would quite possibly see data pages in our first PFS page of the database.  Let’s see a sample from the first PFS in this database.

ghost_displayedonpageduetotf

We can follow that link from this point to page 126.  Page 126 happens to be an index page similar to the following.  There are a couple of indicators that this is an index page.  First being that when we run DBCC Page with a format of 3, we will see two result sets.  The second result set will show statistics and index information.  The second being in the image attached after the query.  We will leave it as an exercise to you to see other ways to demonstrate that this is an index page.

ghost_indexpage

That is great, but we have more ghosts to find.  Let’s look at a ghost on a data page.  Randomly picking a PageID from that list that was output from fn_dblog, let’s see what DBCC Page will provide to us.

ghstcntondatapage

Conclusion

Well, isn’t that just cool!  We have trapped a bunch of ghosts and were even able to see them.  This has been a fantastic deep dive into the crypts of the database.  This is merely a scratch on the surface though.  We hope this will encourage you to explore a bit and at least try one of the homework assignments we left behind in this article.

With all of that, we have a bit of cleanup to do.  The cleanup comes in one of two methods.  Method one involves manual labor.  Method two involves our friendly little trace flag we already used.  Since most DBAs prefer the automated mechanisms over manual, let’s just discuss method two for now.  It is extremely effortless.

That will put the system back to the way it was when we started (and of course we trust that nobody did this on their prod box).

This has been one of a few articles about ghosts in the database. You can check out some of the others here and here.

Azure Data Studio and XEvents

Azure Data Studio (ADS) is getting all sorts of love and attention these days. So much so that they have finally gotten around to adding Extended Events (XE) to the tool – sort of. Now we have the power to run traces on SQL Server via ADS.

The presence of XE in ADS comes via an extension and comes with a few other caveats. I will explore the extension for XE available in ADS in this article and discuss some of the caveats. As you read the article, it might be helpful to go ahead and download ADS if you do not already have it.

History

Roughly 10 years ago Microsoft felt it necessary to introduce a cool tool called Extended Events. Soon after they decided deprecate the features called “Profiler” and “Trace”. Unfortunately the page with the deprecation announcement is no longer available, but some evidence of how long it has been deprecated is available here.

The deprecation announcement remains in effect (and online) for all versions since SQL Server 2012. It just may be difficult to find the 2012 announcement as we roll into newer releases of SQL Server.

Now, we have XE Profiler (or XEvent profiler depending on your release of SSMS – read more here). Profiler is deprecated and now we have some confusion in SSMS as to what is Profiler since we are now using that term with the “XE Profiler” feature.

Now enter ADS. XE is not included with ADS by default. You have to install an extension to gain access to the feature. So, the first thing you will need to do is visit the extensions node and then search for “SQL Server Profiler”. I can hear you right now. It isn’t even using any part of the real feature name anymore – they are just calling it the same exact thing as the deprecated feature.

And yes, my heart breaks a little more every time I see “SQL Server Profiler”. We have been teaching Database Professionals for years to use Extended Events and not SQL Server Profiler. And they have been adopting that change in rather large numbers. This just seems like it will cause so much more confusion. Nevertheless, once you have selected the extension, look to the right hand side and you will see a screen similar to this.

After installing the extension and the reloading ADS we are ready to start using this extension. For the remainder of this article, I will just refer to it as “XE extension”.

Where did it go?

After the reload of ADS, finding the XE extension is not really that easy. If you read the info page where you clicked install, there is some info there on how to access it. In short, on a Windows machine Alt-P will be your friend. Where you use that key combination is not your friend though. If you are in a script for instance and hit that key combo, no connection will be made to your server – even if your script is connected.

In addition to that not-so-obvious message, there is a more obvious message box that pops up in the bottom right corner letting you know a connection could not be established. Unfortunately, the problem can’t be resolved from this screen. Just close the tab and try again from the instance connection as shown here.

While a bit annoying, I can manage with that little caveat – just as long as I remember between uses what I did. After the XE extension is open, you should see a screen similar to the following.

In the preceding picture, I show three indicators of a connection being established. The top right corner in the example is difficult to tell that there is a server name there but it is. In my example I am just using the shorthand notation to connect to my server or “.” (a dot) which connects me to the localhost instance.

The top left indicator is a drop down list of all XE sessions I have on the server.

Beyond that, I don’t find it terribly useful. I can’t edit a session or script it from this tool yet.

Managing a session doesn’t appear to be possible at this point, so let’s try to create a new session and see what happens.

Sweet! The create session does something useful. I can create a new session. Unfortunately, all I can do is use one of three templates similar to the XE Profiler tool in SSMS. The full feature XE GUI tool has a much more complete list of templates and possibilities ever since SQL Server 2012. I documented that in this article.

Unfortunately, I have no use for the three default templates. So, for me, this tool drives me back to needing to use TSQL to create my sessions if I want to use ADS. Here is the big takeaway from that statement. The use of TSQL has been the biggest detractor for most Data Professionals when using XE. They want a full featured GUI. So, you are using ADS and must create an XE session, you will need to pull out your TSQL skills and probably need to pull down some of my helper scripts to get you going. The alternative would be to use SSMS where there is a full featured GUI that is more powerful than you might think.

Conclusion

There surely will continue to be more development around this idea of an XE style profiler. More development generally means that the product will mature and get better over time. This article shows how there is more being added to the feature to try and give you better control over the tool. We love control so the addition of these options is actually a good thing. Is it enough to sway me away from using the already established, more mature, and high performing tools that have been there for several generations? Nope! I will continue to use TSQL and the GUI tools available for XE that predated the XEvent Profiler.

Some say that data professionals really want the “Profiler” tool. In my opinion, that is certainly not the majority and now calling XE by the name “Profiler” is going to cause confusion at the least. Some say that maybe this tool needs to integrate a way to shred XML faster. To that, I say there are methods already available for that such as Powershell, the live data viewer, the Target Data viewer, or even my tools I have provided in the 60 day series.

How Long is That Event Taking

Knowing just how long an event takes is a common requirement when troubleshooting. Sometimes, figuring out the unit of time is a bit troublesome. Is it milliseconds, seconds or microseconds?

It is really easy to mistakenly use milliseconds when microseconds is required. It is also really easy to forget that one event is measured in seconds while another might be measured in milliseconds. To add to the confusion, what if the unit of measure changes between one version of SQL Server and the next?

While none could think it should be easy and consistent to figure out time, sometimes it just takes a little more effort. The same unit of time just isn’t applicable for every type of event. That is not just true in SQL Server but in life in general. You wouldn’t want to use hours when timing muzzle velocity, but hours could be entirely applicable to a surgery or training seminar.

Where does that leave us SQL Geeks when looking at timing of internal events inside of SQL Server? Well, we either need to do a little digging or we could simply read the rest of this article to find a simple script that can do the bulk of the work for us.

When dealing with events inside of SQL Server, the tool of choice to use is Extended Events (XE). If you are unfamiliar with XE, I really encourage you to take a look at these resources to become more familiar with the tool.

Time

None of us want to translate or interpret time incorrectly. Imagine the CIO looming over your shoulder asking how long before the database is back online. If you miscalculate the time and tell him 30 seconds when it really is 5 hours, the level of frustration and anger probably becomes exponentially worse.

While that may seem like a bit of an exaggeration, it’s not an experience any DBA likes to encounter. On a slightly smaller scale, when a developer asks how long a piece of code is running in production, you do want to be as accurate as possible. If the code takes 5 minutes to execute, the developer needs to know it was 5 minutes and not a miscalculated 3.9 seconds. 3.9 seconds may be entirely within the realm of acceptable for the project and thus be dismissed by the development team.

While trapping the precise time and calculating it may seem trivial, it is important (as previously mentioned) to use the correct time unit. There are many events within XE that provide a time unit of measure. The unit of measure is different through most of the events so a different calculation may be required depending on what you are troubleshooting at the moment.

Thankfully, there is some means to figure out if the time measurement is in seconds, milliseconds, microseconds or something different. We just need to pull it all out from the metadata views. Here is how we get to that data.

Holy crap! That doesn’t look easy. That is a ton more code than you might have expected. True. However, I like to have as much information as possible at my finger tips. In this case, I want to know the SQL Server version, channels, and search terms (keywords) related to the event.

Why have the extra data? When troubleshooting, it is nice to know if there are other events that might be related in nature that could shed a bit more light on the problem from a different angle. The use of keywords (think google search) and channels is perfect for helping me find those other events.

Here is what a snippet of that data might look like.

With this snippet, I can see there are multiple different units of measure but there are also multiple different keywords. These keywords can be essential to the troubleshooting process.

You will also see from that image that there are some events that don’t define the time unit very clearly. In fact it is just a null value for the description. That is a bit of a problem – but significantly less troublesome than not knowing the unit of measure for any of the events.

Conclusion

Figuring out the correct unit of time measurement can mean the difference between accurate troubleshooting or leaping down the wrong path. Figuring out the time units is made easier when you are able to query the metadata efficiently and have all of the pertinent details at your fingertips.

Extended Events is a powerful tool to help in troubleshooting and tuning your environment. I recommend investing a little time in reading the 60 day series about Extended Events. This is not a short series but is designed to provide an array of topics to help learn the tool over time. Don’t forget to go back and read the companion article showing how to audit these events via the default trace.

PowerShell ISE Crashes

Working with PowerShell brings a lot of advantages and power to help manage a server. The more current your PoSh version, the more efficiently you will be able to manage your server. Sometimes getting to the current PoSh versions comes with a little pain such as ISE crashes.

I recently had the mis-adventure of working through some ISE crashes after bringing some systems up to PoSh 5.1 that were either PoSh 2.0 or 3.0. It’s not a very fun situation to run a WMI update and then run into a crash of any type when testing if it worked. Your first thought is something terrible has happened.

As it stands, the problem is more of a nuisance than a critical failure. That said, it is enough of a problem that anyone who uses the ISE or .Net applications may experience a slight cardiac event.

Fonts

As you work to quickly recover from your missed heart beat, you start digging through logs and then hitting good old trusty google.

Diving through the logs, you might just happen across an error similar to the following:

Problem signature: Problem Event Name: PowerShell NameOfExe: PowerShell_ISE.exe FileVersionOfSystemManagementAutomation: 6.1.7600.16385 InnermostExceptionType: System.Xml.XmlException OutermostExceptionType: System.Reflection.TargetInvocation
DeepestPowerShellFrame: indows.PowerShell.GuiExe.Internal.GPowerShell.Main DeepestFrame: indows.PowerShell.GuiExe.Internal.GPowerShell.Main ThreadName: unknown.

Maybe the first error you encounter might look like this one instead:

System.TypeInitializationException

“FileFormatException: No FontFamily element found in FontFamilyCollection
that matches current OS or greater: Win7SP1”.

Inner exception originates from: CompositeFontParser

Either way, the error shoots us back to the same fundamental problem. The ISE won’t load, you get an error message and you can’t confirm that the WMI patch was applied properly.

As you work your fingers faster and faster through the pages on google, you discover that this problem is caused more explicitly by a patch for the .Net framework and not necessarily the work to upgrade your PoSh version. It only waited to manifest itself after the upgrade.

That’s gravy and all, but how does one fix the problem? For me, the quickest and most reliable fix was to simply jump straight to the root of the problem – fonts. The ISE is a WPF application and it also requires a fallback font (if a character isn’t present in your font set, then the app chooses a substitute from the fallback font – or something like that).

The fix is extremely simple and really underscores why this is merely a nuisance issue and not a critical problem. Thus it shouldn’t cause any sort of sinking internal feelings of any sort. There are a few plausible fixes floating around out there. I recommend just doing a manual font replacement. It is all but three simple steps:

  1. Download GlobalUserInterface.CompositeFont
  2. XCOPY the font to %windir%\Microsoft.NET\Framework\v4.0.30319\WPF\Fonts
  3. XCOPY the font to %windir%\Microsoft.NET\Framework64\v4.0.30319\WPF\Fonts

After you have copied the font to those two directories, then all that is needed to be done is launch the ISE. I ran into the same problem on three or four servers and the fix took no more than 5 minutes on each of the servers.

Conclusion

I previously mentioned that I have been working more and more with PoSh to try and improve my skillset there. This is one of those very low-level trinkets that I ran into as I have been working to hone my skills in that tech. For other, possibly, interesting articles about my experiences with PowerShell, you can check out these articles.

Given this job is tightly related to the system_health black box sessions (sp_server_diagnostics and system_health xe session), I recommend fixing the job. In addition, I also recommend reading the following series about XE and some of those black box recorder sessions – here.

Monitor Database Offline Events

The other day, I shared an article showing how to audit database offline events via the default trace. Today, I will show an easier method to both audit and monitor for offline events. What is the difference between audit and monitor? It largely depends on your implementation, but I generally consider an audit as something you do after the fact. Monitor is a little more proactive.

Hopefully, a database being taken offline is a known event and not a surprise. Occasionally there are gremlins, in the form of users with too many permissions, that tend to do very strange things to databases and database servers.

Having read the previous article, you already know one method to try and find these database offline anomalies. That method may not be the most sleek solution nor most reliable given the possibility that events can quickly roll out of your default trace files. The better more reliable method is use Extended Events (XE) to monitor explicitly for those types of events. If you are unfamiliar with XE, I really encourage you to take a look at these resources to become more familiar with the tool.

Monitor

Beyond the power of XE to be able to better diagnose problems and trace events in your server, there is the ability to monitor for specific events as well. I won’t go into details about how to monitor with XE until a later article, but suffice it say I can monitor for Events to occur and immediately alert necessary parties to get more immediate action. This is quite some power for a built in tool and it is better than event notifications or agent alerts when it comes to ease of use and reliability.

As I look to monitor for these odd unplanned database offline events, I have several events within XE that can provide the requisite information: sqlserver.object_altered, sqlserver.database_started, sqlserver.database_stopped, and sqlserver.errorlog_written. I can hear you asking already “Wait, this seems to be a bit like a drill sergeant – very overbearing!”

Yes, it is probably a bit excessive for this session. However, I prefer to be comprehensive and the ability to link events together so I can better understand if it is a single one-off or if there is a bigger problem with the entire instance. Databases being stopped, started or set to offline should be rare and far between really. With that rarity in mind, the session should be relatively quiet.

If I have that session running and then take a database online/offline or vise versa, I will see something very similar to this output.

Starting from the bottom and working my way up, I can see that a command was issues to bring the 👻s database ONLINE. The very first thing that occurs is the request is written to the error log. Then I see that the database is in the stopped state. Next a message that the database is starting up (because it was stopped). Then we see two events for object_altered (similar to the default trace) due to the begin and commit phases of that transaction.

After that database was brought ONLINE, you can see that I immediately took 👻s back offline – starting with the errorlog event, then a stopped event and the object_altered begin and commit events.

Capturing each of the events I noted previously, not only gives me a complete picture of the event, it also can help me to identify if something happens in between the various “expected” events. If I use this session in my monitoring setup, then I can be quickly alerted to problems with a database as well as have the archive of the events to go back in time and AUDIT or troubleshoot the event of a database being offline or unable to come online.

With this XE Session running, I can be more confident that I have trapped and correlated the correct events in each of the sources. Using the default trace method, I have to make some highly likely correlations but there is still some “magic” involved. With the XE session, you will be far less likely to see any of those events roll out of the log as well. I can’t underscore the importance of that fact enough. The data will be there when you need it!

Conclusion

We all aspire to having a perfect database environment where nothing surprising or unexpected happens. Unfortunately, that is the desire of dreams and fairy tales. The unexpected will happen. A database can unexpectedly be taken offline. Are you prepared to address the problem fully to the CTO should it happen? This XE session can help you with that.

Extended Events is a powerful tool to help in troubleshooting and tuning your environment. I recommend investing a little time in reading the 60 day series about Extended Events. This is not a short series but is designed to provide an array of topics to help learn the tool over time. Don’t forget to go back and read the companion article showing how to audit these events via the default trace.

T-SQL Tuesday #108: New Horizons Beyond SQL Server

Comments: 1 Comment
Published on: November 13, 2018

There comes a point in one’s career when a change is requisite. Big or small there always seems to be a tipping point that mandates some sort of change. Maybe the change is an entirely new career field. Maybe the change is adapting to the ever improving features of a specific software or product. Maybe, that change means learning a tangential technology.

This is precisely the goal Malathi Mahadevan (b | t) seems to have envisioned for the 108th installment of TSQL Tuesday.

If you are interested in reading the original invite, you can find that here.

So the challenge for this T-SQL Tuesday is – pick one thing you want to learn that is not SQL Server. Write down ways and means to learn it and add it as another skill to your resume. If you are already learning it or know it – explain how you got there and how it has helped you. Your experience may help many others looking for guidance on this.”

Personally, I am not one to settle, so learning and improving are important. New technologies, changes in technologies, new features, tangential technologies – they are ways to continue to learn and improve – most of the time. Sometimes, a new technology offers a good change of pace and offers an exit from something that is becoming too standard, while providing an entrance to something different, difficult, exciting and expanding.

Through the year (2018), I created a few goals for myself around some of these new or different technologies:

  1. Become proficient at MySQL (maybe even certify who knows)
  2. Become proficient at PowerShell
  3. Work towards the TCM (I got to busy with 1 and 2 to even accord any time towards this one)

Proficient is sort of a vague term because it can have a different meaning to different people. For me, I will describe what I have been doing to become proficient in both PoSh and MySQL.

PowerShell

A shout out is absolutely necessary for Adam Machanic (twitter) for picking the right blog meme that has been able to survive so long in the SQLFamily. This party has helped many people figure out fresh topics as well as enabled them to continue to learn.

I have dabbled in PowerShell over the years – just enough to “be dangerous” as some may say. I wouldn’t call it proficient and probably not even dangerous really. I was able to muddle my way through being able to create some basic scripts to perform certain tasks.

This rudimentary ability just isn’t enough to be able to put the skill on a resume (imho). It certainly wasn’t enough skill to be able to manage a large group of servers and perform various enterprise type tasks. My objective was to be able to create a set of tools for myself that I could use and have repeatable success at great ease.

I sat down with my first set of various tasks I wanted to be able to perform and worked on them as I set time aside to learn PoSh better – every day. It took several weeks and by no means am I on the same level as Rob Sewell (b | t) or Chrissy LeMaire (b | t) or Sean McCown (b | t). That said, I do feel I am far more capable in PoSh now than I was last year.

This is a skill that I plan to continue to hone. I am learning every time I pick it up and try to do something different. It is good that I am able to keep learning. Next year, I plan on being able to say I am more proficient than I am now. I am also hopeful to be able to be good enough to properly contribute to the dbaChecks project. I do also hope to share some of the scripts I have created on my blog as well.

Here are some of my first dabbles with powershell that I have integrated into other blog posts. Yes, they are very rudimentary.

MySQL

Learning MySQL is probably not too big of a stretch to be honest. I certainly think it is far more similar to SQL Server in many regards than PoSh. After all, it is just another little DBMS and does happen to be rather popular.

I wanted to pick up MySQL so I could support clients that have it installed here there and everywhere in their environments. Many clients have more than one DBMS platform and it is good to understand and be able to administer multiple platforms with a high level of competence. Unfortunately, MySQL comes with a fair amount of gotchas. There are serious limitations depending on version and flavor. Some clients may be stuck on a rather old version of MariaDB (akin to SQL 2000). This means developing a broad set of scripts and skills quickly and on the fly.

I have a ways to go in my learning due to the varied flavors of MySQL but I am getting there. I do feel pretty comfortable hopping in and troubleshooting performance issues and doing a quick health assessment at this point. I would call that proficient. Similar to what I said about PoSh, I plan on being able to say next year that I am more proficient. More opportunity with this platform is what lends itself to better proficiency.

TSQL2sDay150x150The Wrap

I believe in continuous integration / improvement when it comes to personal growth and development. It is necessary to keep your personal skills sharp as well as keep yourself marketable.

Oh, and if you are interested in some of my community contributions (which according to Jens Vestargaard is an awesome contribution), read this series I have published.

Audit Database Offline Events

Categories: News, Professional, Scripts, SQLBP, SSC
Comments: No Comments
Published on: November 13, 2018

Knowing when an event occurred within the database environment is a very important thing. Being able to act quickly when certain events occur is equally as important. Sometimes, we may not find out about an event for a few days or weeks and then we are asked to figure out the who, when, why and how of the event. Hopefully there was enough foresight in place to plan for such a request to travel back in time.

Recently while networking at a large database related conference, I had the chance to discuss some of this with a semi-desperate DBA. In his environment, apparently a database can be taken offline at random times without the involvement of the DBA team. He was looking for a solution to prevent it from happening. For me, the simplest solution is to revoke permissions from the person doing it. If you can’t revoke the permissions, then at least capture the pertinent information to document who and when the databases are being taken offline.

Auditing and Facts

I am a huge fan of auditing the database environment to gather facts. There is comfort that can be gained in knowing that if a change happens to a database or server under your purview, that you can report some facts back up the chain when a post-mortem is required. I recently shared another example of such an auditing scenario – here.

There is a technique in that article that may be of interest for you if you are restricted to implementing this kind of audit. I won’t go into detail about the poor man audit techniques in this article, so it is worth reading that previous article.

When looking to find data about a database being taken offline, one could capture the data from the default trace or from the SQL Server Error Log. While this article will show how to capture this data from both of those sources, it is important to understand that the data just might not be available in either location depending on settings and activity of the server combined with how old the offline event was.

Lets first take a look at the default trace. That one should be easy enough.

And a snippet of some possible output looks like the following:

If you look closely at the output, you will probably notice that I don’t get a lot of information. I find out that there was an ALTER event to a database but I do not know what the event was nor do I see the text from the SQL statement that caused the change. So, at best, I get a partial view of the picture by looking at just the default trace.

With that in mind, let’s look at the error log and see if that provides better information.

This is pretty cool here. I am checking all of the log files available on SQL Server and scanning for a search term of “offline.”

Ok, this shows a little bit of promise. I can see that the database

was indeed set to offline (or “option OFFLINE” was set to ON) with a spid and a timestamp. Unfortunately, the data shown here is far from being super useful in a post-mortem. The error log does not show who took the database offline.

What if, there was a way to combine these two queries into a single result set and better correlate the results into something highly useful? Something like that seems like it could be possible given we have a spid and timestamp in each result set.

Let’s try it and see.

If I use the timestamp from each result and the spid from each result set, I can surely join the two results together. As you can see, that is exactly what I did. I had to format the spid and timestamp a little bit – but that is ok. With that little bit of magic, I get a result set similar to the following now.

Now, I can easily see who put the database into offline mode. I can also tell you what the statement was when they executed the offline. I also know what the timestamp was when the database was put offline. This is usable data now when it comes time for a post-mortem or fact finding request.

The Wrap

Is this the most eloquent method to trap this kind of data? Absolutely not, but it is usable. When it comes to something more sleek and modern, I would recommend a better tool such as Extended Events. I will show how to do this same task with greater ease and reliability through the use of XE in my next article.

Oh, if you are interested in any of my other articles about Auditing or Extended Events, I recommend you read start in here (for Auditing) or here (for XE).

Refresh SQL Modules

As is true in most facets of life, things tend to get stale and old. Sometimes this staleness can be visibly represented as the wrinkles on your face. Other times, as with SQL Server, it may mean that a stored procedure or view stops working.

Unfortunately, when it comes to SQL Server, the symptoms are not as readily visible as aging lines would be. In SQL Server, the symptoms may be as random and difficult to recognize as a stored procedure just suddenly stops working.

What makes this even more difficult is that the stored procedure (that is no longer working) may appear to be entirely unchanged. These failures can occur when other changes have occurred to the system as well as when no change has occurred. Imagine the joys you could experience while trying to troubleshoot this kind of problem.

If you were keen, you would probably implement any number of Extended Event Sessions to help troubleshoot the issue. If you are unfamiliar, you might want a refresher course on how to setup an XE session which you could read from one of these articles, here and here.

If you are curious, there are more XE articles on my blog – here.

Metadata

There are many possible causes for the metadata to become fouled up inside the SQL modules on the database server. Invariably, according to the interested parties, nothing has changed in months! Right or wrong, there is still a problem to fix. Fortunately, the problem is easy enough to fix.

Sure, all of the modules could be re-deployed from source control. But, given that the modules don’t to appear to have lost any of the code within them, re-deploying code seems to be a bit of overkill. We don’t need to go that far. We can simply run sp_refreshsqlmodule for the stored modules (procs, triggers, functions, views etc). This would be pretty simple with the following script:

In this script, I am just going to refresh all modules in the database (Adventureworks2014 in this case). I could modify the script to trim it down to a specific module or set of modules. Instead, I leave that as homework for you.

That said, I do have a slightly different alternative that looks for any tables changed on a specific date. After finding those changed tables, then I update all modules related to the changed table(s). Here is how that would look:

From here, suppose you want to check for a range of dates where tables were modified. Again, I will leave that as homework for you.

After executing these scripts, I will have an output similar to the following:

This leaves you with one more step to perform – copy the values from the RefreshStmt column to a new query window and execute the statements.

The Wrap

In this article I showed a very simple solution to a problem that plagues some environments – out of date metadata. Forcing a module refresh will often resolve these types of issues.

With such a beautifully simple solution, I could have grouped this article into my “Back to Basics” series, but I did not. That said, there are some pretty interesting articles in the series including (but not limited to): Backups, backup history and user logins. Check them out!

Cannot Use the Special Principal – Back to Basics

I recently had a client call me out of the blue because he happened to be getting an error while trying to add a user to a database role. The error he was getting was “Cannot use the special principal ‘dbo’.”

This error has probably cropped up on me more than a few times. And on more than a few occasions, I have forgotten about the previous experiences. Some of that is because the fix is rather easy and after a few times seeing it, muscle memory takes over and you just fix it without thinking about it too much.

Until you get to that muscle memory moment though, you may flounder a bit trying this and that and failing then proceeding on to a level of frustration that has you losing precious hair.

As luck would have it, this is an article about security and principals and is similar in nature to some other articles I wrote about some fundamental misconceptions about permissions here and here.

I do hope that there is something you will be able to learn from this basics article. If you are curious, there are more basics articles on my blog – here.

Meet Prince Apole and Rolle…

Adding a user to the db_datareader database fixed role is a pretty simple task. Most of us can likely do that in our sleep. Even using the GUI is usually pretty reliable to do that. Every now again though, somebody has decided to get tricky on us. Maybe a mistake was made somewhere in a setting on the server and nobody has caught it because nothing was “broken” – until it was.

In the aforementioned case, I was asked to help resolve the issue and I found that there was a problem in how the database owner was set. Not only was it a problem in the current database but in 12 other databases on the same server. The systems admin was at wits end. He was dealing with something that was just not in his knowledge-base yet. I remember being in the same boat – so no big deal there. We talked about some of the things he had tried and how none of it was working. I am going to recreate the same basic scenario along with some of the attempted fixes in this article.

First, we need to create a database (best to break a database designed to be broken instead of an existing one used for something else already).

That is pretty straight forward – the database will be created with the data files in the default directories on your SQL Server instance. In addition, a login called mydomain\svc_dummy will be created as a windows login.

Now let’s try to set the owner of the database and then add the svc_dummy account to the datareader role.

There is a point of interest here. I said I was going to add svc_dummy to the datareader role – not dbo. Well, I set the database owner in the preceding step to svc_dummy so it basically became dbo. When I try to perform the role addition in the GUI and then script the change, this is the script that is produced. I will show why in a few moments.

The execution of the second part of the script results in the following:

Msg 15405, Level 16, State 1, Line 18

Cannot use the special principal ‘dbo’.

That is obviously not going to work. Let’s try fixing the script and add the svc_dummy principal instead of dbo.

Unfortunately, this results in the following:

Msg 15151, Level 16, State 1, Line 22

Cannot add the principal ‘mydomain\svc_dummy’, because it does not exist or you do not have permission.

Well, maybe the problem is because the user doesn’t exist then? Let’s try to create the user and see what happens.

Now we should see this message:

Msg 15063, Level 16, State 1, Line 32

The login already has an account under a different user name.

Oy vey. We seem to be going in circles. Nothing is working. The user is there but not really there. Let’s try to drop the user and just try to clean things up and start over.

I hope you see the problem with this one. Trying to drop dbo. I dunno but we should see an error here – and we do get an error.

Msg 15150, Level 16, State 1, Line 27

Cannot drop the user ‘dbo’.

Let’s fix the user then and try to drop the svc_dummy user instead of dbo.

Which in turn generates yet another error.

Msg 15151, Level 16, State 1, Line 52

Cannot drop the user ‘mydomain\svc_dummy’, because it does not exist or you do not have permission.

If I can’t resolve the problem by changing the user in the database, maybe I can just blow it out of the water by dropping the server login.

Yet another failure message will ensue. This time the message is:

Msg 15174, Level 16, State 1, Line 55

Login ‘mydomain\svc_dummy’ owns one or more database(s). Change the owner of the database(s) before dropping the login.

So far we have been able to skirt around the problem and generate six different error messages. The last one kind of gives us the best information on what we could do to resolve the issue. The login owns a database and therefore, we need to undo that ownership. Before we do that, let’s take a look at the database principal ‘dbo’.

We already know that svc_dummy is mapped to a user in the DummyDB database. We also know that we cannot add the svc_dummy user because of that prior mapping. We have also learned that when scripting the permissions change from the gui on the svc_dummy login and then generate the script it scripts out the user ‘dbo’. Due to this, let’s look in the sys.database_principals view at the dbo user and see what it tells us.

See how the dbo database principal says it is mapped to a windows account type? With this in mind, let’s join to the sys.server_principals and see what windows account is mapped to the dbo database user.

Now we see a bit more clearly. Combined with the error messages and the principal information for both the login and the user, we have a better view of the puzzle now. Changing the database owner indeed mapped the windows account to dbo for us and is now restricting us to certain activities when trying to manage permissions for the windows login in the database. From here, we can easily fix the issue by changing the database owner, creating a user mapped to the windows login and then adding that principal to the datareader role.

And if we run that script for svc_dummy we will see a successful execution as shown here.

The Wrap

In this article I took a rather long route to a simple fix. It’s easy to try each of the steps I showed in this article thinking it will help. It isn’t illogical to try some of those steps. They just don’t work unfortunately. In the end, getting to know the settings in the database and what the errors are really trying to get at is most helpful. Sometimes, it just takes a few more steps to get to the real meaning of the error.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

«page 2 of 62»

Calendar
December 2018
M T W T F S S
« Nov    
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Welcome , today is Tuesday, December 18, 2018