SQL Server Fixed Role Permissions

Roles and Permissions

Some of my recent articles have been focused on permissions and security. There is good reason for that – security is important and all too often it is mis-understood.

You can catch up with a couple of those articles here and here.

It is very important to understand who has what level of access within the server and databases on that server. Sometimes we see users being granted server or database access through the fixed roles available in SQL Server. How exactly do you know what permissions those individuals have via role membership? This article will help to reveal the permissions granted to the various roles and maybe a gotcha or two.

Finding Permissions

There is ample documentation on what the permissions are for each of the various fixed server and fixed database roles in SQL Server. Some of that documentation can be found here and here and here. With all of that documentation, you may be surprised to hear that it is not quite as easy to find the permissions of these roles via queries from within SQL Server – with a caveat. I am going to discuss some documented means to retrieve the permissions for the various roles and also discuss the pitfalls of these solutions.

What are the fixed roles again? Just in case you did not see them in the links from the previous paragraph, I will list the various fixed roles here.

Fixed Server Roles Fixed Database Roles
public public
sysadmin db_owner
securityadmin db_accessadmin
serveradmin db_securityadmin
setupadmin db_ddladmin
processadmin db_backupoperator
diskadmin db_datareader
dbcreator db_datawriter
bulkadmin db_denydatareader
db_denydatawriter

These default roles do have a unique set of permissions for each role. As noted, the permissions are documented well enough. Sometimes, it is preferable to just query the system to retrieve a list of the permissions for each role. This is especially true if one is in need of providing documentation for an auditor on who has what permission.

When trying to query for a list of permissions, one may feel as though they have fallen off their rocker just as the granny in this pic to the left.

Never fear, however, for there is a method to find the permissions of the fixed roles. Let’s take a look at what it takes to query the permissions associated to each fixed role.

System Queries

Unlike most principals, where one can query the various system catalogs to retrieve the permissions assigned to the principal, the fixed roles do not expose the assigned permissions in the same way. With fixed roles, there are two stored procedures that have been created to retrieve the permissions. These stored procedures are sp_dbfixedrolepermission and sp_srvrolepermission.

Immediately I have a “rocker moment” for each of these stored procedures. In the documentation there is a note that states the following.

This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.

When peaking into the secret sauce behind each of these stored procedures, there is nothing extraordinary to how the permissions are retrieved. In fact, both procedures reference a particular object in addition to the catalog view specific to the type of role (e.g server_principals or database_principals. This second object is called sys.role_permissions. Wait, I said there is no direct view or table to query with the permissions, right?

Let’s try to query that table.

Boom! We have just had another “rocker” moment. As it turns out, this table is the secret sauce to the fixed role permissions being accessible via query. This table can be queried from the stored procedures and can be queried direct – if it is queried from a DAC connection. Most will probably not connect to the DAC just to query the role permissions, so what can we do?

Here is a basic script showing what I have done.

In this script, I have taken the results from each of the stored procedures and dumped them into a temp table. Using this temp table, I can now join to this table to get a more complete list of the permissions in effect for various principals. Once that more complete list is made, then it can be handed to the auditors to satisfy them for at least a week before they ask again ;).

Now it is time for yet another “rocker” moment. Look carefully at the output from these stored procedures. Remember the notice that they will be removed (i.e. on the deprecation list)? It seems there is good reason to remove them from SQL Server – the permissions in sys.role_permissions is not being maintained. That is correct! There are permissions listed in the output of these procedures that are no longer applicable!

If the list is not entirely accurate, then what can be done to get an accurate list of permissions? As it turns out, it seems one may need to code a solution that has the permissions hard coded in the script – very similar to what these system stored procedures were doing.

Recap

Capturing fixed role permissions is possible through the use of two system stored procedures. Just like the red telephone booths, these stored procedures are soon to be a thing of the past. These stored procedures are deprecated and may be just as reliable as those old telephone booths.

Too bad there is not a better means to trap the permissions from these fixed roles. It would be really nice to be able to view them just the same as can be done with the other principals (users and logins).

Now go forth and Audit your roles.

PS

What is up with that weird granny pic? Well, it was a challenge from Grant Fritchey to use the image in a technical blog post. You can read the challenge invite over here. And yeah, I know it is some sort of Dr. Who thing.

SQL Server Permissions – Database Roles

securedb

EZ PZ Permission Squeezee

Given the critical level of importance related to permissions, one may think it is a concept that is well understood by all who are given the charge of protecting the data.

There is a great responsibility when the keys to the data kingdom are granted to a DBA. The DBA is responsible for ensuring the integrity and security of the data. To ensure the security of the data, the DBA has the responsibility of granting, revoking, or denying access, at various levels, to the data and objects within the database.

Despite this high visibility and critical nature of this concept, understanding permissions and assigning permissions does not seem to be as straight forward as it should be. Evidence of this is something I see far too frequently in the wild  as illustrated by the following image.

This screenshot is only a part of the problem that I wish to break down and discuss in this article.

SQL Server Permissions

A fundamental component of SQL Server is the security layer. A principle player in security in SQL Server comes via principals. In a previous article, I outlined the different flavors of principals while focusing primarily on the users and logins. You can brush up on that article here. While I touched lightly, in that article, on the concept of roles, I will expound on the roles a bit more here – but primarily in the scope of the effects on user permissions due to membership in various default roles.

Let’s reset back to the driving issue in the introduction. Frequently, I see what I would call a gross misunderstanding of permissions by way of how people assign permissions and role membership within SQL Server. The assignment of role membership does not stop with database roles. Rather it is usually combined with a mis-configuration of the server role memberships as well. This misunderstanding can really be broken down into one of the following errors:

  • The belief that a login cannot access a database unless added specifically to the database.
  • The belief that a login must be added to every database role.
  • The belief that a login must be added to the sysadmin role to access resources in a database.

The experienced professional will likely note that there is a direct conflict between a few of these beliefs. That said, all too often I see all three of these misconceptions implemented in every instance for nearly every user.

Let’s start looking at these misconceptions. To investigate these problems, I will create a login. After creating the login, I will add that login as a member to the sysadmin role. Once the login is added to the sysadmin role, I will then run some simple tests within my DBA database.

Sysadmin

The creation of a server principal (login) and adding the principal to the sysadmin role is fairly simple. The next couple of screenshots are followed by a quick script that will perform the same actions.

As was promised, here is the script that will do the same thing as illustrated in the GUI above.

With the user now in place, let’s test. The primary test here would be that a server principal cannot access the database since explicit database permissions have not been granted. Here is the first batch of tests that I will run.

The first statement is to allow me to impersonate the superuser login. From the impersonated connection, I first check to see I can query the sys.objects system catalog. Then I test the database_principals system catalog. Next in line is to check the list of permissions that have been granted to the superuser account. Each of these queries executes successfully without error. Here is a screen grab for these first three tests.

Notice the first two queries returned an empty set. This is not a failure, rather evidence that the select statement ran successfully. In the third result set, we can see that the superuser account has all sorts of server level permissions. In the result set there was not a single database level permission.

The last query that utilized sp_helprotect returned the following error:

Msg 15330, Level 11, State 1, Procedure sys.SP_HELPROTECT, Line 302
There are no matching rows on which to report.

This is confirmation that there is no database user called superuser.

So I can query a database without my server principal being given direct access to the database (it is worth reiterating here that this principal is in the sysadmin server role), but can I do other things such as create objects? Let’s test that with the following script.

This script is straight forward. All it does is check for a table. If that table exists, then drop it and recreate it. The last little bit will check to confirm the successful creation of the table. This script succeeds as illustrated in the following image.

That should be pretty convincing that if you add a server principal to the sysadmin server role then that user has access to the databases. These tests have illustrated that it is not necessary to add a server principal as a database principal when that server principal is in the sysadmin role (an erroneous configuration often seen). If the database principal is not necessary in this case, then what will happen if a database principal does exist?

Database Principal in Every Database Role

The next logical step in the sequence is to create a database principal for the already created superuser server principal. Once created, we will test to see what effects if any can be observed by having this database principal in every database role as well as the sysadmin role. This will help to test the first two bullet items from the list of common configurations I have seen in the wild. Let’s start with the script that will help create the principal to be used during the next iteration of tests.

The script basically creates a database principal and then adds that principal to each of the default fixed database roles available in SQL Server. Those roles are easily viewed in the first image in this article and are also listed here for ease of reference (intentionally ignoring the public role).

  • db_owner
  • db_accessadmin
  • db_securityadmin
  • db_ddladmin
  • db_backupoperator
  • db_datareader
  • db_datawriter
  • db_denydatareader
  • db_denydatawriter

The tests for this round of changes will be just like in the previous section. Here is the script to be used for the first batch of tests.

The major differences between this version of the test and the previous iteration of the test is that I have the table still in existence (I did not drop it but that will come shortly) and I have created a database principal so the first two queries will show a single row for each instead of an empty result set. The next significant difference is the last query that utilizes sp_helprotect. Instead of an error like the first time, this execution gives me the following results.

Next I will rerun the test to create an object with the following script:

This script will produce the same results as in the first example. The table, since it was already there, will be dropped and then recreated. After it is recreated, the validation script will find the table and return a single row.

This series of tests has just shown that a database principal tied to a login which is added to the sysadmin role and then added to all database roles will still be able to do pretty much everything a sysadmin can do. To this point, I have shown how bullet points one and three are not sound in reasoning. With that, there is still the test to prove (true or false) the final bullet point that a principal should be added to every database role.

Before testing the validity of the configuration, I want you to look closely at the list of database roles.

  • db_owner
  • db_accessadmin
  • db_securityadmin
  • db_ddladmin
  • db_backupoperator
  • db_datareader
  • db_datawriter
  • db_denydatareader
  • db_denydatawriter

Just reading the names of these roles should be a good indicator that adding a user to every database role is not a desired configuration. I am sure the question is bubbling up now in regards to some of those “but how is it working in the previous examples?”. The response to that is very easy: “The user is a sysadmin!”.

To test this last bit of the configuration, it is time to remove the server principal from the sysadmin role. I will do that via the following script.

With the user removed from the sysadmin role, it is time to repeat the tests.

The results are significantly different this time.

Msg 229, Level 14, State 5, Line 132
The SELECT permission was denied on the object ‘objects’, database ‘mssqlsystemresource’, schema ‘sys’.
Msg 229, Level 14, State 5, Line 136
The SELECT permission was denied on the object ‘database_principals’, database ‘mssqlsystemresource’, schema ‘sys’.

I just tried to query two objects and was refused due to a denial in each case. What if I try to query that table I created previously?

This will give me the following error.

Msg 229, Level 14, State 5, Line 141
The SELECT permission was denied on the object ‘MySuperTable’, database ‘DBA’, schema ‘dbo’.

Well that is not the desired effect or is it? Remember, I added the superuser principal to every role and that includes the “deny” roles.

How about if I try to drop and recreate that table?

Again, no dice!

Msg 229, Level 14, State 5, Line 157
The SELECT permission was denied on the object ‘objects’, database ‘mssqlsystemresource’, schema ‘sys’.

With this constant blocker of being denied cropping up, let’s take a look at relevant permissions related to the database roles to which I added the superuser principal. To evaluate those permissions, I am going to use the following script that relies on the results of a system stored procedure called sp_dbfixedrolepermission. (Word of advice, this procedure is deprecated and I will be writing more on that in a future article.)

Looking at the results of that script, one can start to see why we are having so many problems executing a select.

In this result set, one can see that by virtue of the db_owner role and the db_datareader and db_datawriter roles, we have permissions that are being GRANTed. Immediately after that, we hit the DENY. This behavior is to be expected. A DENY permission will override a GRANT except when the principal is a member of the sysadmin server role.

Seeing this in action, the next logical step is to remove the role membership from the two “deny” roles (in this scenario where all are checked) for starters. Once done, go ahead and re-run these tests, you will see that things will work differently. While you are at it, take it a step further and rerun the tests after removing superuser from the db_owner role. Think about it, do you really need to have db_owner along with all of the “grant” related roles given that db_owner gives all of those same roles?

One last bit of homework is for you to run the tests another time from the point of when the database principal was created. Only this last batch of tests, impersonate the database principal instead of the server principal. You can do that by doing the following:

Instead of using the following impersonation statement:

You may be interested by the results.

Recap

I have taken you on a journey through three common mis-configurations with principals and permissions. It really is quite unnecessary to add a principal to the sysadmin fixed server role as well as every fixed database role. Even if the principal is not a member of the sysadmin role, it is foolish to add it to every database role. There just is no good logic in setting up permissions in that manner.

Take a moment to focus and you can say EZ PZ permission squeezee.

Finding Deprecated Uses in SQL Server

 

sqlbasic_sarge

How well do you know your environment? You probably know all of the jobs that are running, the frequency that indexes need to be rebuilt, and even which users have which level of access to each object in the SQL Server instance. Do you know that your applications are accessing deprecated datatypes over 300 million times a week? What if your TSQL constructs are a bit archaic? Do you know that the code needs to be updated? Do you know how to find deprecated uses in SQL Server?

In this article, I will explore how to use Extended Events to track feature use and abuse. To be more precise, I will share how this tool can help you better understand all of the ways that your applications have been abusing your database by continuing to employ the use of deprecated features, syntax, or constructs in general. In case you are a bit behind in your exploration of XEvents, I have the perfect solution for you – my series on the topic that is continually growing. You can explore the full list of articles in the series by visiting the table of contents – here.

Audit Deprecated Uses

redxI would dare say that most data professionals think there is some use of deprecated constructs, datatypes or features within their environment. I would double down on that and say that most do not know just how bad it really may be. To find just how bad it really is, we need to audit for the use and abuse of these deprecation events.

Right here would be a good time to point out that your mileage may vary. Some items that are deprecated are more painful than others. Some may be deprecated and may have been on the list for 10+ years at this point. The point is, know your environment and then use good judgement to determine which items in your results need the most attention to fix and update. Why? Well, things really may break especially if you are looking to upgrade to a new version of SQL Server. Just because an item is still available in your current edition, that does not ensure it will still be available in a future release of SQL Server.

Now for the juicy stuff. As I mentioned, finding when and where a deprecated feature or syntax is employed, there are a couple of neat little events within Extended Events that can help to track each time a deprecated feature is accessed or employed. How do we do that? Use the deprecation_announcement and deprecation_final_support events. To help create sessions to track these events, I have the following script to create an “audit” session to do exactly that.

 

And just in case you are using 2008 or 2008R2, use this version instead.

Slight differences between these two sessions. First, in the 2008 version of the script, I rotten_orangehave to provide database ids instead of names. That is a shortcoming of 2008 and 2008R2 implementations of Extended Events. Additionally, the file target is different between the two (recall that they renamed the file target). And lastly, there are a few actions that I included in the 2012 version of the script that are not available in 2008 and R2.

With the session in place, I am now going to run through some sample scripts that will generate deprecation events to occur. I am sticking with my 2014 instance for this segment. That is important to note because different events may occur for different versions of SQL Server. Additionally, the parse script I will share will require a slight change for 2008 and r2 (again related to the file target name).

Now to take a peek at the data with this next script.

Now at long last, I can see what kind of data I am generating (they are really wide so I am just posting a snip).

deprecated_features_results

 

 

 

 

With this, I can see the feature_id along with the description and even the tsql that generated the event. What you don’t see in this is that I also trap the source machine and the user name. If there is an application name included in the connection string, I also trap that. These pieces of data can prove critical to efficiently troubleshooting and finding the source of these events.

From here, one might wish to explore all of the events generated from this session in order to ensure the environment is properly prepared for upgrade. Most tools do not evaluate the code thoroughly to trap all of these events. Instead they do a cursory look through stored procedures or at the data types. As we all should know, not every piece of SQL code is actually stored in the database or even is it cached at the time of analysis. This is the type of thing that requires a long running trace to prove that you are that rockstar DBA.

Conclusion

In the article today, I have shown how it is possible to see the deprecation alerts that may be generated in your environment. This data is what can help set you apart as a rockstar when it comes time for that migration. If you have yet to read my series on Extended Events, I highly recommend it. You can find that series here.

Database Dropped

What do you do when a developer comes to you and asks, “Where did the database go?  The database was there one minute, and the next it was not.”  Only one thing could be database droppedworse than the feeling of losing a database on your watch, and that would be losing a production database. It’s like magic—it’s there, and then it disappears. To compound the issue, when asking people if they know what might have happened, all will typically deny, deny, deny.

What do you do when you run into that missing database situation and the inevitable denial that will ensue?  This is when an audit can save the day.  Through an audit, you can discover who dropped the database and when it happened.  Then you have hard data to take back to the team to again ask what happened.  Taking the info from a previous article of mine, we can alter the script I published there and re-use it for our needs here.

This script will now query the default trace to determine when a database was dropped or created.  I am limiting this result set through the use of this filter: ObjectType = 16964.  In addition to that, I have also trimmed the result-set down to just look for drop or create events.

This is the type of information that is already available within the default trace.  What if you wished to not be entirely dependent on the default trace for that information?  As luck would have it, you don’t need to be solely dependent on the default trace.  Instead you can use the robust tool called extended events.  If you would like to be able to take advantage of Extended Events to track this information, I recommend you read my follow-up article here.

This has been a republication of my original content first posted here.

Events By Feature

sql_features_xeWithin the world of SQL Server there are a few things one can be certain of – things will change. This is true of the features in SQL Server. Additionally, Extended Events is constantly evolving which underscores this constant change.

With all of this change occurring within SQL Server, sometimes it is difficult to figure out how to troubleshoot or track issues that relate to new features. Within the need to figure out how to troubleshoot the new features, there is the need to understand what tools are out there to help troubleshoot.

It is no big secret that Profiler cannot help you trace any of the new features. To help perform the task of tracing events related to new features one must take advantage of the power of Extended Events. Even knowing that you can use XEvents, there is still a need to know what is related to the feature.

In this article, I will show how you can discover the events related to specific features. In addition, I will provide enough base information to help you perform other discovery type queries on your own as you continue your exploration of XEvents. In case you are a bit behind in your exploration of XEvents, I have the perfect solution for you – my series on the topic that is continually growing. You can explore the full list of articles in the series by visiting the table of contents – here.

Events and Features

More and more I am being asked how to track which events belong to which features. I have also been seeing more people ask for a way to list the new features supported by XEvents. Thankfully there is adequate information within the XEvent metadata to help retrieve this type of information. The downside is that there is a bit of homework that must be done across a few versions of SQL Server to help produce the desired information. I have done that work and pulled the information into the following query to help make it easier for anybody else wishing to dive in and compare features and events across versions of SQL Server.

In the preceding query, I have provided an aggregation of the events and features across multiple editions of SQL Server. This aggregate of data is necessary to help see how the features are changing and how the events are changing within the features that remain constant from one version to the next.

Once that data is aggregated, I then provide a couple of sample queries that show what can be done with the data. I need to stress here that I have not provided an exhaustive set of queries to explore this data seven ways to Sunday. I have left those types of exercises up to the reader to explore and experiment.

The first example query shows how to pull the new features that can be “Traced” in the current version of SQL Server on which the query is being executed. The second example provides a difference count between versions of SQL Server to help illustrate the evolution of XEvents within SQL Server. The third query is a simple query to list out the number of events for each feature in your version of SQL Server.

From here, one might wish to explore all of the events that are related to a specific feature. This would be easily accomplished by querying out the data from the #presel temp table based on the feature name. Orrrr…one could query the desired feature by following the guidelines in this article.

Conclusion

In the article today, I have shown how it is possible to see the new events and how they relate to the various features within SQL Server. Being able to correlate events that can trap information about new features can and will help you evolve into that rock-star DBA you are trying to become!

Finding the Right Path

xe_path1I have a fairly large backlog of articles that are in progress or that are planned for my series about Extended Events. Despite the backlog and planned articles, every now and then something else comes up that bumps things around. This article is an example of bumping the schedule around. You can see some of the backlog and the full list of articles in the series by visiting the table of contents – here.

I bring that up for a couple of reasons. First and foremost being that the topic was recently raised as a “need” by some colleagues. The second being that I see the need and how it was lacking in coverage by anything I had already written.

What is this gaping hole in the coverage of Extended Events? To be honest, it is not a very complicated topic or very difficult gap to fill. It’s just something that has been overlooked. The gap boils down to this: how does one consistently find the correct path to the Extended Event Log file (XEL file)?

Filling the Gap

The gap I will be working to fill in this article deals with consistently finding the file path for Extended Event (XE) sessions. This gap rises due a few different things such as the ability to define a target in different manners, being able to move the logs directory, or even the fact that a target may not be added to the session (let alone a file target). These causes can all contribute to a bit of frustration and may pose as different symptoms when trying to get the file path.

One additional complication is tied to the running state of a session. For that complication, I talked briefly about it in my article about better practices. The state of the session could have an impact and could cause frustration when trying to retrieve the file path. All of these things are considerations that must be made when trying to retrieve the file path.

To find the file path, let’s start with some basics. The default path for XEL files is in the log directory of the instance. In addition to this default behavior, each session stores metadata about running sessions as well as deployed sessions that is accessible from DMVs and system catalogs.

Log Path

The first basic to tackle is the log path for the instance. I can query for the log path of the error log for the instance and rely on that as the path of my xel files. Should I choose this method, then I could execute the following query.

Executing that query on my SQL 2014 instance produces the following results:

errorpath

Now the obvious problems with this method come from the fact that relying on this data is relying upon an assumption that you have set all of your XE Sessions to use the default log path. If you have declared your sessions to use a file target and did not specify a path, then the assumption is safe. However, if you are detail oriented, you probably have been somewhat explicit in how you define your event file target. This brings us to the next topic – defining the path.

Defining The Path

While a bit of bird-walk, it is necessary to cover this topic at this juncture. This brief discussion will help to understand some of the other issues with retrieving the path consistently.

When defining the path of the event path, there is a bit of flexibility in how one can define the file to be used. You can either declare the file as just the file name, or you can define the file as the folder path along with the file name. Both methods are completely legitimate. Unfortunately, this flexibility is what causes some of the pain with retrieving the file path consistently.

Let’s take a look at two quick, and acceptable, ways to add an event file to an XE Session. I will be re-using a session from a previous article for these examples.

This is a pretty standard format I use for creating my sessions. The section I want to highlight though is the set of the filename near the end of the script. I will generally define the entire path for my xel files when I create a session – just as I have done in this example. Defining this path helps me to know exactly where I am putting the session for starters. Equally as important is that this path is easier to retrieve from metadata because I have explicitly defined the path.

Take this next example of the same session but with one minor difference.

The minor difference in this example is just in how the filename was defined. It is technically accurate and acceptable to only use the filename instead of the path and filename as I did in the previous example. The problem here comes from the retrieval of the path from metadata.

Getting the Path

In the previous two examples, I showed two methods of defining the filename within a session. In the former example, this means I can more easily find the path. However, if I use the script in the beginning of the article to find my xel file paths, then I have made an inaccurate assumption. This leads to an inconsistency in how the file path is fetched. In the latter example, the assumptions concerning the default log path would be valid and could be applied here.

Since the assumptions do not prove to be consistent or accurate across the board, we need to both evaluate how the data for each would look and we would need to see how to retrieve this path more consistently. First up is looking at how the data for each of these example sessions would be stored in metadata.

In the preceding query, I am looking at the deployed session metadata for two sessions, both of which start with “AuditSelect”. In the sys.server_event_session_fields view, there is an attribute called name that contains the value “filename”. This value will only exist if the session has an event file target defined for the session. Running the query will yield the following results (on my system).

eventfile_meta

In this result set, I can see there is one of each of the two event file definition methods I described in the previous section. In the green highlight you will see that I have the entire filepath. In the red highlight, you will only see the filename. Nothing special has been done to this data in the return as you can verify from the posted script. These are the actual stored values. So this would indeed seem like we have a bit of a problem, right?

Well, this is where we get to be a little bit imaginative and employ a script such as the following:

Wow, that is considerably longer than the first example in the article. That is true! This script accounts for a few of the conditions that cause inconsistencies (but not yet all of them) in retrieving the file path for a deployed event session. The key here is to understand this will fetch the path whether you enter a complete path or not while creating the session. I also through a check in there to confirm that the session has a file target as well as a check to ensure the session is deployed to the server.

Looking at the output, I get results that are somewhat friendly and easy to understand without any of the guesswork.

xelpath_result

What if the session has never been started and did not have the full path declared? Well, that is one of the inconsistencies I am working on still.

There is an alternative to this method as well. For the time being, this method would also be recommended in the event the session being researched happens to be one of the system sessions that is “private”. This next code chunk will show two different methods to parse the file path from running session metadata in the DMVs.

Conclusion

In the article today, I have shown some of the internals to retrieving file paths for Extended Event Sessions. I dove into metadata to pull out the path for the session and discussed some concerns for some of these methods. In the end, you have a few viable options to help retrieve the file path in a more consistent fashion.

Audit All Select Statements

audit_selectLegislation and regulation sometimes dictates that certain activities must be tracked within a database. On occasion, it will be required that queries be audited and tracked to reach compliance with the legislation or regulation. To achieve this compliance, this article will demonstrate how to use Extended Events to audit statements being executed within the database.

Over the course of a few recent articles, I have included little hints here and there alluding to this article. In fact, now would be a good time to review one of these articles in preparation for what I will be sharing today. Take a moment to refresh your memory on this article about finding the right event – here.

Backstory

I have to be a little honest here. Prior to somebody asking how they could possibly achieve a statement audit via extended events, I had not considered it as a tool for the job. I would have relied on Audit (which is Extended Event related), or some home grown set of triggers. In this particular request, Audit was not fulfilling the want and custom triggers was not an option. Another option might have included the purchase of third party software but there are times when budget does not allow for nice expensive shiny software.

So, with a little prodding, I hopped into the metadata and poked around a bit to see what I could come up with to achieve this low-budget audit solution.

Low-Budget Audit

Using the handy scripts I have shown previously (and that I just linked to), I was able to explore the Extended Events metadata and find just what may work properly. The requirements in this case were that it needed to be done in XEvents and that it must capture SELECT statements. To find the events that seemed plausible for this task, I plugged the term “SELECT” into my queries as follows:

From this query, there was really only one event that made any sense for my task – “degree_of_parallelism”. There was another event returned in the result set, but it made no sense to me as a possible candidate event for auditing select statements (it was ucs_transmitter_reclassify). From the results, I then took the keyword associated to degree_of_parallelism to see what else might be pertinent. I plugged that keyword “execution” into the following query:

The results of the previous query will be quite a bit more substantial. Some make sense to include in the audit and some do not. I opted to not included any of the events to keep things as clean and simple as possible.

More Requirements

I now have the base events covered that I want to use for my audit. That said, my base requirements are just not extensive enough. When I audit something, I really want to know who did it, when it was done, where it originated and what was it that they did. The degree_of_parallelism event will capture all of the select statements but it does not meet these additional requirements.

In order to meet the additional requirements, I will attach a handful of actions to the degree_of_parallelism event. The addition of the actions will provide all of the data I want and need. Combine the event and actions together, I came up with the following session definition.

This is a very simple session to pull together. I have added a few things in on my predicate to help limit the scope of the session to the AdventureWorks2014 database and to make sure I am not trapping events related to the code completion tools shown. Notice here also that there is a potential chance to optimize this session. Can you find it? Hint: It may be in the predicate. Second hint: re-read this article about predicates.

Now the trick to what makes this work to audit only the select statements is right there in the predicate. I have instructed the session to ignore any statement_type that is not a value of 1. As it turns out, statement_type of 1 is a select statement. To see how these values map, here is a quick query and the correlated results.

dop_statementtype

Based on these values, should you want to audit a different query type, just change the predicate to use the map_key value that corresponds to the desired statement type.

Testing

Testing this session is rather simple. Here is a quick and dirty script to help test it.

Conclusion

Building low cost solutions is a common requirement for the data professional. When given the opportunity, try things out to see what you can build to provide the solution. In this case, I have demonstrated how Extended Events, however unlikely a candidate, can provide a working solution to help audit any select statements occurring within your database.

This has been another article in the 60 Days of XE series. If you have missed any of the articles, or just want a refresher, check out the TOC.

BIT9 PathNames

We live in a day and age when security (data, network, server, etc) is seemingly at the forefront of the daily news. In our quest to improve security we seemingly always find more and more products that are supposed to help with that security. Sometimes, those products require a data repository. As it happens, it is not uncommon for that repository to be in a database. Today, I am looking into a specific issue with the Bit9 product.

Why?

There are some very good reasons as a matter of fact. One really big reason is that I could find no decent information about this issue. The bigger reason is the seemingly indifferent level of response and the delays I saw in response from the Bit9 support channels.

In this specific case, the first response from their support channels took more than two days. Subsequent responses were more than a week later. For the client, this actually caused delays in a project they were working on. Was the issue significant? It was not an outage causing issue, but it was one that did seem to continue to grow and cause concerns with disk space.

The Issue

bit9It may be appropriate to discuss what Bit9 does prior to breaking into the details about the issue. If you have never had any experience with this product before, you may be running a product from a competitor such as Symantec. Bit9 is a an endpoint protection tool from Carbon Black. This helps to prevent against malware and endpoint attacks. In short it is a security based tool to protect your computing enterprise.

Go ahead and click the image to link to their site if you wish to learn more about how it works. For me, the nitty gritty specifics on how it works is a bit outside the scope of this article.

In short, Bit9 records a record of every filename and every filepath for every client machine. It stores these in a database in SQL Server. This shouldn’t be too much of an issue. For one instance serving about 300 client machines we saw about 1.2 million paths being stored. One another instance with far fewer clients and far fewer actual paths on those clients (physically checked) we had 114 million paths stored in the database. The difference in size was 20GB vs ~160GB. When the server is installed with all defaults, you can imagine how this looked for this instance – a very bloated OS volume.

Investigating this issue from a database perspective, I looked to find what was consuming so much space in the database. To do that, I ran my tablespace script and discovered the following:

bit9_tablespacehl

A significant standout there with the pathnames table. Looking at the definition of the table I see something far more interesting and disturbing all at once.

pathnames_table

I see a table with three columns and two large string fields. Each of these fields has a non-clustered index on it. This may or may not be such a big problem (other than the fact that the size of the non-clustered indexes on this table are much larger than the data) except that each of the string fields is an exact duplicate of the other. That’s right. Within this table, the data is duplicated into this second string field and each field has its own index. Not only does it appear that I have a ton of duplicated data, it appears I have entirely useless indexes (neither had been touched for a read since the server had been up)

I inquired about this to the folks at Bit9, both from a design perspective and from an archival perspective. Absolute silence on the design (expected). The inquiry about archival (or purge) of non-essential data did fetch a response – albeit a painfully slow response. The basic question is: “Is it safe to purge or archive old or unnecessary data.” The response we received was “run this and we will tell you what to do next.”

Great, they sent a script to help determine the state of data within the database. I am not posting their script here. Suffice it to say that the script they sent was not very pretty. They query about 20 tables, union the results from those tables, then perform a not in operation to see how many of the pathnames are invalid. No problem. Executing the script did reveal the following:

orphan_result

There seems to be the problem. 95.89% of the rows being stored in the pathnames table are orphaned records! This is a bit of a problem. The software does not appear to manage removal of invalid paths. From here, I knew what the course of action needed to be and acted on it. A big piece of the equation was provided from the results of the script. Another piece was provided in the makeup of the script. From these pieces of information, I created a purge script to help manage the orphan problem. I then put that script into an agent job and set it to run on a weekly basis.

If you find yourself in the same boat, here is the script needed to purge the data. If running this, I recommend disabling the two non-clustered indexes and then performing the delete (especially if you sit at 96% orphaned and over 100 million rows). After deleting the mass amount of orphans, go ahead and rebuild the indexes to stay in compliance with the software contract until Bit9 responds about the index requirement and the schema of the table.

Yep! It is just like that. The worst part of the whole thing is actually the delete. The delete operation will touch about 37 indexed views to update and delete. Be prepared for a long running process if you have a situation like I encountered.

For kicks, here is what that plan would look like in one of my favorite tools (SQL Sentry Plan Explorer).

bit9_deleteplan

What is your pain scale and story?

pain_scale

Easily Shred Event Data

shred_xeIt has been nearly a year since I started an extensive series about Extended Events. Previous to that start, I had already written a bunch of articles about Extended Events.

Among the batch of articles preceding the 60 day series one can find an article about shredding extended event data. Then as a part of the series (the first article in the 60 day series) one can find how to shred the action payload data attached to an extended event session.

You can read the article on shredding XEvent payload data here. Then you can follow that up by reading how to shred the Action data here. And once you are done with those and you really want to learn more about Extended Events, please read the whole series which can be found here.

All of that reading to catch up should keep you busy for a tad bit.

Upgrade

A year is a really long time to go without updating a script – according to some. Today, I have an update for both of the XML shredding scripts you just read about in the list of articles mentioned moments earlier in this post.

Before I dive into the script, let’s revisit a sample of the XML from an event session payload. Within an event session, you can see that there are nodes for both the data and the actions (highlighted in green and red). With all of this information found within the same XML for the session, it somewhat makes sense to try and parse all of the data at once.

action_xml

In addition to parsing all of the XML for the data and the actions at the same time, it seems to also make sense to generate the statements that would parse the XML within a single effort. As you would have noted, my previous scripts were just that – scripts. That implies executing a separate script for each the data and the actions. Maybe it would make more sense to execute a single script.

leaving_painIt is that notion of a single script that constitutes this upgraded version of the script.

One Script

In the following script, I have tried to accomplish just that – a single script to create the entire XML parser for me, for you, and for anybody wishing to use it. I don’t want to have to remember the subtle nuances of how to parse each of the events each time I need to parse the session data. I want something that is quick, easy, and repeatable.

With all of that said, here is the script that I now use to parse my session data. You should notice that it has been simplified and is more extensive now.

Not only does this script slice and dice for you…Wait that’s not quite right.

This script doesn’t just generate the XML shredding strings for the payload and action data. This script also will produce the rest of the tsql statements that should precede and conclude the XML parsing statements. Yes it uses dynamic sql. That is to your advantage in this case.

Could there be other enhancements? Absolutely! More will be coming.

Enjoy this script and happy parsing.

Extra Extra – Read All About It!

From the comments, you will see that Brent Ozar (blog | twitter) made an excellent suggestion. To be honest, I considered sharing some examples during the initial write of this article. I had opted against it then, not considering it in the same way that Brent puts it. So, in this addendum I have a quick example using an XEvent Session that I have not yet written about in all of my articles on Extended Events (pseudo spoiler alert).

Let’s use the AuditSelects XEvent Session I have deployed and running (remember a session can be deployed but may be in the “stopped” state). If I want to figure out what data has been captured for this session, I can take the session name and plug it into the script shown in this article.

parsexe_params

Ignore the green tsql comments for now (you can read them direct from the script). Highlighted in gold is the variable for the session name. By entering a valid session name for this variable, I can restrict the results to just the metadata for that specific session. In addition, note that I have highlighted, in a pale chartreuse, two variables that will give me unique column names for each of the events and actions within each session. This is important because some events have the same name for different attributes as other events. If events with the same attribute names are used in the same session, you will get a duplication of data. Maybe you want the duplicate data. Maybe you don’t.

The session name variable is used in each of the next three code segments. Each segment in the script will generate sql statements that will need to be copied into a new query window. I will leave the further breakdown of the script as an exercise for you. What I want to do here is show the results from executing this script.

When I execute the script, I will have four result sets returned to me (in grid mode). It will look something like shown in the following image.

parser_results

Breaking this down into three colors is probably easiest to see how things fit together. Highlighted in red, I have the build out of the pre-processing statements and the build of the “select” statement that will be returning our data to us. Then in blue comes the guts of the query – all of the beautiful XML parsing statements. The segments in blue correlates to the columns in the select list. And at the bottom of the image I have the finishing pieces that includes my predicates, from, and joins. Each of the highlighted segments will then be copied and pasted to a new query window and look something like the following.

parsed_gluedtogether

And there you have it. Sure there are still some manual steps in it, but you no longer need to memorize all of that glorious XML parsing syntax. You only need to copy and paste with this version of the script.

Enjoy your adventures in parsing XML!

Another Query to Help Find the Right Event

Finding the right event or combination of events to monitor may seem like quite a daunting task with so many events to explore and xvnt_crossing(frequently) too little familiarity with Extended Events. In this follow-up article to Extended Event Help Queries, I will share another means to quickly explore the Extended Events metadata in the effort to find the precise event to fit your needs.

Today I will be sharing a new query or two that I have used on more than one occasion to help track down the event(s) that I wanted to, at least, try while troubleshooting specific problems within the SQL Server Instance. I found these queries useful once again in a recent opportunity to help out some friends from the SQL Community. (Maybe, just maybe, these are really just a spin of other previously shared queries.) I will share the most recent experience in another follow-up post.

For the sake of posterity, I am also adding this to the 60 Days of Extended Events Table of Contents.

Is There an Event that Contains pertinent Data?

In my previous article I demonstrated how to find an event based solely on the name or description of the event. This is fantastic if the event name (or description) contains one of the magical words you have used. What if the event name or description has nothing to do with the terms you selected? Or, what if the data you seek may be attached to the event but wouldn’t necessarily stand out as a description for that event (by name or description details for that event)?

Now comes the more difficult task right? If the name or description of the event doesn’t relate to the search terms then you just might overlook a few events and be stuck trying to troubleshoot a problem. An equally big problem this could cause is yet another invisible barrier to using Extended Events. It would be easy to slide down the slippery slope and not transition to Extended Events just because an event, applicable to the problem at hand, could not be found.

This is where searching on the payload comes in handy. Remember from a previous article that the payload is the event data that is attached to each event. Imagine how much more accurate we can become by extending our search from the name and description of the event object to next search on the payload names and descriptions? That is exactly what I will now show.

Basically, I am searching against the object_columns metadata dmv for Extended Events just to find any columns that match my theoretical needs. In this case, I am trying to find anything that may help me find events related to the terms “select” or “statement”. When run, there will be quite a few results for these terms. That may actually be a bit of a problem (just the same as if nothing were returned in the results). There is a fix for that (well probably several fixes as you begin to tweak the query).

One possible fix is to also include map_values to refine the search. Another fix might be to change the search terms. And still other fixes might include tweaking the predicate. For today, I am going to demonstrate the inclusion of map_values as a secondary tweak to help isolate the needed event.