Join a List of Values

Comments: 1 Comment
Published on: July 2, 2019

There are multiple ways of retrieving data from a table when you have a list of values. Sometimes, the method is straight-forward and sometimes we just can’t see the right solution.

Some of the methods to retrieve data from a table (based on a list of values) might include one or multiple of the following: join to another table, throwing an in clause at it, dumping the values into a temp table, sub-queries, and a table value constructor.

Let’s focus on the table value constructor. This one seems to trip a few people up from time to time. First off, what is it?

Table Value Constructor

The table value constructor is basically like a virtual table not too different from a CTE or a subquery (in that they are all virtual tables of sorts). The table value constructor however can be combined with either of those other types and is a set of row expressions that get put into this virtual table in a single DML statement.

So, this virtual table contains a set of values that we define. Let’s go ahead and take a look at an example set of data that we can cultivate into a table value constructor.

Now, let’s say that I want to get all of the trace events that might correlate to those specific values, all I need to do is add this list of values to a query in the appropriate place. First, let’s see what it looks like when I add it via a CTE.

In this case, I have thrown the table value constructor into a CTE. There could be a few good reasons to do it this way such as it being easier to read or maybe some corporate style policy. For me, I like the readability and the tight grouping of the join clauses without needing to scroll. Running the query in this case would produce results similar to the following.

Now, what if I really prefer to not use a CTE? Easy enough, let’s look at this next example. I will use the same list of values as I search for various wait_types.

Notice here that I have a little bit of a different query. I only have one table to join to after the table value constructor. That said, it works just the same as the CTE example. Once the data is constructed, I alias it like a table, then I define the column names. This query will now return results similar to the following.

Both of these examples have been pretty easy in building a single attribute array for the table value constructor. What if we had the need for multiple columns in our list of data?

Notice the addition of an extra value (separated by a comma) within each set of parenthesis and the additional column definition added after the alias for the table value constructor. This is a pretty simple modification. Upon execution, I can expect results such as the following.

The Wrap

Being able to take a list of data and construct it into a virtual table for further querying and data analysis is a pretty cool thing to do. The setup and syntax are fairly simple – it just takes a bit of practice and effort.

This article demonstrated three different examples of using a table value constructor to join a list of data to other tables. See what other kinds of examples you can come up with and share them.

This has been another post in the back to basics series. Other topics in the series include (but are not limited to): Backups, backup history and user logins.

Mass Backup All Sessions

Migrating Extended Event Sessions from one server to another should be a simple task. So simple, one would think there was no need to give it a second thought, right?

Well, I have previously written about this topic, you are welcome to read it here. The article discusses quite a bit about scripting out your XE Sessions. One thing lacking in that article is the ability to script out every session on a server.

If you are still not using Extended Events, I recommend checking out this library of articles that will cover just about all of the basics concerning Extended Events.

New and Improved

What about scripting out all of the sessions in SSMS? Surely there is an easy way to do that, right? Well, you might think that. Let me step through the problem that I have seen in SSMS (and unfortunately it is not consistent).

First, from Object Explorer Details (or F5), let’s try to script a single session.

When scripting a single session from the “Object Explorer Details”, I have several sub-menus that allow me to script the session to a “New Query Editor Window”. Now, let’s see what happens when trying to script multiple sessions.

With several sessions selected, I try yet again to script the sessions and I get an unwanted result. Notice that the “Script Session as” option is grayed out and unusable. However, if I try it again (several times or maybe just once, your mileage may vary and it seems to not be relevant to version of SSMS), I may see something like this.

Tada! Luck was with me and it finally worked that time. So, what should I do to be able to consistently script all of sessions? Well, that comes with an enhancement to the script I presented in the prior article here.

Lets just dive straight into the new script.

This is a rather lengthy script, so I won’t explain the entire thing. That said, this script will produce the exact XE Session as it was written when you deployed it to the server. In addition, the script will ensure the destination directory for the event_file target is created as a part of the script.

I can definitely hear the gears of thought churning as you ponder about this whole scenario. Surely, you have all of your XE Sessions stored in source control so there is no need whatsoever for this little script. Then again, that would be in an ideal environment. Sadly, source control is seldom considered for XE Sessions. Thus, it is always good to have a backup plan.

Why

Sadly, I had the very need of migrating a ton of sessions from one server to another recently and the methods in SSMS just wouldn’t work. There was no source control in the environment. Building out this little script saved me tons of time in migrating all of the sessions for this server and also provided me with a good script to place in source control.

Conclusion

In the article today, I have provided an excellent tool for backing up all of your XE sessions on the server. This script will help create the necessary scripts for all of your XE Sessions (or even just a single session if you like) in order to migrate the sessions to a new server or place them in source control.

To read more about Extended Events, I recommend this series of articles.

Event Log File Paths

How does one consistently find the correct path to the Extended Event Log file (XEL file)?

This is a topic that I ventured into some time ago. The previous article can be read here. In that article I covered some of the various trouble spots with capturing the file path for various XE log files. One of the main problems being that there is frequently an inconsistency in where XE logs may actually be stored.

Using what was shown in that previous article, I have some improvements and minor tweaks to fill some gaps I hadn’t completed in the previous script.

If you are still not using Extended Events, I recommend checking out this library of articles that will cover just about all of the basics concerning Extended Events.

New and Improved

First, lets just dive straight into the new script.

One of the things I wanted to accomplish with this update was to find the correct path for all of the sessions on the server. As mentioned in the previous article, sometimes there are complications with that. Due to the way log files can be specified for an XE session, behaviors can be a bit funky sometimes when trying to parse the correct paths. Due to those problems, I couldn’t quite short-cut the logic in the previous script and had to do the less desirable thing and create a cursor.

In addition to the cursor, I threw in a fix for when a full path is not declared for the session (at the time of creation) and the session was subsequently never started. In these odd cases, the script had been returning an empty result set and thus was not working properly. Now, it is fixed and here is an example of the output.

The third column in this result set is purely for informational purposes so I could determine at which point the file path was being derived. For the 30+ sessions running on my test instance, most paths are resolved via the first select. In the image, that is denoted by the label “Phase1” and circled in red. The system_health session happened to be running, but did not have a full path declared so it fell into the “Phase2” resolution group and is circled in blue. The last group includes those cases where a path could not be resolved for any number of reasons so they fall to the “FailSafe” grouping and an example is circled in green in the image.

Why

Truth be told, there is a method to short cut this script and get the results faster but I felt it would be less accurate. I could obviously just default to the “FailSafe” group automatically if a full path is not defined in the session creation. Would that be accurate though? Most of the time it would be accurate, but then there are the edge cases where occasionally we forget that something has changed. One such case of this is if after the session is created, you decide the SQL Server log files needs to be moved from the default path (this is where the XEL files default to if no path is defined)?

I have run across multiple scenarios where the logs were required (both technical as well as political) to be moved from the default location. Ideally, this move occurs prior to server startup. When the log file path is changed, the logs are not moved automatically to the new location. This, for me, is a case where it is best to be thorough rather than snake bit. I also like to document these things so I can compare them later if necessary.

Alternatively, here is the just good enough to pass muster version of that script.

 

Conclusion

In the article today, I have shown some of the internals to retrieving file paths for Extended Event Sessions. I dove into metadata to pull out the path for the session and discussed some concerns for some of these methods. In the end, you have a few viable options to help retrieve the file path in a more consistent fashion.

To read more about Extended Events, I recommend this series of articles.

Incorrect Syntax – What?

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: November 17, 2017

Intuitive errors are very helpful when trying to troubleshoot a problem. On the other hand, when we receive an error that is vague, we sometimes find ourselves wanting to throw the laptop out the window. What do we do though when the error message really does look to be pretty intuitive but we can’t figure out where the problem is?

Today, I am going to cover one of these falsely intuitive errors. As it turns out, this is a problem that I had actually forgotten about since I had not seen it in many years. The error is shown here.

This error looks pretty straight forward, right? It tells me there is an incorrect syntax somewhere in my query. Running into a syntax error somewhere along the line does happen to all of us at one time or another. Because of that, I would not be terribly alarmed by the error and just proceed to fixing the syntax problem. Here is the query that is throwing that error.

Now is the time to start pulling out your hair. There is no syntax error in the query. Go ahead and look it over 10-15 times. I know I did.

Ok, if there is no syntax error, then what could possibly be the real problem? Is the database corrupt? Maybe a system table is corrupt? Grasping at straws here, but could it possibly even be some sort of royally screwed up permissions.

Everything seems to be checking out properly. There is no corruption whatsoever. Laptop is soon to be launched at this point right? Ok, maybe not launched because this is a simple query. But, had this been a production related query that was rather intense and complicated, there really may be something getting launched as the frustration mounts.

Simple Solution

Well, since the error does not fit the crime, maybe this is one of those really odd errors that truly means something else. In this case, if we dissect the query we will probably notice that there is a DMO being called here. If I query sys.stats directly there is no problem but joining to sys.dm_db_stats_properties I get the error. If I run the query from the master database, the query runs without error. In fact, running the query from within a connection to most databases succeeds.

Now we have two clues to the problem. It only fails in some databases and only fails when the DMO is used in the query. Let’s take a look at database properties and see what we can find. In this scenario I am finding this little nugget in the affected databases.

This database happens to be in SQL Server 2000 compatibility mode. I wonder why that may be and proceed to investigating as well as fixing it. Let’s see what happens after changing the compatibility mode to something more recent (and yes this is an OLD version of SQL Server and the whole thing should be updated anyway but that is a discussion for another time).

If we step through this whole script, we will see that this is a script that will reproduce the entire scenario from break to fix. First thing is to set the compatibility mode to 80 (SQL Server 2000), then run the query (results in the error). Next I change the compatibility mode to 100 (SQL Server 2008) and rerun the query (no error this time) and I get all of the stats with properties that I was hoping to get. Then I conclude with confirming that I am in the correct compatibility mode.

Conclusion

Errors will happen as we work with TSQL – that can be expected. Sometimes we type too fast or just miss something here or there. When we get a syntax error, it really sets our mind to focusing on it being a mistake that we made. In this case however, the error is thrown for a syntax problem because the DMO is not accessible in compatibility mode 80.

Due to the straight forward error – we may spend a little to much time looking at the wrong thing. Keep your mind open to the error being elsewhere than your code should you see a syntax error near “.” in your endeavors.

What’s that SSIS Password

Categories: News, Professional, Scripts, SSC
Comments: No Comments
Published on: October 30, 2017

A recurring theme over the past several weeks (there are always recurring themes it seems) has been an issue that relates to SSIS. Clients will call for help with an SSIS package that has started failing and they are stuck trying to figure it out. They are all stuck at the same spot in the process – trying to get the package open. As it turns out somebody has decided to password protect the package in each case and the client is unable to open the package because they don’t know the password.

Inevitably they are all stuck at the following screen:

Granted this screenshot is for the password prompt for a project password – it really is the same issue if it is a package password. This seems like pretty good security right? If nobody knows the password then nobody can alter the package, right? At least that seems to be the prevailing course of thought.

Of course this introduces various other issues, one of which I alluded to already – a failing package. What if the requirements change? What if the package needs to be migrated? What if the package needs to be documented from stem to stern? Even better -what if this type of security is not all it is purported to be?

SSIS Security

Microsoft has provided various different “protection” levels for affecting sensitive information within a package. These levels can be set from within the GUI or from the dtutil utility. Here are some of those settings.

Setting Description Value
ServerStorage Rely on SQL Server database roles for protection. Only valid if saved to msdb and not to the file system.
DontSaveSensitive Suppresses the values of sensitive properties in the package when the package is saved. 0
EncryptSensitiveWithUserKey Uses a key that is based on the current user profile to encrypt only the values of sensitive properties in the package. 1
EncryptSensitiveWithPassword Uses DPAPI to to encrypt sensitive values in the package based on a user generated password. 2
EncryptAllWithPassword Encrypt the entire password based on a user generated password. 3
EncryptAllWithUserKey Uses a key based on current user profile to encrypt the package. Only the user that encrypted the package can open or run the package. 4

Based on these descriptions, a look at the package contents (via notepad) would look something like this:

The package will have that ProtectionLevel value added to the xml of the package with the selected value based on the descriptions from the previous table. In the case of this image, it appears that the package in question has been set to “EncryptSensitiveWithPassword” which correlates to a value of 2. This is also what is causing that password prompt to be displayed when opening the package.

So What’s that SSIS Password?

First, a bit of a birdwalk. Some will say you don’t need that password. They say you can go ahead and just change the “ProtectionLevel” value in the project/package xml and all will be rosy good. That may or may not work. Try it at your own risk. That said, if your ProtectionLevel is at a value of 2, there is an alternative method you could retrieve that password. I am going to show you just how to do that.

This method is going to need to assume that the package is being executed via a SQL Agent job and that you also have access to view job properties on the server in question.

Let’s just jump straight to a script to help us with that.

Take note here that I am only querying the msdb database. There is nothing exceedingly top secret here – yet. Most DBAs should be extremely familiar with these tables and functions that I am using here.

What does this show me though? If I have a package that is being run via Agent Job in msdb, then the sensitive information needs to be decrypted somehow. So, in order to do that decryption the password needs to be passed to the package. As it turns out, the password will be stored in the msdb database following the “DECRYPT” switch for the dtutil utility. Since I happen to have a few of these packages already available, when I run this particular query, I will see something like the following in my results.

Now, let’s go ahead and take one of those packages and test this out. I will just take the password that was listed for one of the packages, find the path of that package and then try to open the package using the password I just uncovered. Let’s keep it simple and say I am trying to open the first package called “Sports Adobe Ticket Sales Data Upload”.

After finding the package in question, I enter the password – shown here.

After I click the “OK” button, I am greeted with the following.

Tada! I now have full access to the package fully decrypted. If I need to change the ProtectionLevel, I can easily do that now as well. From here, I am set to proceed with troubleshooting the failure or any of the other reasons I noted for needing to gain access to the package.

 

Conclusion

Occasionally it becomes necessary to access an SSIS package that may be encrypted with a password. This is easily accomplished (unfortunately) if the package happens to be run via a job in SQL Server. While it is possible to gain access to the package, it would be far better to not need to circumvent the security to gain access. Instead, it would seem a much better idea to properly document the necessary passwords etc in a common vault so the appropriate people could access the package in the event of an emergency.

Drop That Schema

Categories: News, Professional, Scripts, SSC
Comments: 1 Comment
Published on: October 27, 2017

An often under utilized or maybe even mis-utilized feature of SQL Server is a database object schema. In the event of the latter, there is an occasional requirement for change. When this need arises, it can turn into a bit of a problem. That is of course if we are not well prepared. In this article, I am going to explore one possible change – the dropping of a schema.

What is a Schema?

A schema is a distinct namespace to facilitate the separation, management, and ownership of database objects. It removes the tight coupling of database objects and owners to improve the security administration of database objects (Source: Technet).

Much like a skeleton is a distinct system that helps support the structural integrity of certain biological entities, a schema helps to support a distinct functioning and structure within a database. This is, of course, a very simplistic description and analogy, but it works.

Another way to look at a schema is almost like a blueprint of what has been or will be created within the database. Like many blueprints, a database schema can map out all sorts of “sub-systems” at our disposal within the database. Some of these may include functions, procedures, tables, views and so forth.

What happens when you try to take out the entire blueprint without understanding the relationship to all of the subsystems? In the case of the schema it just so happens that you will be presented with an error message informing you that you are trying to do something that is either ill-advised or not supported or both.

Here is an example of such an error.

Msg 3729, Level 16, State 1, Line 2
Cannot drop schema ‘Maintenance’ because it is being referenced by object ‘AutoStatsHistory’.

This error message proves to be helpful. I obviously have some objects bound to the schema that need to be blown away before I can drop the schema. If I only have a few objects, this may not be too terribly annoying to do one by one. But what if I have 100 or more objects? Now that becomes an entirely different story.

Drop that Schema!

I have run into this very issue where there are far too many objects in the schema to be able to drop one by one. Add to the problem that I am looking to do this via script. Due to the need to drop the schema and the (albeit self imposed) requirement of doing it via script, I came up with the following that will cover most cases that I have encountered.

And a sample of the output:

As you can see here in the output, I have set the script to generate a series of drop statements for each of the dependent objects within the schema. In addition, the drop statements are ordered to remove objects that may have dependencies on other objects first. A big key here is that this script does not drop the objects for you. Rather it just creates the scripts to do the work. You must review the output and then execute the scripts separately. That means you will be entirely responsible for the results.

Once all of the objects are out of the way, the last statement in the sequence is to finally drop the schema.

You may notice that there are a few things not included in this script. The most notable may be that the Service Broker related objects are not accounted for in this script. I leave that for a later revision.

Conclusion

Occasionally it becomes necessary to remove a schema from the database for one reason or another (e.g. somebody decided to do both Dev and Prod in the same database separated only by schemas). Dropping a schema can become a little bit of a pain without the right tools. This script will help get you on your way to a scriptable solution (to help with documentation of what was changed) and a little less pain while doing it.

Finding Deprecated Uses in SQL Server

 

sqlbasic_sarge

How well do you know your environment? You probably know all of the jobs that are running, the frequency that indexes need to be rebuilt, and even which users have which level of access to each object in the SQL Server instance. Do you know that your applications are accessing deprecated datatypes over 300 million times a week? What if your TSQL constructs are a bit archaic? Do you know that the code needs to be updated? Do you know how to find deprecated uses in SQL Server?

In this article, I will explore how to use Extended Events to track feature use and abuse. To be more precise, I will share how this tool can help you better understand all of the ways that your applications have been abusing your database by continuing to employ the use of deprecated features, syntax, or constructs in general. In case you are a bit behind in your exploration of XEvents, I have the perfect solution for you – my series on the topic that is continually growing. You can explore the full list of articles in the series by visiting the table of contents – here.

Audit Deprecated Uses

redxI would dare say that most data professionals think there is some use of deprecated constructs, datatypes or features within their environment. I would double down on that and say that most do not know just how bad it really may be. To find just how bad it really is, we need to audit for the use and abuse of these deprecation events.

Right here would be a good time to point out that your mileage may vary. Some items that are deprecated are more painful than others. Some may be deprecated and may have been on the list for 10+ years at this point. The point is, know your environment and then use good judgement to determine which items in your results need the most attention to fix and update. Why? Well, things really may break especially if you are looking to upgrade to a new version of SQL Server. Just because an item is still available in your current edition, that does not ensure it will still be available in a future release of SQL Server.

Now for the juicy stuff. As I mentioned, finding when and where a deprecated feature or syntax is employed, there are a couple of neat little events within Extended Events that can help to track each time a deprecated feature is accessed or employed. How do we do that? Use the deprecation_announcement and deprecation_final_support events. To help create sessions to track these events, I have the following script to create an “audit” session to do exactly that.

 

And just in case you are using 2008 or 2008R2, use this version instead.

Slight differences between these two sessions. First, in the 2008 version of the script, I rotten_orangehave to provide database ids instead of names. That is a shortcoming of 2008 and 2008R2 implementations of Extended Events. Additionally, the file target is different between the two (recall that they renamed the file target). And lastly, there are a few actions that I included in the 2012 version of the script that are not available in 2008 and R2.

With the session in place, I am now going to run through some sample scripts that will generate deprecation events to occur. I am sticking with my 2014 instance for this segment. That is important to note because different events may occur for different versions of SQL Server. Additionally, the parse script I will share will require a slight change for 2008 and r2 (again related to the file target name).

Now to take a peek at the data with this next script.

Now at long last, I can see what kind of data I am generating (they are really wide so I am just posting a snip).

deprecated_features_results

 

 

 

 

With this, I can see the feature_id along with the description and even the tsql that generated the event. What you don’t see in this is that I also trap the source machine and the user name. If there is an application name included in the connection string, I also trap that. These pieces of data can prove critical to efficiently troubleshooting and finding the source of these events.

From here, one might wish to explore all of the events generated from this session in order to ensure the environment is properly prepared for upgrade. Most tools do not evaluate the code thoroughly to trap all of these events. Instead they do a cursory look through stored procedures or at the data types. As we all should know, not every piece of SQL code is actually stored in the database or even is it cached at the time of analysis. This is the type of thing that requires a long running trace to prove that you are that rockstar DBA.

Conclusion

In the article today, I have shown how it is possible to see the deprecation alerts that may be generated in your environment. This data is what can help set you apart as a rockstar when it comes time for that migration. If you have yet to read my series on Extended Events, I highly recommend it. You can find that series here.

Events By Feature

sql_features_xeWithin the world of SQL Server there are a few things one can be certain of – things will change. This is true of the features in SQL Server. Additionally, Extended Events is constantly evolving which underscores this constant change.

With all of this change occurring within SQL Server, sometimes it is difficult to figure out how to troubleshoot or track issues that relate to new features. Within the need to figure out how to troubleshoot the new features, there is the need to understand what tools are out there to help troubleshoot.

It is no big secret that Profiler cannot help you trace any of the new features. To help perform the task of tracing events related to new features one must take advantage of the power of Extended Events. Even knowing that you can use XEvents, there is still a need to know what is related to the feature.

In this article, I will show how you can discover the events related to specific features. In addition, I will provide enough base information to help you perform other discovery type queries on your own as you continue your exploration of XEvents. In case you are a bit behind in your exploration of XEvents, I have the perfect solution for you – my series on the topic that is continually growing. You can explore the full list of articles in the series by visiting the table of contents – here.

Events and Features

More and more I am being asked how to track which events belong to which features. I have also been seeing more people ask for a way to list the new features supported by XEvents. Thankfully there is adequate information within the XEvent metadata to help retrieve this type of information. The downside is that there is a bit of homework that must be done across a few versions of SQL Server to help produce the desired information. I have done that work and pulled the information into the following query to help make it easier for anybody else wishing to dive in and compare features and events across versions of SQL Server.

In the preceding query, I have provided an aggregation of the events and features across multiple editions of SQL Server. This aggregate of data is necessary to help see how the features are changing and how the events are changing within the features that remain constant from one version to the next.

Once that data is aggregated, I then provide a couple of sample queries that show what can be done with the data. I need to stress here that I have not provided an exhaustive set of queries to explore this data seven ways to Sunday. I have left those types of exercises up to the reader to explore and experiment.

The first example query shows how to pull the new features that can be “Traced” in the current version of SQL Server on which the query is being executed. The second example provides a difference count between versions of SQL Server to help illustrate the evolution of XEvents within SQL Server. The third query is a simple query to list out the number of events for each feature in your version of SQL Server.

From here, one might wish to explore all of the events that are related to a specific feature. This would be easily accomplished by querying out the data from the #presel temp table based on the feature name. Orrrr…one could query the desired feature by following the guidelines in this article.

Conclusion

In the article today, I have shown how it is possible to see the new events and how they relate to the various features within SQL Server. Being able to correlate events that can trap information about new features can and will help you evolve into that rock-star DBA you are trying to become!

Finding the Right Path

xe_path1I have a fairly large backlog of articles that are in progress or that are planned for my series about Extended Events. Despite the backlog and planned articles, every now and then something else comes up that bumps things around. This article is an example of bumping the schedule around. You can see some of the backlog and the full list of articles in the series by visiting the table of contents – here.

I bring that up for a couple of reasons. First and foremost being that the topic was recently raised as a “need” by some colleagues. The second being that I see the need and how it was lacking in coverage by anything I had already written.

What is this gaping hole in the coverage of Extended Events? To be honest, it is not a very complicated topic or very difficult gap to fill. It’s just something that has been overlooked. The gap boils down to this: how does one consistently find the correct path to the Extended Event Log file (XEL file)?

Filling the Gap

The gap I will be working to fill in this article deals with consistently finding the file path for Extended Event (XE) sessions. This gap rises due a few different things such as the ability to define a target in different manners, being able to move the logs directory, or even the fact that a target may not be added to the session (let alone a file target). These causes can all contribute to a bit of frustration and may pose as different symptoms when trying to get the file path.

One additional complication is tied to the running state of a session. For that complication, I talked briefly about it in my article about better practices. The state of the session could have an impact and could cause frustration when trying to retrieve the file path. All of these things are considerations that must be made when trying to retrieve the file path.

To find the file path, let’s start with some basics. The default path for XEL files is in the log directory of the instance. In addition to this default behavior, each session stores metadata about running sessions as well as deployed sessions that is accessible from DMVs and system catalogs.

Log Path

The first basic to tackle is the log path for the instance. I can query for the log path of the error log for the instance and rely on that as the path of my xel files. Should I choose this method, then I could execute the following query.

Executing that query on my SQL 2014 instance produces the following results:

errorpath

Now the obvious problems with this method come from the fact that relying on this data is relying upon an assumption that you have set all of your XE Sessions to use the default log path. If you have declared your sessions to use a file target and did not specify a path, then the assumption is safe. However, if you are detail oriented, you probably have been somewhat explicit in how you define your event file target. This brings us to the next topic – defining the path.

Defining The Path

While a bit of bird-walk, it is necessary to cover this topic at this juncture. This brief discussion will help to understand some of the other issues with retrieving the path consistently.

When defining the path of the event path, there is a bit of flexibility in how one can define the file to be used. You can either declare the file as just the file name, or you can define the file as the folder path along with the file name. Both methods are completely legitimate. Unfortunately, this flexibility is what causes some of the pain with retrieving the file path consistently.

Let’s take a look at two quick, and acceptable, ways to add an event file to an XE Session. I will be re-using a session from a previous article for these examples.

This is a pretty standard format I use for creating my sessions. The section I want to highlight though is the set of the filename near the end of the script. I will generally define the entire path for my xel files when I create a session – just as I have done in this example. Defining this path helps me to know exactly where I am putting the session for starters. Equally as important is that this path is easier to retrieve from metadata because I have explicitly defined the path.

Take this next example of the same session but with one minor difference.

The minor difference in this example is just in how the filename was defined. It is technically accurate and acceptable to only use the filename instead of the path and filename as I did in the previous example. The problem here comes from the retrieval of the path from metadata.

Getting the Path

In the previous two examples, I showed two methods of defining the filename within a session. In the former example, this means I can more easily find the path. However, if I use the script in the beginning of the article to find my xel file paths, then I have made an inaccurate assumption. This leads to an inconsistency in how the file path is fetched. In the latter example, the assumptions concerning the default log path would be valid and could be applied here.

Since the assumptions do not prove to be consistent or accurate across the board, we need to both evaluate how the data for each would look and we would need to see how to retrieve this path more consistently. First up is looking at how the data for each of these example sessions would be stored in metadata.

In the preceding query, I am looking at the deployed session metadata for two sessions, both of which start with “AuditSelect”. In the sys.server_event_session_fields view, there is an attribute called name that contains the value “filename”. This value will only exist if the session has an event file target defined for the session. Running the query will yield the following results (on my system).

eventfile_meta

In this result set, I can see there is one of each of the two event file definition methods I described in the previous section. In the green highlight you will see that I have the entire filepath. In the red highlight, you will only see the filename. Nothing special has been done to this data in the return as you can verify from the posted script. These are the actual stored values. So this would indeed seem like we have a bit of a problem, right?

Well, this is where we get to be a little bit imaginative and employ a script such as the following:

Wow, that is considerably longer than the first example in the article. That is true! This script accounts for a few of the conditions that cause inconsistencies (but not yet all of them) in retrieving the file path for a deployed event session. The key here is to understand this will fetch the path whether you enter a complete path or not while creating the session. I also throw a check in there to confirm that the session has a file target as well as a check to ensure the session is deployed to the server.

Looking at the output, I get results that are somewhat friendly and easy to understand without any of the guesswork.

xelpath_result

What if the session has never been started and did not have the full path declared? Well, that is one of the inconsistencies I am working on still. (Update 2019/05/17 – An updated script is coming out soon that handles this little issue.)

There is an alternative to this method as well. For the time being, this method would also be recommended in the event the session being researched happens to be one of the system sessions that is “private”. This next code chunk will show two different methods to parse the file path from running session metadata in the DMVs.

Conclusion

In the article today, I have shown some of the internals to retrieving file paths for Extended Event Sessions. I dove into metadata to pull out the path for the session and discussed some concerns for some of these methods. In the end, you have a few viable options to help retrieve the file path in a more consistent fashion.

To read more about Extended Events, I recommend this series of articles.

BIT9 PathNames

We live in a day and age when security (data, network, server, etc) is seemingly at the forefront of the daily news. In our quest to improve security we seemingly always find more and more products that are supposed to help with that security. Sometimes, those products require a data repository. As it happens, it is not uncommon for that repository to be in a database. Today, I am looking into a specific issue with the Bit9 product.

Why?

There are some very good reasons as a matter of fact. One really big reason is that I could find no decent information about this issue. The bigger reason is the seemingly indifferent level of response and the delays I saw in response from the Bit9 support channels.

In this specific case, the first response from their support channels took more than two days. Subsequent responses were more than a week later. For the client, this actually caused delays in a project they were working on. Was the issue significant? It was not an outage causing issue, but it was one that did seem to continue to grow and cause concerns with disk space.

The Issue

bit9It may be appropriate to discuss what Bit9 does prior to breaking into the details about the issue. If you have never had any experience with this product before, you may be running a product from a competitor such as Symantec. Bit9 is a an endpoint protection tool from Carbon Black. This helps to prevent against malware and endpoint attacks. In short it is a security based tool to protect your computing enterprise.

Go ahead and click the image to link to their site if you wish to learn more about how it works. For me, the nitty gritty specifics on how it works is a bit outside the scope of this article.

In short, Bit9 records a record of every filename and every filepath for every client machine. It stores these in a database in SQL Server. This shouldn’t be too much of an issue. For one instance serving about 300 client machines we saw about 1.2 million paths being stored. One another instance with far fewer clients and far fewer actual paths on those clients (physically checked) we had 114 million paths stored in the database. The difference in size was 20GB vs ~160GB. When the server is installed with all defaults, you can imagine how this looked for this instance – a very bloated OS volume.

Investigating this issue from a database perspective, I looked to find what was consuming so much space in the database. To do that, I ran my tablespace script and discovered the following:

bit9_tablespacehl

A significant standout there with the pathnames table. Looking at the definition of the table I see something far more interesting and disturbing all at once.

pathnames_table

I see a table with three columns and two large string fields. Each of these fields has a non-clustered index on it. This may or may not be such a big problem (other than the fact that the size of the non-clustered indexes on this table are much larger than the data) except that each of the string fields is an exact duplicate of the other. That’s right. Within this table, the data is duplicated into this second string field and each field has its own index. Not only does it appear that I have a ton of duplicated data, it appears I have entirely useless indexes (neither had been touched for a read since the server had been up)

I inquired about this to the folks at Bit9, both from a design perspective and from an archival perspective. Absolute silence on the design (expected). The inquiry about archival (or purge) of non-essential data did fetch a response – albeit a painfully slow response. The basic question is: “Is it safe to purge or archive old or unnecessary data.” The response we received was “run this and we will tell you what to do next.”

Great, they sent a script to help determine the state of data within the database. I am not posting their script here. Suffice it to say that the script they sent was not very pretty. They query about 20 tables, union the results from those tables, then perform a not in operation to see how many of the pathnames are invalid. No problem. Executing the script did reveal the following:

orphan_result

There seems to be the problem. 95.89% of the rows being stored in the pathnames table are orphaned records! This is a bit of a problem. The software does not appear to manage removal of invalid paths. From here, I knew what the course of action needed to be and acted on it. A big piece of the equation was provided from the results of the script. Another piece was provided in the makeup of the script. From these pieces of information, I created a purge script to help manage the orphan problem. I then put that script into an agent job and set it to run on a weekly basis.

If you find yourself in the same boat, here is the script needed to purge the data. If running this, I recommend disabling the two non-clustered indexes and then performing the delete (especially if you sit at 96% orphaned and over 100 million rows). After deleting the mass amount of orphans, go ahead and rebuild the indexes to stay in compliance with the software contract until Bit9 responds about the index requirement and the schema of the table.