Database Dropped

What do you do when a developer comes to you and asks, “Where did the database go?  The database was there one minute, and the next it was not.”  Only one thing could be database droppedworse than the feeling of losing a database on your watch, and that would be losing a production database. It’s like magic—it’s there, and then it disappears. To compound the issue, when asking people if they know what might have happened, all will typically deny, deny, deny.

What do you do when you run into that missing database situation and the inevitable denial that will ensue?  This is when an audit can save the day.  Through an audit, you can discover who dropped the database and when it happened.  Then you have hard data to take back to the team to again ask what happened.  Taking the info from a previous article of mine, we can alter the script I published there and re-use it for our needs here.

This script will now query the default trace to determine when a database was dropped or created.  I am limiting this result set through the use of this filter: ObjectType = 16964.  In addition to that, I have also trimmed the result-set down to just look for drop or create events.

This is the type of information that is already available within the default trace.  What if you wished to not be entirely dependent on the default trace for that information?  As luck would have it, you don’t need to be solely dependent on the default trace.  Instead you can use the robust tool called extended events.  If you would like to be able to take advantage of Extended Events to track this information, I recommend you read my follow-up article here.

This has been a republication of my original content first posted here.

Default Sessions – Back to Basics

Comments: No Comments
Published on: September 1, 2016

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Default Sessions

SQL Server is full of some pretty cool stuff. Some of it is rather basic. Some of it is a bit complex. Whether basic or complex, there is a lot that seems to get overlooked because it just might be hidden from plain sight or maybe it is overlooked just because it is already done for you.

Today, I am going to talk about some of the stuff that is already done for you. It is pre-baked and ready to roll. Since it is pre-baked, it may be a good idea to get familiar with it and understand what kinds of data it may offer to you when troubleshooting.

defaultsessionIt is still somewhat surprising to me to discover how many people have never heard of the default trace in SQL Server. Slightly less surprising is how many people have never heard of the default Extended Events system_health session. What if I told you there were a bunch more default sessions than the default trace or system_health? I wrote about some of those default sessions previously.

With the evolution of SQL Server, there is an evolution in Extended Events. New features in SQL Server also means there may be some additional default sessions. I want to touch on some of these other default sessions in brief.

Default Sessions

Phone Home

First, lets start with an easy to find default session – telemetry_xevents. This session is a visible session and behaves like a user defined session with the caveat that it is a session installed with SQL Server 2016 and it is a system default session. I have another post queued up to discuss this session in a bit more detail so will just keep it brief here.

This session is there to basically capture specific data and phone it home to Microsoft. You can presume that based on the definition of the word telemetry

“is an automated communications process by which measurements and other data are collected at remote or inaccessible points and transmitted to receiving equipment for monitoring.”

Before you get your tin-foil hat in a crumple over sending data back to the mother-ship, understand that this data is not sensitive data. The type of data being gathered is usage metrics. Think of it in terms of gathering data about how many times a specific feature was used or how many times a group of errors occurred within the environment. This is in an effort to help improve the product.

While the session data is designed to be sent back to Microsoft, it could also be of use to the Enterprising DBA who is looking to get a better grasp of the environment or to troubleshoot various issues.

It’s a Stretch

With the release of SQL Server 2016 there is another default session that gets installed. This session is probably the most difficult to find. This is a private hidden session when it is installed. In addition, it is not present by default with SQL 2016 RTM but it was present by default with various CTP versions. The session is specific to the new stretch feature and is called rdaxesession.

Breaking down this particular session reveals pretty easily what feature the session is designed to support. The name rdaxesession breaks down to the following: rda = remote data archive, and then the rest is self explanatory. If I investigate the metadata for the session I will discover that the following events are traced in the session:

stretch_database_enable_completed
stretch_database_disable_completed
stretch_table_codegen_completed
stretch_table_remote_creation_completed
stretch_table_row_migration_results_event
stretch_table_validation_error
stretch_table_unprovision_completed
stretch_index_reconciliation_codegen_completed
stretch_remote_index_execution_completed
stretch_database_events_submitted
stretch_table_query_error

Well, that certainly clears it up! This session is all about the stretch feature. If you are on RTM and don’t use the stretch feature, then you won’t see this session even if you do query the metadata.

Don’t be a HADR

The last of the default sessions I want to cover both can be combined into a category about HA and DR – loosely at least. One of the sessions deals with backups and the second of the sessions deals with clusters.

When dealing with a cluster, it is important to know about the hidden trace file that records events related to the cluster. The session target data can be found in “%PROGRAMFILES%\Microsoft SQL Server\MSSQLxx.MSSQLSERVER\MSSQL\Log” by default and cannot be seen within SQL Server metadata at all (not that I have been able to find). For a few details about how this trace works, I recommend reading this article. This session can be called the SQLDiag session.

The session related to backups is actually broken down into two log files and is similar in nature to the SQLDiag session. This is true in that the session is hidden, is found in the same directory, is read in the same fashion, and is equally as hidden from within SQL Server as the SQLDiag session. The backup session deals specifically with managed backup.

What is managed backup? Well, it is a type of backup performed to Azure and which relies upon the “smart_admin” procedures, functions and process.

If you don’t perform managed backups or you don’t have a cluster configured, you will not need either of these sessions (and potentially not see the smart_admin files depending on SQL Server version). If you have either of these features, you will definitely want to ensure you become acquainted with them and how to retrieve the data.

Recap

There are plenty of mechanisms in place to help troubleshoot some of the more complex features of SQL Server. Getting to know the tools will prepare you for that moment when problems arise and you are under the gun.

I did not dive deep into any of these sessions holding that for later articles. I wanted to keep this article to an introductory level. If you are interested in the more detailed articles, please stay tuned for those upcoming articles.

If you are interested in reading more about Extended Events, I recommend reading my series which can be found here.

Events By Feature

sql_features_xeWithin the world of SQL Server there are a few things one can be certain of – things will change. This is true of the features in SQL Server. Additionally, Extended Events is constantly evolving which underscores this constant change.

With all of this change occurring within SQL Server, sometimes it is difficult to figure out how to troubleshoot or track issues that relate to new features. Within the need to figure out how to troubleshoot the new features, there is the need to understand what tools are out there to help troubleshoot.

It is no big secret that Profiler cannot help you trace any of the new features. To help perform the task of tracing events related to new features one must take advantage of the power of Extended Events. Even knowing that you can use XEvents, there is still a need to know what is related to the feature.

In this article, I will show how you can discover the events related to specific features. In addition, I will provide enough base information to help you perform other discovery type queries on your own as you continue your exploration of XEvents. In case you are a bit behind in your exploration of XEvents, I have the perfect solution for you – my series on the topic that is continually growing. You can explore the full list of articles in the series by visiting the table of contents – here.

Events and Features

More and more I am being asked how to track which events belong to which features. I have also been seeing more people ask for a way to list the new features supported by XEvents. Thankfully there is adequate information within the XEvent metadata to help retrieve this type of information. The downside is that there is a bit of homework that must be done across a few versions of SQL Server to help produce the desired information. I have done that work and pulled the information into the following query to help make it easier for anybody else wishing to dive in and compare features and events across versions of SQL Server.

In the preceding query, I have provided an aggregation of the events and features across multiple editions of SQL Server. This aggregate of data is necessary to help see how the features are changing and how the events are changing within the features that remain constant from one version to the next.

Once that data is aggregated, I then provide a couple of sample queries that show what can be done with the data. I need to stress here that I have not provided an exhaustive set of queries to explore this data seven ways to Sunday. I have left those types of exercises up to the reader to explore and experiment.

The first example query shows how to pull the new features that can be “Traced” in the current version of SQL Server on which the query is being executed. The second example provides a difference count between versions of SQL Server to help illustrate the evolution of XEvents within SQL Server. The third query is a simple query to list out the number of events for each feature in your version of SQL Server.

From here, one might wish to explore all of the events that are related to a specific feature. This would be easily accomplished by querying out the data from the #presel temp table based on the feature name. Orrrr…one could query the desired feature by following the guidelines in this article.

Conclusion

In the article today, I have shown how it is possible to see the new events and how they relate to the various features within SQL Server. Being able to correlate events that can trap information about new features can and will help you evolve into that rock-star DBA you are trying to become!

Finding the Right Path

xe_path1I have a fairly large backlog of articles that are in progress or that are planned for my series about Extended Events. Despite the backlog and planned articles, every now and then something else comes up that bumps things around. This article is an example of bumping the schedule around. You can see some of the backlog and the full list of articles in the series by visiting the table of contents – here.

I bring that up for a couple of reasons. First and foremost being that the topic was recently raised as a “need” by some colleagues. The second being that I see the need and how it was lacking in coverage by anything I had already written.

What is this gaping hole in the coverage of Extended Events? To be honest, it is not a very complicated topic or very difficult gap to fill. It’s just something that has been overlooked. The gap boils down to this: how does one consistently find the correct path to the Extended Event Log file (XEL file)?

Filling the Gap

The gap I will be working to fill in this article deals with consistently finding the file path for Extended Event (XE) sessions. This gap rises due a few different things such as the ability to define a target in different manners, being able to move the logs directory, or even the fact that a target may not be added to the session (let alone a file target). These causes can all contribute to a bit of frustration and may pose as different symptoms when trying to get the file path.

One additional complication is tied to the running state of a session. For that complication, I talked briefly about it in my article about better practices. The state of the session could have an impact and could cause frustration when trying to retrieve the file path. All of these things are considerations that must be made when trying to retrieve the file path.

To find the file path, let’s start with some basics. The default path for XEL files is in the log directory of the instance. In addition to this default behavior, each session stores metadata about running sessions as well as deployed sessions that is accessible from DMVs and system catalogs.

Log Path

The first basic to tackle is the log path for the instance. I can query for the log path of the error log for the instance and rely on that as the path of my xel files. Should I choose this method, then I could execute the following query.

Executing that query on my SQL 2014 instance produces the following results:

errorpath

Now the obvious problems with this method come from the fact that relying on this data is relying upon an assumption that you have set all of your XE Sessions to use the default log path. If you have declared your sessions to use a file target and did not specify a path, then the assumption is safe. However, if you are detail oriented, you probably have been somewhat explicit in how you define your event file target. This brings us to the next topic – defining the path.

Defining The Path

While a bit of bird-walk, it is necessary to cover this topic at this juncture. This brief discussion will help to understand some of the other issues with retrieving the path consistently.

When defining the path of the event path, there is a bit of flexibility in how one can define the file to be used. You can either declare the file as just the file name, or you can define the file as the folder path along with the file name. Both methods are completely legitimate. Unfortunately, this flexibility is what causes some of the pain with retrieving the file path consistently.

Let’s take a look at two quick, and acceptable, ways to add an event file to an XE Session. I will be re-using a session from a previous article for these examples.

This is a pretty standard format I use for creating my sessions. The section I want to highlight though is the set of the filename near the end of the script. I will generally define the entire path for my xel files when I create a session – just as I have done in this example. Defining this path helps me to know exactly where I am putting the session for starters. Equally as important is that this path is easier to retrieve from metadata because I have explicitly defined the path.

Take this next example of the same session but with one minor difference.

The minor difference in this example is just in how the filename was defined. It is technically accurate and acceptable to only use the filename instead of the path and filename as I did in the previous example. The problem here comes from the retrieval of the path from metadata.

Getting the Path

In the previous two examples, I showed two methods of defining the filename within a session. In the former example, this means I can more easily find the path. However, if I use the script in the beginning of the article to find my xel file paths, then I have made an inaccurate assumption. This leads to an inconsistency in how the file path is fetched. In the latter example, the assumptions concerning the default log path would be valid and could be applied here.

Since the assumptions do not prove to be consistent or accurate across the board, we need to both evaluate how the data for each would look and we would need to see how to retrieve this path more consistently. First up is looking at how the data for each of these example sessions would be stored in metadata.

In the preceding query, I am looking at the deployed session metadata for two sessions, both of which start with “AuditSelect”. In the sys.server_event_session_fields view, there is an attribute called name that contains the value “filename”. This value will only exist if the session has an event file target defined for the session. Running the query will yield the following results (on my system).

eventfile_meta

In this result set, I can see there is one of each of the two event file definition methods I described in the previous section. In the green highlight you will see that I have the entire filepath. In the red highlight, you will only see the filename. Nothing special has been done to this data in the return as you can verify from the posted script. These are the actual stored values. So this would indeed seem like we have a bit of a problem, right?

Well, this is where we get to be a little bit imaginative and employ a script such as the following:

Wow, that is considerably longer than the first example in the article. That is true! This script accounts for a few of the conditions that cause inconsistencies (but not yet all of them) in retrieving the file path for a deployed event session. The key here is to understand this will fetch the path whether you enter a complete path or not while creating the session. I also through a check in there to confirm that the session has a file target as well as a check to ensure the session is deployed to the server.

Looking at the output, I get results that are somewhat friendly and easy to understand without any of the guesswork.

xelpath_result

What if the session has never been started and did not have the full path declared? Well, that is one of the inconsistencies I am working on still.

There is an alternative to this method as well. For the time being, this method would also be recommended in the event the session being researched happens to be one of the system sessions that is “private”. This next code chunk will show two different methods to parse the file path from running session metadata in the DMVs.

Conclusion

In the article today, I have shown some of the internals to retrieving file paths for Extended Event Sessions. I dove into metadata to pull out the path for the session and discussed some concerns for some of these methods. In the end, you have a few viable options to help retrieve the file path in a more consistent fashion.

BIT9 PathNames

We live in a day and age when security (data, network, server, etc) is seemingly at the forefront of the daily news. In our quest to improve security we seemingly always find more and more products that are supposed to help with that security. Sometimes, those products require a data repository. As it happens, it is not uncommon for that repository to be in a database. Today, I am looking into a specific issue with the Bit9 product.

Why?

There are some very good reasons as a matter of fact. One really big reason is that I could find no decent information about this issue. The bigger reason is the seemingly indifferent level of response and the delays I saw in response from the Bit9 support channels.

In this specific case, the first response from their support channels took more than two days. Subsequent responses were more than a week later. For the client, this actually caused delays in a project they were working on. Was the issue significant? It was not an outage causing issue, but it was one that did seem to continue to grow and cause concerns with disk space.

The Issue

bit9It may be appropriate to discuss what Bit9 does prior to breaking into the details about the issue. If you have never had any experience with this product before, you may be running a product from a competitor such as Symantec. Bit9 is a an endpoint protection tool from Carbon Black. This helps to prevent against malware and endpoint attacks. In short it is a security based tool to protect your computing enterprise.

Go ahead and click the image to link to their site if you wish to learn more about how it works. For me, the nitty gritty specifics on how it works is a bit outside the scope of this article.

In short, Bit9 records a record of every filename and every filepath for every client machine. It stores these in a database in SQL Server. This shouldn’t be too much of an issue. For one instance serving about 300 client machines we saw about 1.2 million paths being stored. One another instance with far fewer clients and far fewer actual paths on those clients (physically checked) we had 114 million paths stored in the database. The difference in size was 20GB vs ~160GB. When the server is installed with all defaults, you can imagine how this looked for this instance – a very bloated OS volume.

Investigating this issue from a database perspective, I looked to find what was consuming so much space in the database. To do that, I ran my tablespace script and discovered the following:

bit9_tablespacehl

A significant standout there with the pathnames table. Looking at the definition of the table I see something far more interesting and disturbing all at once.

pathnames_table

I see a table with three columns and two large string fields. Each of these fields has a non-clustered index on it. This may or may not be such a big problem (other than the fact that the size of the non-clustered indexes on this table are much larger than the data) except that each of the string fields is an exact duplicate of the other. That’s right. Within this table, the data is duplicated into this second string field and each field has its own index. Not only does it appear that I have a ton of duplicated data, it appears I have entirely useless indexes (neither had been touched for a read since the server had been up)

I inquired about this to the folks at Bit9, both from a design perspective and from an archival perspective. Absolute silence on the design (expected). The inquiry about archival (or purge) of non-essential data did fetch a response – albeit a painfully slow response. The basic question is: “Is it safe to purge or archive old or unnecessary data.” The response we received was “run this and we will tell you what to do next.”

Great, they sent a script to help determine the state of data within the database. I am not posting their script here. Suffice it to say that the script they sent was not very pretty. They query about 20 tables, union the results from those tables, then perform a not in operation to see how many of the pathnames are invalid. No problem. Executing the script did reveal the following:

orphan_result

There seems to be the problem. 95.89% of the rows being stored in the pathnames table are orphaned records! This is a bit of a problem. The software does not appear to manage removal of invalid paths. From here, I knew what the course of action needed to be and acted on it. A big piece of the equation was provided from the results of the script. Another piece was provided in the makeup of the script. From these pieces of information, I created a purge script to help manage the orphan problem. I then put that script into an agent job and set it to run on a weekly basis.

If you find yourself in the same boat, here is the script needed to purge the data. If running this, I recommend disabling the two non-clustered indexes and then performing the delete (especially if you sit at 96% orphaned and over 100 million rows). After deleting the mass amount of orphans, go ahead and rebuild the indexes to stay in compliance with the software contract until Bit9 responds about the index requirement and the schema of the table.

Yep! It is just like that. The worst part of the whole thing is actually the delete. The delete operation will touch about 37 indexed views to update and delete. Be prepared for a long running process if you have a situation like I encountered.

For kicks, here is what that plan would look like in one of my favorite tools (SQL Sentry Plan Explorer).

bit9_deleteplan

What is your pain scale and story?

pain_scale

Easily Shred Event Data

shred_xeIt has been nearly a year since I started an extensive series about Extended Events. Previous to that start, I had already written a bunch of articles about Extended Events.

Among the batch of articles preceding the 60 day series one can find an article about shredding extended event data. Then as a part of the series (the first article in the 60 day series) one can find how to shred the action payload data attached to an extended event session.

You can read the article on shredding XEvent payload data here. Then you can follow that up by reading how to shred the Action data here. And once you are done with those and you really want to learn more about Extended Events, please read the whole series which can be found here.

All of that reading to catch up should keep you busy for a tad bit.

Upgrade

A year is a really long time to go without updating a script – according to some. Today, I have an update for both of the XML shredding scripts you just read about in the list of articles mentioned moments earlier in this post.

Before I dive into the script, let’s revisit a sample of the XML from an event session payload. Within an event session, you can see that there are nodes for both the data and the actions (highlighted in green and red). With all of this information found within the same XML for the session, it somewhat makes sense to try and parse all of the data at once.

action_xml

In addition to parsing all of the XML for the data and the actions at the same time, it seems to also make sense to generate the statements that would parse the XML within a single effort. As you would have noted, my previous scripts were just that – scripts. That implies executing a separate script for each the data and the actions. Maybe it would make more sense to execute a single script.

leaving_painIt is that notion of a single script that constitutes this upgraded version of the script.

One Script

In the following script, I have tried to accomplish just that – a single script to create the entire XML parser for me, for you, and for anybody wishing to use it. I don’t want to have to remember the subtle nuances of how to parse each of the events each time I need to parse the session data. I want something that is quick, easy, and repeatable.

With all of that said, here is the script that I now use to parse my session data. You should notice that it has been simplified and is more extensive now.

Not only does this script slice and dice for you…Wait that’s not quite right.

This script doesn’t just generate the XML shredding strings for the payload and action data. This script also will produce the rest of the tsql statements that should precede and conclude the XML parsing statements. Yes it uses dynamic sql. That is to your advantage in this case.

Could there be other enhancements? Absolutely! More will be coming.

Enjoy this script and happy parsing.

Extra Extra – Read All About It!

From the comments, you will see that Brent Ozar (blog | twitter) made an excellent suggestion. To be honest, I considered sharing some examples during the initial write of this article. I had opted against it then, not considering it in the same way that Brent puts it. So, in this addendum I have a quick example using an XEvent Session that I have not yet written about in all of my articles on Extended Events (pseudo spoiler alert).

Let’s use the AuditSelects XEvent Session I have deployed and running (remember a session can be deployed but may be in the “stopped” state). If I want to figure out what data has been captured for this session, I can take the session name and plug it into the script shown in this article.

parsexe_params

Ignore the green tsql comments for now (you can read them direct from the script). Highlighted in gold is the variable for the session name. By entering a valid session name for this variable, I can restrict the results to just the metadata for that specific session. In addition, note that I have highlighted, in a pale chartreuse, two variables that will give me unique column names for each of the events and actions within each session. This is important because some events have the same name for different attributes as other events. If events with the same attribute names are used in the same session, you will get a duplication of data. Maybe you want the duplicate data. Maybe you don’t.

The session name variable is used in each of the next three code segments. Each segment in the script will generate sql statements that will need to be copied into a new query window. I will leave the further breakdown of the script as an exercise for you. What I want to do here is show the results from executing this script.

When I execute the script, I will have four result sets returned to me (in grid mode). It will look something like shown in the following image.

parser_results

Breaking this down into three colors is probably easiest to see how things fit together. Highlighted in red, I have the build out of the pre-processing statements and the build of the “select” statement that will be returning our data to us. Then in blue comes the guts of the query – all of the beautiful XML parsing statements. The segments in blue correlates to the columns in the select list. And at the bottom of the image I have the finishing pieces that includes my predicates, from, and joins. Each of the highlighted segments will then be copied and pasted to a new query window and look something like the following.

parsed_gluedtogether

And there you have it. Sure there are still some manual steps in it, but you no longer need to memorize all of that glorious XML parsing syntax. You only need to copy and paste with this version of the script.

Enjoy your adventures in parsing XML!

SQL Server CTE – Back to Basics

Categories: News, Professional, SSC
Comments: 2 Comments
Published on: July 14, 2016

Quick Flashback

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

SQL Server CTE

kapowOf all the fundamental concepts within SQL Server, this one drives me a bit batty from time to time. Think about that statement for just a moment. I am not going to go in depth about the syntax of a CTE.

I am not really interested in breaking down the semantics and uses of the CTE. Rather, I want to break down something commonly associated to the CTE that is far more basic. I would classify this as more of a myth and misconception than anything else.

The topic du jour happens to be around punctuation. More specifically, do we use terminators or do we use beginninators? Yes that is a made-up term, but you get the drift. Where exactly should the punctuation be placed? Better yet, is it mandatory to include that punctuation every single time?

Those are the items I wish to discuss and hopefully demystify with a few examples (if that many are even necessary).

Punctuation

Punctuation shouldn’t be too big of a deal right? There is so little punctuation within t-sql that it really should be very basic and easy to understand. That said, there is this perpetual myth that a semi-colon is required at the beginning of a CTE. Furthermore, and somewhat more accurate, is that a semi-colon is a requirement in order for a CTE to work. Let’s take a look at both of those scenarios.

Case the First

Let’s just take a look at the most basic of scripts. A simple select statement without a CTE.

Straight away, we can see that I am accessing the AdventureWorks2014 Database. This happens to be on my 2014 instance as well. I have not terminated the select statement with a semi-colon. It really is somewhat a preference here since the statement will work with or without the semi-colon just the same (and despite the previous notion that it was going to become mandatory to terminate all statements with a semi-colon, don’t believe it until it actually happens).

Here is the SQL Server 2016 version:

Now let’s go with a basic CTE example written the same way that we commonly see CTEs written on the internet.

Notice that I am using the semi-colon as a statement beginninator and terminator all in the same statement. The statement runs perfectly fine – that much is true. The usual reason for the leading semi-colon is that the cte has to be preceded by a semi-colon. Let me see how true that is with a minor tweak!

What do you expect to happen from the execution of this statement? Well, as it stands the statement will execute without error just the same as it did in the previous example.

basicctetweaked

What? That worked? Yes. Yes it did. This is quite frankly due to the reality that a semi-colon is not required to precede a CTE. This statement was the only statement in the batch.

Case the Second

Based on those results, that must mean that a semi-colon is absolutely required when there is a statement preceding the CTE. Therefore, what is really required is the statement that immediately precedes the CTE must be terminated by a semi-colon. Not so fast there. Let’s test this one too!

Let’s use the following basic example.

That query works perfectly fine again. That must be due to the preceding semi-colon, right? Hold on to that thought. How about a rewrite as follows:

Here we tweaked the query to represent the notion that all previous statements should be terminated with a semi-colon. What do you think will happen? This will result in an error. GO is a batch terminator and therefore does not require the semi-colon. Surely this reinforces the theory that the semi-colon should come preceding to the CTE though, right? Let’s rewrite that query again.

I removed the semi-colon and now the code segment will execute without error. Wait there is more!

I want to continue tweaking this query. Let’s try multiple selects along with this CTE.

mind_blownNow I have a query that has a preceding select that is not terminated with a semi-colon. I have a CTE that is created right after that select and then I select from that CTE. What could ever possibly happen with this particular query?

The short of that story is that the query will execute without error and I will end up with two result sets.

Wait! Hold on two seconds there! Surely the semi-colon is an absolute requirement because we see it everywhere that it is a mandatory requirement.

The reality is that the semi-colon requirement is not really entirely accurate. If the CTE happens to be in the same batch, then the previous statement in the batch must be terminated by the semi-colon. Take the following change for example:

If I remove the semi-colon from after the “END” statement, then my query will terminate in an error. If I remove the BEGIN and END from around the first select statement, then the query will fail with an error. The error that I receive is part of the reason for the huge misconception about when a semi-colon is required:

Msg 319, Level 15, State 1, Line 38
Incorrect syntax near the keyword ‘with’. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

As I have already shown though, this is not accurate. Start thinking on a bigger scale than just semi-colons. Personally, I would prefer to see CTEs that are used in examples on the internet to be encapsulated with a BEGIN…END instead of using a preceding semi-colon (beginninator).

Has this behavior changed with SQL Server 2016? That is easily shown by the following example:

Recap

These examples clearly demonstrate the inaccuracy of the myth that a semi-colon is required. Despite the statement made in the error message, it just is not entirely the full story. A semi-colon to terminate the preceding statement within a batch is necessary but it is not required every time. If there is a separation between batches and the CTE, then a semi-colon is not required. And never is a semi-colon required to precede a CTE. It is required to terminate a previous statement within the same batch.

Please don’t be led astray by some notions on the web that dictate a CTE must be preceded by a semi-colon.

SQL Server Select – Back to Basics

Comments: No Comments
Published on: April 1, 2016

Prelude in SQL Minor

translucentman_greenBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics.

SQL Server Select

atlasian_manOf all the fundamental concepts within SQL Server, nothing is potentially more basic and fundamental than the often overlooked Select statement. Think about that statement for just a moment.

How often is the select statement overlooked and under-analyzed? Yet, this tiny little word is a rather large part of the transactions that occur within each instance of SQL Server on a routine basis.

Today, I want to break down the SQL Select statement and show the components of the statement along with the logical processing order of those components.

Let’s put on our thinking caps for a moment and think of all the possible components of this super fundamental piece of the TSQL language – the Select statement.

Anatomy of the Select

The first important piece of information with the Select is to understand what it does. What is a Select statement? What is the purpose of a Select? Really, the purpose of this statement should make sense based on the word alone – to select. In other words, to get or retrieve something that is of interest to you at that moment. In our case, that means to retrieve some piece of data from the database to meet our needs. Very simple, right? You issue a SELECT statement to get something out of the database.

Thinking through the Select statement, certain common elements probably stick out pretty easily. These elements might include the following:

  • Select
  • From
  • Join
  • Where

Then with a slightly more complex query, one might see the following components associated with a Select statement.

  • Top
  • Distinct
  • Group By
  • Order By

Then stepping it up to another level again, one might see the following:

  • Having
  • For
  • Into
  • With (Cube or Rollup)

Soak that in for a moment. Suddenly this fundamental piece of TSQL is starting to look a little more like black voodoo magic. It can get rather complex with this fundamental statement. Now soak in what the purpose of the SELECT is, as was stated just a bit ago. That purpose is to retrieve data that you need. One cannot always retrieve the desired data without some options to go with the SELECTion criteria. Suddenly, this complexity renders itself more as raw DBA power.

Knowing the various aspects of the Select, and not diving too far into what each does, you may be wondering how these pieces fit together into a query. Let’s take a quick look at a few different examples illustrating how these puzzle pieces fit together into a SELECT statement.

Here is an example showing a SELECT that utilizes the FROM, WHERE, GROUP BY, and ORDER BY clauses. When writing a SELECT statement, we will write the query in the order shown in the query. Using these clauses in a different location doesn’t read very well to humans and will throw a syntax error.

In this next example, I have used a few different clauses:

The real differences here being that I have used the DISTINCT, TOP and JOIN clauses. Again, trying to write this query with the TOP preceding the TOP or maybe with the JOIN preceding the top doesn’t really make much sense when reading it. Nor would it work if we tried to execute the query due to syntax errors.

This next example uses another option to output the results into an XML document using the FOR clause:

And in this last example, I take the previous example to output the results into a temp table.

These are all examples of the possibilities that can be used with the SELECT statement to retrieve data from the database in order to meet our needs. Each one illustrates different variations and some level of complexities between each of the examples.

Now the question is, did you know that SQL Server does not process these statements in the same syntactic sequence or English friendly forms that we write these SELECT statements? You may be surprised to learn that there is a different processing order to these pieces of the SELECT. This processing order is called the LOGICAL Processing order.

Let’s take one of the queries already shown and compare.

The query as we may write it will look like this:

Anatomy_select

The query as SQL Server will process it will look like this:

anatomy_logop

Or in other words, we may see it as shown in this video:

That is significantly different than we might think, but it makes sense if you think about it long enough.

How does one know that SQL Server processes these steps in this order? Well, besides the documentation in BOL that supports this, one can dive into the internals and generate a tree that will show this in action. For instance, let’s use the same query as just shown and see what the processing tree would look like.

select_tree_map

Besides looking very colorful and almost as if it were in Greek, this processing tree is somewhat revealing in how SQL Server breaks down that SELECT statement we have been using. One more pic, and then a quick explanation.

select_showplan_text

This last image was generated using the SHOWPLAN_TEXT setting. I want to show this because it helps to reinforce a concept from the processing tree. In this last image, note that the OUTER REFERENCE is to the SalesPerson table. This also means (without stating it) that the inner reference is to the Employee table. I have added the INNER and OUTER table references to the image of the processing tree. Why is this important? Well, when trying to interpret the tree it is important to understand that one starts from the inner most node and then read from bottom to top, right to left (as you ascend the nodes).

In this case, we will see that the Employee table is first referenced and processed, then the SalesPerson table, then the ON clause is evaluated. In the tree, I highlighted the ON that pertains to the join in bright green. The vertical dotted lines on the left are to help visually keep the nodes aligned as you step in/out from one level to the next. As we climb the tree to the top left corner, we will see that the last piece of the select to be processed in this case is the TOP statement. We can also see that the top is dependent on the ORDER BY (highlighted in RED) in order to get the correct desired rows.

Recap

This article jumps through the SELECT statement very briskly. The SELECT statement is a powerful tool within SQL Server. Gaining an understanding how the SELECT statement is processed can be helpful to write an appropriate SELECT and how to potentially pseudo code that SELECT. If you can understand that the engine first determines the source of the data and then starts to break down the additional requirements that have been sent with the SELECT. The more options that are thrown in with the SELECT statement, the larger and more complex the processing tree becomes for that statement.

Used well, a SELECT will become your ally and a handy tool in the toolbox. This has been an introduction to the SELECT and how it is processed behind the scenes. If you would like to see more information about the different clauses related to the select statement, check out the BOL entry here.

As an aside, the tree that was displayed is called the “converted tree.” Here is a listing of the other possible trees.

optimizertrees

SQL Agent Job Schedules

Many moons ago, I posted a script to help report on the human friendly scheduling sched_reportinformation for SSRS subscriptions. You can read all about that here.

In a very similar fashion, today I want to share how easy it is to turn the internal schedules for SQL Agent jobs into something that is more human friendly. SQL Agent job schedules should be easy to read for humans if for nothing else than the fact that you have to provide this information to the auditors from time to time. We all know they can’t properly interpret the numeric job schedules and properly determine what they mean.

The job schedules work quite well with the current design – which is not human friendly. The use of numbers is a streamlined approach and far more efficient. It may be a bit more useful to us if there were some sort of translation table that was stored. A stored translation table would make it terribly easy to figure out what is happening within the schedules for the SQL Agent jobs. We don’t have that, so we often have to do a little trial and error. Or we rely on people that have done that trial and error – with a little verification on our part. Afterall, these scripts are posted on the internet.

SQL Agent Job Schedules

I am not going to break down the script. With the similarity to the aforementioned SSRS script, I would recommend looking there for a little bit of an explanation. Here is the script to retrieve SQL Agent Job Schedules from your MSDB instance.

This script will only retrieve the SQL Agent Job schedules that are not SSRS related and that are not SSIS maintenance package related. The SSRS stuff has already been introduced, and a quick eval of those blog posts will show the differences between grabbing SSRS information vs. grabbing just SQL Agent Jobs.

Stay tuned. I will post a script for the SSIS related scheduling information.

Now, go forth and audit your agent jobs. Document the schedules in an easy to read format. Give that report / document to the auditors and your manager and look like a rock star for the day!

Shredding Extended Event Actions

lovehatedice_v1The other day I wrote about that torrid love/hate relationship DBAs tend to have with working with XML. In that same post, I promised I would have a follow up post about XML in Extended Events.

Well, today I present to you another opportunity for you to renew your love/hate relationship with XML.

In the previous post (which you can read here), I discussed just one facet of shredding the XML related to Extended Events. Today, we have “actions” to discuss. I am not going to get into what an “action” is today. I will save that for a near future post.

For the shredding of the actions, I will use the TreeHuggerCPU Event Session I used in the last article (for the sake of consistency). The session doesn’t necessarily need to be running. I will just pull the actions related to the session from the metadata accessible via the system catalog views.

Should the session be running (and in my case it is for demonstration purposes), I could open the session data and view it as XML and see something like the following:

action_xml

 

I have highlighted two different types of nodes available in the XML data. In the previous article, I discussed the “data” nodes and I have highlighted that in red here. Today, we are talking actions, and those are highlighted in green this time around. It is the “action” nodes that we will be shredding via the following script.

 

With this script, I can either search for all actions tie to an XE Session, for the actions tied to one event within a Single XE Session, for all actions tied to a specific event across multiple sessions, or for all actions that are tied to any event tied to any event session deployed to the server.

Combine this with the previous script and suddenly all of that XML just got several times easier.

«page 1 of 7








Calendar
October 2016
M T W T F S S
« Sep    
 12
3456789
10111213141516
17181920212223
24252627282930
31  
Content
SQLHelp

SQLHelp

  • @SQLSoldier: @MidnightDBA SMO via .Net. Why no PowerShell. PowerShell works with 2008 +. #sqlhelp
  • @MidnightDBA: Server: No SSMS, no powershell, SQL 08 r2. Best TSQL script to create sp_add_job statements, et al, for all jobs on the instance? #sqlhelp
  • @AMtwo: @ajneuman AFAIK, the only way to upgrade is to point SSRS at the DB, and let it do it's thing. What error do you get on upgrade? #sqlhelp
  • @ajneuman: Is there a way to manually upgrade the Report Server databae? It's failing to upgrade from version 147 to 174. #sqlhelp #SSRS #sqlserver2016
  • @AngryPets: @SQLYard #sqlhelp Nope - they just need to be sized to handle the indexes on them. I'd check file stalls vs all files to see if helping/etc.
  • @SQLPrincess: #sqlhelp Any ideas on how to reverse engineer a data flow diagram from queries to show the data transformations?
  • @SQLYard: @AngryPets Thnks so the 3 filegroups different sizes are normal then? I was thinking they needed to be aligned in size. #sqlhelp
  • @AMtwo: @SqlRyan If you want to use source control with your report designer, use Visual Studio + SSDT. #sqlhelp
  • @AngryPets: @SQLYard #sqlhelp [2/2] storage, they aren't as helpful as they could be in 'spin disk' days. But still have benefits. Can be over-used.
  • @AngryPets: @SQLYard #sqlhelp Index filegroups are/can-be a best practice - depending upon data, load, and IO subsystem. With more modern ... [1/2]

Welcome , today is Saturday, October 1, 2016