Database Backups – Back to Basics

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Database Backups

db_backupIn my previous two articles, I touched on some data that is available to the data professional with regards to backups and database restores. In the article discussing the restore history data, I alluded to another topic related to restores that could have been discussed.

Today, I will be diving into that alternate direction. This direction is related to database backups. More specifically, how do you truly know if you have a successful backup? Here’s a hint: it has to deal with restores.

If you haven’t heard it before, here it is in simple terms “you do not have a backup until you have restored the backup“. If you ponder that for a minute, it is quite logical and makes plenty of sense. The only way to confirm that you have a backup is to test the purpose of the backup and that is to recover the database back to the state represented by the backup.

Testing Backups

Creating a database backup is a pretty straight-forward task. This is easy enough to do through the GUI or from tsql script, or db_restoreeven, *shudder*, from a maintenance plan. Creating the backup is the easy part. If you are not creating backups, I hope there is a good reason for it (and yes there are valid reasons for not creating a backup of certain databases).

This article will not explore the nuances of the ways to create a backup of a database. Suffice it to say, there are multiple options and methods. This article will focus on the second part of creating a successful backup – testing the backup. In other words, restoring the backup that was created.

So how does one go about testing a backup? The simple answer as already stated is to perform a restore. This means through the use of the restore command. But is it really that simple?

If I have the following backups available for my AdventureWorks2014 database, where should I test the validity of the backup?

avail_backups

Would it make much sense to take any of these backups and test the restore process on the same server as the source of the backup? To be honest, that is a question that may have an entirely different answer in each and every environment. That is a question that requires a little insight into the business needs, available resources and procedures and policies in place for the environment. For me, I typically like to create an automated restore process that will restore the previous night’s backup onto a test/stage/dev server that is not in production.

I have written previously on how I do this sort of automated restore. You can read all about it from the original article here. In that article I provide a script to assist with the restore of these backups. You are welcome to test it out and play around with it while setting up your backup validation environment. In using the script, the user assumes all risk.

I recommend an automated restore system to ensure the backups are tested on a routine basis. The benefits of doing this are more than just a few. That said here are a few of those benefits: 1. Confidence in backups, 2. Quick recovery in event of failure or disaster, 3. A usable environment for quick data comparison, 4. A passable environment for reporting, and the best benefit is that you can get a good nights sleep knowing your backups are reliable.

Caveat

With automated restores, there may be an occasional failure. Heck, you may run into a bit of a head-scratcher here or there trying to get them to work in the first place. It’s not technically easy the first time. With some practice, it gets much easier. This method is intended to be suitable for a cheap solution. There may be a pricier solution out there that can be bought. If that is in your budget – go with it. The main point is to do it.

After you get these restores working, I recommend using the scripts in this article to check the restore history from time to time. Take it an extra mile and generate some reports from that data.

Recap

I have provided some information on how and why to restore a database. The primary reason being that you never know how good your backup is until you have restored it. Some say you do not have a backup until you have restored it. This is a solution and some opinions on how to do that efficiently.

Database Backup History – Back to Basics

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Database Backup History

db_backupAs a data professional, there is little to no doubt that one of the most important tasks is to ensure the data is backed up regularly. We do this in preparation for several reasons. Whether it be to be able to recover in the event of a problem or to compare yesterdays data to how it looks today, the backup is done.

When doing a backup, how often do you report on the backup success? How frequently do you refer to the backup history to compare multiple runs or even generate reports for interested parties? The fact of the matter is that this history should be referenced on a routine basis to generate reports that should be read and not ignored.

Backup Reporting

Right about here you may be asking yourself why would somebody generate reports on the backups. After all, the job succeeds or fails and you respond appropriately to that, right? Does one really need to generate a report about the success of the backup job? Well, that is for you to decide. I would say it wouldn’t hurt to generate a report about the success rate of your backups. Management often likes to see reports concerning important processes. That said, that is not the reporting that I am referencing here.

The reporting I believe is useful in this case is indeed related to the historical information of your backups, but it is less about the success of the job and so forth. Of great benefit from the backup history data is the information on frequency of backup and also the size of the backup. These pieces of information can be somewhat useful when looking to forecast data growth. That is one example of a possible report when working with the backup history data that is stored.

Let’s take a look at a query that will help retrieve some of this backup history data.

Off the top, yes the script uses dynamic sql. I use dynamic sql because I use this script on multiple versions of SQL Server. One of the data points I gather in this script is the compressed backup size. That attribute is not available in SQL 2005. Due to that, and not wanting to maintain multiple scripts, I use the dynamic sql to account for that missing column.

I have also set the script to accept a database name parameter. If a name is provided, then only the backup history for that database is returned. If the parameter is left NULL, then the backup history for all databases will be returned. Additionally, I added a number of days parameter to limit the scope of the report to a specific range of days.

Among the data points returned in this script, you will note there is the duration of the backup, the date, and even the size of the backup. All of these attributes can help me to forecast future storage requirements both for the backup storage as well as for the data volume. Additionally, by knowing the duration of the backup and the trend of that duration, I can adjust maintenance schedules accordingly.

Caveat

In this particular script, I only retrieve the backup history for two backup types: FULL and DIFFERENTIAL. A small adjustment can be made to include other backup types if necessary. The line to change is:

Other possible values for backups are:

  • D = Database
  • I = Differential database
  • L = Log
  • F = File or filegroup
  • G =Differential file
  • P = Partial
  • Q = Differential partial

If you are interested in learning more about backup information, you can read this msdn article – here.

Recap

I have provided a quick example of how to retrieve pertinent backup history information. This information can be used to assist in forecasting storage requirements as well as assist in the job scheduling as backup jobs begin to run longer and longer as the data gets larger and larger.

For a related article, check out this article on how to retrieve the restore history from the msdb database.

Last Restore of a Database – Back to Basics

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Database Restore

db_restoreAs a data professional, one should be somewhat cognitive of backups with regards to the data. From the point of the creation of the backup, we can go in one of two directions with this article. Due to the importance of each of the directions, I will devote a basics article to each.

The order of publication of these articles in no means denotes a level of priority to the topic. I feel each is equally important. The two directions I see both deal with the restore of that backup that was created. One direction is in regards to reporting and the other direction is the actual restore.

Last Restore Date

Something that I seem to be getting asked more and more frequently is how to determine when a database was last restored. Maybe it is just a hot topic and maybe it will be short lived, but it sure seems to be something that is on the tip of peoples minds of late. So, how does one determine the last time a database was restored? The answer is actually quite simple. Within the msdb database, there is a table (that maintains the history of database restores) called restorehistory.

The restorehistory table stores information that will help you better understand what kind of restore was performed, who did the restore, when it was done, and even some information about the backup that was used to perform the restore. You can read more about the specifics of this table from the msdn article – here.

A quick means to query this table is via the following script:

Now, that is an extremely simple query looking at my TestDB database. The results of that query on my system are as follows:

db_restores_example

As is shown in the preceding illustration, I have multiple restores and restore types of this particular database. This is a good start. Let’s take this query up a notch. Not all production environments will have databases that are restored on the same server. What do we do to report on databases that are restored on the production server vs. the databases that exist on the production server? That is where this next query comes in to play.

In this latter query, I have added a few things to help for various reporting needs. You never know when the CIO or an auditor is going to come to you with a request to filter the data differently for restored databases. In this vein, I added parameters that will help filter the results down to a specific database, return all databases, or return all databases that have a restore history on the server. In this case, I decided to use the AdventureWorks2014 database as my example database. You will want to change the parameter as needed. If you pass a null value to the database name, then all databases can be returned in the result set – depending on the values used for the remaining parameters.

Caveat

This is all fine and well if the databases are restored onto the same server as the source database, right? What about the case where the database is restored to a different server? Well, the query still works for reporting the restored databases on the server where the database is restored. The complexity comes when trying to correlate back to the original production server. My recommendation here is good documentation and an administrative linked server that is not accessible via the application accounts or any user but the DBA group.

Recap

I have provided two quick examples of how to retrieve the restore history for your databases. I recommend that the restore history be checked on a routine basis. You never know when an over-permissioned user may decide to restore a database for you and then have all of the rest of the users coming to you to determine what happened. This will also be essential when we routinely test our database backups. Stay tuned for the next article where I will explore that aspect of database restores.

Login from an Untrusted Domain – Back to Basics

Categories: News, Professional, SSC
Comments: No Comments
Published on: November 8, 2016

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Untrusted Domain

SQL Server is full of wonderful features and tools. One feature that is not wonderful and is blatantly frustrating is the error message that is supposed to help you understand why you cannot connect to SQL Server.

Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.

Are you really sure the login is from an untrusted domain? I recently ran into this problem and found that it was less than helpful, though it seems basic enough in what the error should represent. When I encountered the problem, there was little useful information that pertained strictly to my problem.

Wait, am I saying that this error can be reported for various different legitimate problems dealing with logins? Absolutely, yes! Let’s recap some of my givens because these facts are important to troubleshooting the real underlying problem. Which in turn, causes this to be more of a “basics” type of post.

  1. The server was joined to the domain.
  2. The user in question was granted sa permissions to the instance.
  3. The user can login to the instance locally without error.
  4. My personal domain account could login locally without error.
  5. My personal domain account could login remotely without error.
  6. The user in question could not login remotely. Error thrown was the untrusted domain error.
  7. The version of SQL Server was SQL Server 2014 Express edition.

Now talk about a head scratcher from the get-go. The problem only seems to occur when trying to login from remote. So, as an example, here is what I was basically seeing. First, the tsql to create my test login.

add_userAnd with that user in place, let’s try to connect from a remote server to see what will happen.

pre_changeresults

Well, that is far from useful. I can definitely see that my user is properly created. I have more than adequate permissions. If the user was truly from an untrusted domain, the creation of the login would have failed.

As it turns out, there is something that is explicit to SQL Express with remote logins that generates this problem. Some have postured that the account needs to be added to the local administrators group. This is not accurate.

Better Solution

When you encounter this problem, there is an easy solution. Unfortunately it is not really a SQL Solution. Rather it is more of a sysadmin type of solution. The solution is to employ the use of group policy (either locally or a domain group policy). The specific policy that should be changed is called “Access this computer from the network”.

policytochange

My advice here would be to use a domain based group policy if you have a bunch of servers that fall into this need. In my case, we needed to touch 120+ computers. Manually setting this GPO on 120+ servers would have been very tedious.

For a single server like this example, I added it manually as shown below:

after_gpo

With the account granted this permission, all that is left is simply to try and connect.

post_connectresults

This is a great result. Now the account can connect and perform the work it was intended to do. In this case, it is a task account that would server as a proxy account to perform remote tasks such as performing backups.

Recap

 

Troubleshooting login failures is a core concept for the data professional. At times the cause of the login failure is far from intuitive. In this case, the failure is non-indicative of the actual problem. The real problem is that SQL Express does not like remote logins without a little bit of hoop jumping. Learning how to troubleshoot the problem is essential to becoming a better data professional.

Default Sessions – Back to Basics

Comments: No Comments
Published on: September 1, 2016

Remember When…

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s hope this post holds up to the intent of the challenge.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

Default Sessions

SQL Server is full of some pretty cool stuff. Some of it is rather basic. Some of it is a bit complex. Whether basic or complex, there is a lot that seems to get overlooked because it just might be hidden from plain sight or maybe it is overlooked just because it is already done for you.

Today, I am going to talk about some of the stuff that is already done for you. It is pre-baked and ready to roll. Since it is pre-baked, it may be a good idea to get familiar with it and understand what kinds of data it may offer to you when troubleshooting.

defaultsessionIt is still somewhat surprising to me to discover how many people have never heard of the default trace in SQL Server. Slightly less surprising is how many people have never heard of the default Extended Events system_health session. What if I told you there were a bunch more default sessions than the default trace or system_health? I wrote about some of those default sessions previously.

With the evolution of SQL Server, there is an evolution in Extended Events. New features in SQL Server also means there may be some additional default sessions. I want to touch on some of these other default sessions in brief.

Default Sessions

Phone Home

First, lets start with an easy to find default session – telemetry_xevents. This session is a visible session and behaves like a user defined session with the caveat that it is a session installed with SQL Server 2016 and it is a system default session. I have another post queued up to discuss this session in a bit more detail so will just keep it brief here.

This session is there to basically capture specific data and phone it home to Microsoft. You can presume that based on the definition of the word telemetry

“is an automated communications process by which measurements and other data are collected at remote or inaccessible points and transmitted to receiving equipment for monitoring.”

Before you get your tin-foil hat in a crumple over sending data back to the mother-ship, understand that this data is not sensitive data. The type of data being gathered is usage metrics. Think of it in terms of gathering data about how many times a specific feature was used or how many times a group of errors occurred within the environment. This is in an effort to help improve the product.

While the session data is designed to be sent back to Microsoft, it could also be of use to the Enterprising DBA who is looking to get a better grasp of the environment or to troubleshoot various issues.

It’s a Stretch

With the release of SQL Server 2016 there is another default session that gets installed. This session is probably the most difficult to find. This is a private hidden session when it is installed. In addition, it is not present by default with SQL 2016 RTM but it was present by default with various CTP versions. The session is specific to the new stretch feature and is called rdaxesession.

Breaking down this particular session reveals pretty easily what feature the session is designed to support. The name rdaxesession breaks down to the following: rda = remote data archive, and then the rest is self explanatory. If I investigate the metadata for the session I will discover that the following events are traced in the session:

stretch_database_enable_completed
stretch_database_disable_completed
stretch_table_codegen_completed
stretch_table_remote_creation_completed
stretch_table_row_migration_results_event
stretch_table_validation_error
stretch_table_unprovision_completed
stretch_index_reconciliation_codegen_completed
stretch_remote_index_execution_completed
stretch_database_events_submitted
stretch_table_query_error

Well, that certainly clears it up! This session is all about the stretch feature. If you are on RTM and don’t use the stretch feature, then you won’t see this session even if you do query the metadata.

Don’t be a HADR

The last of the default sessions I want to cover both can be combined into a category about HA and DR – loosely at least. One of the sessions deals with backups and the second of the sessions deals with clusters.

When dealing with a cluster, it is important to know about the hidden trace file that records events related to the cluster. The session target data can be found in “{529e71a51265b45c1f7f96357a70e3116ccf61cf0135f67b2aa293699de35170}PROGRAMFILES{529e71a51265b45c1f7f96357a70e3116ccf61cf0135f67b2aa293699de35170}\Microsoft SQL Server\MSSQLxx.MSSQLSERVER\MSSQL\Log” by default and cannot be seen within SQL Server metadata at all (not that I have been able to find). For a few details about how this trace works, I recommend reading this article. This session can be called the SQLDiag session.

The session related to backups is actually broken down into two log files and is similar in nature to the SQLDiag session. This is true in that the session is hidden, is found in the same directory, is read in the same fashion, and is equally as hidden from within SQL Server as the SQLDiag session. The backup session deals specifically with managed backup.

What is managed backup? Well, it is a type of backup performed to Azure and which relies upon the “smart_admin” procedures, functions and process.

If you don’t perform managed backups or you don’t have a cluster configured, you will not need either of these sessions (and potentially not see the smart_admin files depending on SQL Server version). If you have either of these features, you will definitely want to ensure you become acquainted with them and how to retrieve the data.

Recap

There are plenty of mechanisms in place to help troubleshoot some of the more complex features of SQL Server. Getting to know the tools will prepare you for that moment when problems arise and you are under the gun.

I did not dive deep into any of these sessions holding that for later articles. I wanted to keep this article to an introductory level. If you are interested in the more detailed articles, please stay tuned for those upcoming articles.

If you are interested in reading more about Extended Events, I recommend reading my series which can be found here.

SQL Server CTE – Back to Basics

Categories: News, Professional, SSC
Comments: 2 Comments
Published on: July 14, 2016

Quick Flashback

sqlbasic_sargeBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics. Reviewing that link, you can probably tell I am a bit behind in the monthly series.

SQL Server CTE

kapowOf all the fundamental concepts within SQL Server, this one drives me a bit batty from time to time. Think about that statement for just a moment. I am not going to go in depth about the syntax of a CTE.

I am not really interested in breaking down the semantics and uses of the CTE. Rather, I want to break down something commonly associated to the CTE that is far more basic. I would classify this as more of a myth and misconception than anything else.

The topic du jour happens to be around punctuation. More specifically, do we use terminators or do we use beginninators? Yes that is a made-up term, but you get the drift. Where exactly should the punctuation be placed? Better yet, is it mandatory to include that punctuation every single time?

Those are the items I wish to discuss and hopefully demystify with a few examples (if that many are even necessary).

Punctuation

Punctuation shouldn’t be too big of a deal right? There is so little punctuation within t-sql that it really should be very basic and easy to understand. That said, there is this perpetual myth that a semi-colon is required at the beginning of a CTE. Furthermore, and somewhat more accurate, is that a semi-colon is a requirement in order for a CTE to work. Let’s take a look at both of those scenarios.

Case the First

Let’s just take a look at the most basic of scripts. A simple select statement without a CTE.

Straight away, we can see that I am accessing the AdventureWorks2014 Database. This happens to be on my 2014 instance as well. I have not terminated the select statement with a semi-colon. It really is somewhat a preference here since the statement will work with or without the semi-colon just the same (and despite the previous notion that it was going to become mandatory to terminate all statements with a semi-colon, don’t believe it until it actually happens).

Here is the SQL Server 2016 version:

Now let’s go with a basic CTE example written the same way that we commonly see CTEs written on the internet.

Notice that I am using the semi-colon as a statement beginninator and terminator all in the same statement. The statement runs perfectly fine – that much is true. The usual reason for the leading semi-colon is that the cte has to be preceded by a semi-colon. Let me see how true that is with a minor tweak!

What do you expect to happen from the execution of this statement? Well, as it stands the statement will execute without error just the same as it did in the previous example.

basicctetweaked

What? That worked? Yes. Yes it did. This is quite frankly due to the reality that a semi-colon is not required to precede a CTE. This statement was the only statement in the batch.

Case the Second

Based on those results, that must mean that a semi-colon is absolutely required when there is a statement preceding the CTE. Therefore, what is really required is the statement that immediately precedes the CTE must be terminated by a semi-colon. Not so fast there. Let’s test this one too!

Let’s use the following basic example.

That query works perfectly fine again. That must be due to the preceding semi-colon, right? Hold on to that thought. How about a rewrite as follows:

Here we tweaked the query to represent the notion that all previous statements should be terminated with a semi-colon. What do you think will happen? This will result in an error. GO is a batch terminator and therefore does not require the semi-colon. Surely this reinforces the theory that the semi-colon should come preceding to the CTE though, right? Let’s rewrite that query again.

I removed the semi-colon and now the code segment will execute without error. Wait there is more!

I want to continue tweaking this query. Let’s try multiple selects along with this CTE.

mind_blownNow I have a query that has a preceding select that is not terminated with a semi-colon. I have a CTE that is created right after that select and then I select from that CTE. What could ever possibly happen with this particular query?

The short of that story is that the query will execute without error and I will end up with two result sets.

Wait! Hold on two seconds there! Surely the semi-colon is an absolute requirement because we see it everywhere that it is a mandatory requirement.

The reality is that the semi-colon requirement is not really entirely accurate. If the CTE happens to be in the same batch, then the previous statement in the batch must be terminated by the semi-colon. Take the following change for example:

If I remove the semi-colon from after the “END” statement, then my query will terminate in an error. If I remove the BEGIN and END from around the first select statement, then the query will fail with an error. The error that I receive is part of the reason for the huge misconception about when a semi-colon is required:

Msg 319, Level 15, State 1, Line 38
Incorrect syntax near the keyword ‘with’. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

As I have already shown though, this is not accurate. Start thinking on a bigger scale than just semi-colons. Personally, I would prefer to see CTEs that are used in examples on the internet to be encapsulated with a BEGIN…END instead of using a preceding semi-colon (beginninator).

Has this behavior changed with SQL Server 2016? That is easily shown by the following example:

Recap

These examples clearly demonstrate the inaccuracy of the myth that a semi-colon is required. Despite the statement made in the error message, it just is not entirely the full story. A semi-colon to terminate the preceding statement within a batch is necessary but it is not required every time. If there is a separation between batches and the CTE, then a semi-colon is not required. And never is a semi-colon required to precede a CTE. It is required to terminate a previous statement within the same batch.

Please don’t be led astray by some notions on the web that dictate a CTE must be preceded by a semi-colon.

SQL Server Select – Back to Basics

Comments: No Comments
Published on: April 1, 2016

Prelude in SQL Minor

translucentman_greenBack in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

With this being another installment in a monthly series, here is a link to review the other posts in the series – back to basics.

SQL Server Select

atlasian_manOf all the fundamental concepts within SQL Server, nothing is potentially more basic and fundamental than the often overlooked Select statement. Think about that statement for just a moment.

How often is the select statement overlooked and under-analyzed? Yet, this tiny little word is a rather large part of the transactions that occur within each instance of SQL Server on a routine basis.

Today, I want to break down the SQL Select statement and show the components of the statement along with the logical processing order of those components.

Let’s put on our thinking caps for a moment and think of all the possible components of this super fundamental piece of the TSQL language – the Select statement.

Anatomy of the Select

The first important piece of information with the Select is to understand what it does. What is a Select statement? What is the purpose of a Select? Really, the purpose of this statement should make sense based on the word alone – to select. In other words, to get or retrieve something that is of interest to you at that moment. In our case, that means to retrieve some piece of data from the database to meet our needs. Very simple, right? You issue a SELECT statement to get something out of the database.

Thinking through the Select statement, certain common elements probably stick out pretty easily. These elements might include the following:

  • Select
  • From
  • Join
  • Where

Then with a slightly more complex query, one might see the following components associated with a Select statement.

  • Top
  • Distinct
  • Group By
  • Order By

Then stepping it up to another level again, one might see the following:

  • Having
  • For
  • Into
  • With (Cube or Rollup)

Soak that in for a moment. Suddenly this fundamental piece of TSQL is starting to look a little more like black voodoo magic. It can get rather complex with this fundamental statement. Now soak in what the purpose of the SELECT is, as was stated just a bit ago. That purpose is to retrieve data that you need. One cannot always retrieve the desired data without some options to go with the SELECTion criteria. Suddenly, this complexity renders itself more as raw DBA power.

Knowing the various aspects of the Select, and not diving too far into what each does, you may be wondering how these pieces fit together into a query. Let’s take a quick look at a few different examples illustrating how these puzzle pieces fit together into a SELECT statement.

Here is an example showing a SELECT that utilizes the FROM, WHERE, GROUP BY, and ORDER BY clauses. When writing a SELECT statement, we will write the query in the order shown in the query. Using these clauses in a different location doesn’t read very well to humans and will throw a syntax error.

In this next example, I have used a few different clauses:

The real differences here being that I have used the DISTINCT, TOP and JOIN clauses. Again, trying to write this query with the TOP preceding the TOP or maybe with the JOIN preceding the top doesn’t really make much sense when reading it. Nor would it work if we tried to execute the query due to syntax errors.

This next example uses another option to output the results into an XML document using the FOR clause:

And in this last example, I take the previous example to output the results into a temp table.

These are all examples of the possibilities that can be used with the SELECT statement to retrieve data from the database in order to meet our needs. Each one illustrates different variations and some level of complexities between each of the examples.

Now the question is, did you know that SQL Server does not process these statements in the same syntactic sequence or English friendly forms that we write these SELECT statements? You may be surprised to learn that there is a different processing order to these pieces of the SELECT. This processing order is called the LOGICAL Processing order.

Let’s take one of the queries already shown and compare.

The query as we may write it will look like this:

Anatomy_select

The query as SQL Server will process it will look like this:

anatomy_logop

Or in other words, we may see it as shown in this video:

That is significantly different than we might think, but it makes sense if you think about it long enough.

How does one know that SQL Server processes these steps in this order? Well, besides the documentation in BOL that supports this, one can dive into the internals and generate a tree that will show this in action. For instance, let’s use the same query as just shown and see what the processing tree would look like.

select_tree_map

Besides looking very colorful and almost as if it were in Greek, this processing tree is somewhat revealing in how SQL Server breaks down that SELECT statement we have been using. One more pic, and then a quick explanation.

select_showplan_text

This last image was generated using the SHOWPLAN_TEXT setting. I want to show this because it helps to reinforce a concept from the processing tree. In this last image, note that the OUTER REFERENCE is to the SalesPerson table. This also means (without stating it) that the inner reference is to the Employee table. I have added the INNER and OUTER table references to the image of the processing tree. Why is this important? Well, when trying to interpret the tree it is important to understand that one starts from the inner most node and then read from bottom to top, right to left (as you ascend the nodes).

In this case, we will see that the Employee table is first referenced and processed, then the SalesPerson table, then the ON clause is evaluated. In the tree, I highlighted the ON that pertains to the join in bright green. The vertical dotted lines on the left are to help visually keep the nodes aligned as you step in/out from one level to the next. As we climb the tree to the top left corner, we will see that the last piece of the select to be processed in this case is the TOP statement. We can also see that the top is dependent on the ORDER BY (highlighted in RED) in order to get the correct desired rows.

Recap

This article jumps through the SELECT statement very briskly. The SELECT statement is a powerful tool within SQL Server. Gaining an understanding how the SELECT statement is processed can be helpful to write an appropriate SELECT and how to potentially pseudo code that SELECT. If you can understand that the engine first determines the source of the data and then starts to break down the additional requirements that have been sent with the SELECT. The more options that are thrown in with the SELECT statement, the larger and more complex the processing tree becomes for that statement.

Used well, a SELECT will become your ally and a handy tool in the toolbox. This has been an introduction to the SELECT and how it is processed behind the scenes. If you would like to see more information about the different clauses related to the select statement, check out the BOL entry here.

As an aside, the tree that was displayed is called the “converted tree.” Here is a listing of the other possible trees.

optimizertrees

SQL Server Locks – Back to Basics

loch_nessPrelude in SQL Minor

Back in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

SQL Server Locks

A fundamental component of SQL Server is locking and locks. Locks within SQL Server are critical to the proper functioning of the database and the integrity of the data within the database. The presence of locks does not inherently mean there is a problem. In no way should locking within SQL Server be considered a monster, though locks may often times be misconstrued in that light.

So what is locking? What are locks? In SQL Server, locking is a pessimistic mechanism that limits, or synchronizes, access to resources between multiple different requests. A lock is the manifestation of that mechanism. Think of a lock within SQL Server as if it were a padlock that was temporarily securing a gate to a storage shed limiting access to the shed.

That is the overly simplistic illustration of a lock in SQL Server. Have you ever seen a gate with more than one lock on it? Well, just like that multi-lock gate, resources within SQL Server can have several different locks placed on them. This is where it starts to get complicated and tricky.

To help control the locks and direct access to resources within the database, SQL Server utilizes a service called the lock manager. When a query is executed, the Database Engine (more precisely the query processor) determines what resources are needed and what locks are needed for those resources. Then the query processor walks over to the lock manager and begs for the lock types needed. The lock manager takes a look at the locks in use on the resources and either grants the locks or tells the query processor to wait until the locks are available.

This process is not terribly different from a tenant requesting the use of a meeting room from the building manager (or property manager). If the meeting room is available, the building manager will grant access to the meeting room and place a hold on that room preventing other tenants from using it at the time you requested it. Well, not terribly different but on a much much larger scale.

Just as you may see in a building with multiple meeting spaces, offices, rooms, and equipment as resources, there are many types of resources within SQL Server. The lock manager has to coordinate different types of locks for the varying types of resources. Let’s take a look at some of the lock types and resource types.

Lock Types

First up is to explore the different types of locks in SQL Server. One can either explore the internet or query SQL Server direct to try and find what is available. I will show two ways to find the different types of locks available within SQL Server via the following queries:

Looking at the results from the first query, I will get results similar to the following:

name map_value
lock_mode NL
lock_mode SCH_S
lock_mode SCH_M
lock_mode S
lock_mode U
lock_mode X
lock_mode IS
lock_mode IU
lock_mode IX
lock_mode SIU
lock_mode SIX
lock_mode UIX
lock_mode BU
lock_mode RS_S
lock_mode RS_U
lock_mode RI_NL
lock_mode RI_S
lock_mode RI_U
lock_mode RI_X
lock_mode RX_S
lock_mode RX_U
lock_mode LAST_MODE

The last lock type in this list is kind of an interesting one. I believe this maps to the RangeX-X lock type seen in the documentation and in the second of the two queries provided. What do these letters mean? It looks like a bunch of alphabet soup right? Well, there is a little more info to be found once turning to the documentation (check the image by the heading of “key”). Here’s a little table to decrypt it as well:

name map_value Decrypted
lock_mode NL No Lock
lock_mode SCH_S Schema Stability Locks
lock_mode SCH_M Schema Modification Locks
lock_mode S Shared
lock_mode U Update
lock_mode X Exclusive
lock_mode IS Intent Shared
lock_mode IU Intent Update
lock_mode IX Intent Exclusive
lock_mode SIU Shared with Intent Update
lock_mode SIX Shared with Intent Exclusive
lock_mode UIX Update with Intent Exclusive
lock_mode BU Bulk Update
lock_mode RS_S Shared Range-Shared
lock_mode RS_U Shared Range-Update
lock_mode RI_NL Insert Range-Null
lock_mode RI_S Insert Range-Shared
lock_mode RI_U Insert Range-Update
lock_mode RI_X Insert Range-Exclusive
lock_mode RX_S Exclusive Range-Shared
lock_mode RX_U Exclusive Range-Update
lock_mode LAST_MODE Exclusive Range-Exclusive

I want to just cover some quick highlights about a few lock types:

Lock mode Description
Shared (S) Used for Read operations that do not change data.
Update (U) Used on resources that can be updated. Helps to reduce some deadlocks.
Exclusive (X) Used for data-modification operations, such as INSERT, UPDATE, or DELETE. Ensures that multiple updates cannot be made to the same resource at the same time.
Intent Used to establish imminent need for a lock.
Schema Used when a schema dependent operation is executing.
Bulk Update (BU) Used when bulk copying data into a table and the TABLOCK hint is specified.
Key-range Protects the range of rows read by a query when using the serializable transaction isolation level. Ensures that other transactions cannot insert rows that would qualify for the queries of the serializable transaction if the queries were run again.

Lock Resources

Lock resources are the things upon which a lock is held. This is the lock granularity and hierarchy. It is this hierarchy that helps to reduce the overall cost of locking in SQL Server (again banishing the notion of it being a monster). The hierarchy of locks can be explored via queries in SQL Server (and it can be found – to a degree – in the documentation). Here are a couple of queries to help explore the locks hierarchy:

The results of the first of these queries are:

name map_value
lock_resource_type UNKNOWN_LOCK_RESOURCE
lock_resource_type NULL_RESOURCE
lock_resource_type DATABASE
lock_resource_type FILE
lock_resource_type UNUSED1
lock_resource_type OBJECT
lock_resource_type PAGE
lock_resource_type KEY
lock_resource_type EXTENT
lock_resource_type RID
lock_resource_type APPLICATION
lock_resource_type METADATA
lock_resource_type HOBT
lock_resource_type ALLOCATION_UNIT
lock_resource_type OIB
lock_resource_type ROWGROUP
lock_resource_type LAST_RESOURCE

There are a few differences between this list and what is found in the documentation. There are also a few differences between the results of the first and second queries. This should give a pretty good idea about the level at which locks could be held from a key all the way up to database.

I have shared these because the lock resources may come in handy when troubleshooting or even designing queries and database schemas.

Lock Owners

The last bit about locks for this time is the lock owner. Another way of thinking about this one is what is happening while the lock is being held. Is the query a part of a transaction that is holding the lock? Maybe the query is a cursor and the cursor is holding locks for the queries within the construct of the cursor. The following query that shows what the possible owners are can be a little helpful in understanding what an owner is.

name map_value
lock_owner_type Transaction
lock_owner_type Cursor
lock_owner_type Session
lock_owner_type SharedXactWorkspace
lock_owner_type ExclusiveXactWorkspace
lock_owner_type LockConflictNotificationObject
lock_owner_type LockTableIterator
lock_owner_type LastLockInfoOwner

This is another one of those items to explore which could help understand the nature of the locks if you happen to be troubleshooting or investigating a certain locking issue within your environment.

Recap

 

Locking is a fundamental concept in SQL Server. Locking is the mechanism that coordinates access to resources while maintaining data integrity. Locks are the tool used in locking to ensure access is only granted to the correct requestor and preventing access to the resource to outside sessions.

This was a basic introductory article on the topic. Stay tuned for a more in-depth look at a specific scenario involving locking and deadlocking.

SQL Server Principals – Back to Basics

Categories: News, Professional, Security, SSC
Comments: 3 Comments
Published on: January 28, 2016

securedb

Prelude in SQL Minor

Back in late December of 2015, a challenge of sorts was issued by Tim Ford (twitter) to write a blog post each month on a SQL Server Basic. Some have hash-tagged this as #backtobasics. Here is the link to that challenge sent via tweet.

I did not officially accept the challenge. Was an official acceptance required? I don’t know. I do know that I think it is a good challenge and that I intend to participate in the challenge. I hope I can meet the requirements and keep the posts to “basics”. Let’s just call this first post in the challenge to be my official acceptance.

SQL Server Principals

A fundamental component of SQL Server is the security layer. A principle player in security in SQL Server comes via principals. SQL Server principals come in more than flavor. (This is where a lot of confusion gets introduced.) The different types of principals are database and server. A database principal is also called a database user (sometimes just user). A server principal is also called a server login, or login for short.

Server and database principals are quite a bit different in function and come with different permission sets. The two are sometimes used, in reference, interchangeably, but that is done incorrectly. Today I hope to unravel some of what each is and how to see permissions for each type of principal.

Let’s first look at definitions for each type of principal and how to query for basic information about each principal (e.g. name and creation date).

Principals

In general principals are entities to which permissions are granted. These entities can request access or resources. As mentioned already, these principals can be scoped to different levels. These levels as mentioned include database and server.

Server Principals

Server Principals are the kind of principals that include logins and server roles. You may be familiar with some of the server roles:

  • public
  • sysadmin
  • securityadmin
  • serveradmin
  • setupadmin
  • processadmin
  • diskadmin
  • dbcreator
  • bulkadmin

The logins can be created from a windows login or group, or be created as a SQL Login. These principals can also include custom created server roles (in addition to the system created server roles already listed). Once a principal is created, permissions may be granted to the principal. When these permissions are granted, then when the principal attempts to request a resource (related to the permission), to perform a task, the principal can complete that task.

What permissions can be granted to a principal at the server scope? A list of permissions can be created via the following query:

And a sample of the results could look like this:

server permissions

An interesting note here is in the red highlighted permissions. If you look at the documentation for server permissions you will not find those two permissions (at least not as of this writing).

From the permissions returned by the query, you will see that these are all permissions related to server administration type of tasks. Note that these permissions do not grant the ability to do the type of tasks attributed to database types of actions. For instance, the server permissions do not grant the explicit permission to create a reference, execute a procedure or create a table within a database. These are all permissions reserved for the database scope.

Database Principals

Database principals are the type of principals scoped to the database level. These principals will request resources from the database and depending on permissions granted to the principal be able to perform various tasks within the database. The types of database principals include database roles, application roles, and database users. SQL Server Logins can be mapped to a database user and thereby be granted access to the database as the database principal.

Since a database principal can include the database roles, here is a list of the potential database roles:

  • public
  • db_owner
  • db_accessadmin
  • db_securityadmin
  • db_ddladmin
  • db_backupoperator
  • db_datareader
  • db_datawriter
  • db_denydatareader
  • db_denydatawriter

These roles can significantly simplify security management within the database. One can easily assign a database principal to be a member of the db_backupoperator role and thus grant that principal the ability to backup the database without much more need to grant or deny permissions.

Considering the ease of role management, one thing that bugs me and that I see frequently is the addition of a database user to every database role in all databases. It makes little sense to add a user to the db_owner role and then also to the db_datareader and db_denydatareader roles. For one, db_owner already has the ability to read from the tables thus negating db_datareader. Consider the db_denydatareader now – it is opposite to db_datareader. Why try to grant and deny read access to the same user? It makes no sense at all.

The next thing that pains me about these roles is the public role. I have written about the public role previously, but it needs stating again. Do not grant any additional permissions to the public role. This is like enabling the guest user and opening up the database to everybody. I have seen a rash of granting “alter server state” and “view server state” to the public server level role and it is painful to see. The same advice applies to the public role whether it is at the server or database scope.

Once a database principal has been created, it is time to proceed to giving the principal the necessary access. Here are some of the possible permissions that can be granted (along with a query to find even more):

db_permissions

Within these results, it is apparent that a database principal can be granted the ability to perform backups, create procedures, execute procedures and even create encryption keys. Between the server scope and the database scope, there is decent level of granularity to control access and resources within the database instance.

Finding Permissions for Each Principal

It is not uncommon to need to know and report on who has been granted what level of permissions within the database environment. If you have been with the database since inception to conception, you probably have documentation on every permission that has been granted. If you inherit a database, your odds on good documentation about the permissions is probably significantly lower. I have shared a comprehensive script previously to show all of the permissions. Sometimes a little less info is more than adequate for the current needs.

Here is a quick alternative with just a little less info to provide database permissions and server permissions based on the input of a specific list of users and databases.

Now, this script does require the use of a function to split strings. The one I am referencing was written by Jeff Moden and can be found here. In this example, I am looking at a few test principals that I created – testuser, Phantom, Gargouille and Garguoille (which is invalid). Running the script, I would receive results such as the following:

audit_output2

This is a pretty quick running script to gather report worthy data on principals and permissions.

A Haunting TSQL Tuesday Tale

Comments: 1 Comment
Published on: October 11, 2010

Time for the ghouls and goblins to come out of the woodwork once again for another tale of deception and tyranny.  OK maybe not tyranny but definitely a haunting tale is to be told this month while trick-or-treaters plan their routes and mend their costumes.  We have SQueeLing experts suiting up for the adult party known as TSQL Tuesday.

This month the vampires, ghosts, and zombies are all breaking it down at Sankar Reddy’s place (Blog | Twitter).  The ghoulish stories to be shared are to be myths from the underSQLworld.

Truncating a Table is Unrecoverable

Have you heard this one in the past?  If you truncate a table, you cannot undo the action like you can with a delete.  What is the difference between the two methods to remove data from a table?

When you truncate a table, the pages are deallocated with minimal locking of the resources.  When you use the delete statement, the data is deleted row-by-row with more locks.  Everybody knows that a delete can be rolled back.  Many people believe that a Truncate cannot be rolled back.

Well, at this point we really need a table to test.  So let’s create a table and perform a few simple tests.

With the test table ready to go, we will begin with the baseline demonstration on the Delete command.  Here is the script to demonstrate that a Delete can be rolled back (but we already knew that).

Pretty simple and straight forward – baseline is proven with that script.  The premise that a Delete can be rolled back has been established.  Now, how about that truncate statement?  For the truncate statement, I will use the same simple script, substituting the appropriate commands in this script.

When this script is executed, we see that indeed the data is recoverable.  Here are the results I get from the truncate and associated rollback script.

Conclusion

I think this is one of those myths that is frequently floated throughout the SQL world due to another myth that is related to the Truncate statement.  It is often explained that you can’t recover a truncate because the truncate is not logged.  That is also obviously wrong (if it weren’t logged, how could you recover the table without a restore?).  The myth about Truncate being non-logged has already been debunked (as has this one – in a different series) by Paul Randal in his Myth-A-Day series.  The key to this is that the Truncate is logged and that you wrap the command in a transaction – and only commit when you are ready.

This happens to be a good post for the back to basics series. Other topics in the series include (but are not limited to): Backupsbackup history and user logins.

page 2 of 2»

Calendar
May 2018
M T W T F S S
« Apr    
 123456
78910111213
14151617181920
21222324252627
28293031  

Welcome , today is Friday, May 25, 2018