Quantcast
Channel: OptimalBI » SQL Server | OptimalBI
Viewing all articles
Browse latest Browse all 9

How to Write Dynamic SQL Well

$
0
0

Steven-Ensslen-Orange

I do a lot of work in Microsoft SQL Server's Transact SQL stored procedures. Almost everything that I write performs a single operation on a series of tables: for example computes a specific aggregate on each table in an entire database. Transact SQL does not have inheritance or other features standard in more recent, object-oriented languages like java or C#. Transact SQL has dynamic SQL.

Dynamic SQL has a bad reputation. When I started as a DBA I had a code quality checklist that had dynamic SQL on it as a problem. I later wrote a policy that said that dynamic object creation was forbidden. I've learned better since. If you're going to perform the same operation on a number of different tables, generating the code is the only sensible option. Otherwise you're repeating yourself, which is a far worse practise.

If you're writing Dynamic SQL you should be sure to:

  1. Work at a single level of abstraction

    Single level of abstraction is good practise for any programmer, but is particularly important in Dynamic SQL. Separate the generation of the dynamic SQL from the execution of the dynamic SQL. One part of the generator queries the database dictionary and works with strings. A separate part worries about execution. Pass the actual SQL as plain text between the two.
    This is particularly important if you run the statements that you generate rather that building procedures out of them. You need an interface that you can debug that shows the SQL that you are running.

  2. Use transactions

    Wrap all of your dynamic SQL in explicit transactions and exception handlers. If you're generating procedures then the generator should use an exception-handled explicit-transaction to create the dynamic procedure, which also contains an exception-handled explicit-transaction.

  3. Log obsessively

    Write a detailed log. The generator will probably execute without anyone watching. Log not only that the generator ran but also why it ran.

  4. Guard against bad inputs

    SQL-injection doesn't exist in my line of business. In business intelligence users get data out but don't put data in. Still every identifier (column name or table name) that is dynamic in the SQL you are generating needs to be wrapped as a quoted identifier. This guards against unusual characters in object names and helps with debugging. You haven't written much dynamic SQL if you haven't seen an edge case misplace an entire clause inside a set of quotes.

  5. Generate with a sensible level of permissions

    Do NOT run your generator as a super-user or DBA privileged user. It will occasionally do unexpected things. Create an account with the privileges that are needed.

You have four major options for dynamic SQL:

  1. Generate statements outside the database at runtime

    This is theoretically the best practise. The generator is written in a fully featured programming language. It includes templates (strings) of the SQL that it will run in the database. It queries the database dictionary and a configuration table to find out where to apply each template.

    Benefits:
    1. The generator is easier to write as it is written in a fully featured programming language.
    2. Much higher levels of code are reused between Dynamic-SQL generators. The same Transact SQL language features (strong typing even at compile time) that prevent one generic SQL statement from being applied to different tables limit code sharing between generators.
    Disadvantages:
    1. There has to be an execution environment outside the database. That may be extra cost or need extra approvals. If you're using a cloud provider that means a compute Virtual machine in addition to the database. Even if you're using CLR stored procedures (which I strongly recommend if SQL Server is your database and you're generating outside SQL), you'll need extra privileges and have problems if there are multiple database instances on the same operating system.
    2. The dynamic SQL environment will have different outages than the database. In-database code never has to check if the database is up.
  2. Generate a set of stored procedures outside the database at compile time

    If you haven't done Dynamic-SQL before this option is going to seem best. Think of it as a gateway drug. It will get by overly fussy architects and DBAs. In fact, you don't even have to let anyone know that you didn't write the mountain of resulting code by hand.

    Benefits:
    1. There is nothing dynamic at run time.
    2. There is a very small run-time performance increase as the database dictionary is queried and the resulting SQL is parsed at compile time.
    3. You can use arbitrary tools to actually build the code. This means you're likely to use a language that is both more powerful and that you know better.
    4. The dynamic SQL is easy to get. It exists and is stable both inside and out of the database.
    5. It is easiest to use static-code-analysis on dynamic-SQL generated via this technique.
    Disadvantages:
    1. A massive pile of code is produced. Even if everyone is clear that this code is generated, there are costs. Do NOT check all this generated code into source control, that's just a waste of everyone's resources.
    2. The reason that everyone doesn't just do it this way is that the developer needs to intervene to re-generate code when the database changes. This introduces cost and delay.
    3. Worse, the generator or the generation environment can go missing in this technique. The developer gets a new computer and can't get the generator to run, as there was a dependency that no one knew of.
    4. Finally, sooner or later some genius decides to modify the generated code directly rather than through the generator. Fight one of these battles of overwriting each other's code (usually in production) and the cost of time and aggravation and application instability will make you want to use one of the other techniques.
  3. Generate a set of stored procedures inside the database as needed

    I use his technique often. The generator is a stored procedure that acts as zoo keeper of the generated code.

    Benefits:
    1. You only have to know one language, the generator and target are both SQL.
    2. The code is usually static. This is a double-edged sword. You can get the code to read it, but you have to check the database to verify that it isn't changing repeatedly.
    3. You can't lose the generation environment as it is part of the execution environment.
    4. If anyone modifies the generated code the generator can detect the change and revert automatically.
    Disadvantages:
    1. SQL is not an easy language to generate other code in.
    2. Making procedures in Dynamic SQL needs more complicated generator code than just writing the statements that the procedures contain as dynamic SQL. (See the "exec exec trick").
  4. Generate and Run Dynamic SQL in the database as needed

    This is the obvious technique. It is the same as the first technique, just writing the generator in SQL. I use this often.

    Benefits:
    1. Like the previous technique, you only have to know one language, the generator and target are both SQL.
    2. You can't lose the generation environment as it is part of the execution environment.
    3. There is no code outside of the generator for anyone to modify.
    Disadvantages:
    1. SQL is not an easy language to generate other code in.
    2. Just getting the code to read is tricky.

Don't repeat yourself. When you need to do the same thing to a horde of different tables, Dynamic SQL is the right tool. Use it wisely. I do.

-Steven


Viewing all articles
Browse latest Browse all 9

Trending Articles