Run T-SQL in Parallel

Writing CLR procedures to run T-SQL concurrently is not an extremely new idea. I have seen a lot of implementations and I have written and improved it many times by myself as well. After those coding exercises, I found few important things were not (or just partially) addressed.

  • Termination of launcher session: Either the launcher session get cancelled or killed, running asynchronous workers should be cancelled.
  • Different ways to shut down a batch: Waiting Workers should be abandoned. Executing workers should be either cancelled or waited to be completed.
  • Effective monitoring: People want to see which session is running what.
  • Adjustable maximum threads in the course of execution.

Continue reading “Run T-SQL in Parallel”

NULL Values Impact the Performance of MAX() and MIN()

We know that we need to take special attention about NULL values while writing queries because they may lead us into writing incorrect query logics. NULLs may also affect the performance of aggregation function MAX() and MIN(). I have submitted this issue to Microsoft Connect. There are number of ways to get around it. I think the improvement can also be done at query engine level. Hope this can be fixed in next or future version of SQL Server. But for now, we need to know it and know how to get around it.

Continue reading “NULL Values Impact the Performance of MAX() and MIN()”

Collation Of Temp Tables

Temp tables are used frequently while coding. It might cause issues when the collation of user database is different from system default collation. The message you will see is

Cannot resolve the collation conflict between “SQL_Latin1_General_CP1_CI_AS” and “Latin1_General_CS_AS” in the equal to operation.

Continue reading “Collation Of Temp Tables”

Row Level Security (2)

In my last post, I talked about the concept of row level security impelementation. Performance issue will gradually arise while number of rows in MyData increases. This is because is_member function evaluate every row in MyData table and check whether it’s the row should be returned. In order to make index becomes Search-able, we will have to change the structure of the view.

Continue reading “Row Level Security (2)”

Row Level Security (1)

In bigger organizations, the data in a table might be sensitive to few departments but not the others. Some people may need to rows from the table that belongs to few departments where another group of people may need to access the data belongs to another few departments in one table. Those 2 groups of people might share the data from one or few departments. Implementing such logics is not a big deal for customized applications, for instance use procedures to filter rows out. What if users access data using very genaric tools, such as SSMS, in which they can arbitrarily issue queries against table. How would you selectively return rows from a table without asking uses putting filters in their queries?

Continue reading “Row Level Security (1)”

Transactions, Chatty or Chunky?

Chatty or Chunky? What do you mean? Running code block with a transaction can ensure the atomicity of the code, all done or all undone. In defult programming mode, select statement will never start a transaction automatically. Data modification language, such insert, delete, update, merge, send, receive, etc, will automatically start a transaction as the command starts if there isn’t any transactions and will commit
automatically (if there isn’t any transactions). I call this kind of strategy Chatty.

Continue reading “Transactions, Chatty or Chunky?”

@@DBTS vs MIN_ACTIVE_ROWVERSION

Both @@DBTS and Min_active_RowVersion() are used to get the current Row Version in a database. Row version, is also called timestamp formerly, is an unsigned bigint data type of a column stored and presented as a binary(8). This data type is like a identity value of a table in which every table can only have one RowVersion column and the value of the row version is managed by SQL Server rather than uses, it’s read-only. When a new record is inserted into a table with RowVersion column, a row version will be assigned to the row. When update happens to the table, the row version of updated row will be increased. The values of the row version from tables within a database is always unique.

Continue reading “@@DBTS vs MIN_ACTIVE_ROWVERSION”

Change Default DML Behavior

DML, Data Manipulation Language, is used to add data to table and modify existing rows in tables. There are 3 commands

  1. Insert : insert records to a table
  2. Delete: remove records to a table
  3. Update: modify records in a table

In SQL Server, you are allowed to change the default behaviors of those 3 commands. For instance, while inserting a record, the new record can be applied to a table as an update(can be a delete as well).

Continue reading “Change Default DML Behavior”

Generate MD5 Value from Big Data

How do you generate MD5 hash in SQL Server? You might immediately tell that it can be generated MD5 by calling HASHBYTES built-in function. That’s true, however, it only accepts generating MD5 hashes from variables which has less than 8000 bytes. How would you generate MD5 hash from big variables?

Continue reading “Generate MD5 Value from Big Data”