Performance Comparison – Tables, Temp Tables, and Table Variables

People are confused about the differences amoung user tables, temp tables, and table variables in terms of inserting performance. I did a test today on my laptop – Core I7 Q720, 8GB RAM, SSD. I have both tempdb and user db on the same hard drive. I tested following scenarios over permanent tables, permanent tables in tempdb, global temp tables, local temp tables and table variables.

  1. Heap: inserting records to a heap one by one, each insert is an auto commit operation
  2. Primary Key: clustered primary key over an identity column. each insert is an auto commit operation
  3. Heap – Transaction: all insertions are in one transaction for each batch, table is a heap
  4. PK – Transaction: all insertions are in one transaction for each batch, table has clustered primary key on an identity column.
  5. Heap BulkLoad: Bulk load data to a heap
  6. PK BulkLoad: Bulk load data to a table with clustered index on an identity column

Continue reading “Performance Comparison – Tables, Temp Tables, and Table Variables”

Few Things About Pooled Connections

Pooled Connections are not a new concept of SQL Client. It reuses pool of opened connections for any new T-SQL requests. It provides a logically fresh new connection to application as you physically open a new connection. The fact behind it is that when you close a pooled connection, the connection is merely returned back to the pool without physically closing the tunnel(SQL Server doesn’t know the connection is closed by the client), and, when you open a new connection, SQL Client will open a new connection if there is no available connections in the pool. Otherwise SQL Client will get the connection from the pool. In this case, SQL Server still does not know that the connection is assigned for new purposes. Whenever you run the first command through a connection which assigned from the pool, SQL Client will “inject” execution of sp_reset_connection before the execution of your SQL statement or procedure to reset resources and mimik a brand new connection. A lot of discussions regarding sp_reset_connection in the internet. I will excerpt them at the end of this blog post. Apart from that, you must know few other things about it.
Continue reading “Few Things About Pooled Connections”

SQL Saturday 108 Redmond

SQL Saturday 108 – Redmond Feb 25, 2012

I will be presenting Parallel Task Scheduling with T-SQL. You might already attend my presentation on this before however this session will be the revised one with new features included. It’s not so difficult to write a stored procedure to execute number of tasks (SQL statements) sequentially and/or conditionally. However, things will become complex when you decide to run the code blocks within the procedure from different sessions concurrently with desired ordering and definable degrees of parallelism. In this session, we will discuss different ways to implement thread schedulers, semaphores, synchronizations, and process ordering as well as process monitoring and manipulation by using pure T-SQL.
Related Blog Posts:
Parallel Task Scheduling
Locking and Blocking

Sharing Global Temp Tables

Global temp tables can be accessed by one or more sessions. This important feature of SQL Server becomes more important than it was in my programming live since more applications I worked with and am currently working on have parallel processing involved heavily. Very often, amount of data generated from one sessioin are shared to many concurrent sessions. Global temp tables come and play. Creating a global temp table is simple.

create table ##temp(i int)

It will be released when

  1. It’s explicitly removed by any sessions
  2. The creator session closed and there is no other sessions referencing this session.

Continue reading “Sharing Global Temp Tables”

Parallel Task Scheduling (7) – Scheduled By Activator

There are always concerns while writing a multi-threading program. In the last post, the considerations are degree of parallelism and tasks starving from getting workers. Max threads for one scheduler can be restricted by applying counting semaphore as I posted there. But it still does not resolve worker starvation issue. The application structure in last post is to use pool of threads managed by Service Broker to schedule tasks issued from differrent scheduler. Based on this structure, definitely you can dedicate time on coding at the activation procedure or initiator side to make the worker to go round robbin manner to excute all tasks and hence eliminate the starvation at some level. Thread starvation will still happen under the situation that large amount of long tasks are scheduled. We cannot specify thread quantum to overcome the issue. However, we can definitely utilize the nature of SQL Server to overcome the issue.
Continue reading “Parallel Task Scheduling (7) – Scheduled By Activator”

Parallel Task Scheduling (6) – Scheduled By Caller

In my last post of this series, I explain how to use message queue of service broker to deliver the tasks to and execute them at conversation target. We did not talk about complex ordering in that scenario. We can also see that the ended conversations stay in the conversation endpoint which will affect the performance of service broker. Ending a conversation with cleanup option can result this issue. It’s not suggested however. In this article, I am going to talk about how to implement complex ordering using Service Broker as a message conveyer and performer but reducing number of conversation handles.
Continue reading “Parallel Task Scheduling (6) – Scheduled By Caller”

Parallel Task Scheduling (5) – Task Conveyer

We have discussed using Service Broker as a task scheduler to read, load, and execute tasks from a queue. The issue of this approach is that the scheduler has to constantly searching the queue to find an available task and then load and execute it. While there is nothing in the queue, Service Brocker will have to perform the same operation but for nothing, in which it adds extra unnecessaryload to the system. Why not use Service Broker to perfrom functions of scheduler, queue, and executor?
Continue reading “Parallel Task Scheduling (5) – Task Conveyer”

MD5 Collision

While I am reading back to my blog Generate MD5 Value from Big Data, I should have mention MD5 collision. This issue is within the standard algorithm not SQL Server itself. It happens rarely for character based hashing. It’s still a very reliable algorithm for change detecting but you need to be aware of this. Here is the code snippet of MD5 collision.

declare @a varbinary(130)  = 0xd131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f8955ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5bd8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70
declare @b varbinary(130)  = 0xd131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f8955ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5bd8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70 
select LEN(@a) LengthA, LEN(@b) LengthB
select case when @a = @b then 1 else 0 end [@a = @b ?]
select HASHBYTES('MD5', @a) [Hash from @a], HASHBYTES('MD5', @b) [hash from @b];
/*
LengthA     LengthB
----------- -----------
128         128

(1 row(s) affected)

@a = @b ?
-----------
0

(1 row(s) affected)

Hash from @a                       hash from @b
---------------------------------- ----------------------------------
0x79054025255FB1A26E4BC422AEF54EB4 0x79054025255FB1A26E4BC422AEF54EB4

(1 row(s) affected)

*/
		

Parallel Task Scheduling (4) – Service Broker

The biggest issue of using SQL jobs to schedule tasks is the resource consumption due to the inefficienct SQL cache utilization of SQL Server Agent. Service Broker in this case can be employed to resolve this issue. There are 2 options. One is to implement a tiny version of SQL Server Agent by Service Broker, in which we will discuss in the post, and the other way is to use Service Broker as a task conveyer and executor, which I am going to talk in next few posts in this series.
Continue reading “Parallel Task Scheduling (4) – Service Broker”

Generate MD5 Value from Big Data

How do you generate MD5 hash in SQL Server? You might immediately tell that it can be generated MD5 by calling HASHBYTES built-in function. That’s true, however, it only accepts generating MD5 hashes from variables which has less than 8000 bytes. How would you generate MD5 hash from big variables?

Continue reading “Generate MD5 Value from Big Data”