Monday, March 16, 2015

SSIS Performance Tip: Avoid the Sort Transform

Most BI developers are familiar with the concept of blocking and non-blocking transforms and why you should avoid blocking transforms. However, it does surprise me how many times I see people using the sort transform in their SSIS packages. The sort transform is fully blocking, meaning it needs to receive all the rows of data before it can do its job. This makes it one of the worst transforms to use. Using a TSQL command in the source and a few other options, you can smite the evil sort transform and prevent bottlenecks.

Take this OLE DB Source for example. I am using a SQL statement to select columns from a table that contains world mortality stats. Be sure to include an 'Order by' clause at the end. This one is ordering by the [Year] column.

Once you have your SQL statement, click 'OK' to close the source editor. Now, right click the OLE DB Source and select 'Show Advanced Editor'. In the advanced editor, click the 'Input and Output Properties' tab. In that window. click OLE DB Source Output and to the right you will see all of the output properties. Look for the 'IsSorted' property and set it to 'True' in the drop down menu.

Next, expand the 'OLE DB Source Output' node and then expand the 'Output Columns' node.

In the list of output columns, find and select the column on which you would like to sort. To the right, you will see a list of properties of the column. Look for the property 'SortKeyPosition' and set it to a value of 1.

If you are sorting on multiple columns, be sure to set that same 'SortKeyPosition' property for each one, making sure you select the appropriate number for the order in which they should be sorted.

The end result is the data being sorted as it is being retrieved from the source, through the power of TSQL. It's a huge time saver and one of the best techniques you can incorporate in your package design.


Post a Comment