How Functions function

If you are going to use functions first you should understand how functions work as they are often over used in database applications.  Personally I will start out by saying that I only use functions when I absolutely have no other choice.  This is something that I have seen many different opinions on so let just start out by taking a look at the different types of functions and how they work.

In SQL Server there are several different types of functions so lets take a look at what they are:

  • Table Value Functions (These are functions that return a table or dataset)
    • Inline (Contains a single T-SQL Statement)
    • Multi-Statement (Contains multiple T-SQL Statements and other functions such as Cursors and While loops)
  • Scalar-Valued Functions (Returns a single value)

 

There are many different reasons that developers use functions such as repeatable code, generic datasets, common logic etc…  These are all good reasons to use functions when used in the properly.  To understand properly lets start by taking a look at how functions work.

Table Valued Functions

There is a significant difference between the Inline vs. Multi-Statement functions when it comes to performance.  To help show some of these differences I created a few functions as well as a view and stored procedure to help show some differences that you can download here and goes against the Adventure Works database.  Lets first just take a look at what happens when we call the two functions of SALES.GetSalesOrderDetail() and SALES.GetSalesOrderDetail_Multi().   If you SET STATISTICS IO ON and SET STATISTICS TIME ON you will notice that the multi statement takes almost double the CPU Time (hint also run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE to insure data isn’t cached between tests)  Most of this overhead is contained in the insert to the temporary table which is a needed overhead of multi-statement functions. 

Now before I show comparisons to views and stored procedures lets take a look at one other significant difference between these two functions and that is when we use them in a join statement.  For this test I am going to execute the following two commands

SELECT sod.*
FROM SALES.GetSalesOrderDetail() sod
join sales.SalesOrderHeader soh
on sod.SalesOrderID = soh.SalesOrderID

SELECT sod.*
FROM SALES.GetSalesOrderDetail_Multi() sod
join sales.SalesOrderHeader soh
on sod.SalesOrderID = soh.SalesOrderID

If you take a look at the execution plans you will notice a significant difference with the inline functions as the SQL can be interpreted and joined using a Merge Join operator and also notice the estimated row count is fairly accurate.  However with the multi-statement function you will see that the SQL can not be interpreted and the function is applied via a nested loop and sequence operator as well as the row count is not accurate and set to 100.  If you also take a look at the CPU time on the multi statement used with a join you will see about 3x the cost.  

Now just for a comparison run the similar query this time using the view.

SELECT sod.*
FROM SALES.SalesOrderDetail_View sod
join sales.SalesOrderHeader soh
on sod.SalesOrderID = soh.SalesOrderID

You will notice the same execution plan as the inline functions but on average you should see about 1/2 the cost of the CPU with the view compared to the function. 

Scalar-Valued Functions

Scalar Valued Functions operate very similar to the multi-step table valued functions in that they require significant CPU cycles and typically the migration path from Scalar Functions is to use inline T-SQL.  There are multiple ways to do this but lets take a look at the following example, which you can download the script to create the function here.

This is a simple function that is just concatenating some fields to format an address.  If we call the function vs do the same query inline you will see an almost 10x cost in CPU for the function. See the examples below.

SELECT
    Person.FormatAddress(AddressLine1,AddressLine2,City,StateProvinceID,PostalCode) AS FormatAddress
FROM
    Person.Address;

GO;

SELECT
    COALESCE(AddressLine1,”) + ‘ ‘
    + COALESCE(AddressLine2,”) + ‘ ‘
    + COALESCE(City,”) + ‘ , ‘
    + COALESCE(st.StateProvinceCode,”) + ‘ ‘
    + COALESCE(PostalCode,”)
FROM
    Person.Address    Addr
JOIN
    Person.StateProvince    St
ON
    Addr.StateProvinceID    =    st.StateProvinceID;

 

Wrap-up

So what does all this mean?  In general if you want to ensure that your database isn’t using unnecessary CPU cycles avoid using functions.  In almost all performance tuning engagements that I have worked when there is a CPU utilization issue we were able to find offending Functions as the cause for this.  Functions are for lack of better words CPU hogs and the more operations that are within the function the worse it gets.  So if you already have function try and migrate them to inline T-SQL. In the case of Table Value functions look to see if you can accomplish the same thing via a stored procedure or if you have to use a Function at least make sure that it is an inline Table Value Function.