Tuesday, 1 December 2009

table-values functions. Make sure you write them as inline.

Here is a multi-line statement (slow):


CREATE FUNCTION dbo.fnSomeReport
(
@someParam INT,
@field_company_id INT = NULL
)
RETURNS @result TABLE
(
[field_company_name] VARCHAR(255),
[engineer] VARCHAR(100),
[sku] VARCHAR(30)
)
AS
BEGIN
INSERT INTO @result
(
[field_company_name],
[engineer],
[sku]
)
SELECT field_company_name, engineer, sku
FROM ...
WHERE ID > someParam
AND (@field_company_id IS NULL
OR
field_company_id = @field_company_id)

RETURN
END
GO



Now as an inline statement (fast)

CREATE FUNCTION dbo.fnSomeReport
(
@someParam INT,
@field_company_id INT = NULL
)
RETURNS TABLE AS
RETURN
(
SELECT field_company_name, engineer, sku
FROM ...
WHERE ID > someParam
AND (@field_company_id IS NULL
OR
field_company_id = @field_company_id)
)
GO


Right then whats going on and what the hell is all the fuss about anyway?
The query optimizer expands the function as if it was a macro, and generates the plan as if you had provided the expanded query. Thus, there is no performance cost for packaging a SELECT statement into a table-valued function. For this reason, when you want to reuse a stored procedure that consists of a single SELECT statement, rewriting it into an inline UDF is without doubt the best choice. (Or instead of rewriting it, move the SELECT into a UDF, and rewrite the existing procedure as a wrapper on the function, so that the client is unaffected.)

Why is the multi-line function bad?
A multi-statement function has a body that can have as many statements as you like. You need to declare a return table, and you insert the data to return into that table.

You use multi-statement functions as inline functions, but in difference to inline functions, they are not expanded in place, but instead it's like you would call a stored procedure in the middle of the query, and return the data in a temp table. This permits you to move more complex stored procedure into functions.

Compared to inline functions, multi-statement functions incur some overhead due to the return table. More important, though, is that if you use the function in a query, the optimizer will have no idea of what the function returns, and will make standard assumptions. Many times this is not a real issue, but particularly if the result is huge, it can be.

In most cases, you want to avoid the use of scalar UDFs in SELECT, WHERE, GROUP BY and ORDER BY clauses. These will essentially force the optimizer to execute the function once for each row in the query, and depending on the complexity of the function, this could bring your system to its knees.

A view does not accept parameters. To restrict the rows returned by a view, you use a WHERE clause in the query that is accessing the view, not in the view itself. Table-valued functions, are more like parameterized views; in fact, they were called parameterized views during development, according to Hal Berenson.

Inline table-valued functions - These are better than scalar functions. Some will say they are preferential over stored procedures because they can easily be used in JOINs and regular SELECTs; some will say they are preferential over views because they can be parameterized. In this way, they behave just like Access parameterized views, with the added benefit that a query plan can be cached.

Other links: