Revisions to Difference between scalar, table-valued, and aggregate functions in SQL server?

added info about new "Scalar UDF Inlining" feature of SQL Server 2019

Source Link

edited Jan 30, 2019 at 15:39

49.3k
11
141
184

T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
- Prior to SQL Server 2019: these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
- Starting in SQL Server 2019: certain T-SQL scalar UDFs can be inlined, that is, have their definitions placed directly into the query such that the query does not call the UDF (similar to how iTVFs work (see below)). There are restrictions that can prevent a UDF from being inlineable (if that wasn't a word before, it is now), and UDFs that can be inlined will not always be inlined due to several factors. This feature can be disabled at the database, query, and individual UDF levels. For more information on this really cool new feature, please see: Scalar UDF Inlining (be sure to review the "requirements" section).
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
- Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
- SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.

T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
- Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
- SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.

T-SQL (UDF):
- Prior to SQL Server 2019: these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
- Starting in SQL Server 2019: certain T-SQL scalar UDFs can be inlined, that is, have their definitions placed directly into the query such that the query does not call the UDF (similar to how iTVFs work (see below)). There are restrictions that can prevent a UDF from being inlineable (if that wasn't a word before, it is now), and UDFs that can be inlined will not always be inlined due to several factors. This feature can be disabled at the database, query, and individual UDF levels. For more information on this really cool new feature, please see: Scalar UDF Inlining (be sure to review the "requirements" section).
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
- Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
- SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.

fixed link now that MS Connect has been migrated incompetently with no redirects to the new yet equally horrible UserVoice system.

Source Link

edited Apr 24, 2018 at 15:01

Solomon Rutzky

49.3k
11
141
184

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
- Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
- SQL Server 2014 and 2016: these always report 100 rows.
- Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
- Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
- SQL Server 2014 and 2016: these always report 100 rows.
- Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
- Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
- SQL Server 2014 and 2016: these always report 100 rows.
- Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer

minor clarifications; minor corrections

Source Link

edited Oct 13, 2017 at 3:39

Solomon Rutzky

49.3k
11
141
184

Scalar functions (sometimes referred to as User-Defined Functions / UDFs) return a single value as a return value, not as a result set, and can be used in most places within a query or SET statement, except for the FROM clause (and maybe other places?). Also, scalar functions can be called via EXEC, just like Stored Procedures, though there are not many occasions to make use of this ability (for more details on this ability, please see my answer to the following question on DBA.StackExchange: Why scalar valued functions need execute permission rather than select?Why scalar valued functions need execute permission rather than select?). These can be created in both T-SQL and SQLCLR.

T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and prohibit parallel execution plans.
T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
SQLCLR (UDF): these scalar functions also typically run per each row or scanned row, but there is the possibility of allowing SQLCLR scalar functions to work in parallel plans ( 😃 ) if they: a) are marked IsDeterministic = true, b) do not do any database access, and c) most likely need to be in an Assembly marked as WITH PERMISSION_SET = SAFE.
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
- Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
- SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
- Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
- SQL Server 2014 and 2016: these always report 100 rows.
- Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer

CREATE FUNCTION CREATE FUNCTION (MSDN documentation)
CREATE AGGREGATE CREATE AGGREGATE (MSDN documentation)
CLR Table-Valued Function Example with Full Streaming (STVF / TVF) (article I wrote; free registration is required for this sitewrote)

Scalar functions (sometimes referred to as User-Defined Functions / UDFs) return a single value as a return value, not as a result set, and can be used in most places within a query or SET statement, except for the FROM clause (and maybe other places?). Also, scalar functions can be called via EXEC, just like Stored Procedures, though there are not many occasions to make use of this ability (for more details on this ability, please see my answer to the following question on DBA.StackExchange: Why scalar valued functions need execute permission rather than select?). These can be created in both T-SQL and SQLCLR.

T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and prohibit parallel execution plans.
SQLCLR (UDF): these scalar functions also typically run per each row or scanned row, but there is the possibility of allowing SQLCLR scalar functions to work in parallel plans ( 😃 ) if they: a) are marked IsDeterministic = true, b) do not do any database access, and c) most likely need to be in an Assembly marked as WITH PERMISSION_SET = SAFE.

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end.
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes.

CREATE FUNCTION (MSDN documentation)
CREATE AGGREGATE (MSDN documentation)
CLR Table-Valued Function Example with Full Streaming (STVF / TVF) (article I wrote; free registration is required for this site)

Scalar functions (sometimes referred to as User-Defined Functions / UDFs) return a single value as a return value, not as a result set, and can be used in most places within a query or SET statement, except for the FROM clause (and maybe other places?). Also, scalar functions can be called via EXEC, just like Stored Procedures, though there are not many occasions to make use of this ability (for more details on this ability, please see my answer to the following question on DBA.StackExchange: Why scalar valued functions need execute permission rather than select?). These can be created in both T-SQL and SQLCLR.

T-SQL (UDF): these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
- Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
- SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.

T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
- Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
- SQL Server 2014 and 2016: these always report 100 rows.
- Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer

CREATE FUNCTION (MSDN documentation)
CREATE AGGREGATE (MSDN documentation)
CLR Table-Valued Function Example with Full Streaming (STVF / TVF) (article I wrote)

replaced http://dba.stackexchange.com/ with https://dba.stackexchange.com/

Source Link

edited Apr 13, 2017 at 12:42

URL Rewriter Bot

Loading

minor edit and added link

Source Link

edited Jan 11, 2017 at 22:45

Solomon Rutzky

49.3k
11
141
184

Loading

Added links to articles

Source Link

edited Jan 14, 2016 at 17:25

Solomon Rutzky

49.3k
11
141
184

Loading

added 60 characters in body

Source Link

edited Jan 14, 2016 at 6:51

Solomon Rutzky

49.3k
11
141
184

Loading

Source Link

answered Jan 14, 2016 at 6:41

Solomon Rutzky

49.3k
11
141
184

Loading

Collectives™ on Stack Overflow

Return to Answer