<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David Barbarin &#187; SQL Server 2019</title>
	<atom:link href="https://blog.developpez.com/mikedavem/ptag/sql-server-2019/feed" rel="self" type="application/rss+xml" />
	<link>https://blog.developpez.com/mikedavem</link>
	<description>MVP DataPlatform - MCM SQL Server</description>
	<lastBuildDate>Thu, 09 Sep 2021 21:19:50 +0000</lastBuildDate>
	<language>fr-FR</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.42</generator>
	<item>
		<title>Mitigating Scalar UDF&#8217;s procedural code performance with SQL 2019 and  Scalar UDF Inlining capabilities</title>
		<link>https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites</link>
		<comments>https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites#comments</comments>
		<pubDate>Thu, 05 Mar 2020 15:10:02 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL Server 2019]]></category>
		<category><![CDATA[Imperative]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[Procedural code]]></category>
		<category><![CDATA[Programing]]></category>
		<category><![CDATA[Scalar UDF Inlining]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1516</guid>
		<description><![CDATA[A couple of days ago, I read the write-up of my former colleague @FranckPachot about refactoring procedural code to SQL. This is recurrent subject in the database world and I was interested in transposing this article to SQL Server because &#8230; <a href="https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>A couple of days ago, I read the write-up of my former colleague <a href="https://twitter.com/FranckPachot" rel="noopener" target="_blank">@FranckPachot</a> about <a href="https://blog.dbi-services.com/refactoring-procedural-to-sql-an-example-with-mysql-sakila/" rel="noopener" target="_blank">refactoring procedural code to SQL</a>. This is recurrent subject in the database world and I was interested in transposing this article to SQL Server because it was about refactoring a  Scalar-Valued function to a SQL view. The latter one is a great alternative when it comes performance but something new was shipped with SQL Server 2019 and could address (or at least could mitigate) this recurrent scenario. </p>
<p><span id="more-1516"></span></p>
<p>First of all, Scalar-Valued functions (from the User Defined Function category) are interesting objects for code modularity, factoring and reusability. No surprise to see them widely used by DEVs. But they are not always suited to performance considerations especially when it concerns the “impedance mismatch” problem. This is term used to refer to the problems that occurs due to differences between the database model and the programming language model. Indeed, from one side, a database world with SQL language that is declarative, and with queries that are set or multiset-oriented. To another side, programing world with imperative-oriented languages requiring accessing each tuple individually for processing.</p>
<p>To cut the story short, Scalar UDF provides programing benefits for DEVs but when performance matters, we discourage to use them for the aforementioned reasons. Before continuing, let’s precise that all the scripts and demos in the next sections are based on <a href="https://github.com/jOOQ/jOOQ/tree/master/jOOQ-examples/Sakila" rel="noopener" target="_blank">salika-db</a> project on GitHub. Franck Pachot used the mysql version and fortunately there exists a sample for SQL Server as well. Furthermore, the mysql function used as initial example by Franck may be translated to SQL Server as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- Scalar function<br />
CREATE OR ALTER FUNCTION inventory_in_stock (@p_inventory_id INT) <br />
RETURNS BIT<br />
BEGIN<br />
&nbsp; &nbsp; DECLARE @v_rentals INT;<br />
&nbsp; &nbsp; DECLARE @v_out &nbsp; &nbsp; INT;<br />
&nbsp; &nbsp; DECLARE @verif &nbsp; &nbsp; BIT;<br />
&nbsp; &nbsp; <br />
<br />
&nbsp; &nbsp; --AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE<br />
&nbsp; &nbsp; --FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED<br />
<br />
&nbsp; &nbsp; SET @v_rentals = (SELECT COUNT(*) FROM rental WHERE inventory_id = @p_inventory_id);<br />
<br />
&nbsp; &nbsp; IF @v_rentals = 0 <br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 1<br />
&nbsp; &nbsp; END<br />
&nbsp; &nbsp; ELSE<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @v_out = (SELECT COUNT(rental_id) <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM inventory <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LEFT JOIN rental ON inventory.inventory_id = rental.inventory_id<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE inventory.inventory_id = @p_inventory_id<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND rental.return_date IS NULL)<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; IF @v_out &amp;gt; 0 <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 0;<br />
&nbsp; &nbsp; &nbsp; &nbsp; ELSE<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 1;<br />
&nbsp; &nbsp; END;<br />
<br />
&nbsp; &nbsp; RETURN @verif;<br />
END <br />
GO</div></div>
<p>During his write-up, Franck provided a natural alternative of this UDF based on a SQL view and here a similar solution applied to SQL Server:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE OR ALTER VIEW v_inventory_stock_status <br />
AS<br />
<br />
SELECT <br />
&nbsp; &nbsp; i.inventory_id,<br />
&nbsp; &nbsp; CASE <br />
&nbsp; &nbsp; &nbsp; &nbsp; WHEN NOT EXISTS (SELECT 1 FROM dbo.rental AS r WHERE r.inventory_id = &nbsp;i.inventory_id AND r.return_date IS NULL) THEN 1<br />
&nbsp; &nbsp; &nbsp; &nbsp; ELSE 0<br />
&nbsp; &nbsp; END AS inventory_in_stock<br />
FROM dbo.inventory AS i<br />
GO</div></div>
<p>Then similar to what Franck did, we can join this view with the inventory table to get the expected outcome:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join v_inventory_stock_status AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-1-Query-OutPut.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-1-Query-OutPut.jpg" alt="156 - 1 - Query OutPut" width="356" height="82" class="alignnone size-full wp-image-1518" /></a></p>
<p>There is another alternative that could be use here base on a CTE rather than a TSQL view as follows. However, the performance is similar in both cases and it is up to each DEV which solution fits with their needs:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">;with cte<br />
as<br />
(<br />
&nbsp; &nbsp; SELECT <br />
&nbsp; &nbsp; &nbsp; &nbsp; i.inventory_id,<br />
&nbsp; &nbsp; &nbsp; &nbsp; CASE <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHEN NOT EXISTS (SELECT 1 FROM dbo.rental AS r WHERE r.inventory_id = &nbsp;i.inventory_id AND r.return_date IS NULL) THEN 1<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ELSE 0<br />
&nbsp; &nbsp; &nbsp; &nbsp; END AS inventory_in_stock<br />
&nbsp; &nbsp; FROM dbo.inventory AS i<br />
)<br />
select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join cte AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p>I compared then the performance between the UDF based version and the TSQL view:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- udf<br />
select count(*),dbo.inventory_in_stock(inventory_id) <br />
from inventory <br />
group by dbo.inventory_in_stock(inventory_id)<br />
GO<br />
-- view<br />
select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join v_inventory_stock_status AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p>The outcome below (CPU, Reads, Writes, Duration) is as expected. The SQL view is the winner by far. </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-2-UDF-vs-View-performance-e1583417465245.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-2-UDF-vs-View-performance-e1583417465245.jpg" alt="156 - 2 - UDF vs View performance" width="1000" height="166" class="alignnone size-full wp-image-1519" /></a></p>
<p>Similar to Franck’s finding, the performance gain is as the cost of rewriting the code for DEVs in this scenario. But SQL Server 2019 provides another interesting way to continue using the UDF abstraction  without compromising on performance: <a href="https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/scalar-udf-inlining?view=sql-server-ver15" rel="noopener" target="_blank">Scalar T-SQL UDF Inlining</a> feature and I was curious to see how much improvement we get with such capabilities for this scenario. </p>
<p>First time I executed the following UDF-based TSQL script on SQL Server 2019 RTM (be sure to be in 150 compatibility mode), I ran into some OOM issues for the second query:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- SQL 2017-<br />
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;<br />
GO<br />
SELECT dbo.inventory_in_stock(10)<br />
GO<br />
-- SQL 2019+<br />
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON;<br />
GO<br />
SELECT dbo.inventory_in_stock(10)</div></div>
<p></p>
<blockquote><p>Msg 8624, Level 16, State 17, Line 14<br />
Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.</p></blockquote>
<p>To be honest, not a surprise to be honest because I already aware of it by reading the blog post of <a href="https://twitter.com/sqL_handLe" rel="noopener" target="_blank">@sqL_handle</a> a couple of weeks ago. Updating to CU2 fixed my issue. The second shot revealed some interesting outcomes.<br />
The query plan of first query (&lt;= SQL 2017) is as we may expected usually from executing a TSQL scalar function. From an execution perspective, this black box is materialized in the form of the compute scalar operator as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2017-query-plan.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2017-query-plan.jpg" alt="156 - 3 - UDF 2017 query plan" width="610" height="203" class="alignnone size-full wp-image-1521" /></a></p>
<p>But the story has changed with Scalar UDF Inlining capability. This is illustrated by the below pictures which are sample of a larger execution plan:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-query-plan-e1583417692798.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-query-plan-e1583417692798.jpg" alt="156 - 3 - UDF 2019 query plan" width="1000" height="375" class="alignnone size-full wp-image-1522" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-2-query-plan-e1583420250648.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-2-query-plan-e1583420250648.jpg" alt="156 - 3 - UDF 2019 2 query plan" width="1000" height="329" class="alignnone size-full wp-image-1524" /></a></p>
<p>The query optimizer has inferred some relation operations from my (imperative based) scalar UDF based on the <a href="https://www.microsoft.com/en-us/research/project/froid/" rel="noopener" target="_blank">Froid framework</a> and provides several benefits including compiler optimization and parallelism (initially not possible with UDFs).</p>
<p>Let’s perform the same benchmark test that I performed between the UDF-based and the TSQL view based queries. In fact, I had to propose a slightly variation of the query to hope kicking in the Scalar UDF Inline capability:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- First UDF query <br />
select count(*),dbo.inventory_in_stock(inventory_id) <br />
from inventory <br />
group by dbo.inventory_in_stock(inventory_id)<br />
GO<br />
<br />
-- Variation of the first query<br />
;with cte<br />
as<br />
(<br />
&nbsp; &nbsp; select inventory_id,dbo.inventory_in_stock(inventory_id) as inventory_in_stock<br />
&nbsp; &nbsp; from inventory <br />
)<br />
select &nbsp;count(*), inventory_in_stock<br />
from cte<br />
group by inventory_in_stock<br />
GO</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-4-UDF-2019-benchmark-query-plan-e1583420352841.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-4-UDF-2019-benchmark-query-plan-e1583420352841.jpg" alt="156 - 4 - UDF 2019 benchmark query plan" width="1000" height="357" class="alignnone size-full wp-image-1525" /></a></p>
<p>From a performance perspective, it is worth noting the improvement is not necessarily on the read operation but more the CPU and Duration times.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-5-UDF-vs-UDF-inline-performance-e1583420400334.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-5-UDF-vs-UDF-inline-performance-e1583420400334.jpg" alt="156 - 5 - UDF vs UDF inline performance" width="1000" height="110" class="alignnone size-full wp-image-1527" /></a></p>
<p>But let’s push the tests further by increasing the amount of data. As a reminder, the performance of the test is tied to the number of UDF execution and implicitly number of records in the Inventory table. </p>
<p>So, let’s add a bunch of records to the Inventory table …</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">INSERT inventory (film_id, store_id, last_update)<br />
SELECT <br />
&nbsp; &nbsp; film_id,<br />
&nbsp; &nbsp; store_id,<br />
&nbsp; &nbsp; GETDATE()<br />
FROM inventory;</div></div>
<p>&#8230; and let&rsquo;s execute this script to get respectively a total of 146592 and 2345472 rows for each test. Here the corresponding performance outcomes:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-6-UDF-vs-UDF-inline-performance-add-more-rows-e1583420498722.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-6-UDF-vs-UDF-inline-performance-add-more-rows-e1583420498722.jpg" alt="156 - 6 - UDF vs UDF inline performance - add more rows" width="1000" height="225" class="alignnone size-full wp-image-1528" /></a></p>
<p>I noticed more rows there are in the inventory table better performance we get for each corresponding test:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-7-UDF-vs-UDF-inline-performance-chart-cpu.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-7-UDF-vs-UDF-inline-performance-chart-cpu.jpg" alt="156 - 7 - UDF vs UDF inline performance - chart cpu" width="876" height="384" class="alignnone size-full wp-image-1529" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-8-UDF-vs-UDF-inline-performance-chart-duration.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-8-UDF-vs-UDF-inline-performance-chart-duration.jpg" alt="156 - 8 - UDF vs UDF inline performance - chart duration" width="868" height="384" class="alignnone size-full wp-image-1530" /></a></p>
<p>Well, interesting outcome without rewriting any code isn’t it? An 80% decrease in average for query duration time and 61% for CPU time execution. For a sake of curiosity let’s take a look at the different query plans:</p>
<p><strong>Scalar UDF Inlining not enabled</strong></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-10-UDF-more-rows-execution-plan-e1583420738157.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-10-UDF-more-rows-execution-plan-e1583420738157.jpg" alt="156 - 10 - UDF - more rows execution plan" width="1000" height="155" class="alignnone size-full wp-image-1531" /></a></p>
<p>Again, the real cost is hidden by the UDF black box through the compute scalar operator but we guess easily that every row processed by compute Scalar operator implies the dbo.inventory_in_stock() function. </p>
<p><strong>Scalar UDF Inlining enabled</strong></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-11-UDF-inlining-more-rows-execution-plan-e1583420789863.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-11-UDF-inlining-more-rows-execution-plan-e1583420789863.jpg" alt="156 - 11 - UDF inlining - more rows execution plan" width="1000" height="255" class="alignnone size-full wp-image-1532" /></a></p>
<p>Without going into details of the execution plan, something that draw attention is compiler optimizer tricks kicked in including parallelism. All the optimization stuff done by the query processor is helpful to improve the overall performance of the query.</p>
<p>So last point, does Scalar UDF Inlining scale better than the SQL view? </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-9-UDF-inline-vs-view-performance-e1583420871437.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-9-UDF-inline-vs-view-performance-e1583420871437.jpg" alt="156 - 9 - UDF inline vs view performance" width="1200" height="74" class="alignnone size-full wp-image-1533" /></a></p>
<p>This last output seems to confirm the SQL view remains the winner among the alternatives in this specific scenario and you will have to choose best solution and likely the acceptable tradeoff that will fit with your context.</p>
<p>See you!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
