<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David Barbarin &#187; performance</title>
	<atom:link href="https://blog.developpez.com/mikedavem/ptag/performance/feed" rel="self" type="application/rss+xml" />
	<link>https://blog.developpez.com/mikedavem</link>
	<description>MVP DataPlatform - MCM SQL Server</description>
	<lastBuildDate>Thu, 09 Sep 2021 21:19:50 +0000</lastBuildDate>
	<language>fr-FR</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.42</generator>
	<item>
		<title>Azure monitor as observability platform for Azure SQL Databases and more</title>
		<link>https://blog.developpez.com/mikedavem/p13205/sql-azure/azure-monitor-as-observability-platform-for-azure-sql-databases</link>
		<comments>https://blog.developpez.com/mikedavem/p13205/sql-azure/azure-monitor-as-observability-platform-for-azure-sql-databases#comments</comments>
		<pubDate>Mon, 08 Feb 2021 16:57:26 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[DevOps]]></category>
		<category><![CDATA[SQL Azure]]></category>
		<category><![CDATA[Azure Monitor]]></category>
		<category><![CDATA[Azure SQL Analytics]]></category>
		<category><![CDATA[Azure SQL Database]]></category>
		<category><![CDATA[devops]]></category>
		<category><![CDATA[Log Analytics]]></category>
		<category><![CDATA[observability]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1762</guid>
		<description><![CDATA[In a previous blog post, I wrote about reasons we moved our monitoring of on-prem SQL Server instances on Prometheus and Grafana. But what about Cloud and database services? We have different options and obviously in my company we thought &#8230; <a href="https://blog.developpez.com/mikedavem/p13205/sql-azure/azure-monitor-as-observability-platform-for-azure-sql-databases">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In a previous <a href="https://blog.developpez.com/mikedavem/p13203/sql-server-2014/why-we-moved-sql-server-monitoring-on-prometheus-and-grafana" rel="noopener" target="_blank">blog post</a>, I wrote about reasons we moved our monitoring of on-prem SQL Server instances on Prometheus and Grafana. But what about Cloud and database services? </p>
<p><span id="more-1762"></span></p>
<p>We have different options and obviously in my company we thought first moving our Azure SQL Database workload telemetry on on-prem central monitoring infrastructure as well. But not to mention the main blocker which is the serverless compute tier because Telegraf Server agent would imply initiating a connection that could prevent auto-pausing the database or at least it would made monitoring more complex because it would supposed to have a predictable workload all the time. </p>
<p>The second option was to rely on Azure monitor which is a common platform for combining several logging, monitoring and dashboard solutions across a wide set of Azure resources. It is scalable platform, fully managed and provides a powerful query language and native features like alerts, if logs or metrics match specific conditions. Another important point is there is no vendor lock-in, with this solution, as we can always fallback to our self-hosted Prometheus and Grafana instances if neither computer tier doesn’t fit nor in case Azure Monitor might not be an option anymore! </p>
<p>Firstly, to achieve a good observability with Azure SQL Database we need to put both diagnostic telemetry and SQL Server audits events in a common Log Analytics workspace. A quick illustration below: </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-0-Azure-SQL-DB-Monitor-architecture.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-0-Azure-SQL-DB-Monitor-architecture-1024x387.jpg" alt="173 - 0 - Azure SQL DB Monitor architecture" width="584" height="221" class="alignnone size-large wp-image-1763" /></a></p>
<p>Diagnostic settings are configured per database and including basic metrics (CPU, IO, Memory etc …) and also different SQL Server internal metrics as deadlock, blocked processes or query store information about query execution statistic and waits etc&#8230; For more details please refer to the Microsoft <a href="https://docs.microsoft.com/en-us/azure/azure-sql/database/metrics-diagnostic-telemetry-logging-streaming-export-configure?tabs=azure-portal" rel="noopener" target="_blank">BOL</a>.</p>
<p>SQL Azure DB auditing is both server-level or database-level configuration setting. In our context, we defined a template of events at the server level which is then applied to all databases within the logical server. By default, 3 events are automatically audited:<br />
&#8211;	BATCH_COMPLETED_GROUP<br />
&#8211;	SUCCESSFUL_DATABASE_AUTHENTICATION_GROUP<br />
&#8211;	FAILED_DATABASE_AUTHENTICATION_GROUP</p>
<p>The first one of the list is probably to be discussed according to the environment because of its impact but in our context that&rsquo;s ok because we faced a data warehouse workload. However we added other ones to meet our security requirements:<br />
&#8211;	PERMISSION_CHANGE_GROUP<br />
&#8211;	DATABASE_PRINCIPAL_CHANGE_GROUP<br />
&#8211;	DATABASE_ROLE_MEMBER_CHANGE_GROUP<br />
&#8211;	USER_CHANGE_PASSWORD_GROUP</p>
<p>But if you take care about Log Analytics as target for SQL audits, you will notice it is still a feature in preview as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-4-Audit-target.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-4-Audit-target.jpg" alt="173 - 4 - Audit target" width="484" height="153" class="alignnone size-full wp-image-1765" /></a></p>
<p>To be clear, usually we don’t consider using Azure preview features in production especially when they remain in this state for a long time but in this specific context we got interested by observability capabilities of the platform. From one hand, we get very useful performance insights through SQL Analytics dashboards (again in preview) and from the other hand we can easily query logs and traces through Log Analytics for correlation with other metrics. Obviously, we hope Microsoft moving a step further and providing this feature in GA in the near feature. </p>
<p>Let’s talk briefly of SQL Analytics first. It is an advanced and free cloud monitoring solution for Azure SQL database monitoring performance and it relies mainly on your Azure Diagnostic metrics and Azure Monitor views to present data in a structured way through performance dashboard.</p>
<p>Here an example of built-in dashboards we are using to track activity and high CPU / IO bound queries against our data warehouse.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-1-SQL-Analytics-general-dashboard-e1612797920282.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-1-SQL-Analytics-general-dashboard-1024x410.jpg" alt="173 - 1 - SQL Analytics general dashboard" width="584" height="234" class="alignnone size-large wp-image-1768" /></a></p>
<p>You can use drill-down capabilities to different contextual dashboards to get insights of resource intensive queries. For example, we identified some LOG IO intensive queries against a clustered columnstore index and after some refactoring of UPDATE statement to DELETE + INSERT we reduced drastically LOG IO waits.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-2-SQL-Analytics-IO-e1612797960660.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-2-SQL-Analytics-IO-1024x316.jpg" alt="173 - 2 - SQL Analytics IO" width="584" height="180" class="alignnone size-large wp-image-1767" /></a></p>
<p>In addition, Azure monitor helped us in an another scenario where we tried to figure out recent workload patterns and to know if the current compute tier still fits with it. As said previously, we are relying on Serverless compute tier to handle the data warehouse-oriented workload with both auto-scaling and auto-pausing capabilities. At the first glance, we might expect a typical nightly workload as illustrated to Microsoft <a href="https://docs.microsoft.com/en-us/azure/azure-sql/database/serverless-tier-overview#:~:text=Serverless%20is%20a%20compute%20tier,of%20compute%20used%20per%20second." rel="noopener" target="_blank">BOL</a> and a cost optimized to this workload:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-6-Serverless-pattern.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-6-Serverless-pattern.jpg" alt="173 - 6 - Serverless pattern" width="516" height="316" class="alignnone size-full wp-image-1769" /></a></p>
<p><em>Images from Microsoft BOL</em></p>
<p>It could have been true when the activity started on Azure, but the game has changed with new incoming projects over the time. Starting with the general performance dashboard, the workload seems to follow the right pattern for Serverless compute tier, but we noticed billing keep going during unexpected timeframe as shown below. Let’s precise that I put deliberately only a sample of two days, but this pattern is a good representation of the general workload in our context. </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-3-General-performance-dashboard.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-3-General-performance-dashboard-1024x556.jpg" alt="173 - 3 - General performance dashboard" width="584" height="317" class="alignnone size-large wp-image-1771" /></a></p>
<p>Indeed, workload should be mostly nightly-oriented with sporadic activity during the day but quick correlation with other basic metrics like CPU or Memory percentage usage confirmed a persistent activity all day. We have CPU spikes and probably small batches that keep minimum memory around at other moments. </p>
<p>As per the <a href="https://docs.microsoft.com/en-us/azure/azure-sql/database/serverless-tier-overview#:~:text=Serverless%20is%20a%20compute%20tier,of%20compute%20used%20per%20second." rel="noopener" target="_blank">Microsoft documentation</a>, the minimum auto-pausing  delay value is 1h and requires an inactive database (number of sessions = 0 and CPU = 0 for user workload) during this timeframe. Basic metrics didn’t provide any further insights about connections, applications or users that could generate such &laquo;&nbsp;noisy&nbsp;&raquo; activity, so we had to go another way by looking at the SQL Audit logs stored in Azure Monitor Logs. Data can be read through KQL which stands for Kusto Query Language (and not Kibana Query Language <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":-)" class="wp-smiley" /> ). It’s the language used to query the Azure log databases: Azure Monitor Logs, Azure Monitor Application Insights and others and it is pretty similar to SQL language in the construct. </p>
<p>Here the first query I used to correlate number of events with metrics and that could prevent auto-pausing to kick in for the concerned database including RPC COMPLETED, BATCH COMPLETED, DATABASE AUTHENTICATION SUCCEEDED or DATABASE AUTHENTICATION FAILED</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">AzureDiagnostics<br />
| where Category == 'SQLSecurityAuditEvents' and (action_name_s in ('RPC COMPLETED','BATCH COMPLETED') or action_name_s contains &quot;DATABASE AUTHENTICATION&quot;) &nbsp;and LogicalServerName_s == 'xxxx' and database_name_s == xxxx<br />
| summarize count() by bin(event_time_t, 1h),action_name_s<br />
| render columnchart</div></div>
<p>Results are aggregated and bucketized per hour on generated time event with bin() function. Finally, for a quick and easy read, I choosed a simple and unformatted column chart render. Here the outcome:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-7-Audit-per-hour-per-event-e1612798279257.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-7-Audit-per-hour-per-event-1024x459.jpg" alt="173 - 7 - Audit per hour per event" width="584" height="262" class="alignnone size-large wp-image-1772" /></a></p>
<p>As you probably noticed, daily activity is pretty small compared to nightly one and seems to confirm SQL batches and remote procedure calls. From this unclear picture, we can confirm anyway the daily workload is enough to keep the billing going because there is no per hour timeframe where there is no activity. </p>
<p>Let’s write another KQL query to draw a clearer picture of which applications ran during the a daily timeframe 07:00 – 20:00:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">let start=datetime(&quot;2021-01-26&quot;);<br />
let end=datetime(&quot;2021-01-29&quot;);<br />
let dailystart=7;<br />
let dailyend=20;<br />
let timegrain=1d;<br />
AzureDiagnostics<br />
| project &nbsp;action_name_s, event_time_t, application_name_s, server_principal_name_s, Category, LogicalServerName_s, database_name_s<br />
| where Category == 'SQLSecurityAuditEvents' and (action_name_s in ('RPC COMPLETED','BATCH COMPLETED') or action_name_s contains &quot;DATABASE AUTHENTICATION&quot;) &nbsp;<br />
| where LogicalServerName_s == 'xxxx' and database_name_s == 'xxxx' <br />
| where event_time_t &amp;gt; start and event_time_t &amp;lt; end<br />
| where datetime_part(&amp;quot;Hour&amp;quot;,event_time_t) between (dailystart .. dailyend)<br />
| summarize count() by bin(event_time_t, 1h), application_name_s<br />
| render columnchart with (xtitle = &amp;#039;Date&amp;#039;, ytitle = &amp;#039;Nb events&amp;#039;, title = &amp;#039;Prod SQL Workload pattern&amp;#039;)</div></div>
<p>And here the new outcome:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-8-Audit-per-hour-per-application.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-8-Audit-per-hour-per-application-1024x380.jpg" alt="173 - 8 - Audit per hour per application" width="584" height="217" class="alignnone size-large wp-image-1774" /></a></p>
<p>The new chart reveals some activities from SQL Server Management Studio but most part concerns applications with .Net SQL Data Provider. For a better clarity, we need more information related about applications and, in my context, I managed to address the point by reducing the search scope with the service principal name that issued the related audit event. It results to this new outcome that is pretty similar to previous one:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2021/02/173-9-Audit-per-hour-per-sp.jpg"><img src="http://blog.developpez.com/mikedavem/files/2021/02/173-9-Audit-per-hour-per-sp-1024x362.jpg" alt="173 - 9 - Audit per hour per sp" width="584" height="206" class="alignnone size-large wp-image-1775" /></a></p>
<p>Good job so far. For a sake of clarity, the service principal obfuscated above is used by our Reporting Server infrastructure and reports to get data from this data warehouse.  By going this way to investigate daily activity at different moments on the concerned Azure SQL database, we came to the conclusion that using Serverless computer tier didn’t make sense anymore and we need to upgrade likely to another computer tier.</p>
<p><strong>Additional thoughts</strong></p>
<p>Azure monitor is definitely a must to have if you are running resources on Azure and if you don’t own a platform for observability (metrics, logs and traces). Otherwise, it can be even beneficial for freeing up your on-prem monitoring infrastructure resources if scalability is a concern. Furthermore, there is no vendor-locking and you can decide to stream Azure monitor data outside in another place but at the cost of additional network transfer fees according to the target scenario. For example, Azure monitor can be used directly as datasource with Grafana. Azure SQL telemetry can be collected with Telegraf agent whereas audit logs can be recorded in another logging system like Kibana. In this blog post, we just surfaced the Azure monitor capabilities but, as demonstrated above, performing deep analysis correlations from different sources in a very few steps is a good point of this platform.</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Building a more robust and efficient statistic maintenance with large tables</title>
		<link>https://blog.developpez.com/mikedavem/p13201/sql-server-vnext/building-a-more-robust-and-efficient-statistic-maintenance-with-large-tables</link>
		<comments>https://blog.developpez.com/mikedavem/p13201/sql-server-vnext/building-a-more-robust-and-efficient-statistic-maintenance-with-large-tables#comments</comments>
		<pubDate>Mon, 26 Oct 2020 21:05:34 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[rebuild index]]></category>
		<category><![CDATA[update statistic]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1689</guid>
		<description><![CDATA[In a past, I went to different ways for improving update statistic maintenance in different shops according to their context, requirement and constraints as well as the SQL Server version used at this moment. All are important inputs for creating &#8230; <a href="https://blog.developpez.com/mikedavem/p13201/sql-server-vnext/building-a-more-robust-and-efficient-statistic-maintenance-with-large-tables">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In a past, I went to different ways for improving update statistic maintenance in different shops according to their context, requirement and constraints as well as the SQL Server version used at this moment. All are important inputs for creating a good maintenance strategy which can be very simple with execution of sp_updatestats or specialized scripts to focus on some tables.  </p>
<p><span id="more-1689"></span></p>
<p>One of my latest experiences on this topic was probably one of the best although we go to circuitous way for dealing with long update statistic maintenance task on a large database. We used a mix of statistic analysis stuff and improvements provided by SQL Server 2014 SP1 CU6 and parallel update statistic capabilities. I wrote a <a href="https://blog.dbi-services.com/experiencing-updating-statistics-on-a-big-table-by-unusual-ways/" rel="noopener" target="_blank">blog post</a> if you are interested in learning more on this experience.</p>
<p>I’m working now for a new company meaning a different context … At the moment of this write-up, we are running on SQL Server 2017 CU21 and database sizes are in different order of magnitude (more than 100GB compressed) compared to my previous experience. However, switching from default sampling method to FULLSCAN for some large tables drastically increased the update statistic task beyond to the allowed Windows time frame (00:00AM to 03:00AM) without any optimization. </p>
<p><strong>Why to change the update statistic sampling method? </strong></p>
<p>Let’s start from the beginning: why we need to change default statistic sample? In fact, this topic has been already covered in detail in the internet and to make the story short, good statistics are part of the recipe for efficient execution plans and queries. Default sampling size used by both auto update mechanism or UPDATE STATISTIC command without any specification come from a <a href="https://docs.microsoft.com/en-us/archive/blogs/srgolla/sql-server-statistics-explained" rel="noopener" target="_blank">non-linear algorithm</a> and may not produce good histogram with large tables. Indeed, the sampling size decreases as the table get bigger leading to a rough picture of values in the table which may affect cardinality estimation in execution plan … Exactly the side effects we experienced on with a couple of our queries and we wanted to minimize in the future. Therefore, we decided to improve cardinality estimation by switching to FULLSCAN method only for some big tables to produce better histogram. But this method comes also at the cost of a direct impact on consumed resources and execution time because the optimizer needs to read more data to build a better picture of data distribution and sometimes with an higher <a href="https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql?redirectedfrom=MSDN&amp;view=sql-server-ver15" rel="noopener" target="_blank">tempdb usage</a>. Our first attempt on ACC environment increased the update statistic maintenance task from initially 5min with default sampling size to 3.5 hours with the FULLSCAN method and only for large tables … Obviously an unsatisfactory solution because we were out of the allowed Windows maintenance timeframe. </p>
<p><strong>Context matters</strong></p>
<p>But first let’s set the context a little bit more: The term “large” can be relative according to the environment. In my context, it means tables with more than 100M of rows and less than 100GB in size for the biggest one and 10M of rows and 10GB in size for lower ones. In fact, for partitioned tables total size includes the archive partition’s compression. </p>
<p>Another gusty detail: concerned databases are part of availability groups and maxdop for primary replica was setup to 1. There is a long story behind this value with some side effects encountered in the past when switching to <strong>maxdop &gt; 1 and cost threshold for parallelism = 50</strong>. At certain times of the year, the workload increased a lot and we faced memory allocation issues for some parallel queries (parallel queries usually require more memory). This is something we need to investigate further but we switched back to maxdop=1 for now and I would say so far so good …</p>
<p>Because we don’t really have index structures heavily fragmented between two rebuild index operations, we’re not in favor of frequent rebuilding index operations. Even if such operation can be either done online or is resumable with SQL Server 2017 EE, it remains a very resource intensive operation including log block replication on the underlying Always On infrastructure. In addition, there is a strong commitment of minimizing resource overhead during the Windows maintenance because of concurrent business workload in the same timeframe.  </p>
<p><strong>Options available to speed-up update statistic task</strong></p>
<p> <strong>Using MAXDOP / PERSIST_SAMPLE_PERCENT with UPDATE STATISTICS command</strong></p>
<p><a href="https://support.microsoft.com/en-us/help/4041809/kb4041809-update-adds-support-for-maxdop-for-create-statistics-and-upd" rel="noopener" target="_blank">KB4041809</a> describes new support added for MAXDOP option for the CREATE STATISTICS and UPDATE STATISTICS statements in Microsoft SQL Server 2014, 2016 and 2017. This is especially helpful to override MAXDOP settings defined at the server or database-scope level. As a reminder, maxdop value is forced to 1 in our context on availability group primary replicas. </p>
<p>For partitioned tables we don’t go through this setting because update statistic is done at partition level (see next section).  The concerned tables own 2 partitions, respectively CURRENT and ARCHIVE. We keep the former small in size and with a relative low number of rows (only last 2 weeks of data). Therefore, there is no real benefit of using MAXDOP to force update statistics to run with parallelism in this case.</p>
<p>But non-partitioned large tables (&gt;=10 GB) are good candidate. According to the following picture, we noticed an execution time reduction of 57% by increasing maxdop value to 4 for some large tables with these specifications:<br />
&#8211;	~= 10GB<br />
&#8211;	~ 11M rows<br />
&#8211;	112 columns<br />
&#8211;	71 statistics</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-11-maxdop-nonpartitioned-tables.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-11-maxdop-nonpartitioned-tables.jpg" alt="168 - 11 - maxdop - nonpartitioned tables" width="481" height="289" class="alignnone size-full wp-image-1691" /></a></p>
<p>Another feature we went through is described in <a href="https://support.microsoft.com/en-us/help/4039284/kb4039284-enhancement-new-keyword-is-added-to-create-and-update-statis" rel="noopener" target="_blank">KB4039284</a> and available since with SQL Server 2016+.  In our context, the maintenance of statistics relies on a custom stored procedure (not Ola maintenance scripts yet) and we have configured default sampling rate method for all statistics and we wanted to make exception only for targeted large tables. In the past, we had to use <a href="https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql?view=sql-server-ver15" rel="noopener" target="_blank">NO_RECOMPUTE</a> option to exclude statistics for automatic updates. The new PERSIST_SAMPLE_PERCENT option indicates SQL Server to lock the sampling rate for future update operations and we are using it for non-partitioned large tables. </p>
<p> <strong>Incremental statistics</strong></p>
<p>SQL Server 2017 provides interesting options to reduce maintenance overhead. Surprisingly some large tables were already partitioned but no incremental statistics were configured. Incremental statistics are especially useful for tables where only few partitions are changed at a time and are a great feature to improve efficiency of statistic maintenance because operations are done at the partition level since SQL Server 2014. Another <a href="https://blog.dbi-services.com/sql-server-2014-new-incremental-statistics/" rel="noopener" target="_blank">blog post</a> written a couple of years ago and here was a great opportunity to apply theorical concepts to a practical use case. Because we already implemented partition-level maintenance for indexes, it made sense to apply the same method for statistics to minimize overhead with FULLSCAN method and to benefit from statistic update threshold at the partition level. As said in the previous section, partitioned tables own 2 partitions CURRENT (last 2 weeks) and ARCHIVE and the goal was to only update statistics on the CURRENT partition on daily basis. However, let’s precise that although statistic objects are managed are the partition level, the SQL Server optimizer is not able to use them directly (no change since SQL Server 2014 to SQL Server 2019 as far as I know) and refers instead to the global statistic object.</p>
<p>Let’s demonstrate with the following example:</p>
<p>Let&rsquo;s consider BIG TABLE with 2 partitions for CURRENT (last 2 weeks) and ARCHIVE values as shown below:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">SELECT <br />
&nbsp; &nbsp; s.object_id,<br />
&nbsp; &nbsp; s.name AS stat_name,<br />
&nbsp; &nbsp; sp.rows,<br />
&nbsp; &nbsp; sp.rows_sampled,<br />
&nbsp; &nbsp; sp.node_id,<br />
&nbsp; &nbsp; sp.left_boundary,<br />
&nbsp; &nbsp; sp.right_boundary,<br />
&nbsp; &nbsp; sp.partition_number<br />
FROM sys.stats AS s<br />
CROSS APPLY sys.dm_db_stats_properties_internal(s.object_id, s.stats_id) AS sp<br />
WHERE s.object_id = OBJECT_ID('[dbo].[BIG TABLE]')<br />
AND s.name = 'XXXX_OID'</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-2-Stats-Partition-e1603745274341.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-2-Stats-Partition-e1603745274341.jpg" alt="168 - 2 - Stats Partition" width="800" height="113" class="alignnone size-full wp-image-1692" /></a></p>
<p>Statistic object is incremental, and we got an internal picture of per-partition statistics and the global one. You need to enable trace flag 2309 and to add node id reference to the DBCC SHOW_STATISTICS command as well.  Let’s dig into the ARCHIVE partition to find a specific value within the histogram step:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">DBCC TRACEON ( 2309 );<br />
GO<br />
DBCC SHOW_STATISTICS('[dbo].[BIG TABLE]', 'XXX_OID', 7) WITH HISTOGRAM;</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-3-histogram-partition-1.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-3-histogram-partition-1.jpg" alt="168 - 3 - histogram partition 1" width="825" height="157" class="alignnone size-full wp-image-1693" /></a></p>
<p>Then, I used the value 9246258 in the WHERE clause of the following query:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">SELECT *<br />
FROM dbo.[BIG TABLE]<br />
WHERE XXXX_OID = 9246258</div></div>
<p>It gives an estimated cardinality of 37.689 rows as show below …</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-4-query.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-4-query.jpg" alt="168 - 4 - query" width="614" height="186" class="alignnone size-full wp-image-1694" /></a></p>
<p>… Cardinality estimation is 37.689 while we should expect a value of 12 rows here referring to the statistic histogram above. Let’s now have a look at the global statistic (nodeid = 1):</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">DBCC SHOW_STATISTICS('[dbo].[BIG TABLE]', 'XXX_OID', 1) WITH HISTOGRAM;</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-5-histogram-partition-global.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-5-histogram-partition-global.jpg" alt="168 - 5 - histogram partition global" width="822" height="139" class="alignnone size-full wp-image-1695" /></a></p>
<p>In fact, the query optimizer estimates rows by using AVG_RANGE_ROWS value between 9189129 and 9473685 in the global statistic. Well, it is likely not as perfect as we may expect. Incremental statistics do helps in reducing time taken to gather stats for sure, but it may not be enough to represent the entire data distribution in the table – We are still limited to 200 steps in the global statistic object. Pragmatically, I think we may mitigate this point by saying things could be worst somehow if we need either to use default sample algorithm or to decrease the sample size of your update statistic operation. </p>
<p>Let’s illustrate with the BIG TABLE. To keep things simple, I have voluntary chosen a (real) statistic where data is evenly distributed.  Here some pictures of real data distribution:</p>
<p>The first one is a simple view of MIN, MAX boundaries as well as AVG of occurrences (let’s say duplicate records for a better understanding) by distinct value:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-6-nb_occurences_per_value.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-6-nb_occurences_per_value.jpg" alt="168 - 6 - nb_occurences_per_value" width="457" height="104" class="alignnone size-full wp-image-1696" /></a></p>
<p>Referring to the picture above, we may notice there is no high variation of number of occurrences per distinct value represented by the leading XXX_OID column in the related index. In the picture below, another representation of data distribution where each histogram bucket includes the number of distinct values per number of occurrences. </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-10-histogram_per_nb-occurences.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-10-histogram_per_nb-occurences.jpg" alt="168 - 10 - histogram_per_nb occurences" width="481" height="289" class="alignnone size-full wp-image-1697" /></a></p>
<p>For example, we have roughly 2.3% of distinct values in the BIG TABLE with 29 duplicate records. The same applies for values 28, 31 and so on … In short, this histogram confirms a certain degree of homogeneity of data distribution and avg_occurences value is not so far from the truth.</p>
<p>Let’s using default sample value for UPDATE STATISTICS. A very low sample of rows are taken into account leading to very approximative statistics as show below:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">SELECT <br />
&nbsp; &nbsp; rows,<br />
&nbsp; &nbsp; rows_sampled,<br />
&nbsp; &nbsp; CAST(rows_sampled * 100. / rows AS DECIMAL(5,2)) AS [sample_%],<br />
&nbsp; &nbsp; steps<br />
FROM sys.dm_db_stats_properties(OBJECT_ID('[dbo].[BIG TABLE]), 1)</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-7-default_sample_value.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-7-default_sample_value.jpg" alt="168 - 7 - default_sample_value" width="421" height="56" class="alignnone size-full wp-image-1699" /></a></p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">SELECT *<br />
FROM sys.dm_db_stats_histogram(OBJECT_ID('[dbo].[BIG TABLE]), 1)</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-8-default_sample_histogram-e1603745718861.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-8-default_sample_histogram-e1603745718861.jpg" alt="168 - 8 - default_sample_histogram" width="800" height="218" class="alignnone size-full wp-image-1700" /></a></p>
<p>Focusing on average_range_rows colum values, we may notice estimation is not representative of real distribution in the BIG TABLE. </p>
<p>After running FULLSCAN method with UPDATE STATISTICS command, the story has changed, and estimation is now closer to the reality:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-9-fullscan_histogram-e1603745769635.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-9-fullscan_histogram-e1603745769635.jpg" alt="168 - 9 - fullscan_histogram" width="800" height="255" class="alignnone size-full wp-image-1701" /></a></p>
<p>As a side note, one additional benefit of using FULLSCAN method is to get a representative statistic histogram in fewer steps. This is well-explained in the SQL Tiger team&rsquo;s <a href="https://docs.microsoft.com/en-us/archive/blogs/sql_server_team/perfect-statistics-histogram-in-just-few-steps" rel="noopener" target="_blank">blog post</a> and we noticed this specific behavior with some statistic histograms where frequency is low … mainly primary key and unique index related statistics.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-1-statistic-histogram-before-after-e1603745894373.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-1-statistic-histogram-before-after-e1603745894373.jpg" alt="168 - 1 - statistic histogram before after" width="800" height="196" class="alignnone size-full wp-image-1702" /></a></p>
<p><strong>How benefit was incremental statistic? </strong></p>
<p>The picture below refers to one of our biggest partitioned large table with the following characteristics:<br />
&#8211;	~ 410M rows<br />
&#8211;	~ 63GB in size (including compressed partition size)<br />
&#8211;	67 columns<br />
&#8211;	30 statistics </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/168-12-maxdop-partitioned-tables.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/168-12-maxdop-partitioned-tables.jpg" alt="168 - 12 - maxdop - partitioned tables" width="738" height="289" class="alignnone size-full wp-image-1703" /></a></p>
<p>As noticed in the picture above, overriding maxdop setting at the database-scoped level resulted to an interesting drop in execution time when FULLSCAN method is used (from 03h30 to 17s in the best case)<br />
Similarly, combining efforts done for both non-partitioned and partitioned larges tables resulted to reduced execution time of update statistic task from ~ 03h30 to 15min – 30min in production that is a better fit with our requirements. </p>
<p>Going through more sophisticated process to update statistic may seem more complicated but strongly required in some specific scenarios. Fortunately, SQL Server provides different features to help optimizing this process. I’m looking forward to seeing features that will be shipped with next versions of SQL Server.</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Curious case of locking scenario with SQL Server audits</title>
		<link>https://blog.developpez.com/mikedavem/p13200/sql-server-vnext/curious-case-of-locking-scenario-including-sql-server-audits</link>
		<comments>https://blog.developpez.com/mikedavem/p13200/sql-server-vnext/curious-case-of-locking-scenario-including-sql-server-audits#comments</comments>
		<pubDate>Mon, 05 Oct 2020 19:25:47 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[blocking]]></category>
		<category><![CDATA[dbatools]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL Server audit]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1673</guid>
		<description><![CDATA[In high mission-critical environments, ensuring high level of availability is a prerequisite and usually IT department addresses required SLAs (the famous 9’s) with high available architecture solutions. As stated by Wikipedia: availability measurement is subject to some degree of interpretation. &#8230; <a href="https://blog.developpez.com/mikedavem/p13200/sql-server-vnext/curious-case-of-locking-scenario-including-sql-server-audits">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In high mission-critical environments, ensuring high level of availability is a prerequisite and usually IT department addresses required SLAs (the famous 9’s) with high available architecture solutions. As stated by <a href="https://en.wikipedia.org/wiki/High_availability" rel="noopener" target="_blank">Wikipedia</a>: <strong><em>availability measurement is subject to some degree of interpretation</em></strong>. Thus, IT department generally focus on uptime metric whereas for other departments availability is often related to application response time or tied to slowness / unresponsiveness complains. The latter is about application throughput and database locks may contribute to reduce it. This is something we are constantly monitoring in addition of the uptime in my company. </p>
<p><span id="more-1673"></span></p>
<p>A couple of weeks ago, we began to experience suddenly some unexpected blocking issues that included some specific query patterns and SQL Server audit feature. This is all more important as this specific scenario began from one specific database and led to create a long hierarchy tree of blocked processes with blocked SQL Server audit operation first and then propagated to all databases on the SQL Server instance. A very bad scenario we definitely want to avoid … Here a sample of the blocking processes tree:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-1-blocking-scenarios-e1601924652500.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-1-blocking-scenarios-e1601924652500.jpg" alt="167 - 1 - blocking scenarios" width="800" height="56" class="alignnone size-full wp-image-1674" /></a></p>
<p>First, let’s set the context :</p>
<p>We are using SQL Server audit for different purposes since the SQL Server 2014 version and we actually running on SQL Server 2017 CU21 at the moment of this write-up. The obvious one is for security regulatory compliance with login events. We also rely on SQL Server audits to extend the observability of our monitoring system (based on Prometheus and Grafana). Configuration changes are audited with specific events and we link concerned events with annotations in our SQL Server Grafana dashboards. Thus, we are able to quickly correlate events with some behavior changes that may occur on the database side. The high-level of the audit infrastructure is as follows:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-0-audit-architecture-e1601924728531.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-0-audit-architecture-e1601924728531.jpg" alt="167 - 0 - audit architecture" width="800" height="417" class="alignnone size-full wp-image-1675" /></a></p>
<p>As shown in the picture above, a PowerShell script carries out stopping and restarting the audit target and then we use the archive audit file to import related data to a dedicated database.<br />
Let’s precise we use this process without any issues since a couple of years and we were surprised to experience such behavior at this moment. Enough surprising for me to write a blog post <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /> &#8230; Digging further to the root cause, we pointed out to a specific pattern that seemed to be the root cause of our specific issue:</p>
<p><strong><br />
1.	Open transaction<br />
2.	Foreach row in a file execute an UPSERT statement<br />
3.	Commit transaction<br />
</strong></p>
<p>This is a <a href="https://www.red-gate.com/simple-talk/sql/t-sql-programming/rbar-row-by-agonizing-row/" rel="noopener" target="_blank">RBAR pattern</a> and it may become slow according the number of lines it has to deal with. In addition, the logic is encapsulated within a single transaction leading to accumulate locks during all the transaction duration. Thinking about it, we didn’t face the specific locking issue with other queries so far because they are executed within short transactions by design. </p>
<p>This point is important because enabling SQL Server audits implies also extra metadata locks. We decided to mimic this behavior on a TEST environment in order to figure out what happened exactly.</p>
<p>Here the scripts we used for that purpose:</p>
<p><strong>TSQL script:</strong></p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">- Create audit<br />
USE [master]<br />
GO<br />
<br />
CREATE SERVER AUDIT [Audit-Target-Login]<br />
TO FILE <br />
( &nbsp; FILEPATH = N'/var/opt/mssql/log/'<br />
&nbsp; &nbsp; ,MAXSIZE = 0 MB<br />
&nbsp; &nbsp; ,MAX_ROLLOVER_FILES = 2147483647<br />
&nbsp; &nbsp; ,RESERVE_DISK_SPACE = OFF<br />
)<br />
WITH<br />
( &nbsp; QUEUE_DELAY = 1000<br />
&nbsp; &nbsp; ,ON_FAILURE = CONTINUE<br />
)<br />
WHERE (<br />
&nbsp; &nbsp; [server_principal_name] like '%\%' <br />
&nbsp; &nbsp; AND NOT [server_principal_name] like '%\svc%' <br />
&nbsp; &nbsp; AND NOT [server_principal_name] like 'NT SERVICE\%' <br />
&nbsp; &nbsp; AND NOT [server_principal_name] like 'NT AUTHORITY\%' <br />
&nbsp; &nbsp; AND NOT [server_principal_name] like '%XDCP%'<br />
);<br />
<br />
ALTER SERVER AUDIT [Audit-Target-Login] WITH (STATE = ON);<br />
GO<br />
<br />
CREATE SERVER AUDIT SPECIFICATION [Server-Audit-Target-Login]<br />
FOR SERVER AUDIT [Audit-Target-Login]<br />
ADD (FAILED_DATABASE_AUTHENTICATION_GROUP),<br />
ADD (SUCCESSFUL_DATABASE_AUTHENTICATION_GROUP),<br />
ADD (FAILED_LOGIN_GROUP),<br />
ADD (SUCCESSFUL_LOGIN_GROUP),<br />
ADD (LOGOUT_GROUP)<br />
WITH (STATE = ON)<br />
GO<br />
<br />
USE [DBA] <br />
GO <br />
<br />
-- Tables to simulate the scenario<br />
CREATE TABLE dbo.T ( <br />
&nbsp; &nbsp; id INT, <br />
&nbsp; &nbsp; col1 VARCHAR(50) <br />
);<br />
<br />
CREATE TABLE dbo.T2 ( <br />
&nbsp; &nbsp; id INT, <br />
&nbsp; &nbsp; col1 VARCHAR(50) <br />
); <br />
<br />
INSERT INTO dbo.T VALUES (1, REPLICATE('T',20));<br />
INSERT INTO dbo.T2 VALUES (1, REPLICATE('T',20));</div></div>
<p><strong>PowerShell scripts:</strong></p>
<p>Session 1: Simulating SQL pattern</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"># Scenario simulation &nbsp;<br />
$server ='127.0.0.1' <br />
$Database ='DBA' <br />
<br />
$Connection =New-Object System.Data.SQLClient.SQLConnection <br />
$Connection.ConnectionString = &quot;Server=$server;Initial Catalog=$Database;Integrated Security=false;User ID=sa;Password=P@SSw0rd1;Application Name=TESTLOCK&quot; <br />
$Connection.Open() <br />
<br />
$Command = New-Object System.Data.SQLClient.SQLCommand <br />
$Command.Connection = $Connection <br />
$Command.CommandTimeout = 500<br />
<br />
$sql = <br />
&quot; <br />
MERGE T AS T <br />
USING T2 AS S ON T.id = S.id <br />
WHEN MATCHED THEN UPDATE SET T.col1 = 'TT' <br />
WHEN NOT MATCHED THEN INSERT (col1) VALUES ('TT'); <br />
<br />
WAITFOR DELAY '00:00:03' &nbsp;<br />
&quot; &nbsp;<br />
<br />
#Begin Transaction <br />
$command.Transaction = $connection.BeginTransaction() <br />
<br />
# Simulate for each file =&amp;gt; Execute merge statement<br />
while(1 -eq 1){<br />
<br />
&nbsp; &nbsp; $Command.CommandText =$sql <br />
&nbsp; &nbsp; $Result =$Command.ExecuteNonQuery() <br />
<br />
}<br />
&nbsp; &nbsp; &nbsp;<br />
$command.Transaction.Commit() <br />
$Connection.Close()</div></div>
<p>Session 2: Simulating stopping / starting SQL Server audit for archiving purpose</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$creds = New-Object System.Management.Automation.PSCredential -ArgumentList ($user, $password)<br />
<br />
$Query = &quot;<br />
&nbsp; &nbsp; USE master;<br />
&nbsp; &nbsp; ALTER SERVER AUDIT [Audit-Target-Login]<br />
&nbsp; &nbsp; WITH ( STATE = OFF );<br />
<br />
&nbsp; &nbsp; ALTER SERVER AUDIT [Audit-Target-Login]<br />
&nbsp; &nbsp; WITH ( STATE = ON );<br />
&quot;<br />
<br />
Invoke-DbaQuery `<br />
&nbsp; &nbsp; -SqlInstance $server `<br />
&nbsp; &nbsp; -Database $Database `<br />
&nbsp; &nbsp; -SqlCredential $creds `<br />
&nbsp; &nbsp; -Query $Query</div></div>
<p>First, we wanted to get a comprehensive picture of locks acquired during the execution of this specific SQL pattern by with an extended event session and lock_acquired event as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE EVENT SESSION [locks] <br />
ON SERVER <br />
ADD EVENT sqlserver.lock_acquired<br />
(<br />
&nbsp; &nbsp; ACTION(sqlserver.client_app_name,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sqlserver.session_id,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sqlserver.transaction_id)<br />
&nbsp; &nbsp; WHERE ([sqlserver].[client_app_name]=N'TESTLOCK'))<br />
ADD TARGET package0.histogram<br />
(<br />
&nbsp; &nbsp; SET filtering_event_name=N'sqlserver.lock_acquired',<br />
&nbsp; &nbsp; source=N'resource_type',source_type=(0)<br />
)<br />
WITH <br />
(<br />
&nbsp; &nbsp; MAX_MEMORY=4096 KB,<br />
&nbsp; &nbsp; EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,<br />
&nbsp; &nbsp; MAX_DISPATCH_LATENCY=30 SECONDS,<br />
&nbsp; &nbsp; MAX_EVENT_SIZE=0 KB,<br />
&nbsp; &nbsp; MEMORY_PARTITION_MODE=NONE,<br />
&nbsp; &nbsp; TRACK_CAUSALITY=OFF,<br />
&nbsp; &nbsp; STARTUP_STATE=OFF<br />
)<br />
GO</div></div>
<p>Here the output we got after running the first PowerShell session:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-2-xe-lock-output.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-2-xe-lock-output.jpg" alt="167 - 2 - xe lock output" width="327" height="158" class="alignnone size-full wp-image-1676" /></a></p>
<p>We confirm METADATA locks in addition to usual locks acquired to the concerned structures. We correlated this output with sp_WhoIsActive (and @get_locks = 1) after running the second PowerShell session. Let’s precise that you may likely have to run the 2nd query several times to reproduce the initial issue.  </p>
<p>Here a picture of locks respectively acquired by session 1 and in waiting state by session 2:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-3-sp_WhoIsActiveGetLocks-e1601925071999.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-3-sp_WhoIsActiveGetLocks-e1601925071999.jpg" alt="167 - 3 - sp_WhoIsActiveGetLocks" width="800" height="344" class="alignnone size-full wp-image-1677" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-4-sp_WhoIsActiveGetLocks2-e1601925104990.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-4-sp_WhoIsActiveGetLocks2-e1601925104990.jpg" alt="167 - 4 - sp_WhoIsActiveGetLocks2" width="800" height="122" class="alignnone size-full wp-image-1678" /></a></p>
<p>We may identify clearly metadata locks acquired on the SQL Server audit itself (METDATA.AUDIT_ACTIONS with Sch-S) and the second query with ALTER SERVER AUDIT … WITH (STATE = OFF) statement that is waiting on the same resource (Sch-M). Unfortunately, my google Fu didn’t provide any relevant information on this topic excepted the documentation related to <a href="https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-tran-locks-transact-sql?view=sql-server-ver15" rel="noopener" target="_blank">sys.dm_tran_locks</a> DMV. My guess is writing events to audits requires a stable the underlying infrastructure and SQL Server needs to protect concerned components (with Sch-S) against concurrent modifications (Sch-M). Anyway, it is easy to figure out that subsequent queries could be blocked (with incompatible Sch-S on the audit resource) while the previous ones are running.  </p>
<p>The query pattern exposed previously (unlike short transactions) is a good catalyst for such blocking scenario due to the accumulation and duration of locks within one single transaction. It may be confirmed by the XE’s output:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-5-lock_sch_s_same_transaction-e1601925276612.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-5-lock_sch_s_same_transaction-e1601925276612.jpg" alt="167 - 5 - lock_sch_s_same_transaction" width="800" height="543" class="alignnone size-full wp-image-1681" /></a></p>
<p>We managed to get a reproductible scenario with TSQL and PowerShell scripts. In addition, I also ran queries from other databases to confirm it may compromise responsiveness of the entire workload on the same instance (respectively DBA3 and DBA4 databases in my test). </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/10/167-6-lock_tree-e1601925310889.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/10/167-6-lock_tree-e1601925310889.jpg" alt="167 - 6 - lock_tree" width="800" height="78" class="alignnone size-full wp-image-1682" /></a></p>
<p><strong>How we fixed this issue?</strong></p>
<p>Even it is only one part of the solution, I’m a strong believer this pattern remains a performance killer and using a set-bases approach may help to reduce drastically number and duration of locks and implicitly chances to make this blocking scenario happen again. Let&rsquo;s precise it is not only about MERGE statement because I managed to reproduce the same issue with INSERT and UPDATE statements as well.</p>
<p>Then, this scenario really made us think about a long-term solution because we cannot guarantee this pattern will not be used by other teams in the future. Looking further at the PowerShell script which carries out steps of archiving the audit file and inserting data to the audit database, we finally added a QueryTimeout parameter value to 10s to the concerned Invoke-DbaQuery command as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">...<br />
<br />
$query = &quot;<br />
&nbsp; &nbsp; USE [master];<br />
<br />
&nbsp; &nbsp; IF EXISTS (SELECT 1<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM &nbsp;sys.dm_server_audit_status<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE [name] = '$InstanceAuditPrefix-$AuditName')<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; ALTER SERVER AUDIT [$InstanceAuditPrefix-$AuditName]<br />
&nbsp; &nbsp; &nbsp; &nbsp; WITH (STATE = OFF);<br />
&nbsp; &nbsp; END<br />
<br />
&nbsp; &nbsp; ALTER SERVER AUDIT [$InstanceAuditPrefix-$AuditName]<br />
&nbsp; &nbsp; WITH (STATE = ON);<br />
&quot;<br />
<br />
Invoke-DbaQuery `<br />
&nbsp; &nbsp; -SqlInstance $Instance `<br />
&nbsp; &nbsp; -SqlCredential $SqlCredential `<br />
&nbsp; &nbsp; -Database master `<br />
&nbsp; &nbsp; -Query $query `<br />
&nbsp; &nbsp; -EnableException `<br />
&nbsp; &nbsp; -QueryTimeout 5 <br />
<br />
...</div></div>
<p>Therefore, because we want to prioritize the business workload over the SQL Server audit operation, if such situation occurs again, stopping the SQL Server audit will timeout after reaching 5s which was relevant in our context. The next iteration of the PowerShell is able to restart at the last stage executed previously. </p>
<p>Hope this blog post helps.</p>
<p>See you!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SQL Server index rebuid online and blocking scenario</title>
		<link>https://blog.developpez.com/mikedavem/p13199/sql-server-2012/sql-server-index-rebuid-online-and-blocking-scenario</link>
		<comments>https://blog.developpez.com/mikedavem/p13199/sql-server-2012/sql-server-index-rebuid-online-and-blocking-scenario#comments</comments>
		<pubDate>Sun, 30 Aug 2020 21:18:28 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Server 2012]]></category>
		<category><![CDATA[SQL Server 2014]]></category>
		<category><![CDATA[SQL Server 2016]]></category>
		<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[SQL Server 2019]]></category>
		<category><![CDATA[blocking]]></category>
		<category><![CDATA[online operation]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1664</guid>
		<description><![CDATA[A couple of months ago, I experienced a problem about index rebuild online operation on SQL Server. In short, the operation was supposed to be online and to never block concurrent queries. But in fact, it was not the case &#8230; <a href="https://blog.developpez.com/mikedavem/p13199/sql-server-2012/sql-server-index-rebuid-online-and-blocking-scenario">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>A couple of months ago, I experienced a problem about index rebuild online operation on SQL Server. In short, the operation was supposed to be online and to never block concurrent queries. But in fact, it was not the case (or to be more precise, it was partially the case) and to make the scenario more complex, we experienced different behaviors regarding the context. Let’s start the story with the initial context: in my company, we usually go through continuous deployment including SQL modification scripts and because we usually rely on daily pipeline, we must ensure related SQL operations are not too disruptive to avoid impacting the user experience.</p>
<p><span id="more-1664"></span></p>
<p>Sometimes, we must introduce new indexes to deployment scripts and according to how disruptive the script can be, a discussion between Devs and Ops is initiated, and it results either to manage manually by the Ops team or to deploy it automatically through the automatic deployment pipeline by Devs. </p>
<p>Non-disruptive operations can be achieved in many ways and ONLINE capabilities of SQL Server may be part of the solution and this is what I suggested with one of our scripts. Let’s illustrate this context with the following example. I created a table named dbo.t1 with a bunch of rows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">USE [test];<br />
<br />
SET NOCOUNT ON;<br />
<br />
DROP TABLE IF EXISTS dbo.t1;<br />
GO<br />
<br />
CREATE TABLE dbo.t1 (<br />
&nbsp; &nbsp; id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,<br />
&nbsp; &nbsp; col1 VARCHAR(50) NULL<br />
);<br />
GO<br />
<br />
INSERT INTO dbo.t1 (col1) VALUES (REPLICATE('T', 50));<br />
GO …<br />
EXEC sp_spaceused 'dbo.t1'<br />
--name&nbsp; rows&nbsp; &nbsp; reserved&nbsp; &nbsp; data&nbsp; &nbsp; index_size&nbsp; unused<br />
--t1&nbsp; &nbsp; 5226496 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1058000 KB&nbsp; 696872 KB &nbsp; 342888 KB &nbsp; 18240 KB</div></div>
<p>Go ahead and let’ set the context with a pattern of scripts deployment we went through during this specific deployment. Let’s precise this script is over simplified, but I keep the script voluntary simple to focus only on the most important part.  You will notice the script includes two steps with operations on the same table including updating / fixing values in col2 first and then rebuilding index on col1.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">/* Code before */<br />
<br />
-- Update some values in the col1 colum<br />
UPDATE [dbo].[t1]<br />
SET col1 = REPLICATE('B', 50)<br />
<br />
-- Then create an index on col1 column<br />
CREATE INDEX [col1]<br />
ON [dbo].[t1] (col1) WITH (ONLINE = ON);<br />
GO</div></div>
<p>At the initial stage, the creation of index was by default (OFFLINE). Having discussed this point with the DEV team, we decided to create the index ONLINE in this context. The choice between OFFLINE / ONLINE operation is often not trivial and should be evaluated carefully but to keep simple, let’s say it was the right way to go in our context. Generally speaking, online operations are slower, but the tradeoff was acceptable in order to minimize blocking issues during this deployment. At least, this is what I thought …</p>
<p>In my demo, without any concurrent workload against the dbo.t1 table, creating the index offline took 6s compared to the online method with 12s. So, an expected result here …</p>
<p>Let’s run this another query in another session:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">SELECT id, col1<br />
FROM dbo.t1<br />
WHERE id BETWEEN 1 AND 2</div></div>
<p>In a normal situation, this query should be blocked in a short time corresponding to the duration of the update operation. But once the update is done, blocking situation should disappear even during the index rebuild operation that is performed ONLINE. </p>
<p>But now let’s add <a href="https://flywaydb.org/" rel="noopener" target="_blank">Flyway</a> to the context. Flyway is an open source tool we are using for automatic deployment of SQL objects. The deployment script was executed from it in ACC environment and we noticed longer blocked concurrent accesses this time. This goes against what we would ideally like. Digging through this issue with the DEV team, we also noticed the following message when running the deployment script:</p>
<p><em>Warning: Online index operation on table &lsquo;dbo.t1 will proceed but concurrent access to the table may be limited due to residual lock on the table from a previous operation in the same transaction.<br />
</em></p>
<p>This is something I didn’t noticed from SQL Server Management Studio when I tested the same deployment script. So, what happened here?</p>
<p>Referring to the <a href="https://flywaydb.org/documentation/migrations#transactions" rel="noopener" target="_blank">Flyway documentation</a>, it is mentioned that Flyway always wraps the execution of an entire migration within a single transaction by default and it was exactly the root cause of the issue.</p>
<p>Let’s try with some experimentations: </p>
<p><strong>Test 1</strong>: Update + rebuilding index online in implicit transaction mode (one transaction per query).</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- Update some values in the col1 colum<br />
UPDATE [dbo].[t1]<br />
SET col1 = REPLICATE('B', 50)<br />
<br />
-- Then create an index on col1 column<br />
CREATE INDEX [col1]<br />
ON [dbo].[t1] (col1) WITH (ONLINE = ON);<br />
GO<br />
-- In another session<br />
SELECT id, col1<br />
FROM dbo.t1<br />
WHERE id BETWEEN 1 AND 2</div></div>
<p><strong>Test 2</strong>: Update + rebuilding index online within one single explicit transaction</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">BEGIN TRAN;<br />
<br />
-- Update some values in the col1 colum<br />
UPDATE [dbo].[t1]<br />
SET col1 = REPLICATE('B', 50)<br />
<br />
-- Then create an index on col1 column<br />
CREATE INDEX [col1]<br />
ON [dbo].[t1] (col1) WITH (ONLINE = ON);<br />
GO<br />
COMMIT TRAN;<br />
-- In another session<br />
SELECT id, col1<br />
FROM dbo.t1<br />
WHERE id BETWEEN 1 AND 2</div></div>
<p>After running these two scripts, we can notice the blocking duration of SELECT query is longer in test2 as shown in the picture below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/08/166-1-blocked-process.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/08/166-1-blocked-process.jpg" alt="166 - 1 - blocked process" width="890" height="358" class="alignnone size-full wp-image-1665" /></a></p>
<p>In the test 1, the duration of the blocking session corresponds to that for updating operation (first step of the script). However, in the test 2, we must include the time for creating the index but let’s precise the index is not the blocking operation at all, but it increases the residual locking put by the previous update operation. In short, this is exactly what the warning message is telling us. I think you can imagine easily which impact such situation may implies if the index creation takes a long time. You may get exactly the opposite of what you really expected. </p>
<p>Obviously, this is not a recommended situation and creating an index should be run in very narrow and constrained transaction.But from my experience, things are never always obvious and regarding your context, you should keep an eye of how transactions are managed especially when it comes automatic deployment stuff that could be quickly out of the scope of the DBA / Ops team. Strong collaboration with DEV team is recommended to anticipate this kind of issue.</p>
<p>See you !!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Universal usage of NVARCHAR type and performance impact</title>
		<link>https://blog.developpez.com/mikedavem/p13195/sql-server-vnext/universal-usage-of-nvarchar-type-and-performance-impact</link>
		<comments>https://blog.developpez.com/mikedavem/p13195/sql-server-vnext/universal-usage-of-nvarchar-type-and-performance-impact#comments</comments>
		<pubDate>Wed, 27 May 2020 17:06:24 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[convert_implicit]]></category>
		<category><![CDATA[nvarchar]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[Query Store]]></category>
		<category><![CDATA[sqlserver]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1604</guid>
		<description><![CDATA[A couple of weeks, I read an article from Brent Ozar about using NVARCHAR as a universal parameter. It was a good reminder and from my experience, I confirm this habit has never been a good idea. Although it depends &#8230; <a href="https://blog.developpez.com/mikedavem/p13195/sql-server-vnext/universal-usage-of-nvarchar-type-and-performance-impact">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>A couple of weeks, I read an <a href="https://www.brentozar.com/archive/2020/04/can-you-use-nvarchar-as-a-universal-parameter-almost/" rel="noopener" target="_blank">article</a> from Brent Ozar about using NVARCHAR as a universal parameter. It was a good reminder and from my experience, I confirm this habit has never been a good idea. Although it depends on the context, chances are you will almost find an exception that proves the rule. </p>
<p><span id="more-1604"></span></p>
<p>A couple of days ago, I felt into a situation that illustrated perfectly this issue, and, in this blog, I decided to share my experience and demonstrate how the impact may be in a real production scenario.<br />
So, let’s start with the culprit. I voluntary masked some contextual information but the principal is here. The query is pretty simple:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">DECLARE @P0 DATETIME <br />
DECLARE @P1 INT<br />
DECLARE @P2 NVARCHAR(4000) <br />
DECLARE @P3 DATETIME <br />
DECLARE @P4 NVARCHAR(4000)<br />
<br />
UPDATE TABLE SET DATE = @P0<br />
WHERE ID = @P1<br />
&nbsp;AND IDENTIFIER = @P2<br />
&nbsp;AND P_DATE &amp;gt;= @P3<br />
&nbsp;AND W_O_ID = (<br />
&nbsp; &nbsp;SELECT TOP 1 ID FROM TABLE2<br />
&nbsp; &nbsp;WHERE Identifier = @P4<br />
&nbsp; &nbsp;ORDER BY ID DESC)</div></div>
<p>And the corresponding execution plan: </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-1-excution_plan_with_implicit_conversion-e1590596511773.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-1-excution_plan_with_implicit_conversion-e1590596511773.jpg" alt="162 - 1 - excution_plan_with_implicit_conversion" width="1000" height="380" class="alignnone size-full wp-image-1605" /></a></p>
<p>The most interesting part concerns the TABLE2 table. As you may notice the @P4 input parameter type is NVARCHAR and it is evident we get a CONVERT_IMPLICIT in the concerned Predicate section above. The CONVERT_IMPLICIT function is required because of <a href="https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-ver15" rel="noopener" target="_blank">data type precedence</a>. It results to a costly operator that will scan all the data from TABLE2. As you probably know, CONVERT_IMPLICT prevents sargable condition and normally this is something we could expect here referring to the distribution value in the statistic histogram and the underlying index on the Identifier column.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">EXEC sp_helpindex 'TABLE2';</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-8-index-config.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-8-index-config.jpg" alt="162 - 8 - index config" width="1035" height="135" class="alignnone size-full wp-image-1606" /></a></p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">DBCC SHOW_STATISTICS ('TABLE2', 'IX___IDENTIFIER')<br />
WITH HISTOGRAM;</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-10-histogram-stats.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-10-histogram-stats.jpg" alt="162 - 10 - histogram stats" width="877" height="435" class="alignnone size-full wp-image-1620" /></a></p>
<p>Another important point to keep in mind is that scanning all the data from the TABLE 2 table may be at a certain cost (&gt; 1GB) even if data resides in memory.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">EXEC sp_spaceused 'TABLE2'</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-9-index-space-used.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-9-index-space-used.jpg" alt="162 - 9 - index space used" width="725" height="62" class="alignnone size-full wp-image-1607" /></a></p>
<p>The execution plan warning confirms the potential overhead of retrieving few rows in the TABLE2 table:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-2-excution_plan_with_implicit_conversion-arning.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-2-excution_plan_with_implicit_conversion-arning.jpg" alt="162 - 2 - excution_plan_with_implicit_conversion arning" width="1160" height="108" class="alignnone size-full wp-image-1608" /></a></p>
<p>To set a little bit more the context, the concerned application queries are mainly based on JDBC Prepared statements which imply using NVARCHAR(4000) with string parameters regardless the column type in the database (VARCHAR / NVARCHAR). This is at least what we noticed from during my investigations. </p>
<p>So, what? Well, in our DEV environment the impact was imperceptible, and we had interesting discussions with the DEV team on this topic and we basically need to improve the awareness and the visibility on this field. (Another discussion and probably another blog post) … </p>
<p>But chances are your PROD environment will tell you a different story when it comes a bigger workload and concurrent query executions. In my context, from an infrastructure standpoint, the symptom was an abnormal increase of the CPU consumption a couple of days ago. Usually, the CPU consumption was roughly 20% up to 30% and in fact, the issue was around for a longer period, but we didn’t catch it due to a &laquo;&nbsp;normal&nbsp;&raquo; CPU footprint on this server. </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-3-SQL-Processor-dashboard.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-3-SQL-Processor-dashboard.jpg" alt="162 - 3 - SQL Processor dashboard" width="668" height="631" class="alignnone size-full wp-image-1612" /></a></p>
<p>So, what happened here? We&rsquo;re using SQL Server 2017 with Query Store enabled on the concerned database. This feature came to the rescue and brought attention to the first clue: A query plan regression that led increasing IO consumption in the second case (and implicitly the additional CPU resource consumption as well).</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-4-QS-regression-plan-e1590597560224.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-4-QS-regression-plan-e1590597560224.jpg" alt="162 - 4 - QS regression plan" width="1000" height="575" class="alignnone size-full wp-image-1613" /></a></p>
<p>You have probably noticed both the execution plans are using an index scan at the right but the more expensive one (at the bottom) uses a different index strategy. Instead of using the primary key and clustered index (PK_xxx), a non-clustered index on the IX_xxx_Identifier column in the second query execution plan is used with the same CONVERT_IMPLICIT issue. </p>
<p>According to the query store statistics, number of executions per business day is roughly 25000 executions with ~ 8.5H of CPU time consumed during this period (18.05.2020 – 26.05.2020) that was a very different order of magnitude compared to what we may have in the DEV environment <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /></p>
<p>At this stage, I would say investigating why a plan regression occurred doesn’t really matter because in both cases the most expensive operator concerns an index scan and again, we expect an index seek. Getting rid of the implicit conversion by using VARCHAR type to make the conditional clause sargable was a better option for us. Thus, the execution plan would be:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-7-Execution-plan-with-seek-e1590597830429.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-7-Execution-plan-with-seek-e1590597830429.jpg" alt="162 - 7 - Execution plan with seek" width="1000" height="158" class="alignnone size-full wp-image-1615" /></a></p>
<p>The first workaround in mind was to force the better plan in the query store (automatic tuning with FORCE_LAST_GOOD_PLAN = ON is disabled) but having discussed this point with the DEV team, we managed to deploy a fix very fast to address this issue and to reduce drastically the CPU consumption on this SQL Server instance as shown below. The picture is self-explanatory: </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-6-SQL-Processor-dashboard-after-optimization-e1590597880869.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-6-SQL-Processor-dashboard-after-optimization-e1590597880869.jpg" alt="162 - 6 - SQL Processor dashboard after optimization" width="1000" height="460" class="alignnone size-full wp-image-1616" /></a></p>
<p>The fix consisted in adding CAST / CONVERT function to the right side of the equality (parameter and not the column) to avoid side effect on the JDBC driver. Therefore, we get another version of the query and a different query hash as well. The query update is pretty similar to the following one:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">DECLARE @P0 DATETIME <br />
DECLARE @P1 INT<br />
DECLARE @P2 NVARCHAR(4000) <br />
DECLARE @P3 DATETIME <br />
DECLARE @P4 NVARCHAR(4000)<br />
<br />
UPDATE TABLE SET DATE = @P0<br />
WHERE ID = @P1<br />
&nbsp;AND IDENTIFIER = CAST(@P2 AS varchar(50))<br />
&nbsp;AND P_DATE &amp;gt;= @P3<br />
&nbsp;AND W_O_ID = (<br />
&nbsp; &nbsp;SELECT TOP 1 ID FROM TABLE2<br />
&nbsp; &nbsp;WHERE Identifier = CAST(@P4 AS varchar(50))<br />
&nbsp; &nbsp;ORDER BY ID DESC)</div></div>
<p>Sometime later, we gathered query store statistics of both the former and new query to confirm the performance improvement as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/05/162-5-QS-stats-after-optimization.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/05/162-5-QS-stats-after-optimization.jpg" alt="162 - 5 - QS stats after optimization" width="923" height="98" class="alignnone size-full wp-image-1617" /></a></p>
<p>Finally changing the data type led to enable using an index seek operator to reduce drastically the SQL Server CPU consumption and logical read operations by far. </p>
<p>QED!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SQL Server on Linux and new FUA support for XFS filesystem</title>
		<link>https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem</link>
		<comments>https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem#comments</comments>
		<pubDate>Mon, 13 Apr 2020 17:34:32 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[SQL Server 2019]]></category>
		<category><![CDATA[blktrace]]></category>
		<category><![CDATA[FUA]]></category>
		<category><![CDATA[iostats]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[xfs]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1568</guid>
		<description><![CDATA[I wrote a (dbi services) blog post concerning Linux and SQL Server IO behavior changes before and after SQL Server 2017 CU6. Now, I was looking forward seeing some new improvements with Force Unit Access (FUA) that was implemented with &#8230; <a href="https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I wrote a (dbi services) <a href="https://blog.dbi-services.com/sql-server-on-linux-io-internal-thoughts/" rel="noopener" target="_blank">blog pos</a>t concerning Linux and SQL Server IO behavior changes before and after SQL Server 2017 CU6.  Now, I was looking forward seeing some new improvements with Force Unit Access (FUA) that was implemented with Linux XFS enhancements since the Linux Kernel 4.18.</p>
<p><span id="more-1568"></span></p>
<p>As reminder, SQL Server 2017 CU6 provides added a way to guarantee data durability by using &laquo;&nbsp;forced flush&nbsp;&raquo; mechanism explained <a href="https://support.microsoft.com/en-us/help/4131496/enable-forced-flush-mechanism-in-sql-server-2017-on-linux" rel="noopener" target="_blank">here</a>. To cut the story short, SQL Server has strict storage requirement such as Write Ordering, FUA and things go differently on Linux than Windows to achieve durability. What is FUA and why is it important for SQL Server? From <a href="https://en.wikipedia.org/wiki/Disk_buffer#Force_Unit_Access_(FUA)" rel="noopener" target="_blank">Wikipedia</a>:  Force Unit Access (aka FUA) is an I/O write command option that forces written data all the way to stable storage. FUA appeared in the SCSI command set but good news, it was later adopted by other standards over the time. SQL Server relies on it to meet WAL and ACID capabilities. </p>
<p>On the Linux world and before the Kernel 4.18, FUA was handled and optimized only for the filesystem journaling. However, data storage always used the multi-step flush process that could introduce SQL Server IO storage slowness (Issue write to block device for the data + issue block device flush to ensure durability with O_DSYNC). </p>
<p>On the Windows world, installing and using a SQL Server instance assumes you are compliant with the Microsoft storage requirements and therefore the first RTM version shipped on Linux came only with O_DIRECT assuming you already ensure that SQL Server IO are able to be written directly into a non-volatile storage through the kernel, drivers and hardware before the acknowledgement. Forced flush mechanism &#8211; based on fdatasync() &#8211;  was then introduced to address scenarios with no safe DIRECT_IO capabilities. </p>
<p>But referring to the Bob Dorr <a href="https://bobsql.com/sql-server-on-linux-forced-unit-access-fua-internals/" rel="noopener" target="_blank">article</a>, Linux Kernel 4.18 comes with XFS enhancements to handle FUA for data storage and it is obviously of benefit to SQL Server.  FUA support is intended to improve write requests by shorten the path of write requests as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-1-IO-worklow-e1586796506268.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-1-IO-worklow-e1586796506268.jpg" alt="160 - 1 - IO worklow" width="1000" height="539" class="alignnone size-full wp-image-1569" /></a></p>
<p><em>Picture from existing IO workflow on Bob Dorr&rsquo;s article</em></p>
<p>This is an interesting improvement for write intensive workload and it seems to be confirmed from the tests performed by Microsoft and Bob Dorr in his article. </p>
<p>Let’s the experiment begins with my lab environment based on a Centos 7 on Hyper-V with an upgraded kernel version: 5.6.3-1.e17.elrepo.x86_64.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$uname -r<br />
5.6.3-1.el7.elrepo.x86_64<br />
<br />
$cat /etc/os-release | grep VERSION<br />
VERSION=&quot;7 (Core)&quot;<br />
VERSION_ID=&quot;7&quot;<br />
CENTOS_MANTISBT_PROJECT_VERSION=&quot;7&quot;<br />
REDHAT_SUPPORT_PRODUCT_VERSION=&quot;7&quot;</div></div>
<p>Let’s precise that my tests are purely experimental and instead of upgrading the Kernel to a newer version you may directly rely on RHEL 8 based distros which comes with kernel version 4.18 for example.</p>
<p>My lab environment includes 2 separate SSD disks to host the DATA + TLOG database files as follows:</p>
<p>I:\ drive : SQL Data volume (sdb – XFS filesystem)<br />
T:\ drive : SQL TLog volume (sda – XFS filesystem)</p>
<p>The general performance is not so bad <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-6-diskmark-tests-storage-env-e1586796679451.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-6-diskmark-tests-storage-env-e1586796679451.jpg" alt="160 - 6 - diskmark tests storage env" width="1000" height="362" class="alignnone size-full wp-image-1571" /></a></p>
<p>Initially I just dedicated on disk for both SQL DATA and TLOG but I quickly noticed some IO waits (iostats output) leading to make me lunconfident with my test results</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-3-iostats-before-optimization.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-3-iostats-before-optimization.jpg" alt="160 - 3 - iostats before optimization" width="975" height="447" class="alignnone size-full wp-image-1572" /></a></p>
<p>Spreading IO on physically separate volumes helped to reduce drastically these phenomena afterwards:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-4-iostats-after-optimization.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-4-iostats-after-optimization.jpg" alt="160 - 4 - iostats after optimization" width="984" height="531" class="alignnone size-full wp-image-1573" /></a> </p>
<p>First, I enabled FUA capabilities on Hyper-V side as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">Set-VMHardDiskDrive -VMName CENTOS7 -ControllerType SCSI -OverrideCacheAttributes WriteCacheAndFUAEnabled<br />
<br />
Get-VMHardDiskDrive -VMName CENTOS7 | `<br />
&nbsp; &nbsp; ft VMName, ControllerType, &nbsp;ControllerLocation, Path, WriteHardeningMethod -AutoSize</div></div>
<p>Then I checked if FUA is enabled and supported from an OS perspective including sda (TLOG) and sdb (SQL DATA) disks:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ lsblk -f<br />
NAME &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;FSTYPE &nbsp; &nbsp; &nbsp;LABEL UUID &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; MOUNTPOINT<br />
sdb<br />
└─sdb1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 06910f69-27a3-4711-9093-f8bf80d15d72 &nbsp; /sqldata<br />
sr0<br />
sda<br />
├─sda2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f5a9bded-130f-4642-bd6f-9f27563a4e16 &nbsp; /boot<br />
├─sda3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;LVM2_member &nbsp; &nbsp; &nbsp; QsbKEt-28yT-lpfZ-VCbj-v5W5-vnVr-2l7nih<br />
│ ├─centos-swap swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7eebbb32-cef5-42e9-87c3-7df1a0b79f11 &nbsp; [SWAP]<br />
│ └─centos-root xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 90f6eb2f-dd39-4bef-a7da-67aa75d1843d &nbsp; /<br />
└─sda1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;vfat &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7529-979E &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/boot/efi<br />
<br />
$ dmesg | grep sda<br />
[ &nbsp; &nbsp;1.665478] sd 0:0:0:0: [sda] 83886080 512-byte logical blocks: (42.9 GB/40.0 GiB)<br />
[ &nbsp; &nbsp;1.665479] sd 0:0:0:0: [sda] 4096-byte physical blocks<br />
[ &nbsp; &nbsp;1.665774] sd 0:0:0:0: [sda] Write Protect is off<br />
[ &nbsp; &nbsp;1.665775] sd 0:0:0:0: [sda] Mode Sense: 0f 00 10 00<br />
[ &nbsp; &nbsp;1.670321] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA<br />
[ &nbsp; &nbsp;1.683833] &nbsp;sda: sda1 sda2 sda3<br />
[ &nbsp; &nbsp;1.708938] sd 0:0:0:0: [sda] Attached SCSI disk<br />
[ &nbsp; &nbsp;5.607914] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)</div></div>
<p>Finally according to the documentation, I configured the <strong>trace flag 3979</strong> and <strong>control.alternatewritethrough=0</strong> parameters at startup parameters for my SQL Server instance.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ /opt/mssql/bin/mssql-conf traceflag 3979 on<br />
<br />
$ /opt/mssql/bin/mssql-conf set control.alternatewritethrough 0<br />
<br />
$ systemctl restart mssql-server</div></div>
<p>The first I performed was pretty similar to those in my previous (dbi services) <a href="https://blog.dbi-services.com/sql-server-on-linux-io-internal-thoughts/">blog post</a>.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE TABLE dummy_test (<br />
&nbsp; &nbsp; id INT IDENTITY,<br />
&nbsp; &nbsp; col1 VARCHAR(2000) DEFAULT REPLICATE('T', 2000)<br />
);<br />
<br />
INSERT INTO dummy_test DEFAULT VALUES;<br />
GO 67</div></div>
<p>For a sake of curiosity, I looked at the corresponding strace output:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ cat sql_strace_fua.txt<br />
% time &nbsp; &nbsp; seconds &nbsp;usecs/call &nbsp; &nbsp; calls &nbsp; &nbsp;errors syscall<br />
------ ----------- ----------- --------- --------- ----------------<br />
&nbsp;78.13 &nbsp;360.618066 &nbsp; &nbsp; &nbsp; 61739 &nbsp; &nbsp; &nbsp;5841 &nbsp; &nbsp; &nbsp;2219 futex<br />
&nbsp; 6.88 &nbsp; 31.731833 &nbsp; &nbsp; 1511040 &nbsp; &nbsp; &nbsp; &nbsp;21 &nbsp; &nbsp; &nbsp; &nbsp;15 restart_syscall<br />
&nbsp; 3.81 &nbsp; 17.592176 &nbsp; &nbsp; &nbsp;130312 &nbsp; &nbsp; &nbsp; 135 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; io_getevents<br />
&nbsp; 2.95 &nbsp; 13.607314 &nbsp; &nbsp; &nbsp; 98604 &nbsp; &nbsp; &nbsp; 138 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; epoll_wait<br />
&nbsp; 2.88 &nbsp; 13.313667 &nbsp; &nbsp; &nbsp;633984 &nbsp; &nbsp; &nbsp; &nbsp;21 &nbsp; &nbsp; &nbsp; &nbsp;21 rt_sigtimedwait<br />
&nbsp; 2.60 &nbsp; 11.997925 &nbsp; &nbsp; 1333103 &nbsp; &nbsp; &nbsp; &nbsp; 9 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; nanosleep<br />
&nbsp; 1.79 &nbsp; &nbsp;8.279781 &nbsp; &nbsp; &nbsp; &nbsp; 242 &nbsp; &nbsp; 34256 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; gettid<br />
&nbsp; 0.84 &nbsp; &nbsp;3.876021 &nbsp; &nbsp; &nbsp; &nbsp; 226 &nbsp; &nbsp; 17124 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; getcpu<br />
&nbsp; 0.03 &nbsp; &nbsp;0.138836 &nbsp; &nbsp; &nbsp; &nbsp; 347 &nbsp; &nbsp; &nbsp; 400 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_yield<br />
&nbsp; 0.01 &nbsp; &nbsp;0.062348 &nbsp; &nbsp; &nbsp; &nbsp; 254 &nbsp; &nbsp; &nbsp; 245 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; getrusage<br />
&nbsp; 0.01 &nbsp; &nbsp;0.056065 &nbsp; &nbsp; &nbsp; &nbsp; 406 &nbsp; &nbsp; &nbsp; 138 &nbsp; &nbsp; &nbsp; &nbsp;69 readv<br />
&nbsp; 0.01 &nbsp; &nbsp;0.038107 &nbsp; &nbsp; &nbsp; &nbsp; 343 &nbsp; &nbsp; &nbsp; 111 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; read<br />
&nbsp; 0.01 &nbsp; &nbsp;0.037883 &nbsp; &nbsp; &nbsp; &nbsp; 743 &nbsp; &nbsp; &nbsp; &nbsp;51 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mmap<br />
&nbsp; 0.01 &nbsp; &nbsp;0.037498 &nbsp; &nbsp; &nbsp; &nbsp; 180 &nbsp; &nbsp; &nbsp; 208 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; epoll_ctl<br />
&nbsp; 0.01 &nbsp; &nbsp;0.035654 &nbsp; &nbsp; &nbsp; &nbsp; 517 &nbsp; &nbsp; &nbsp; &nbsp;69 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; writev<br />
&nbsp; 0.01 &nbsp; &nbsp;0.025542 &nbsp; &nbsp; &nbsp; &nbsp; 370 &nbsp; &nbsp; &nbsp; &nbsp;69 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; io_submit<br />
&nbsp; 0.00 &nbsp; &nbsp;0.019760 &nbsp; &nbsp; &nbsp; &nbsp; 282 &nbsp; &nbsp; &nbsp; &nbsp;70 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; write<br />
&nbsp; 0.00 &nbsp; &nbsp;0.019555 &nbsp; &nbsp; &nbsp; &nbsp; 477 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; open<br />
&nbsp; 0.00 &nbsp; &nbsp;0.016285 &nbsp; &nbsp; &nbsp; &nbsp;1629 &nbsp; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rt_sigaction<br />
&nbsp; 0.00 &nbsp; &nbsp;0.012359 &nbsp; &nbsp; &nbsp; &nbsp; 301 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; close<br />
&nbsp; 0.00 &nbsp; &nbsp;0.010069 &nbsp; &nbsp; &nbsp; &nbsp; 205 &nbsp; &nbsp; &nbsp; &nbsp;49 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; munmap<br />
&nbsp; 0.00 &nbsp; &nbsp;0.006977 &nbsp; &nbsp; &nbsp; &nbsp; 303 &nbsp; &nbsp; &nbsp; &nbsp;23 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rt_sigprocmask<br />
&nbsp; 0.00 &nbsp; &nbsp;0.006256 &nbsp; &nbsp; &nbsp; &nbsp; 153 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fstat<br />
&nbsp; 0.00 &nbsp; &nbsp;0.004646 &nbsp; &nbsp; &nbsp; &nbsp; 465 &nbsp; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;10 stat<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000860 &nbsp; &nbsp; &nbsp; &nbsp; 215 &nbsp; &nbsp; &nbsp; &nbsp; 4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; madvise<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000321 &nbsp; &nbsp; &nbsp; &nbsp; 161 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_setaffinity<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000295 &nbsp; &nbsp; &nbsp; &nbsp; 148 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; set_robust_list<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000281 &nbsp; &nbsp; &nbsp; &nbsp; 141 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; clone<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000236 &nbsp; &nbsp; &nbsp; &nbsp; 118 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sigaltstack<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000093 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;47 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; arch_prctl<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000046 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;23 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_getaffinity<br />
------ ----------- ----------- --------- --------- ----------------<br />
100.00 &nbsp;461.546755 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 59137 &nbsp; &nbsp; &nbsp;2334 total</div></div>
<p>… And as I expected, with FUA enabled no fsync() / fdatasync() called anymore and writing to a stable storage is achieved directly by FUA commands. Now iomap_dio_rw() is determining if REQ_FUA can be used and issuing generic_write_sync() is still necessary. To dig further to the IO layer we need to rely to another tool blktrace (mentioned to the Bob Dorr&rsquo;s article as well).</p>
<p>In my case I got to different pictures of blktrace output between forced flushed mechanism (the default) and FUA oriented IO:</p>
<p>-&gt; With forced flush</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">34.694734500 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17164 &nbsp;A &nbsp;WS &nbsp; &nbsp; &nbsp; 2048 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694735000 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17165 &nbsp;Q &nbsp;WS &nbsp; &nbsp; &nbsp; 2048 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694737000 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17166 &nbsp;X &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694738100 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17167 &nbsp;G &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694739800 &nbsp; &nbsp; &nbsp;14225 18426216 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17169 &nbsp;G &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694740900 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17171 &nbsp;D &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694747200 &nbsp; &nbsp; &nbsp;14225 18426216 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17174 &nbsp;D &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.713665000 &nbsp; &nbsp; &nbsp;14225 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8,16 &nbsp; 0 &nbsp; &nbsp;17175 &nbsp;Q FWS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.713668100 &nbsp; &nbsp; &nbsp;14225 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8,16 &nbsp; 0 &nbsp; &nbsp;17176 &nbsp;G FWS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr</div></div>
<p>WS (Write Synchronous) is performed but SQL Server still needs to go through the multi-step flush process with the additional FWS (PERFLUSH|WRITE|SYNC).</p>
<p>-&gt; FUA</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">0.000000000 &nbsp; &nbsp; &nbsp;16305 55106536 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;1 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000000400 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;2 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000001100 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;3 &nbsp;Q WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000005200 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;4 &nbsp;G WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.001377800 &nbsp; &nbsp; &nbsp;16305 55106544 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;6 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; 16 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr</div></div>
<p>FWS has disappeared with only WFS commands which are basically <strong>REQ_WRITE with the REQ_FUA request</strong></p>
<p>I spent some times to read some interesting discussions in addition to the Bob Dorr&rsquo;s wonderful article. Here an interesting <a href="https://lkml.org/lkml/2019/12/3/316" rel="noopener" target="_blank">pointer</a> to a a discussion about REQ_FUA for instance.</p>
<p><strong>But what about performance gain? </strong></p>
<p>I had 2 simple scenarios to play with in order to bring out FUA helpfulness including the harden the dirty pages in the BP with checkpoint process and harden the log buffer to disk during the commit phase. When forced flush method is used, each component relies on additional FlushFileBuffers() function to achieve durability. This event can be easily tracked from an XE session including <strong>flush_file_buffers</strong> and <strong>make_writes_durable</strong> events.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-1-1-flushfilebuffers-worklflow.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-1-1-flushfilebuffers-worklflow.jpg" alt="160 - 1 - 1 - flushfilebuffers worklflow" width="839" height="505" class="alignnone size-full wp-image-1575" /></a></p>
<p><strong>First scenario (10K inserts within a transaction and checkpoint)</strong></p>
<p>In this scenario my intention was to stress the checkpoint process with a bunch of buffers and dirty pages to flush to disk when it kicks in.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">USE dummy;<br />
<br />
SET NOCOUNT ON;<br />
-- Disable checkpoint to control when it will kick in<br />
DBCC TRACEON(3505);<br />
-- Check traceflag<br />
DBCC TRACESTATUS;<br />
<br />
DECLARE @i INT = 0;<br />
DECLARE @iteration INT = 0;<br />
DECLARE @start_upd DATETIME;<br />
DECLARE @start_chkpt DATETIME;<br />
DECLARE @end_upd DATETIME;<br />
DECLARE @end_chkpt DATETIME;<br />
<br />
TRUNCATE TABLE dummy_test;<br />
<br />
WHILE @iteration &amp;lt; 251<br />
BEGIN<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; BEGIN TRAN;<br />
<br />
&nbsp; &nbsp; WHILE @i &amp;lt;= 10000<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; INSERT INTO dummy_test DEFAULT VALUES;<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @i += 1;<br />
&nbsp; &nbsp; END<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; COMMIT TRAN;<br />
<br />
&nbsp; &nbsp; SET @end_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; SET @i = 0;<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_chkpt = GETDATE();<br />
&nbsp; &nbsp; CHECKPOINT;<br />
&nbsp; &nbsp; SET @end_chkpt = GETDATE();<br />
&nbsp; &nbsp; PRINT &amp;#039;INS: &amp;#039; + CAST(DATEDIFF(ms, @start_upd, @end_upd) AS VARCHAR(50)) + &amp;#039; - CHKPT: &amp;#039; + CAST(DATEDIFF(ms, @start_chkpt, @end_chkpt) AS VARCHAR(50));<br />
<br />
&nbsp; &nbsp; SET @iteration += 1;<br />
END</div></div>
<p>The result is as follows:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-5-test-perfs-250_10K_chkpt.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-5-test-perfs-250_10K_chkpt.jpg" alt="160 - 5 - test perfs 250_10K_chkpt" width="974" height="298" class="alignnone size-full wp-image-1576" /></a></p>
<p>In my case, I noticed ~ 17% of improvement for the checkpoint process and ~7% for the insert transaction including the commit phase with flushing data to the TLog. In parallel, looking at the extended event aggregated output confirms that FUA avoids a lot of additional operations to persist data on disk illustrated by flush_file_buffers and make_writes_durable events.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-6-xe-flush-file-buffers-e1586798220100.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-6-xe-flush-file-buffers-e1586798220100.jpg" alt="160 - 6 - xe flush file buffers" width="1000" height="178" class="alignnone size-full wp-image-1577" /></a></p>
<p><strong>Second scenario (100x 1 insert within a transaction and checkpoint)</strong></p>
<p>In this scenario, I wanted to stress the log writer by forcing a lot of small transactions to commit. I updated the TSQL code as shown below:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">USE dummy;<br />
<br />
SET NOCOUNT ON;<br />
-- Disable checkpoint to control when it will kick in<br />
DBCC TRACEON(3505);<br />
-- Check traceflag<br />
DBCC TRACESTATUS;<br />
<br />
DECLARE @i INT = 0;<br />
DECLARE @iteration INT = 0;<br />
DECLARE @start_upd DATETIME;<br />
DECLARE @start_chkpt DATETIME;<br />
DECLARE @end_upd DATETIME;<br />
DECLARE @end_chkpt DATETIME;<br />
<br />
TRUNCATE TABLE dummy_test;<br />
<br />
WHILE @iteration &amp;lt; 251<br />
BEGIN<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; WHILE @i &amp;lt;= 100<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; INSERT INTO dummy_test DEFAULT VALUES;<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @i += 1;<br />
&nbsp; &nbsp; END<br />
<br />
&nbsp; &nbsp; SET @end_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; SET @i = 0;<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_chkpt = GETDATE();<br />
&nbsp; &nbsp; CHECKPOINT;<br />
&nbsp; &nbsp; SET @end_chkpt = GETDATE();<br />
&nbsp; &nbsp; PRINT &amp;#039;INS: &amp;#039; + CAST(DATEDIFF(ms, @start_upd, @end_upd) AS VARCHAR(50)) + &amp;#039; - CHKPT: &amp;#039; + CAST(DATEDIFF(ms, @start_chkpt, @end_chkpt) AS VARCHAR(50));<br />
<br />
&nbsp; &nbsp; SET @iteration += 1;<br />
END</div></div>
<p>The new picture is the following:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-7-test-perfs-250_100_1K_chkpt.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-7-test-perfs-250_100_1K_chkpt.jpg" alt="160 - 7 - test perfs 250_100_1K_chkpt" width="974" height="298" class="alignnone size-full wp-image-1580" /></a></p>
<p>This time the improvement is definitely more impressive with a decrease of ~80% of the execution time about the INSERT + COMMIT and ~77% concerning the checkpoint phase!!!</p>
<p>Looking at the extended event session confirms the shorten IO path has something to do with it <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-7-xe-flush-file-buffers-2-e1586798367112.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-7-xe-flush-file-buffers-2-e1586798367112.jpg" alt="160 - 7 - xe flush file buffers 2" width="1000" height="170" class="alignnone size-full wp-image-1578" /></a></p>
<p>Well, shortening the IO path and relying directing on initial FUA instructions was definitely a good idea both to join performance and to meet WAL and ACID capabilities. Anyway, I’m glad to see Microsoft to contribute improving to the Linux Kernel!!!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Mitigating Scalar UDF&#8217;s procedural code performance with SQL 2019 and  Scalar UDF Inlining capabilities</title>
		<link>https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites</link>
		<comments>https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites#comments</comments>
		<pubDate>Thu, 05 Mar 2020 15:10:02 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL Server 2019]]></category>
		<category><![CDATA[Imperative]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[Procedural code]]></category>
		<category><![CDATA[Programing]]></category>
		<category><![CDATA[Scalar UDF Inlining]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1516</guid>
		<description><![CDATA[A couple of days ago, I read the write-up of my former colleague @FranckPachot about refactoring procedural code to SQL. This is recurrent subject in the database world and I was interested in transposing this article to SQL Server because &#8230; <a href="https://blog.developpez.com/mikedavem/p13189/performance/mitigating-scalar-udf-procedural-code-performance-with-sql-2019-udf-inline-capabilites">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>A couple of days ago, I read the write-up of my former colleague <a href="https://twitter.com/FranckPachot" rel="noopener" target="_blank">@FranckPachot</a> about <a href="https://blog.dbi-services.com/refactoring-procedural-to-sql-an-example-with-mysql-sakila/" rel="noopener" target="_blank">refactoring procedural code to SQL</a>. This is recurrent subject in the database world and I was interested in transposing this article to SQL Server because it was about refactoring a  Scalar-Valued function to a SQL view. The latter one is a great alternative when it comes performance but something new was shipped with SQL Server 2019 and could address (or at least could mitigate) this recurrent scenario. </p>
<p><span id="more-1516"></span></p>
<p>First of all, Scalar-Valued functions (from the User Defined Function category) are interesting objects for code modularity, factoring and reusability. No surprise to see them widely used by DEVs. But they are not always suited to performance considerations especially when it concerns the “impedance mismatch” problem. This is term used to refer to the problems that occurs due to differences between the database model and the programming language model. Indeed, from one side, a database world with SQL language that is declarative, and with queries that are set or multiset-oriented. To another side, programing world with imperative-oriented languages requiring accessing each tuple individually for processing.</p>
<p>To cut the story short, Scalar UDF provides programing benefits for DEVs but when performance matters, we discourage to use them for the aforementioned reasons. Before continuing, let’s precise that all the scripts and demos in the next sections are based on <a href="https://github.com/jOOQ/jOOQ/tree/master/jOOQ-examples/Sakila" rel="noopener" target="_blank">salika-db</a> project on GitHub. Franck Pachot used the mysql version and fortunately there exists a sample for SQL Server as well. Furthermore, the mysql function used as initial example by Franck may be translated to SQL Server as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- Scalar function<br />
CREATE OR ALTER FUNCTION inventory_in_stock (@p_inventory_id INT) <br />
RETURNS BIT<br />
BEGIN<br />
&nbsp; &nbsp; DECLARE @v_rentals INT;<br />
&nbsp; &nbsp; DECLARE @v_out &nbsp; &nbsp; INT;<br />
&nbsp; &nbsp; DECLARE @verif &nbsp; &nbsp; BIT;<br />
&nbsp; &nbsp; <br />
<br />
&nbsp; &nbsp; --AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE<br />
&nbsp; &nbsp; --FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED<br />
<br />
&nbsp; &nbsp; SET @v_rentals = (SELECT COUNT(*) FROM rental WHERE inventory_id = @p_inventory_id);<br />
<br />
&nbsp; &nbsp; IF @v_rentals = 0 <br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 1<br />
&nbsp; &nbsp; END<br />
&nbsp; &nbsp; ELSE<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @v_out = (SELECT COUNT(rental_id) <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM inventory <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LEFT JOIN rental ON inventory.inventory_id = rental.inventory_id<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE inventory.inventory_id = @p_inventory_id<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND rental.return_date IS NULL)<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; IF @v_out &amp;gt; 0 <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 0;<br />
&nbsp; &nbsp; &nbsp; &nbsp; ELSE<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SET @verif = 1;<br />
&nbsp; &nbsp; END;<br />
<br />
&nbsp; &nbsp; RETURN @verif;<br />
END <br />
GO</div></div>
<p>During his write-up, Franck provided a natural alternative of this UDF based on a SQL view and here a similar solution applied to SQL Server:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE OR ALTER VIEW v_inventory_stock_status <br />
AS<br />
<br />
SELECT <br />
&nbsp; &nbsp; i.inventory_id,<br />
&nbsp; &nbsp; CASE <br />
&nbsp; &nbsp; &nbsp; &nbsp; WHEN NOT EXISTS (SELECT 1 FROM dbo.rental AS r WHERE r.inventory_id = &nbsp;i.inventory_id AND r.return_date IS NULL) THEN 1<br />
&nbsp; &nbsp; &nbsp; &nbsp; ELSE 0<br />
&nbsp; &nbsp; END AS inventory_in_stock<br />
FROM dbo.inventory AS i<br />
GO</div></div>
<p>Then similar to what Franck did, we can join this view with the inventory table to get the expected outcome:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join v_inventory_stock_status AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-1-Query-OutPut.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-1-Query-OutPut.jpg" alt="156 - 1 - Query OutPut" width="356" height="82" class="alignnone size-full wp-image-1518" /></a></p>
<p>There is another alternative that could be use here base on a CTE rather than a TSQL view as follows. However, the performance is similar in both cases and it is up to each DEV which solution fits with their needs:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">;with cte<br />
as<br />
(<br />
&nbsp; &nbsp; SELECT <br />
&nbsp; &nbsp; &nbsp; &nbsp; i.inventory_id,<br />
&nbsp; &nbsp; &nbsp; &nbsp; CASE <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHEN NOT EXISTS (SELECT 1 FROM dbo.rental AS r WHERE r.inventory_id = &nbsp;i.inventory_id AND r.return_date IS NULL) THEN 1<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ELSE 0<br />
&nbsp; &nbsp; &nbsp; &nbsp; END AS inventory_in_stock<br />
&nbsp; &nbsp; FROM dbo.inventory AS i<br />
)<br />
select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join cte AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p>I compared then the performance between the UDF based version and the TSQL view:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- udf<br />
select count(*),dbo.inventory_in_stock(inventory_id) <br />
from inventory <br />
group by dbo.inventory_in_stock(inventory_id)<br />
GO<br />
-- view<br />
select count(v.inventory_id),inventory_in_stock<br />
from inventory AS i<br />
left join v_inventory_stock_status AS v ON i.inventory_id = v.inventory_id<br />
group by v.inventory_in_stock;<br />
go</div></div>
<p>The outcome below (CPU, Reads, Writes, Duration) is as expected. The SQL view is the winner by far. </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-2-UDF-vs-View-performance-e1583417465245.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-2-UDF-vs-View-performance-e1583417465245.jpg" alt="156 - 2 - UDF vs View performance" width="1000" height="166" class="alignnone size-full wp-image-1519" /></a></p>
<p>Similar to Franck’s finding, the performance gain is as the cost of rewriting the code for DEVs in this scenario. But SQL Server 2019 provides another interesting way to continue using the UDF abstraction  without compromising on performance: <a href="https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/scalar-udf-inlining?view=sql-server-ver15" rel="noopener" target="_blank">Scalar T-SQL UDF Inlining</a> feature and I was curious to see how much improvement we get with such capabilities for this scenario. </p>
<p>First time I executed the following UDF-based TSQL script on SQL Server 2019 RTM (be sure to be in 150 compatibility mode), I ran into some OOM issues for the second query:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- SQL 2017-<br />
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;<br />
GO<br />
SELECT dbo.inventory_in_stock(10)<br />
GO<br />
-- SQL 2019+<br />
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON;<br />
GO<br />
SELECT dbo.inventory_in_stock(10)</div></div>
<p></p>
<blockquote><p>Msg 8624, Level 16, State 17, Line 14<br />
Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.</p></blockquote>
<p>To be honest, not a surprise to be honest because I already aware of it by reading the blog post of <a href="https://twitter.com/sqL_handLe" rel="noopener" target="_blank">@sqL_handle</a> a couple of weeks ago. Updating to CU2 fixed my issue. The second shot revealed some interesting outcomes.<br />
The query plan of first query (&lt;= SQL 2017) is as we may expected usually from executing a TSQL scalar function. From an execution perspective, this black box is materialized in the form of the compute scalar operator as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2017-query-plan.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2017-query-plan.jpg" alt="156 - 3 - UDF 2017 query plan" width="610" height="203" class="alignnone size-full wp-image-1521" /></a></p>
<p>But the story has changed with Scalar UDF Inlining capability. This is illustrated by the below pictures which are sample of a larger execution plan:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-query-plan-e1583417692798.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-query-plan-e1583417692798.jpg" alt="156 - 3 - UDF 2019 query plan" width="1000" height="375" class="alignnone size-full wp-image-1522" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-2-query-plan-e1583420250648.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-3-UDF-2019-2-query-plan-e1583420250648.jpg" alt="156 - 3 - UDF 2019 2 query plan" width="1000" height="329" class="alignnone size-full wp-image-1524" /></a></p>
<p>The query optimizer has inferred some relation operations from my (imperative based) scalar UDF based on the <a href="https://www.microsoft.com/en-us/research/project/froid/" rel="noopener" target="_blank">Froid framework</a> and provides several benefits including compiler optimization and parallelism (initially not possible with UDFs).</p>
<p>Let’s perform the same benchmark test that I performed between the UDF-based and the TSQL view based queries. In fact, I had to propose a slightly variation of the query to hope kicking in the Scalar UDF Inline capability:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- First UDF query <br />
select count(*),dbo.inventory_in_stock(inventory_id) <br />
from inventory <br />
group by dbo.inventory_in_stock(inventory_id)<br />
GO<br />
<br />
-- Variation of the first query<br />
;with cte<br />
as<br />
(<br />
&nbsp; &nbsp; select inventory_id,dbo.inventory_in_stock(inventory_id) as inventory_in_stock<br />
&nbsp; &nbsp; from inventory <br />
)<br />
select &nbsp;count(*), inventory_in_stock<br />
from cte<br />
group by inventory_in_stock<br />
GO</div></div>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-4-UDF-2019-benchmark-query-plan-e1583420352841.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-4-UDF-2019-benchmark-query-plan-e1583420352841.jpg" alt="156 - 4 - UDF 2019 benchmark query plan" width="1000" height="357" class="alignnone size-full wp-image-1525" /></a></p>
<p>From a performance perspective, it is worth noting the improvement is not necessarily on the read operation but more the CPU and Duration times.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-5-UDF-vs-UDF-inline-performance-e1583420400334.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-5-UDF-vs-UDF-inline-performance-e1583420400334.jpg" alt="156 - 5 - UDF vs UDF inline performance" width="1000" height="110" class="alignnone size-full wp-image-1527" /></a></p>
<p>But let’s push the tests further by increasing the amount of data. As a reminder, the performance of the test is tied to the number of UDF execution and implicitly number of records in the Inventory table. </p>
<p>So, let’s add a bunch of records to the Inventory table …</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">INSERT inventory (film_id, store_id, last_update)<br />
SELECT <br />
&nbsp; &nbsp; film_id,<br />
&nbsp; &nbsp; store_id,<br />
&nbsp; &nbsp; GETDATE()<br />
FROM inventory;</div></div>
<p>&#8230; and let&rsquo;s execute this script to get respectively a total of 146592 and 2345472 rows for each test. Here the corresponding performance outcomes:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-6-UDF-vs-UDF-inline-performance-add-more-rows-e1583420498722.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-6-UDF-vs-UDF-inline-performance-add-more-rows-e1583420498722.jpg" alt="156 - 6 - UDF vs UDF inline performance - add more rows" width="1000" height="225" class="alignnone size-full wp-image-1528" /></a></p>
<p>I noticed more rows there are in the inventory table better performance we get for each corresponding test:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-7-UDF-vs-UDF-inline-performance-chart-cpu.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-7-UDF-vs-UDF-inline-performance-chart-cpu.jpg" alt="156 - 7 - UDF vs UDF inline performance - chart cpu" width="876" height="384" class="alignnone size-full wp-image-1529" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-8-UDF-vs-UDF-inline-performance-chart-duration.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-8-UDF-vs-UDF-inline-performance-chart-duration.jpg" alt="156 - 8 - UDF vs UDF inline performance - chart duration" width="868" height="384" class="alignnone size-full wp-image-1530" /></a></p>
<p>Well, interesting outcome without rewriting any code isn’t it? An 80% decrease in average for query duration time and 61% for CPU time execution. For a sake of curiosity let’s take a look at the different query plans:</p>
<p><strong>Scalar UDF Inlining not enabled</strong></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-10-UDF-more-rows-execution-plan-e1583420738157.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-10-UDF-more-rows-execution-plan-e1583420738157.jpg" alt="156 - 10 - UDF - more rows execution plan" width="1000" height="155" class="alignnone size-full wp-image-1531" /></a></p>
<p>Again, the real cost is hidden by the UDF black box through the compute scalar operator but we guess easily that every row processed by compute Scalar operator implies the dbo.inventory_in_stock() function. </p>
<p><strong>Scalar UDF Inlining enabled</strong></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-11-UDF-inlining-more-rows-execution-plan-e1583420789863.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-11-UDF-inlining-more-rows-execution-plan-e1583420789863.jpg" alt="156 - 11 - UDF inlining - more rows execution plan" width="1000" height="255" class="alignnone size-full wp-image-1532" /></a></p>
<p>Without going into details of the execution plan, something that draw attention is compiler optimizer tricks kicked in including parallelism. All the optimization stuff done by the query processor is helpful to improve the overall performance of the query.</p>
<p>So last point, does Scalar UDF Inlining scale better than the SQL view? </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/03/156-9-UDF-inline-vs-view-performance-e1583420871437.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/03/156-9-UDF-inline-vs-view-performance-e1583420871437.jpg" alt="156 - 9 - UDF inline vs view performance" width="1200" height="74" class="alignnone size-full wp-image-1533" /></a></p>
<p>This last output seems to confirm the SQL view remains the winner among the alternatives in this specific scenario and you will have to choose best solution and likely the acceptable tradeoff that will fit with your context.</p>
<p>See you!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQL DB Azure, performance scaling thoughts</title>
		<link>https://blog.developpez.com/mikedavem/p13188/sql-azure/sql-db-azure-performance-scaling-thoughts</link>
		<comments>https://blog.developpez.com/mikedavem/p13188/sql-azure/sql-db-azure-performance-scaling-thoughts#comments</comments>
		<pubDate>Thu, 20 Feb 2020 21:09:54 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Azure]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL Azure DB]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1490</guid>
		<description><![CDATA[Let’s continue with Azure stories and performance scaling &#8230; A couple of weeks ago, we studied opportunities to replace existing clustered indexes (CI) with columnstore indexes (CCI) for some facts. To cut the story short and to focus on the &#8230; <a href="https://blog.developpez.com/mikedavem/p13188/sql-azure/sql-db-azure-performance-scaling-thoughts">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Let’s continue with Azure stories and performance scaling &#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-0-banner-e1582232926354.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-0-banner-e1582232926354.jpg" alt="155 - 0 - banner" width="500" height="288" class="alignnone size-full wp-image-1507" /></a></p>
<p>A couple of weeks ago, we studied opportunities to replace existing clustered indexes (CI) with columnstore indexes (CCI) for some facts. To cut the story short and to focus on the right topic of this write-up, we prepared a creation script for specific CCIs based on the <a href="http://www.nikoport.com/2014/04/16/clustered-columnstore-indexes-part-29-data-loading-for-better-segment-elimination/" rel="noopener" target="_blank">Niko’s technique</a> variation (no MAXDOP = 1 meaning we enable parallelism) in order to get a better segment alignment. </p>
<p><span id="more-1490"></span></p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">-- Recreation of clustered index<br />
CREATE CLUSTERED INDEX [PK_FACT_IDX] <br />
ON dbo.FactTable (KeyColumn)<br />
WITH (DROP_EXISTING = ON, DATA_COMPRESSION = PAGE);<br />
<br />
-- Creation of the CCI<br />
CREATE CLUSTERED COLUMNSTORE INDEX [PK_FACT_IDX] <br />
ON dbo.FactTable <br />
WITH (DROP_EXISTING = ON);<br />
<br />
-- Recreation of [[... n] nonclustered indexes<br />
CREATE INDEX [IDX_xxx … n]<br />
ON dbo.FactTable (column)<br />
WITH (DROP_EXISTING = ON, DATA_COMPRESSION = PAGE);</div></div>
<p>Before deploying those indexes in our SQL DB Azure environment, we staged a first scenario in on-premises instance and the creation of all indexes took ~ 1h. It is worth noting that our tests are based on the same database with the same data in all cases. But guess what, the story was different in Azure <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /> and I got feedbacks from another team who was responsible to deploy indexes in Azure, the creation script was a bit longer (~ 4h).<br />
I definitely enjoyed this story because we got a deeper understanding of DB Azure performance topic.</p>
<p><strong>=&gt; Moving to the cloud means we’ll get slower performance? </strong></p>
<p>Before drawing conclusions to quickly a good habit to get is to compare specifications between environments. It’s not about comparing oranges and apples.  Well let’s set my own context: from one side, the on-premises virtual SQL Server environment specification includes 8vCPUs (Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz), 64 GB of RAM and a high-performance storage array with micro latency device dedicated to our IO intensive workloads. From the vendor specifications, we may except very interesting IO performance with a general throughput greater than 100 KIOPs (Random) or 1GB/s (sequential).  On another side, the SQL DB Azure is based on the service pricing tier General Purpose: Serverless Gen5, 8 vCores. We use the vCore purchasing model and referring to the <a href="https://docs.microsoft.com/bs-latn-ba/Azure/sql-database/sql-database-vcore-resource-limits-single-databases" rel="noopener" target="_blank">Microsoft documentation</a>, hardware generation 5 includes a compute specification based on Intel E5-2673 v4 (Broadwell) 2.3-GHz and Intel SP-8160 (Skylake) processors.  Added to this, the service pricing tier comes with a remote SSD based storage including IO latency around 5-7ms and 2560 IOPs max. Given the opportunity of the infrastructure elasticity, we could scale to up 16 vCores, 48GB of RAM and 5120 IOPs for data. Obviously, latency remains the same in this case.</p>
<p>As illustration, creation of all indexes (CI + CCI + NCIs) performed in our on-premises environment gave the following storage performance figures:  ~ 700MB/s and 13K IOPs for maximum values that were an aggregation of DATA + LOG activity on D: drive. Rebuilding indexes are high resource consuming operations in terms of CPU as well and we obviously noticed CPU saturation at different steps of the operation.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-1-on-premises-storage-performance-e1582231532623.gif"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-1-on-premises-storage-performance-e1582231532623.gif" alt="155 - 1 - on-premises-storage-performance" width="900" height="448" class="alignnone size-full wp-image-1491" /></a></p>
<p>&#8230;</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-2-on-premises-cpu-performance-e1582231566692.gif"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-2-on-premises-cpu-performance-e1582231566692.gif" alt="155 - 2 - on-premises-cpu-performance" width="800" height="398" class="alignnone size-full wp-image-1492" /></a></p>
<p>As an aside, we may notice the creation of CCI is a less intensive operation in terms of resources and we retrieve the same pattern in Azure below. Talking of which, let’s compare with our SQL Azure DB. There are different ways to get performance metrics including the portal which enables monitoring performance through easy-to-use interface or DMVs for each Azure DB like sys.dm_db_resource_stats. It is worth noting that in SQL Azure DB metrics are expressed as percentage of the service tier limit, so you need to adjust your analysis with the tier you’re using. First, we observed the same resource utilization pattern for all steps of the creation script but within a different timeline – duration has increased to 4h (as mentioned by another team). There is a clear picture of reaching the limit of the configured service tier, especially for Log IO (green line) and we already switched from GP_S_Gen5_8 to GP_S_Gen5_16 service tier </p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-3-Az-CCI_Gen5_16_General_Purpose_CI_CCI_compressed_page-e1582231670221.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-3-Az-CCI_Gen5_16_General_Purpose_CI_CCI_compressed_page-e1582231670221.jpg" alt="155 - 3 - Az - CCI_Gen5_16_General_Purpose_CI_CCI_compressed_page" width="1200" height="278" class="alignnone size-full wp-image-1494" /></a></p>
<p>In addition, Wait stats gave interesting insights as well:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-5-wait_stats_CCI_index_Gen5_8_16_GP_CI_CCI_compressed_page_-e1582231763875.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-5-wait_stats_CCI_index_Gen5_8_16_GP_CI_CCI_compressed_page_-e1582231763875.jpg" alt="155 - 5 - wait_stats_CCI_index_Gen5_8_16_GP_CI_CCI_compressed_page_" width="1200" height="226" class="alignnone size-full wp-image-1496" /></a></p>
<p>Excluding the traditional PAGEIOLATCH_xx waits, the LOG_RATE_GOVERNOR wait type appeared in the top waits and confirms that we bumped into the limits imposed on transaction log I/O by our performance tier.</p>
<p><strong>=&gt; Scaling vs Upgrading the Service for better performance?  </strong></p>
<p>With SQL DB Azure PaaS, we may benefit from elastic architecture. Firstly, scaling the number of CPUs is a factor of improvement and there is a direct relationship with storage (IOPs), memory or disk space allocated for tempdb for instance. But the order of magnitude varies with the service tier as shown below:</p>
<p>For General Purpose ServerLess Generation 5 service tier &#8211; Resources per Core</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-6-Gen5_8_16_GP_service_tier_perf_.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-6-Gen5_8_16_GP_service_tier_perf_.jpg" alt="155 - 6 - Gen5_8_16_GP_service_tier_perf_" width="1002" height="175" class="alignnone size-full wp-image-1499" /></a></p>
<p>Something relevant here because even performance increases with the number of vCores provisioned, we can deduce Log IO saturation from our test in Azure (especially in the first step of the CI creation) results of max log rate limitation that doesn’t scale in the same way. This is especially relevant here because as said previously index creation can be an resource intensive operation with a huge impact on the transaction log.</p>
<p><strong>What would be a solution to speed-up this operation? </strong></p>
<p>First viable solution in our context would be to switch to SIMPLE recovery model that fits perfectly with our scenario because we could get minimally-logged capabilities and a lower impact on the transaction log and because it is suitable for DW environments. Unfortunately, at the moment of this write-up, this is not supported and I suggest you to vote on <a href="https://feedback.azure.com/forums/217321-sql-database/suggestions/36400585-allow-recovery-model-to-be-changed-to-simple-in-az" rel="noopener" target="_blank">feedback Azure</a> if you are interested in.<br />
From an infrastructure standpoint, improving max log rate throughput is only possible by upgrading to a higher service tier (but at the cost of higher fees obviously). For a sake of curiosity, I did a try with the <strong>BC_Gen5_16</strong> service tier specifications:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-6-Gen5_8_16_BC_service_tier_perf_.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-6-Gen5_8_16_BC_service_tier_perf_.jpg" alt="155 - 6 - Gen5_8_16_BC_service_tier_perf_" width="1002" height="175" class="alignnone size-full wp-image-1500" /></a></p>
<p>Even if this new service tier seems to be a better fit (suggested by the relative percentage of resource usage) …</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/02/155-4-CCI_index_Gen5_16_Business_Critical_CI_CCI_compressed_page_-e1582232230338.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/02/155-4-CCI_index_Gen5_16_Business_Critical_CI_CCI_compressed_page_-e1582232230338.jpg" alt="155 - 4 - CCI_index_Gen5_16_Business_Critical_CI_CCI_compressed_page_" width="1200" height="203" class="alignnone size-full wp-image-1501" /></a></p>
<p>… there are important notes here:</p>
<p>1) Business Critical Tier is not available for Serverless architecture</p>
<p>2) Moving to a different service is not instantaneous and it may require several hours according to the database size (~ 3h for a total size of ~500GB database size in my case).  Well, this is not viable option even if get better performance. Indeed, if we add the time to upgrade to a higher service tier (3h) + time to run the creation script (3h or 25% of performance gain compared to the previous GP_S_Gen5_16 service tier). We may obviously upgrade again to reach performance closer to our on-premises environment but does it worth fighting for here only for an index creation script? </p>
<p>Concerning our scenario (Data Warehouse), it is generally easy to schedule a non-peak hours time frame that doesn&rsquo;t overlap with the processing-oriented workload but it could not be the case for everyone!  </p>
<p>See you!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Expérimentation d&#8217;une mise à jour de statistiques sur une grosse table par des voies détournées</title>
		<link>https://blog.developpez.com/mikedavem/p13167/sql-server-2014/experimentation-dune-mise-a-jour-de-statistiques-sur-une-grosse-table-par-des-voies-detournees</link>
		<comments>https://blog.developpez.com/mikedavem/p13167/sql-server-2014/experimentation-dune-mise-a-jour-de-statistiques-sur-une-grosse-table-par-des-voies-detournees#comments</comments>
		<pubDate>Thu, 25 Jan 2018 06:52:08 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Server 2014]]></category>
		<category><![CDATA[SQL Server 2016]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[sqlserver]]></category>
		<category><![CDATA[statistiques]]></category>
		<category><![CDATA[TF7471]]></category>
		<category><![CDATA[update statistic]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1375</guid>
		<description><![CDATA[Ceci est mon premier blog de l&#8217;année 2018 et depuis un moment d&#8217;ailleurs. En effet, l&#8217;année dernière j&#8217;ai mis toute mon énergie à réajuster mes connaissances Linux avec la nouvelle stratégie Open Source de Microsoft. Mais en même temps, j&#8217;ai &#8230; <a href="https://blog.developpez.com/mikedavem/p13167/sql-server-2014/experimentation-dune-mise-a-jour-de-statistiques-sur-une-grosse-table-par-des-voies-detournees">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Ceci est mon premier blog de l&rsquo;année 2018 et depuis un moment d&rsquo;ailleurs. En effet, l&rsquo;année dernière j&rsquo;ai mis toute mon énergie à réajuster mes connaissances Linux avec la nouvelle stratégie Open Source de Microsoft. Mais en même temps, j&rsquo;ai réalisé un certain nombre de tâches intéressantes chez certains clients et en voici une pour commencer cette nouvelle année. Dans ce billet, j&rsquo;aimerai souligner une approche particulière (selon moi) pour optimiser une mise à jour de statistiques pour une grosse table.</p>
<p>&gt; <a href="https://blog.dbi-services.com/experiencing-updating-statistics-on-a-big-table-by-unusual-ways/" rel="noopener" target="_blank">Lire la suite</a> (en anglais)</p>
<p>David Barbarin<br />
MVP &amp; MCM SQL Server</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Groupes de disponibilités AlwaysOn and problème de statistique sur les secondaires</title>
		<link>https://blog.developpez.com/mikedavem/p13130/sql-server-2012/groupes-de-disponibilites-alwayson-and-probleme-de-statistique-sur-les-secondaires</link>
		<comments>https://blog.developpez.com/mikedavem/p13130/sql-server-2012/groupes-de-disponibilites-alwayson-and-probleme-de-statistique-sur-les-secondaires#comments</comments>
		<pubDate>Sun, 15 Jan 2017 15:30:34 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[SQL Server 2012]]></category>
		<category><![CDATA[SQL Server 2014]]></category>
		<category><![CDATA[SQL Server 2016]]></category>
		<category><![CDATA[availability groups]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[réplica en lecture seule]]></category>
		<category><![CDATA[reporting]]></category>
		<category><![CDATA[secondary replicas]]></category>
		<category><![CDATA[statistiques]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1289</guid>
		<description><![CDATA[Je voudrais partager avec vous un problème intéressant de statistiques que vous pouvez rencontrer avec les réplicas en lecture seule dans une infrastructure de groupe de disponibilités. Pour ceux qui les utilisent pour des besoins de Reporting, continuez la lecture &#8230; <a href="https://blog.developpez.com/mikedavem/p13130/sql-server-2012/groupes-de-disponibilites-alwayson-and-probleme-de-statistique-sur-les-secondaires">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Je voudrais partager avec vous un problème intéressant de statistiques que vous pouvez rencontrer avec les réplicas en lecture seule dans une infrastructure de groupe de disponibilités. Pour ceux qui les utilisent pour des besoins de Reporting, continuez la lecture de ce billet car il s&rsquo;agit d&rsquo;un problème de comportement de mise à jour de statistiques sur ceux-ci pouvant impliquer un problème d&rsquo;estimation de cardinalités pouvant avoir de graves conséquences sur les performances de vos requêtes.</p>
<p>&gt; <a href="http://blog.dbi-services.com/sql-server-alwayson-availability-groups-and-statistic-issues-on-secondaries/" target="_blank">Lire la suite</a> (en anglais)</p>
<p>David Barbarin<br />
MVP &amp; MCM SQL Server</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
