<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David Barbarin &#187; xfs</title>
	<atom:link href="https://blog.developpez.com/mikedavem/ptag/xfs/feed" rel="self" type="application/rss+xml" />
	<link>https://blog.developpez.com/mikedavem</link>
	<description>MVP DataPlatform - MCM SQL Server</description>
	<lastBuildDate>Thu, 09 Sep 2021 21:19:50 +0000</lastBuildDate>
	<language>fr-FR</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.42</generator>
	<item>
		<title>SQL Server on Linux and new FUA support for XFS filesystem</title>
		<link>https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem</link>
		<comments>https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem#comments</comments>
		<pubDate>Mon, 13 Apr 2020 17:34:32 +0000</pubDate>
		<dc:creator><![CDATA[mikedavem]]></dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL Server 2017]]></category>
		<category><![CDATA[SQL Server 2019]]></category>
		<category><![CDATA[blktrace]]></category>
		<category><![CDATA[FUA]]></category>
		<category><![CDATA[iostats]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[xfs]]></category>

		<guid isPermaLink="false">http://blog.developpez.com/mikedavem/?p=1568</guid>
		<description><![CDATA[I wrote a (dbi services) blog post concerning Linux and SQL Server IO behavior changes before and after SQL Server 2017 CU6. Now, I was looking forward seeing some new improvements with Force Unit Access (FUA) that was implemented with &#8230; <a href="https://blog.developpez.com/mikedavem/p13193/sql-server-vnext/sql-server-on-linux-and-new-fua-support-for-xfs-filesystem">Lire la suite <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I wrote a (dbi services) <a href="https://blog.dbi-services.com/sql-server-on-linux-io-internal-thoughts/" rel="noopener" target="_blank">blog pos</a>t concerning Linux and SQL Server IO behavior changes before and after SQL Server 2017 CU6.  Now, I was looking forward seeing some new improvements with Force Unit Access (FUA) that was implemented with Linux XFS enhancements since the Linux Kernel 4.18.</p>
<p><span id="more-1568"></span></p>
<p>As reminder, SQL Server 2017 CU6 provides added a way to guarantee data durability by using &laquo;&nbsp;forced flush&nbsp;&raquo; mechanism explained <a href="https://support.microsoft.com/en-us/help/4131496/enable-forced-flush-mechanism-in-sql-server-2017-on-linux" rel="noopener" target="_blank">here</a>. To cut the story short, SQL Server has strict storage requirement such as Write Ordering, FUA and things go differently on Linux than Windows to achieve durability. What is FUA and why is it important for SQL Server? From <a href="https://en.wikipedia.org/wiki/Disk_buffer#Force_Unit_Access_(FUA)" rel="noopener" target="_blank">Wikipedia</a>:  Force Unit Access (aka FUA) is an I/O write command option that forces written data all the way to stable storage. FUA appeared in the SCSI command set but good news, it was later adopted by other standards over the time. SQL Server relies on it to meet WAL and ACID capabilities. </p>
<p>On the Linux world and before the Kernel 4.18, FUA was handled and optimized only for the filesystem journaling. However, data storage always used the multi-step flush process that could introduce SQL Server IO storage slowness (Issue write to block device for the data + issue block device flush to ensure durability with O_DSYNC). </p>
<p>On the Windows world, installing and using a SQL Server instance assumes you are compliant with the Microsoft storage requirements and therefore the first RTM version shipped on Linux came only with O_DIRECT assuming you already ensure that SQL Server IO are able to be written directly into a non-volatile storage through the kernel, drivers and hardware before the acknowledgement. Forced flush mechanism &#8211; based on fdatasync() &#8211;  was then introduced to address scenarios with no safe DIRECT_IO capabilities. </p>
<p>But referring to the Bob Dorr <a href="https://bobsql.com/sql-server-on-linux-forced-unit-access-fua-internals/" rel="noopener" target="_blank">article</a>, Linux Kernel 4.18 comes with XFS enhancements to handle FUA for data storage and it is obviously of benefit to SQL Server.  FUA support is intended to improve write requests by shorten the path of write requests as shown below:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-1-IO-worklow-e1586796506268.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-1-IO-worklow-e1586796506268.jpg" alt="160 - 1 - IO worklow" width="1000" height="539" class="alignnone size-full wp-image-1569" /></a></p>
<p><em>Picture from existing IO workflow on Bob Dorr&rsquo;s article</em></p>
<p>This is an interesting improvement for write intensive workload and it seems to be confirmed from the tests performed by Microsoft and Bob Dorr in his article. </p>
<p>Let’s the experiment begins with my lab environment based on a Centos 7 on Hyper-V with an upgraded kernel version: 5.6.3-1.e17.elrepo.x86_64.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$uname -r<br />
5.6.3-1.el7.elrepo.x86_64<br />
<br />
$cat /etc/os-release | grep VERSION<br />
VERSION=&quot;7 (Core)&quot;<br />
VERSION_ID=&quot;7&quot;<br />
CENTOS_MANTISBT_PROJECT_VERSION=&quot;7&quot;<br />
REDHAT_SUPPORT_PRODUCT_VERSION=&quot;7&quot;</div></div>
<p>Let’s precise that my tests are purely experimental and instead of upgrading the Kernel to a newer version you may directly rely on RHEL 8 based distros which comes with kernel version 4.18 for example.</p>
<p>My lab environment includes 2 separate SSD disks to host the DATA + TLOG database files as follows:</p>
<p>I:\ drive : SQL Data volume (sdb – XFS filesystem)<br />
T:\ drive : SQL TLog volume (sda – XFS filesystem)</p>
<p>The general performance is not so bad <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-6-diskmark-tests-storage-env-e1586796679451.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-6-diskmark-tests-storage-env-e1586796679451.jpg" alt="160 - 6 - diskmark tests storage env" width="1000" height="362" class="alignnone size-full wp-image-1571" /></a></p>
<p>Initially I just dedicated on disk for both SQL DATA and TLOG but I quickly noticed some IO waits (iostats output) leading to make me lunconfident with my test results</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-3-iostats-before-optimization.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-3-iostats-before-optimization.jpg" alt="160 - 3 - iostats before optimization" width="975" height="447" class="alignnone size-full wp-image-1572" /></a></p>
<p>Spreading IO on physically separate volumes helped to reduce drastically these phenomena afterwards:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-4-iostats-after-optimization.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-4-iostats-after-optimization.jpg" alt="160 - 4 - iostats after optimization" width="984" height="531" class="alignnone size-full wp-image-1573" /></a> </p>
<p>First, I enabled FUA capabilities on Hyper-V side as follows:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">Set-VMHardDiskDrive -VMName CENTOS7 -ControllerType SCSI -OverrideCacheAttributes WriteCacheAndFUAEnabled<br />
<br />
Get-VMHardDiskDrive -VMName CENTOS7 | `<br />
&nbsp; &nbsp; ft VMName, ControllerType, &nbsp;ControllerLocation, Path, WriteHardeningMethod -AutoSize</div></div>
<p>Then I checked if FUA is enabled and supported from an OS perspective including sda (TLOG) and sdb (SQL DATA) disks:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ lsblk -f<br />
NAME &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;FSTYPE &nbsp; &nbsp; &nbsp;LABEL UUID &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; MOUNTPOINT<br />
sdb<br />
└─sdb1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 06910f69-27a3-4711-9093-f8bf80d15d72 &nbsp; /sqldata<br />
sr0<br />
sda<br />
├─sda2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f5a9bded-130f-4642-bd6f-9f27563a4e16 &nbsp; /boot<br />
├─sda3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;LVM2_member &nbsp; &nbsp; &nbsp; QsbKEt-28yT-lpfZ-VCbj-v5W5-vnVr-2l7nih<br />
│ ├─centos-swap swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7eebbb32-cef5-42e9-87c3-7df1a0b79f11 &nbsp; [SWAP]<br />
│ └─centos-root xfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 90f6eb2f-dd39-4bef-a7da-67aa75d1843d &nbsp; /<br />
└─sda1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;vfat &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7529-979E &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/boot/efi<br />
<br />
$ dmesg | grep sda<br />
[ &nbsp; &nbsp;1.665478] sd 0:0:0:0: [sda] 83886080 512-byte logical blocks: (42.9 GB/40.0 GiB)<br />
[ &nbsp; &nbsp;1.665479] sd 0:0:0:0: [sda] 4096-byte physical blocks<br />
[ &nbsp; &nbsp;1.665774] sd 0:0:0:0: [sda] Write Protect is off<br />
[ &nbsp; &nbsp;1.665775] sd 0:0:0:0: [sda] Mode Sense: 0f 00 10 00<br />
[ &nbsp; &nbsp;1.670321] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA<br />
[ &nbsp; &nbsp;1.683833] &nbsp;sda: sda1 sda2 sda3<br />
[ &nbsp; &nbsp;1.708938] sd 0:0:0:0: [sda] Attached SCSI disk<br />
[ &nbsp; &nbsp;5.607914] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)</div></div>
<p>Finally according to the documentation, I configured the <strong>trace flag 3979</strong> and <strong>control.alternatewritethrough=0</strong> parameters at startup parameters for my SQL Server instance.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ /opt/mssql/bin/mssql-conf traceflag 3979 on<br />
<br />
$ /opt/mssql/bin/mssql-conf set control.alternatewritethrough 0<br />
<br />
$ systemctl restart mssql-server</div></div>
<p>The first I performed was pretty similar to those in my previous (dbi services) <a href="https://blog.dbi-services.com/sql-server-on-linux-io-internal-thoughts/">blog post</a>.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE TABLE dummy_test (<br />
&nbsp; &nbsp; id INT IDENTITY,<br />
&nbsp; &nbsp; col1 VARCHAR(2000) DEFAULT REPLICATE('T', 2000)<br />
);<br />
<br />
INSERT INTO dummy_test DEFAULT VALUES;<br />
GO 67</div></div>
<p>For a sake of curiosity, I looked at the corresponding strace output:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">$ cat sql_strace_fua.txt<br />
% time &nbsp; &nbsp; seconds &nbsp;usecs/call &nbsp; &nbsp; calls &nbsp; &nbsp;errors syscall<br />
------ ----------- ----------- --------- --------- ----------------<br />
&nbsp;78.13 &nbsp;360.618066 &nbsp; &nbsp; &nbsp; 61739 &nbsp; &nbsp; &nbsp;5841 &nbsp; &nbsp; &nbsp;2219 futex<br />
&nbsp; 6.88 &nbsp; 31.731833 &nbsp; &nbsp; 1511040 &nbsp; &nbsp; &nbsp; &nbsp;21 &nbsp; &nbsp; &nbsp; &nbsp;15 restart_syscall<br />
&nbsp; 3.81 &nbsp; 17.592176 &nbsp; &nbsp; &nbsp;130312 &nbsp; &nbsp; &nbsp; 135 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; io_getevents<br />
&nbsp; 2.95 &nbsp; 13.607314 &nbsp; &nbsp; &nbsp; 98604 &nbsp; &nbsp; &nbsp; 138 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; epoll_wait<br />
&nbsp; 2.88 &nbsp; 13.313667 &nbsp; &nbsp; &nbsp;633984 &nbsp; &nbsp; &nbsp; &nbsp;21 &nbsp; &nbsp; &nbsp; &nbsp;21 rt_sigtimedwait<br />
&nbsp; 2.60 &nbsp; 11.997925 &nbsp; &nbsp; 1333103 &nbsp; &nbsp; &nbsp; &nbsp; 9 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; nanosleep<br />
&nbsp; 1.79 &nbsp; &nbsp;8.279781 &nbsp; &nbsp; &nbsp; &nbsp; 242 &nbsp; &nbsp; 34256 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; gettid<br />
&nbsp; 0.84 &nbsp; &nbsp;3.876021 &nbsp; &nbsp; &nbsp; &nbsp; 226 &nbsp; &nbsp; 17124 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; getcpu<br />
&nbsp; 0.03 &nbsp; &nbsp;0.138836 &nbsp; &nbsp; &nbsp; &nbsp; 347 &nbsp; &nbsp; &nbsp; 400 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_yield<br />
&nbsp; 0.01 &nbsp; &nbsp;0.062348 &nbsp; &nbsp; &nbsp; &nbsp; 254 &nbsp; &nbsp; &nbsp; 245 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; getrusage<br />
&nbsp; 0.01 &nbsp; &nbsp;0.056065 &nbsp; &nbsp; &nbsp; &nbsp; 406 &nbsp; &nbsp; &nbsp; 138 &nbsp; &nbsp; &nbsp; &nbsp;69 readv<br />
&nbsp; 0.01 &nbsp; &nbsp;0.038107 &nbsp; &nbsp; &nbsp; &nbsp; 343 &nbsp; &nbsp; &nbsp; 111 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; read<br />
&nbsp; 0.01 &nbsp; &nbsp;0.037883 &nbsp; &nbsp; &nbsp; &nbsp; 743 &nbsp; &nbsp; &nbsp; &nbsp;51 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mmap<br />
&nbsp; 0.01 &nbsp; &nbsp;0.037498 &nbsp; &nbsp; &nbsp; &nbsp; 180 &nbsp; &nbsp; &nbsp; 208 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; epoll_ctl<br />
&nbsp; 0.01 &nbsp; &nbsp;0.035654 &nbsp; &nbsp; &nbsp; &nbsp; 517 &nbsp; &nbsp; &nbsp; &nbsp;69 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; writev<br />
&nbsp; 0.01 &nbsp; &nbsp;0.025542 &nbsp; &nbsp; &nbsp; &nbsp; 370 &nbsp; &nbsp; &nbsp; &nbsp;69 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; io_submit<br />
&nbsp; 0.00 &nbsp; &nbsp;0.019760 &nbsp; &nbsp; &nbsp; &nbsp; 282 &nbsp; &nbsp; &nbsp; &nbsp;70 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; write<br />
&nbsp; 0.00 &nbsp; &nbsp;0.019555 &nbsp; &nbsp; &nbsp; &nbsp; 477 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; open<br />
&nbsp; 0.00 &nbsp; &nbsp;0.016285 &nbsp; &nbsp; &nbsp; &nbsp;1629 &nbsp; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rt_sigaction<br />
&nbsp; 0.00 &nbsp; &nbsp;0.012359 &nbsp; &nbsp; &nbsp; &nbsp; 301 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; close<br />
&nbsp; 0.00 &nbsp; &nbsp;0.010069 &nbsp; &nbsp; &nbsp; &nbsp; 205 &nbsp; &nbsp; &nbsp; &nbsp;49 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; munmap<br />
&nbsp; 0.00 &nbsp; &nbsp;0.006977 &nbsp; &nbsp; &nbsp; &nbsp; 303 &nbsp; &nbsp; &nbsp; &nbsp;23 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rt_sigprocmask<br />
&nbsp; 0.00 &nbsp; &nbsp;0.006256 &nbsp; &nbsp; &nbsp; &nbsp; 153 &nbsp; &nbsp; &nbsp; &nbsp;41 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fstat<br />
&nbsp; 0.00 &nbsp; &nbsp;0.004646 &nbsp; &nbsp; &nbsp; &nbsp; 465 &nbsp; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;10 stat<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000860 &nbsp; &nbsp; &nbsp; &nbsp; 215 &nbsp; &nbsp; &nbsp; &nbsp; 4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; madvise<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000321 &nbsp; &nbsp; &nbsp; &nbsp; 161 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_setaffinity<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000295 &nbsp; &nbsp; &nbsp; &nbsp; 148 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; set_robust_list<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000281 &nbsp; &nbsp; &nbsp; &nbsp; 141 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; clone<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000236 &nbsp; &nbsp; &nbsp; &nbsp; 118 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sigaltstack<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000093 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;47 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; arch_prctl<br />
&nbsp; 0.00 &nbsp; &nbsp;0.000046 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;23 &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sched_getaffinity<br />
------ ----------- ----------- --------- --------- ----------------<br />
100.00 &nbsp;461.546755 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 59137 &nbsp; &nbsp; &nbsp;2334 total</div></div>
<p>… And as I expected, with FUA enabled no fsync() / fdatasync() called anymore and writing to a stable storage is achieved directly by FUA commands. Now iomap_dio_rw() is determining if REQ_FUA can be used and issuing generic_write_sync() is still necessary. To dig further to the IO layer we need to rely to another tool blktrace (mentioned to the Bob Dorr&rsquo;s article as well).</p>
<p>In my case I got to different pictures of blktrace output between forced flushed mechanism (the default) and FUA oriented IO:</p>
<p>-&gt; With forced flush</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">34.694734500 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17164 &nbsp;A &nbsp;WS &nbsp; &nbsp; &nbsp; 2048 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694735000 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17165 &nbsp;Q &nbsp;WS &nbsp; &nbsp; &nbsp; 2048 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694737000 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17166 &nbsp;X &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694738100 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17167 &nbsp;G &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694739800 &nbsp; &nbsp; &nbsp;14225 18426216 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17169 &nbsp;G &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694740900 &nbsp; &nbsp; &nbsp;14225 18425192 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17171 &nbsp;D &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.694747200 &nbsp; &nbsp; &nbsp;14225 18426216 &nbsp; &nbsp; 8,16 &nbsp; 0 &nbsp; &nbsp;17174 &nbsp;D &nbsp;WS &nbsp; &nbsp; &nbsp; 1024 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.713665000 &nbsp; &nbsp; &nbsp;14225 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8,16 &nbsp; 0 &nbsp; &nbsp;17175 &nbsp;Q FWS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
34.713668100 &nbsp; &nbsp; &nbsp;14225 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8,16 &nbsp; 0 &nbsp; &nbsp;17176 &nbsp;G FWS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr</div></div>
<p>WS (Write Synchronous) is performed but SQL Server still needs to go through the multi-step flush process with the additional FWS (PERFLUSH|WRITE|SYNC).</p>
<p>-&gt; FUA</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">0.000000000 &nbsp; &nbsp; &nbsp;16305 55106536 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;1 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000000400 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;2 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000001100 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;3 &nbsp;Q WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.000005200 &nbsp; &nbsp; &nbsp;16305 57615336 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;4 &nbsp;G WFS &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr<br />
0.001377800 &nbsp; &nbsp; &nbsp;16305 55106544 &nbsp; &nbsp; 8,0 &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;6 &nbsp;A WFS &nbsp; &nbsp; &nbsp; &nbsp; 16 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sqlservr</div></div>
<p>FWS has disappeared with only WFS commands which are basically <strong>REQ_WRITE with the REQ_FUA request</strong></p>
<p>I spent some times to read some interesting discussions in addition to the Bob Dorr&rsquo;s wonderful article. Here an interesting <a href="https://lkml.org/lkml/2019/12/3/316" rel="noopener" target="_blank">pointer</a> to a a discussion about REQ_FUA for instance.</p>
<p><strong>But what about performance gain? </strong></p>
<p>I had 2 simple scenarios to play with in order to bring out FUA helpfulness including the harden the dirty pages in the BP with checkpoint process and harden the log buffer to disk during the commit phase. When forced flush method is used, each component relies on additional FlushFileBuffers() function to achieve durability. This event can be easily tracked from an XE session including <strong>flush_file_buffers</strong> and <strong>make_writes_durable</strong> events.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-1-1-flushfilebuffers-worklflow.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-1-1-flushfilebuffers-worklflow.jpg" alt="160 - 1 - 1 - flushfilebuffers worklflow" width="839" height="505" class="alignnone size-full wp-image-1575" /></a></p>
<p><strong>First scenario (10K inserts within a transaction and checkpoint)</strong></p>
<p>In this scenario my intention was to stress the checkpoint process with a bunch of buffers and dirty pages to flush to disk when it kicks in.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">USE dummy;<br />
<br />
SET NOCOUNT ON;<br />
-- Disable checkpoint to control when it will kick in<br />
DBCC TRACEON(3505);<br />
-- Check traceflag<br />
DBCC TRACESTATUS;<br />
<br />
DECLARE @i INT = 0;<br />
DECLARE @iteration INT = 0;<br />
DECLARE @start_upd DATETIME;<br />
DECLARE @start_chkpt DATETIME;<br />
DECLARE @end_upd DATETIME;<br />
DECLARE @end_chkpt DATETIME;<br />
<br />
TRUNCATE TABLE dummy_test;<br />
<br />
WHILE @iteration &amp;lt; 251<br />
BEGIN<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; BEGIN TRAN;<br />
<br />
&nbsp; &nbsp; WHILE @i &amp;lt;= 10000<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; INSERT INTO dummy_test DEFAULT VALUES;<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @i += 1;<br />
&nbsp; &nbsp; END<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; COMMIT TRAN;<br />
<br />
&nbsp; &nbsp; SET @end_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; SET @i = 0;<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_chkpt = GETDATE();<br />
&nbsp; &nbsp; CHECKPOINT;<br />
&nbsp; &nbsp; SET @end_chkpt = GETDATE();<br />
&nbsp; &nbsp; PRINT &amp;#039;INS: &amp;#039; + CAST(DATEDIFF(ms, @start_upd, @end_upd) AS VARCHAR(50)) + &amp;#039; - CHKPT: &amp;#039; + CAST(DATEDIFF(ms, @start_chkpt, @end_chkpt) AS VARCHAR(50));<br />
<br />
&nbsp; &nbsp; SET @iteration += 1;<br />
END</div></div>
<p>The result is as follows:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-5-test-perfs-250_10K_chkpt.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-5-test-perfs-250_10K_chkpt.jpg" alt="160 - 5 - test perfs 250_10K_chkpt" width="974" height="298" class="alignnone size-full wp-image-1576" /></a></p>
<p>In my case, I noticed ~ 17% of improvement for the checkpoint process and ~7% for the insert transaction including the commit phase with flushing data to the TLog. In parallel, looking at the extended event aggregated output confirms that FUA avoids a lot of additional operations to persist data on disk illustrated by flush_file_buffers and make_writes_durable events.</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-6-xe-flush-file-buffers-e1586798220100.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-6-xe-flush-file-buffers-e1586798220100.jpg" alt="160 - 6 - xe flush file buffers" width="1000" height="178" class="alignnone size-full wp-image-1577" /></a></p>
<p><strong>Second scenario (100x 1 insert within a transaction and checkpoint)</strong></p>
<p>In this scenario, I wanted to stress the log writer by forcing a lot of small transactions to commit. I updated the TSQL code as shown below:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:650px;height:450px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">USE dummy;<br />
<br />
SET NOCOUNT ON;<br />
-- Disable checkpoint to control when it will kick in<br />
DBCC TRACEON(3505);<br />
-- Check traceflag<br />
DBCC TRACESTATUS;<br />
<br />
DECLARE @i INT = 0;<br />
DECLARE @iteration INT = 0;<br />
DECLARE @start_upd DATETIME;<br />
DECLARE @start_chkpt DATETIME;<br />
DECLARE @end_upd DATETIME;<br />
DECLARE @end_chkpt DATETIME;<br />
<br />
TRUNCATE TABLE dummy_test;<br />
<br />
WHILE @iteration &amp;lt; 251<br />
BEGIN<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; WHILE @i &amp;lt;= 100<br />
&nbsp; &nbsp; BEGIN<br />
&nbsp; &nbsp; &nbsp; &nbsp; INSERT INTO dummy_test DEFAULT VALUES;<br />
&nbsp; &nbsp; &nbsp; &nbsp; SET @i += 1;<br />
&nbsp; &nbsp; END<br />
<br />
&nbsp; &nbsp; SET @end_upd = GETDATE();<br />
<br />
&nbsp; &nbsp; SET @i = 0;<br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; SET @start_chkpt = GETDATE();<br />
&nbsp; &nbsp; CHECKPOINT;<br />
&nbsp; &nbsp; SET @end_chkpt = GETDATE();<br />
&nbsp; &nbsp; PRINT &amp;#039;INS: &amp;#039; + CAST(DATEDIFF(ms, @start_upd, @end_upd) AS VARCHAR(50)) + &amp;#039; - CHKPT: &amp;#039; + CAST(DATEDIFF(ms, @start_chkpt, @end_chkpt) AS VARCHAR(50));<br />
<br />
&nbsp; &nbsp; SET @iteration += 1;<br />
END</div></div>
<p>The new picture is the following:</p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-7-test-perfs-250_100_1K_chkpt.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-7-test-perfs-250_100_1K_chkpt.jpg" alt="160 - 7 - test perfs 250_100_1K_chkpt" width="974" height="298" class="alignnone size-full wp-image-1580" /></a></p>
<p>This time the improvement is definitely more impressive with a decrease of ~80% of the execution time about the INSERT + COMMIT and ~77% concerning the checkpoint phase!!!</p>
<p>Looking at the extended event session confirms the shorten IO path has something to do with it <img src="https://blog.developpez.com/mikedavem/wp-includes/images/smilies/icon_smile.gif" alt=":)" class="wp-smiley" /></p>
<p><a href="http://blog.developpez.com/mikedavem/files/2020/04/160-7-xe-flush-file-buffers-2-e1586798367112.jpg"><img src="http://blog.developpez.com/mikedavem/files/2020/04/160-7-xe-flush-file-buffers-2-e1586798367112.jpg" alt="160 - 7 - xe flush file buffers 2" width="1000" height="170" class="alignnone size-full wp-image-1578" /></a></p>
<p>Well, shortening the IO path and relying directing on initial FUA instructions was definitely a good idea both to join performance and to meet WAL and ACID capabilities. Anyway, I’m glad to see Microsoft to contribute improving to the Linux Kernel!!!</p>
]]></content:encoded>
			<wfw:commentRss></wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
