SQLIOSim - Outstanding I/O Warnings
I have encountered some questions pertaining to the SQLIOSim outstanding IO warnings and I am posting some details.
The Display monitor thread is responsible for updating the display (progress outputs) and checking for long IOs (CheckForLongIO/ReportLongIo). Each time it wakes (from 1 to 10 second range depending on state) it fires the check.
Each logical CPU (scheduler) in the simulator has an IO list. The CheckForLongIO loops over each logical CPU and checks the pending IO list of each logical scheduler.
When the IO is posted (async) we grab the current time and add the IO to the logical scheduler specific list. When the check is performed each IO in the list is checked with HasOverlappedIoCompleted . This is a WIN32 macro that checks the status stored in the OVERLAPPED structure of the IO to see if the system has reported the IO as complete. When FALSE is returned and the duration of the IO is > 15 seconds the condition is flagged as a ‘stalled’ IO as the OS and sub-system have not completed the specific IO.
The number of these outstanding IOs is accumulated and reported after all logical CPUs have been checked.
The following output shows 2 logical CPUs being used by the simulator.
03/10/07 |
15:35:58 |
3264 |
System |
CPUCount = 2, Affinity = 0, IOAffinity = 0, MaxMemory = 630 MB, StopOnError = Yes, TestCycles = 1, TestCycleDuration = 300 s, CacheHitRatio = 100.00% |
At 15:36:46 you have 5 IO requests that exceeded 15 seconds. One second later another 72 requests exceeded the 15 second mark. (The same IO is not reported again, once flagged as long it won’t be counted on secondary passes.)
What this does not show is how many IOs have completed. SQLIOSim could have completed 1000s of IOs during this 15 second window but some of the requests are being stalled.
03/10/07 |
15:36:45 |
3764 |
Display Monitor |
Error: 0x00000000 Error Text: Description: 5 IO requests are outstanding for more than 15 sec. |
03/10/07 |
15:36:46 |
3764 |
Display Monitor |
Error: 0x00000000 Error Text: Description: 72 IO requests are outstanding for more than 15 sec. |
The files are open with NO buffering so they should not involve system cache components. You should review filter drivers and the like. Likely something like elevator seek or something of this nature is at play when you see the outstanding IO warnings.
03/10/07 |
15:35:58 |
3264 |
System |
NoBuffering = Yes, WriteThrough = Yes, UseScatterGather = Yes, ForceReadAhead = No, MaxOutstandingIO = 0, TargetIODuration = 100 ms |
LOGICALCONFIG::PrintConfigurationToLog |
e:yukonsosbranchsqlntdbmsstorengutilsqliosimlogicalconfig.cpp |
228 |
|
03/10/07 |
15:35:58 |
3264 |
System |
AllowIOBursts = Yes, DeleteFilesAtStartup = Yes, DeleteFilesAtShutdown = No, StampFiles = No |
The following error text is reported for each, individual file no more than every 30 seconds, indicating the number of total IOs identified as taking longer than 15 seconds.
The following indicates that 826 IO requests for the file sqliosim.mdx exceeded 15 seconds to complete in the last 30 second processing window. The last IO that is currently STATUS_PENDING (256) is at the specified offset on the specified handle.
03/10/07 |
15:37:16 |
3764 |
Display Monitor |
Comments
Anonymous
June 28, 2007
One of the things that’s great about my job is the fact that I get to meet people who are a lot smarterAnonymous
July 18, 2007
I had run the test for HP MSA drive wiht one local drive. But for MSA drvie it is printing following errors:- 19/07 12:43:56 11688 CreateFileStream Error: 0x80070032 Error Text: The request is not supported. Description: Unable to get disk cache info for Z:Anonymous
October 08, 2007
The utility attempts to check the caching state. When it can't it logs an error. This should be a warning and will be updated in future versions. [RDORR Jan 24, 2008] This is correct and is already updated in the SQL 2008 CTP builds.Anonymous
March 04, 2009
"[RDORR Jan 24, 2008] This is correct and is already updated in the SQL 2008 CTP builds." In SQL2008 on Windows 2008, I still get a similar error message from "CreateFileStream": Error: 0x80070001 Error Text: Incorrect function. Description: Unable to get disk cache info for C: Version: 2007.100.1600.22 Is it safe to say that it is just a warning? not an error?Anonymous
June 09, 2009
According to this post, "The CheckForLongIO calls ReportLongIO but ReportLongIO has the 30 second reporting throttle." In my case it looks like it is in the reverse order. Can you please help me to understand is this is a potential problem. 10:47:09 Description: 1 long IO requests are outstanding for file E:sqliosim.mdx. Handle 0x598. Offset of latest long IO 0xab5f40000, duration 15366 ms CLogicalFile::ReportLongIO 10:47:09 Description: 298 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:11 Description: 385 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:13 Description: 282 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:14 Description: 915 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:15 Description: 830 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:16 Description: 723 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:17 Description: 1446 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIO 10:47:18 Description: 101 IO requests are outstanding for more than 15 sec. CLogicalCPUs::CheckForLongIOAnonymous
August 24, 2010
wonderful blog thanks for sharing such useful post. http://www.billergeeks.comAnonymous
October 01, 2010
Thanks for sharing this information, its quite helpful. www.crystal-limousine.comAnonymous
May 16, 2013
The comment has been removedAnonymous
November 25, 2018
I have a basic query:In case of underlying device not being accessible for few seconds.How does sqliosim behaves.I see Cachehit Ratio =1000 (10%) being set.My expectation, it should hold the IO for some time before it gives an Error dump.But in actual, i am seeing an sqliosim error dumo immediately.1st Dump Snippet:Data mismatch between the expected disk data and the read buffer:File: E:\sqliosim.mdxOffset: 0x1A6D0000Expected FileId: 0x0Received FileId: 0x0Expected PageId: 0xD368Received PageId: 0x2DD70000 (does not match expected)Expected CheckSum: 0xCDE3D5F4Received CheckSum: 0xD05B4ED0 (does not match expected)Calculated CheckSum: 0xD05B4ED0Expected Buffer Length: 0x2000Received Buffer Length: 0x2000Synchronous read was not successful after 15 attempts++++++Difference data available at the end of dump+++++++Do anyone have an idea, whether its a Storage problem or a tool problem.I ran it in Windows 2008,12,2016 Server edition.Will appreciate a quick response.Thanks,TanayAnonymous
November 25, 2018
Any idea, how sqliosim would handle if underlying device is offline for few seconds