Inside disk i/o

                                                                                                                                           Beta version
      Ø  Identify io bottlenecks by using sp_sysmon and resolutions 
Ø  useful MDA to find the disk io contention
Ø  disk i/o structures
Ø  io scheduler
Ø  semaphore




Identify io bottlenecks by using sp_sysmon and resolutions:-

io bottlenecks and resolutions

Identify i/o problems by using sp_sysmon

1) Lock Management report

        * Last page locks on Heaps

2) Task Management report
     
       * Task context switch due to "i/o Device contention"

3) Disk i/o Management report

       * Max outstanding i/o
       * i/o delayed by
       * Completed disk i/o
       * Device Activity details

-------------->>

1) Lock Management report

 * Last page locks on Heaps

Last Page Locks on Heaps
      Granted                     1617.8          23.5       32356      99.9 %
       Waited                    1.4           0.0          28       0.1 %
       -------------------------            ------------      ------------        ----------   ----------
  Total Last Pg Locks              1619.2          23.5       32384     100.0 %


2) Task Management report

    * Task context switch due to "i/o Device contention"

Task Context Switches Due To:
    Voluntary Yields                 49.5           0.7         989       7.4 %
    Cache Search Misses               1.7           0.0          34       0.3 %
    System Disk Writes                2.1           0.0          42       0.3 %
    I/O Pacing                       37.4           0.5         747       5.6 %
    Logical Lock Contention           2.5           0.0          50       0.4 %
    Address Lock Contention          0.0           0.0           0       0.0 %
    Latch Contention                  0.0           0.0           0       0.0 %
    Log Semaphore Contention        0.8           0.0          15       0.1 %
    PLC Lock Contention               0.3           0.0           5       0.0 %
    Group Commit Sleeps             3.2           0.0          64       0.5 %
    Last Log Page Writes             46.3           0.7         926       6.9 %
    Modify Conflicts                   3.3           0.0          66       0.5 %
    I/O Device Contention       0.0           0.0       0       10.0 %

I/O Device Contention  it means the task was waiting for a semaphore for particular device it happens where mirroring (SMP). In a lay man words if the ASE engines try to get I/O structure from the same device at a time one should be wait until the task is completed.

In this case the resolutions are 'add some device and move some  indexes and table'

-------->
3) Disk i/o Management report

       * Max outstanding i/o
       * i/o delayed by
       * Completed disk i/o
       * Device Activity details


Disk I/O Management
-------------------
Max Outstanding I/Os            per sec      per xact       count  % of total
  -------------------------                  ------------      ------------     ----------  ----------
    Server                              n/a           n/a          16       n/a
    Engine 0                            n/a           n/a          14       n/a
    Engine 1                            n/a           n/a          14       n/a
Max Outstanding I/Os  the count column is reporting  ASE max num of I/Os pending for each engine at the time interval

We have to find the difference between different runs and observe the count columns if it is not coming down we have to change I/O related config in ASE and OS parameter

**It is most useful parameter to identify the i/o bottleneck

    I/Os Delayed by           per sec      per xact       count  % of total

    Disk I/O Structures               n/a           n/a              0       n/a
    Server Config Limit               n/a           n/a              0       n/a
    Engine Config Limit               n/a           n/a              0       n/a
    Operating System Limit           n/a           n/a              0       n/a

The ASE is delayed due to insufficient disk io control before initiating io request .  If the count is >0 we have to change the parameter Disk I/O Structure is ASE parameter the rest we have to change in OS level

sp_configure "disk I/o structures"

Server,Engine config limit  happens due to OS is restricting async limits per system or per process. we have to reconfigure async io setting like aio in os level and configure in ASE level like

sp_configure "max async I/os per server"


Max async i/os per engine:

specifies the maximum number of outstanding asynchronous disk I/O requests for a single engine at one time.Max value of 'max async i/os per engine' is mentioned in /proc/sys/fs/aio-max-nr. If this value exceeds the max allowed value io_setup() call fails with EAGAIN.

Max async i/os per server:

specifies the maximum number of asynchronous disk I/O requests that can be outstanding for Adaptive Server at one time. This limit is not affected by the number of online engines per Adaptive Server; max async i/os per server limits the total number of asynchronous I/Os a server can issue at one time, regardless of how many online engines it has. max async i/os per engine limits the number of outstanding I/Os per engine.

 http://www.sybase.com/files/White_Papers/Sybase_ASE_on_Linux_wp.pdf

Asynchronous I/O (AIO):- It is to allow a process to initiate a number of I/O operations without having to block or wait for any to complete. At some later time, or after being notified of I/O completion, the process can retrieve the results of the I/O.

** the config of sub report of  "Total Requested Disk I/Os and Total Completed I/Os" should be equal.

   Total Requested Disk I/Os   118.0           1.7        2359
  Completed Disk I/O's
    Engine 0                         52.5           0.8        1049      44.4 %
    Engine 1                         65.7           1.0        1313      55.6 %
  -------------------------  ------------  ------------  ----------
  Total Completed I/Os              118.1           1.7        2362
==============================================

Useful MDA to find the disk io contentions:-

1> select Reads, APFReads, Writes, DevSemaphoreRequests, DevSemaphoreWaits, IOTime,
2> substring(LogicalName, 1, 30) LogicalName, substring(PhysicalName, 1, 30) PhysicalName
3> from master..monDeviceIO
4> where LogicalName like 'data%'
5>
6> select 'Data Segments', sum(IOTime) 'Total IOTime', sum(Reads + APFReads + Writes) 'Total R+APFr+W', sum(IOTime)/sum(Reads + APFReads + Writes) 'ms/IO'
7> from master..monDeviceIO
8> where LogicalName like 'data%'
9>
10> select Reads, APFReads, Writes, DevSemaphoreRequests, DevSemaphoreWaits, IOTime,
11> substring(LogicalName, 1, 30) LogicalName, substring(PhysicalName, 1, 30) PhysicalName
12> from master..monDeviceIO
13> where LogicalName like 'log%'
14>
15> select 'Data Segments', sum(IOTime) 'Total IOTime', sum(Reads + APFReads + Writes) 'Total R+APFr+W', sum(IOTime)/sum(Reads + APFReads + Writes) 'ms/IO'
16> from master..monDeviceIO
17> where LogicalName like 'log%'

18> go

disk i/o structures:-
--------------------

Here is current advice from ASE Product Support Engineering(ex for 5 engines).  A general rule of thumb is to allow 400 per engine and then monitor with sp_sysmon. If the Max Outstanding I/Os Per Engine gets over about 60% then "disk i/o structures" should be increased.

The same is true if I/Os Delayed By: Disk I/O Structures is non-zero. In this case with 5 engines we would recommend setting "disk i/o structures" to 5 x 400 = 2000 -> round up to 2048.


There should be no negative side effect of changing "disk i/o structures" from value 512 to 2048 in the server. ASE will use a small amount of additional memory when it boots -- a couple hundred bytes each, so around half a MB or less for the increase from 512 to 2048. This is a relatively low cost to prevent a constriction on throughput.

===========================================

The sp_sysmon shows that there are very high numbers for  I/Os delayed by disk i/o structures:-
---------------------------------------------------------------------------------------------

in this case 1) we have to increase the disk i/o structure in sybase ase level and consequent OS paremeter. 2) observe sp_sysmon also showed significant number of samples where the cache spinlock contention exceeded 10%. By adding some additional cache partitions, this also helped speed up the pending requested I/Os.

=========================================

io scheduler:-

Does Sybase make any recommendation or have any guidelines on the Linux io scheduler with ASE?

The write-up below would suggest that the deadline scheduler would be more appropriate for ASE rather than the CFQ default scheduler.

Flexible I/O scheduler:
----------------------------------------------------------
The new I/O scheduler allows administrators to tune the server to match its usage with four
(Exclusive) I/O behavior policies:

1. CFQ (default scheduler): Complete Fair Queuing, is suitable for a wide variety of applications, especially desktop and multimedia workloads. It is the default I/O scheduler. CFQ treats all competing processes equally by assigning each process a unique request queue and giving each queue equal bandwidth.

2. Deadline: The deadline I/O scheduler implements a per-request service deadline to ensure that no requests are neglected, thus providing excellent request latency while maintaining good disk throughput. Deadline policy is best for disk-intensive database applications.

3. Anticipatory: The anticipatory I/O scheduler uses the deadline mechanism plus an anticipation heuristic to predict the actions of applications. This provides greater disk throughput but slightly increases latency. The anticipation heuristic is suitable for file servers but does not work as well for database workloads.

4. No-Op: This no-operation mode does no sorting at all and is used only for disks that perform their own scheduling or that are randomly accessible. The first three behaviors group and merge requests to maximize request sizes, cutting down on the number of searches performed.
----------------------------------------------------------


From experience it seems that CFQ (the default setting) or Deadline perform the best with ASE. However 'best' Linux scheduler settings are  depend upon I/O load, disk performance, what types of I/Os are being done, and sizes of I/O. Like most things which can be tuned, 
As per TechWave 2011 SAP released Doc 

Storage                                    Method
SAN                                         noop
SSD                                         noop
local disk                                 deadline 

setting of scheduler:-
=========================================

semaphore:-

Semaphores can be thought of as simple counters that indicate the status of a resource. This counter is a protected variable and cannot be accessed by the user directly. The shield to this variable is provided by none other than the kernel. The usage of this semaphore variable is simple. If counter is greater that 0, then the resource is available, and if the counter is 0 or less, then that resource is busy or being used by someone else. This simple mechanism helps in synchronizing multithreaded and multiprocess based applications. Semaphores were invented and proposed by Edsger Dijkstra, and still used in operating systems today for synchronization purposes. The same mechanism is now available for application developers too. Its one of the most important aspects of interprocess communication


========================================