Beta version
Ø Identify io bottlenecks by using sp_sysmon and resolutions
Ø Identify io bottlenecks by using sp_sysmon and resolutions
Ø useful MDA to find the disk io contention
Ø disk i/o structures
Ø io scheduler
Ø semaphore
Identify io bottlenecks by using sp_sysmon and resolutions:-
io bottlenecks and resolutions
Identify i/o problems by using sp_sysmon
1) Lock Management report
* Last page locks on Heaps
2) Task Management report
* Task context switch due to "i/o Device contention"
3) Disk i/o Management report
* Max outstanding i/o
* i/o delayed by
* Completed disk i/o
* Device Activity details
-------------->>
1) Lock Management report
* Last page locks on Heaps
Last Page Locks on Heaps
Granted 1617.8 23.5 32356 99.9 %
Waited 1.4 0.0 28 0.1 %
------------------------- ------------ ------------ ---------- ----------
Total Last Pg Locks 1619.2 23.5 32384 100.0 %
2) Task Management report
* Task context switch due to "i/o Device contention"
Task Context Switches Due To:
Voluntary Yields 49.5 0.7 989 7.4 %
Cache Search Misses 1.7 0.0 34 0.3 %
System Disk Writes 2.1 0.0 42 0.3 %
I/O Pacing 37.4 0.5 747 5.6 %
Logical Lock Contention 2.5 0.0 50 0.4 %
Address Lock Contention 0.0 0.0 0 0.0 %
Latch Contention 0.0 0.0 0 0.0 %
Log Semaphore Contention 0.8 0.0 15 0.1 %
PLC Lock Contention 0.3 0.0 5 0.0 %
Group Commit Sleeps 3.2 0.0 64 0.5 %
Last Log Page Writes 46.3 0.7 926 6.9 %
Modify Conflicts 3.3 0.0 66 0.5 %
I/O Device Contention 0.0 0.0 0 10.0 %
I/O Device Contention it means the task was waiting for a semaphore for particular device it happens where mirroring (SMP). In a lay man words if the ASE engines try to get I/O structure from the same device at a time one should be wait until the task is completed.
In this case the resolutions are 'add some device and move some indexes and table'
-------->
3) Disk i/o Management report
* Max outstanding i/o
* i/o delayed by
* Completed disk i/o
* Device Activity details
Disk I/O Management
-------------------
Max Outstanding I/Os per sec per xact count % of total
------------------------- ------------ ------------ ---------- ----------
Server n/a n/a 16 n/a
Engine 0 n/a n/a 14 n/a
Engine 1 n/a n/a 14 n/a
Max Outstanding I/Os the count column is reporting ASE max num of I/Os pending for each engine at the time interval
We have to find the difference between different runs and observe the count columns if it is not coming down we have to change I/O related config in ASE and OS parameter
**It is most useful parameter to identify the i/o bottleneck
I/Os Delayed by per sec per xact count % of total
Disk I/O Structures n/a n/a 0 n/a
Server Config Limit n/a n/a 0 n/a
Engine Config Limit n/a n/a 0 n/a
Operating System Limit n/a n/a 0 n/a
The ASE is delayed due to insufficient disk io control before initiating io request . If the count is >0 we have to change the parameter Disk I/O Structure is ASE parameter the rest we have to change in OS level
sp_configure "disk I/o structures"
Server,Engine config limit happens due to OS is restricting async limits per system or per process. we have to reconfigure async io setting like aio in os level and configure in ASE level like
sp_configure "max async I/os per server"
Max async i/os per engine:
specifies the maximum number of outstanding asynchronous disk I/O requests for a single engine at one time.Max value of 'max async i/os per engine' is mentioned in /proc/sys/fs/aio-max-nr. If this value exceeds the max allowed value io_setup() call fails with EAGAIN.
Max async i/os per server:
specifies the maximum number of asynchronous disk I/O requests that can be outstanding for Adaptive Server at one time. This limit is not affected by the number of online engines per Adaptive Server; max async i/os per server limits the total number of asynchronous I/Os a server can issue at one time, regardless of how many online engines it has. max async i/os per engine limits the number of outstanding I/Os per engine.
http://www.sybase.com/files/White_Papers/Sybase_ASE_on_Linux_wp.pdf
Asynchronous I/O (AIO):- It is to allow a process to initiate a number of I/O operations without having to block or wait for any to complete. At some later time, or after being notified of I/O completion, the process can retrieve the results of the I/O.
** the config of sub report of "Total Requested Disk I/Os and Total Completed I/Os" should be equal.
Total Requested Disk I/Os 118.0 1.7 2359
Completed Disk I/O's
Engine 0 52.5 0.8 1049 44.4 %
Engine 1 65.7 1.0 1313 55.6 %
------------------------- ------------ ------------ ----------
Total Completed I/Os 118.1 1.7 2362
==============================================
Useful MDA to find the disk io contentions:-
1> select Reads, APFReads, Writes, DevSemaphoreRequests, DevSemaphoreWaits, IOTime,
2> substring(LogicalName, 1, 30) LogicalName, substring(PhysicalName, 1, 30) PhysicalName
3> from master..monDeviceIO
4> where LogicalName like 'data%'
5>
6> select 'Data Segments', sum(IOTime) 'Total IOTime', sum(Reads + APFReads + Writes) 'Total R+APFr+W', sum(IOTime)/sum(Reads + APFReads + Writes) 'ms/IO'
7> from master..monDeviceIO
8> where LogicalName like 'data%'
9>
10> select Reads, APFReads, Writes, DevSemaphoreRequests, DevSemaphoreWaits, IOTime,
11> substring(LogicalName, 1, 30) LogicalName, substring(PhysicalName, 1, 30) PhysicalName
12> from master..monDeviceIO
13> where LogicalName like 'log%'
14>
15> select 'Data Segments', sum(IOTime) 'Total IOTime', sum(Reads + APFReads + Writes) 'Total R+APFr+W', sum(IOTime)/sum(Reads + APFReads + Writes) 'ms/IO'
16> from master..monDeviceIO
17> where LogicalName like 'log%'
18> go
disk
i/o structures:-
--------------------
Here is current advice from ASE Product Support
Engineering(ex for 5 engines). A general rule of thumb is to allow 400
per engine and then monitor with sp_sysmon. If the Max Outstanding I/Os Per
Engine gets over about 60% then "disk i/o structures" should be
increased.
The same is true if I/Os Delayed By: Disk I/O
Structures is non-zero. In this case with 5 engines we would recommend setting
"disk i/o structures" to 5 x 400 = 2000 -> round up to 2048.
There should be no negative side effect of
changing "disk i/o structures" from value 512 to 2048 in the server.
ASE will use a small amount of additional memory when it boots -- a couple
hundred bytes each, so around half a MB or less for the increase from 512 to
2048. This is a relatively low cost to prevent a constriction on throughput.
===========================================
The sp_sysmon shows that there are very high
numbers for I/Os delayed by disk i/o structures:-
---------------------------------------------------------------------------------------------
in this case 1) we have to increase the disk
i/o structure in sybase ase level and consequent OS paremeter. 2) observe
sp_sysmon also showed significant number of samples where the cache spinlock
contention exceeded 10%. By adding some additional cache partitions, this also
helped speed up the pending requested I/Os.
=========================================
io
scheduler:-
Does Sybase make any recommendation or have any guidelines
on the Linux io scheduler with ASE?
The write-up below would suggest that the
deadline scheduler would be more appropriate for ASE rather than the CFQ
default scheduler.
Flexible I/O scheduler:
----------------------------------------------------------
The new I/O scheduler allows administrators to
tune the server to match its usage with four
(Exclusive) I/O behavior policies:
1. CFQ (default scheduler): Complete Fair
Queuing, is suitable for a wide variety of applications, especially desktop and
multimedia workloads. It is the default I/O scheduler. CFQ treats all competing
processes equally by assigning each process a unique request queue and giving
each queue equal bandwidth.
2. Deadline: The deadline I/O scheduler
implements a per-request service deadline to ensure that no requests are
neglected, thus providing excellent request latency while maintaining good disk
throughput. Deadline policy is best for disk-intensive database applications.
3. Anticipatory: The anticipatory I/O scheduler
uses the deadline mechanism plus an anticipation heuristic to predict the
actions of applications. This provides greater disk throughput but slightly
increases latency. The anticipation heuristic is suitable for file servers but
does not work as well for database workloads.
4. No-Op: This no-operation mode does no
sorting at all and is used only for disks that perform their own scheduling or
that are randomly accessible. The first three behaviors group and merge
requests to maximize request sizes, cutting down on the number of searches
performed.
----------------------------------------------------------
From experience it seems that CFQ (the default
setting) or Deadline perform the best with ASE. However 'best' Linux scheduler
settings are depend upon I/O load, disk
performance, what types of I/Os are being done, and sizes of I/O. Like most
things which can be tuned,
As per TechWave 2011 SAP released Doc
Storage Method
SAN noop
SSD noop
local disk deadline
As per TechWave 2011 SAP released Doc
Storage Method
SAN noop
SSD noop
local disk deadline
setting
of scheduler:-
=========================================
semaphore:-
Semaphores can be thought of as simple counters
that indicate the status of a resource. This counter is a protected variable
and cannot be accessed by the user directly. The shield to this variable is
provided by none other than the kernel. The usage of this semaphore variable is
simple. If counter is greater that 0, then the resource is available, and if
the counter is 0 or less, then that resource is busy or being used by someone
else. This simple mechanism helps in synchronizing multithreaded and
multiprocess based applications. Semaphores were invented and proposed by
Edsger Dijkstra, and still used in operating systems today for synchronization
purposes. The same mechanism is now available for application developers too.
Its one of the most important aspects of interprocess communication
========================================