2011-09-16

filesystemio_options and disk_asynch_io

What do filesystemio_options and disk_asynch_io Do?

disk_asynch_io is a kind of master switch, which turns on or off Async I/O to database files on any type of storage, whether it's raw device or filesystem. The filesystemio_options parameter gives finer control over I/O to database files on filesystems. It allows you to turn off async I/O to filesystem files but keep async I/O to raw devices if the "master" switch disk_asynch_io is set to true.

Instance initialization parameter filesystemio_options has four options:
1. "asynch" : means buffered I/O + Async I/O
2. "directIO" : means Direct I/O only
3. "setall" : means Direct I/O + Async I/O
4. "none" : disables Async I/O and Direct I/O

One should always use at least Direct I/O with OCFS/OCFS2. In fact one does not have choice as the database automatically adds that mode whenever it sees the file is on an OCFS volume or OCFS2 volume mounted with the datavolume mount option.
If the user wants aio with OCFS/OCFS2, use setall.
If the user wants aio with ASM/ASMlib, he is expected to set disk_asynch_io=true, this is because ASM bypasses the filesystem layer in this case and ASM I/O is entirely controlled by DISK_ASYNCH_IO parameter. AIO needs to be enabled/disabled by setting disk_asynch_io to parameter values TRUE/FALSE, please see Note 413389.1 .
 Asynchronous I/O Support on OCFS/OCFS2 and Related Settings: filesystemio_options, disk_asynch_io [ID 432854.1]

Applies to:

Linux OS - Version: 2.4.9 to 2.6.18-128.1.10.0.1
Linux OS - Version: 1.0.14 to 1.0.15]
Linux OS - Version: 1.2.0-1 to 1.4.2-1   [Release: OCFS2 to OCFS2]
Linux x86
Linux x86-64
**Checked for relevance on 27-Dec-2010***
Linux Kernel - Version: 2.4.9 to 2.6.18-128.1.10.0.1
Linux Kernel - Version: 1.0.14 to 1.0.15
Linux Kernel - Version: 1.2.0-1 to 1.4.2-1

Purpose

The options about Direct I/O and Asynchronous I/O and filesystemio_options and disk_asynch_io are very confusing to our customer, even to some engineers.

This article tries to clarify the difference among these concepts and options.

Scope and Application

This article will introduce some concepts about I/O on Linux platform, after that, it will focus on the options: filesystemio_options and disk_asynch_io.

The part of introduction of AIO and all the pictures related come from the article  "Boost Application Performance using Asynchronous I/O" authored by M. Tim Jones. As of this document is written (June 7, 2007) no copyright violation of the material is applicable if referenced explicitly.

Asynchronous I/O Support on OCFS/OCFS2 and Related Settings: filesystemio_options, disk_asynch_io

1. Introduction to AIO

Linux asynchronous I/O is a relatively recent addition to the Linux kernel. It's a standard feature of the 2.6 kernel, but you can find patches for 2.4. The basic idea behind AIO is to allow a process to initiate a number of I/O operations without having to block or wait for any to complete. At some later time, or after being notified of I/O completion, the process can retrieve the results of the I/O.

2. I/O Models

Let's explore the different I/O models that are available under Linux. This isn't intended as an exhaustive review, but rather aims to cover the most common models to illustrate their differences from asynchronous I/O. Figure 1 shows synchronous and asynchronous models, as well as blocking and non-blocking models.

Figure 1. Simplified matrix of basic Linux I/O models


Each of these I/O models has usage patterns that are advantageous for particular applications. This section briefly explores each one.

- Synchronous Blocking I/O
One of the most common models is the synchronous blocking I/O model. In this model, the user-space application performs a system call that results in the application blocking. This means that the application waits until the system call is complete (data transferred or error). The calling application is in a state where it consumes no CPU and simply awaits the response, so it is efficient from a processing perspective.

I/O-bound versus CPU-bound processes

A process that is I/O bound is one that performs more I/O than processing. A CPU-bound process does more processing than I/O. The Linux 2.6 scheduler actually favors I/O-bound processes because they commonly initiate an I/O and then block, which means other work can be efficiently interlaced between them.

Figure 2 illustrates the traditional blocking I/O model, which is also the most common model used in applications today. Its behaviors are well understood, and its usage is efficient for typical applications. When the read system call is invoked, the application waits and the context switches to the kernel. The read is then initiated, and when the response returns (from the device from which you're reading), the data is moved to the user-space buffer. Then the application is unblocked (and the read call returns).

Figure 2. Typical flow of the synchronous blocking I/O model



From the application's perspective, the read call spans a long duration. But, in fact, the application is actually blocked while the read is multiplexed with other work in the kernel.

- Synchronous Non-blocking I/O

A less efficient variant of synchronous blocking is synchronous non-blocking I/O. In this model, a device is opened as non-blocking. This means that instead of completing an I/O immediately, a read may return an error code indicating that the command could not be immediately satisfied (EAGAIN or EWOULDBLOCK), as shown in Figure 3.

Figure 3. Typical flow of the synchronous non-blocking I/O model



The implication of non-blocking is that an I/O command may not be satisfied immediately, requiring that the application make numerous calls to await completion. This can be extremely inefficient because in many cases the application must busy-wait until the data is available or attempt to do other work while the command is performed in the kernel. As also shown in Figure 3, this method can introduce latency in the I/O because any gap between the data becoming available in the kernel and the user calling read to return it can reduce the overall data throughput.

- Asynchronous Blocking I/O

Another blocking paradigm is non-blocking I/O with blocking notifications. In this model, non-blocking I/O is configured, and then the blocking select system call is used to determine when there's any activity for an I/O descriptor. What makes the select call interesting is that it can be used to provide notification for not just one descriptor, but many. For each descriptor, you can request notification of the descriptor's ability to write data, availability of read data, and also whether an error has occurred.

Figure 4. Typical flow of the asynchronous blocking I/O model (select)




The primary issue with the select call is that it's not very efficient. While it's a convenient model for asynchronous notification, its use for high-performance I/O is not advised.

- Asynchronous Non-blocking I/O (AIO)

Finally, the asynchronous non-blocking I/O model is one of overlapping processing with I/O. The read request returns immediately, indicating that the read was successfully initiated. The application can then perform other processing while the background read operation completes. When the read response arrives, a signal or a thread-based callback can be generated to complete the I/O transaction.

Figure 5. Typical flow of the asynchronous non-blocking I/O model



The ability to overlap computation and I/O processing in a single process for potentially multiple I/O requests takes advantage of the gap between processing speed and I/O speed. While one or more slow I/O requests are pending, the CPU can perform other tasks or, more commonly, operate on already completed I/Os while other I/Os are initiated.

The next section examines this model further, explores the API, and then demonstrates a number of commands.
- Motivation for Asynchronous I/O
From the previous taxonomy of I/O models, you can see the motivation for AIO. The blocking models require the initiating application to block when the I/O has started. This means that it isn't possible to overlap processing and I/O at the same time. The synchronous non-blocking model allows overlap of processing and I/O, but it requires that the application check the status of the I/O on a recurring basis. This leaves asynchronous non-blocking I/O, which permits overlap of processing and I/O, including notification of I/O completion.

The functionality provided by the select function (asynchronous blocking I/O) is similar to AIO, except that it still blocks. However, it blocks on notifications instead of the I/O call.

3. What do filesystemio_options and disk_asynch_io Do?

disk_asynch_io is a kind of master switch, which turns on or off Async I/O to database files on any type of storage, whether it's raw device or filesystem. The filesystemio_options parameter gives finer control over I/O to database files on filesystems. It allows you to turn off async I/O to filesystem files but keep async I/O to raw devices if the "master" switch disk_asynch_io is set to true.

Instance initialization parameter filesystemio_options has four options:
1. "asynch" : means buffered I/O + Async I/O
2. "directIO" : means Direct I/O only
3. "setall" : means Direct I/O + Async I/O
4. "none" : disables Async I/O and Direct I/O

One should always use at least Direct I/O with OCFS/OCFS2. In fact one does not have choice as the database automatically adds that mode whenever it sees the file is on an OCFS volume or OCFS2 volume mounted with the datavolume mount option.
If the user wants aio with OCFS/OCFS2, use setall.
If the user wants aio with ASM/ASMlib, he is expected to set disk_asynch_io=true, this is because ASM bypasses the filesystem layer in this case and ASM I/O is entirely controlled by DISK_ASYNCH_IO parameter. AIO needs to be enabled/disabled by setting disk_asynch_io to parameter values TRUE/FALSE, please see Note 413389.1 .

4. O_DIRECT and Async I/O

O_DIRECT is a flag to be set with open() to notify the kernel, if the calling application would like to use buffer and cache for the filesystem.

Async I/O is a totally different aspect of doing I/O ,it is implemented with a O/S library, namely: libaio, librt.

"libaio" is used by Oracle database, and it's a Linux-native asynchronous I/O facility, you can see Note 225751.1 (Question 11) for more information.

"librt" contains the the POSIX.1b standard asynchronous I/O functionality which we call it AIO. The implementation of these functions can be done using support in the kernel (if available) or using an implementation based on threads at userlevel. In the latter case it might be necessary to link applications with the thread library libpthread in addition to librt.

"rtkaio" is provided by SuSE and Red Hat via glibc which allows users to make use of the kernel's asynchronous I/O interfaces. This additional library offers a higher performing alternative to the existing user-space only asynchronous I/O support provided by "librt".

You need to understand that O_DIRECT is about filesystem buffer cache and Async I/O is about being blocking/non-blocking, but there are some relationship between these two concepts, That is :

According to Red Hat Enterprise Linux AS 4 Release Notes:

Asynchronous I/O (AIO) on file systems is currently only supported in O_DIRECT, or non-buffered mode. Also note that the asynchronous poll interface is no longer present, and that AIO on pipes is no longer supported.

Namely, If you (or customer) want to use OCFS (even >= 1.0.14) with Async I/O we need to have Direct I/O enabled. Actually any version of OCFS requires
Direct I/O.
According to AIO SourceForge page:

Support for kernel AIO has been included in the 2.6 Linux kernel.

What Works?

  •  AIO read and write on raw (and O_DIRECT on blockdev)
  •  AIO read and write on files opened with O_DIRECT on ext2, ext3, jfs, xfs
What Does Not Work?
  •  AIO read and write on files opened without O_DIRECT (i.e. normal buffered filesystem AIO). On ext2, ext3, jfs, xfs and nfs, these do not return an explicit error, but quietly default to synchronous or rather non-AIO behaviour (i.e io_submit waits for I/O to complete in these cases). For most other filesystems, -EINVAL is reported.
  •  AIO fsync (not supported for any filesystem)
  •  AIO read and write on sockets (doesn't return an explicit error, but quietly defaults to synchronous or rather non-AIO behavior)
  •  AIO read and write on pipes (reports -EINVAL)  Not all devices (including TTY) support AIO (typically return -EINVAL)

Additional functionality is available HERE
  • Buffered filesystem AIO, i.e. AIO read and write on files opened without O_DIRECT, on ext2, ext3, jfs, xfs, reiserfs.
  • Does not include AIO fsync
  • AIO read and write on pipes (from Chris Mason)


For the patches done by Suparna Bhattacharya on the kernel for Async I/O please see here. They are different from the AIO in the mainline kernel. With the AIO implementation in the mainline kernel one needs to have O_DIRECT for the code tree to do Async I/O is to be executed. Therefore for RHEL3, 4, 5 and SLES10 (which use the mainline kernel AIO implementation),  Async I/O  is only enabled along with O_DIRECT. SLES8 and 9 actually have Suparna Bhattacharya's asynchronous I/O patches instead of the mainline kernel AIO and Async I/O is available even without O_DIRECT.
@ Based on an e-mail from Chris Mason
Therefore you need to enable Direct I/O if use Async I/O on OCFS2 volume or other filesystems on RHEL and SLES10. Namely for Oracle RDBMS:
filesystemio_options = setall
 

Niciun comentariu:

Trimiteți un comentariu