FIO是怎么將I/O發(fā)送到文件中的呢?是通過下面的引擎來進(jìn)行操作的:
**sync**
????使用read()、write()、lseek()函數(shù)來進(jìn)行IO的讀寫和定位
**psync**
????使用pread()、pwrite()來進(jìn)行IO的讀寫,在所有系統(tǒng)都支持(除了windows)
? ??由于lseek和read?調(diào)用之間,內(nèi)核可能會(huì)臨時(shí)掛起進(jìn)程,所以對(duì)同步造成了問題,調(diào)用pread相當(dāng)于順序調(diào)用了lseek和read,這兩個(gè)操作相當(dāng)于一個(gè)捆綁的原子操作。
? ??由于lseek和write?調(diào)用之間,內(nèi)核可能會(huì)臨時(shí)掛起進(jìn)程,所以對(duì)同步問題造成了問題,調(diào)用pwrite相當(dāng)于順序調(diào)用了lseek和write,這兩個(gè)操作相當(dāng)于一個(gè)捆綁的原子操作。
? ? 這兩個(gè)函數(shù)無法中斷其定位和讀寫操作,另外不更新文件指針
**vsync**
? ? 模仿隊(duì)列的IO合并功能,盡量減少提交數(shù)
? ? readv/writev:在一次函數(shù)調(diào)用中讀、寫多個(gè)非連續(xù)緩沖區(qū),但是這些緩沖區(qū)已經(jīng)用iovec表示好了,減少了系統(tǒng)調(diào)用的次數(shù)。
**pvsync**
????pwritev()系統(tǒng)調(diào)用結(jié)合了writev()和的功能pwrite ()。它執(zhí)行與writev()相同的任務(wù),但是添加了第四個(gè)參數(shù)偏移量,指定輸出所在的文件偏移量要進(jìn)行操作。這些系統(tǒng)調(diào)用不會(huì)更改文件偏移量。即,讀寫與定位的原子操作 +?隊(duì)列合并。
**pvsync2**
? ? 同pvsync類似,多了第五個(gè)參數(shù),用于設(shè)置本次操作的屬性,如:
? ??RWF_DSYNC (since Linux 4.7) Provide a per-write equivalent of theO_DSYNC open(2) flag. This flag is meaningful only forpwritev2(), and its effect applies only to the data range written by the system call.(O_DSYNC:每次write都等待物理I/O完成,但是如果寫操作不影響讀取剛寫入的數(shù)據(jù),則不等待文件屬性更新)
? ??RWF_HIPRI (since Linux 4.6) High priority read/write. Allows block-based filesystems to use polling of the device, which provides lower latency, but may use additional resources. (Currently, this feature is usable only on a file descriptor opened using theO_DIRECTflag.)
? ??RWF_SYNC (since Linux 4.7) Provide a per-write equivalent of theO_SYNC open(2) flag. This flag is meaningful only forpwritev2(), and its effect applies only to the data range written by the system call.(O_SYNC:每次write都等到物理I/O完成,包括write引起的文件屬性的更新)
? ??RWF_NOWAIT (since Linux 4.14) Do not wait for data which is not immediately available. If this flag is specified, thepreadv2() system call will return instantly if it would have to read data from the backing stor‐ age or wait for a lock. If some data was successfully read, it will return the number of bytes read. If no bytes were read, it will return -1 and seterrnotoEAGAIN. Currently, this flag is meaningful only forpreadv2().
? ??RWF_APPEND (since Linux 4.16) Provide a per-write equivalent of theO_APPEND open(2) flag. This flag is meaningful only forpwritev2(), and its effect applies only to the data range written by the system call. Theoffsetargument does not affect the write operation; the data is always appended to the end of the file. However, if theoffsetargument is -1, the current file offset is updated
**io_uring**
linux原生AIO的升級(jí)版,易用且高效,linux5.1內(nèi)核版本開始支持.
**libaio**
? ? linux2.6內(nèi)核之后就有的本地異步非阻塞IO調(diào)用,使用此引擎會(huì)指定部分選項(xiàng)。
**posixaio**
????POSIX1003.1b 實(shí)時(shí)擴(kuò)展協(xié)議規(guī)定的標(biāo)準(zhǔn)異步 I/O 接口,即 aio_read 函數(shù)、 aio_write 函數(shù)、aio_fsync 函數(shù)、aio_cancel 函數(shù)、aio_error 函數(shù)、aio_return 函數(shù)、aio_suspend函數(shù)和 lio_listio 函數(shù)。這組 API 用來操作異步 I/O。
**solarisaio**
????使用Solaris系統(tǒng)本地的異步IO接口
**windowsaio**
? ? windows本地的IO接口
**mmap**
????文件通過內(nèi)存映射到用戶空間,使用memcpy寫入和讀出數(shù)據(jù)
**splice**
????使用splice和vmsplice在用戶空間和內(nèi)核之間傳輸數(shù)據(jù)
**sg**
????SCSI?generic?sg?v3?io.可以是使用SG_IO?ioctl來同步,或是目標(biāo)是一個(gè)sg字符設(shè)備,我們使用read和write執(zhí)行異步IO
**null**
????不傳輸任何數(shù)據(jù),只是偽裝成這樣。主要用于訓(xùn)練使用fio,或是基本debug/test的目的
**net**
????根據(jù)給定的host:port通過網(wǎng)絡(luò)傳輸數(shù)據(jù)。根據(jù)具體的協(xié)議,hostname,port,listen,filename這些選項(xiàng)將被用來說明建立哪種連接,協(xié)議選項(xiàng)將決定哪種協(xié)議被使用。
**netsplice**
????像net,但是使用splic/vmsplice來映射數(shù)據(jù)和發(fā)送/接收數(shù)據(jù)。
**cpuio**
????不傳輸任何的數(shù)據(jù),但是要根據(jù)cpuload=和cpucycle=選項(xiàng)占用CPU周期.e.g.?cpuload=85將使用job不做任何的實(shí)際IO,但要占用85%的CPU周期。在SMP機(jī)器上,使用numjobs=<no_of_cpu>來獲取需要的CPU,因?yàn)閏puload僅會(huì)載入單個(gè)CPU,然后占用需要的比例。
**guasi**
????GUASI?IO引擎是一般的用于異步IO的用戶空間異步系統(tǒng)調(diào)用接口
**rdma**
????RDMA?I/O引擎支持RDMA內(nèi)存語義(RDMA_WRITE/RDMA_READ)和通道主義(Send/Recv)用于InfiniBand,RoCE和iWARP協(xié)議
????external指明要調(diào)用一個(gè)外部的IO引擎(二進(jìn)制文件)。e.g.?ioengine=external:/tmp/foo.o將載入/tmp下的foo.o這個(gè)IO引擎
**falloc**
I/O engine that does regular fallocate to simulate data transfer as
fio ioengine.
DDIR_READ
does fallocate(,mode = FALLOC_FL_KEEP_SIZE,).
DDIR_WRITE
does fallocate(,mode = 0).
DDIR_TRIM
does fallocate(,mode = FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
**ftruncate**
I/O engine that sends :manpage:`ftruncate(2)` operations in response
to write (DDIR_WRITE) events. Each ftruncate issued sets the file's
size to the current block offset. :option:`blocksize` is ignored.
**e4defrag**
I/O engine that does regular EXT4_IOC_MOVE_EXT ioctls to simulate
defragment activity in request to DDIR_WRITE event.
**rados**
I/O engine supporting direct access to Ceph Reliable Autonomic
Distributed Object Store (RADOS) via librados. This ioengine
defines engine specific options.
**rbd**
I/O engine supporting direct access to Ceph Rados Block Devices
(RBD) via librbd without the need to use the kernel rbd driver. This
ioengine defines engine specific options.
**http**
I/O engine supporting GET/PUT requests over HTTP(S) with libcurl to
a WebDAV or S3 endpoint.? This ioengine defines engine specific options.
This engine only supports direct IO of iodepth=1; you need to scale this
via numjobs. blocksize defines the size of the objects to be created.
TRIM is translated to object deletion.
**gfapi**
Using GlusterFS libgfapi sync interface to direct access to
GlusterFS volumes without having to go through FUSE.? This ioengine
defines engine specific options.
**gfapi_async**
Using GlusterFS libgfapi async interface to direct access to
GlusterFS volumes without having to go through FUSE. This ioengine
defines engine specific options.
**libhdfs**
Read and write through Hadoop (HDFS).? The :option:`filename` option
is used to specify host,port of the hdfs name-node to connect.? This
engine interprets offsets a little differently.? In HDFS, files once
created cannot be modified so random writes are not possible. To
imitate this the libhdfs engine expects a bunch of small files to be
created over HDFS and will randomly pick a file from them
based on the offset generated by fio backend (see the example
job file to create such files, use ``rw=write`` option). Please
note, it may be necessary to set environment variables to work
with HDFS/libhdfs properly.? Each job uses its own connection to
HDFS.
**mtd**
Read, write and erase an MTD character device (e.g.,
:file:`/dev/mtd0`). Discards are treated as erases. Depending on the
underlying device type, the I/O may have to go in a certain pattern,
e.g., on NAND, writing sequentially to erase blocks and discarding
before overwriting. The `trimwrite` mode works well for this
constraint.
**pmemblk**
Read and write using filesystem DAX to a file on a filesystem
mounted with DAX on a persistent memory device through the PMDK
libpmemblk library.
**dev-dax**
Read and write using device DAX to a persistent memory device (e.g.,
/dev/dax0.0) through the PMDK libpmem library.
**external**
Prefix to specify loading an external I/O engine object file. Append
the engine filename, e.g. ``ioengine=external:/tmp/foo.o`` to load
ioengine :file:`foo.o` in :file:`/tmp`. The path can be either
absolute or relative. See :file:`engines/skeleton_external.c` for
details of writing an external I/O engine.
**filecreate**
Simply create the files and do no I/O to them.? You still need to
set? `filesize` so that all the accounting still occurs, but no
actual I/O will be done other than creating the file.
**filestat**
Simply do stat() and do no I/O to the file. You need to set 'filesize'
and 'nrfiles', so that files will be created.
This engine is to measure file lookup and meta data access.
**libpmem**
Read and write using mmap I/O to a file on a filesystem
mounted with DAX on a persistent memory device through the PMDK
libpmem library.
**ime_psync**
Synchronous read and write using DDN's Infinite Memory Engine (IME).
This engine is very basic and issues calls to IME whenever an IO is
queued.
**ime_psyncv**
Synchronous read and write using DDN's Infinite Memory Engine (IME).
This engine uses iovecs and will try to stack as much IOs as possible
(if the IOs are "contiguous" and the IO depth is not exceeded)
before issuing a call to IME.
**ime_aio**
Asynchronous read and write using DDN's Infinite Memory Engine (IME).
This engine will try to stack as much IOs as possible by creating
requests for IME. FIO will then decide when to commit these requests.
**libiscsi**
Read and write iscsi lun with libiscsi.
**nbd**
Read and write a Network Block Device (NBD).
參考: