Welcome to KvikIO's C++ documentation!

KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python bindings to cuFile which enables GPUDirect Storage (GDS). KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly.

KvikIO C++ is part of the RAPIDS suite of open-source software libraries for GPU-accelerated data science.


Notice this is the documentation for the C++ library. For the Python documentation, see under kvikio.


Features

  • Object Oriented API.
  • Exception handling.
  • Concurrent reads and writes using an internal thread pool.
  • Non-blocking API.
  • Handle both host and device IO seamlessly.

Installation

For convenience we release Conda packages that makes it easy to include KvikIO in your CMake projects.

Conda/Mamba

We strongly recommend using mamba in place of conda, which we will do throughout the documentation.

Install the stable release from the rapidsai channel with the following:

# Install in existing environment
mamba install -c rapidsai -c conda-forge libkvikio
# Create new environment (CUDA 11.8)
mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=11.8 libkvikio
# Create new environment (CUDA 12.5)
mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=12.5 libkvikio

Install the nightly release from the rapidsai-nightly channel with the following:

# Install in existing environment
mamba install -c rapidsai-nightly -c conda-forge libkvikio
# Create new environment (CUDA 11.8)
mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.12 cuda-version=11.8 libkvikio
# Create new environment (CUDA 12.5)
mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.12 cuda-version=12.5 libkvikio

Notice if the nightly install doesn't work, set channel_priority: flexible in your .condarc.


Include KvikIO in a CMake project

An example of how to include KvikIO in an existing CMake project can be found here: https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/downstream/.

Build from source

To build the C++ example run:

./build.sh libkvikio

Then run the example:

./examples/basic_io

Runtime Settings

Compatibility Mode (KVIKIO_COMPAT_MODE)

When KvikIO is running in compatibility mode, it doesn't load libcufile.so. Instead, reads and writes are done using POSIX. Notice, this is not the same as the compatibility mode in cuFile. It is possible that KvikIO performs I/O in the non-compatibility mode by using the cuFile library, but the cuFile library itself is configured to operate in its own compatibility mode. For more details, refer to cuFile compatibility mode and cuFile environment variables

The environment variable KVIKIO_COMPAT_MODE has three options (case-insensitive):

  • ON (aliases: TRUE, YES, 1): Enable the compatibility mode.
  • OFF (aliases: FALSE, NO, 0): Disable the compatibility mode, and enforce cuFile I/O. GDS will be activated if the system requirements for cuFile are met and cuFile is properly configured. However, if the system is not suited for cuFile, I/O operations under the OFF option may error out, crash or hang.
  • AUTO: Try cuFile I/O first, and fall back to POSIX I/O if the system requirements for cuFile are not met.

Under AUTO, KvikIO falls back to the compatibility mode:

  • when libcufile.so cannot be found.
  • when running in Windows Subsystem for Linux (WSL).
  • when /run/udev isn't readable, which typically happens when running inside a docker image not launched with --volume /run/udev:/run/udev:ro.

This setting can also be programmatically controlled by defaults::set_compat_mode() and defaults::compat_mode_reset().

Thread Pool (KVIKIO_NTHREADS)

KvikIO can use multiple threads for IO automatically. Set the environment variable KVIKIO_NTHREADS to the number of threads in the thread pool. If not set, the default value is 1.

This setting can also be controlled by defaults::thread_pool_nthreads() and defaults::thread_pool_nthreads_reset().

Task Size (KVIKIO_TASK_SIZE)

KvikIO splits parallel IO operations into multiple tasks. Set the environment variable KVIKIO_TASK_SIZE to the maximum task size (in bytes). If not set, the default value is 4194304 (4 MiB).

This setting can also be controlled by defaults::task_size() and defaults::task_size_reset().

GDS Threshold (KVIKIO_GDS_THRESHOLD)

To improve performance of small IO requests, .pread() and .pwrite() implement a shortcut that circumvents the threadpool and uses the POSIX backend directly. Set the environment variable KVIKIO_GDS_THRESHOLD to the minimum size (in bytes) to use GDS. If not set, the default value is 1048576 (1 MiB).

This setting can also be controlled by defaults::gds_threshold() and defaults::gds_threshold_reset().

Size of the Bounce Buffer (KVIKIO_GDS_THRESHOLD)

KvikIO might have to use intermediate host buffers (one per thread) when copying between files and device memory. Set the environment variable KVIKIO_BOUNCE_BUFFER_SIZE to the size (in bytes) of these "bounce" buffers. If not set, the default value is 16777216 (16 MiB).

This setting can also be controlled by defaults::bounce_buffer_size() and defaults::bounce_buffer_size_reset().

Example

#include <cstddef>
#include <cuda_runtime.h>
#include <kvikio/file_handle.hpp>
using namespace std;
int main()
{
// Create two arrays `a` and `b`
constexpr std::size_t size = 100;
void *a = nullptr;
void *b = nullptr;
cudaMalloc(&a, size);
cudaMalloc(&b, size);
// Write `a` to file
kvikio::FileHandle fw("test-file", "w");
size_t written = fw.write(a, size);
fw.close();
// Read file into `b`
kvikio::FileHandle fr("test-file", "r");
size_t read = fr.read(b, size);
fr.close();
// Read file into `b` in parallel using 16 threads
kvikio::default_thread_pool::reset(16);
{
kvikio::FileHandle f("test-file", "r");
future<size_t> future = f.pread(b_dev, sizeof(a), 0); // Non-blocking
size_t read = future.get(); // Blocking
// Notice, `f` closes automatically on destruction.
}
}
Handle of an open file registered with cufile.
Definition: file_handle.hpp:44

For a full runnable example see https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/basic_io.cpp.